Skip to content

Latest commit

 

History

History

4

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 

Given two sorted arrays nums1 and nums2 of size m and n respectively, return the median of the two sorted arrays.

The overall run time complexity should be O(log (m+n)).

 

Example 1:

Input: nums1 = [1,3], nums2 = [2]
Output: 2.00000
Explanation: merged array = [1,2,3] and median is 2.

Example 2:

Input: nums1 = [1,2], nums2 = [3,4]
Output: 2.50000
Explanation: merged array = [1,2,3,4] and median is (2 + 3) / 2 = 2.5.

 

Constraints:

  • nums1.length == m
  • nums2.length == n
  • 0 <= m <= 1000
  • 0 <= n <= 1000
  • 1 <= m + n <= 2000
  • -106 <= nums1[i], nums2[i] <= 106

Companies: Amazon, Google, Adobe, Bloomberg, Apple, Facebook, Yahoo, Goldman Sachs, Microsoft, TikTok, Uber, MathWorks, Accenture, VMware, LinkedIn, ServiceNow, tcs, Walmart Labs, PayPal, Oracle, Morgan Stanley, Samsung, Yandex, SAP, Arcesium, Sprinklr, Zoho, Capgemini, OLX, Zenefits, Dropbox

Related Topics:
Array, Binary Search, Divide and Conquer

Similar Questions:

Solution 1.

// OJ: https://leetcode.com/problems/median-of-two-sorted-arrays
// Author: github.com/lzl124631x
// Time: O(M + N)
// Space: O(1)
class Solution {
public:
    double findMedianSortedArrays(vector<int>& A, vector<int>& B) {
        int M = A.size(), N = B.size(), a = 0, b = 0;
        auto getMin = [&]() {
            int ans;
            if (b >= N || (a < M && A[a] <= B[b])) ans = A[a++];
            else ans = B[b++];
            return ans;
        };
        for (int i = 0; i < (M + N - 1) / 2; ++i) getMin();
        if ((M + N) % 2) return getMin();
        return (getMin() + getMin()) / 2.;
    }
};

Solution 2.

Intuition

Because the inputs are sorted arrays and the problem asks for a logarithmic time limit, we strongly feel that binary search (or a similar approach) is a promising method. While we're not sure how to cast the same pattern as a normal binary search on this problem, let's go over some steps of a regular binary search and see if we can get any inspiration. (If you are not familiar with binary search, you can refer to our Binary Search Explore Card)

Here we use binary search to find target in a sorted array A:

  • Locate the middle index (element) of A.

  • Compare the value of the middle element with target.

  • Reduce the search space by cutting the current array in half and discarding the half which is guaranteed not to contain target.

  • Repeat the above process until we either empty the array (move to half a the length of 0) or find target.

img

At each step, the search space is cut in half, so we can quickly get the result. Now back to this problem where we have two sorted arrays. For the sake of convenience, let's call them A and B.

img

Similarly, we can get and compare their middle values A_mid and B_mid. Without loss of generality in this example we assume A_mid <= B_mid initially, as shown in the yellow boxes.

img

What does this comparison imply?

It implies that we can compare sections of A and B.

For the rest of this article, we will use ≤\le to represent the relative magnitude of values in arrays. For example, Aleft≤ArightA_{\text{left}} \le A_{\text{right}}AleftAright means that every element in AleftA_{\text{left}}Aleft is no larger than any element in ArightA_{\text{right}}Aright. We also 'compare' elements in an array with a single element similarly, for example, Aleft≤AmidA_{\text{left}} \le A_{\text{mid}}AleftAmid means that every element in AleftA_{\text{left}}Aleft is no larger than the element AmidA_{\text{mid}}Amid.
This may not be the most standard way of expressing it, but is easy enough to understand.

Recall that both arrays are sorted, so we know that:

  • Aleft≤AmidA_{\text{left}} \le A_{\text{mid}}AleftAmid
  • Bmid≤BrightB_{\text{mid}} \le B_{\text{right}}BmidBright

Combine these observations with the comparison we just made:

Amid≤BmidA_{\text{mid}} \le B_{\text{mid}}AmidBmid

We have the following result:

Aleft≤Amid≤Bmid≤BrightA_{\text{left}} \le A_{\text{mid}} \le B_{\text{mid}} \le B_{\text{right}}AleftAmidBmidBright

Thus,

Aleft≤BrightA_{\text{left}} \le B_{\text{right}}AleftBright

As shown in the picture below:

img

Since A is sorted, we know that Aleft≤ArightA_{\text{left}} \le A_{\text{right}}AleftAright.

img

Now we know that A_left is smaller than two halves: A_right and B_right. Although we still don't know where exactly these elements are, what we do know is A_left doesn't intersect with A_right + B_right! There is an invisible boundary between the A_left segment and the mixed segment A_right + B_right. As shown in the picture below, the dashed line divides all sorted elements into two halves.

img

We can apply all the same logic to the mixed segment AleftA_{\text{left}}Aleft + BleftB_{\text{left}}Bleft and BrightB_{\text{right}}Bright, which also do not intersect. You can try to prove it yourself as an exercise.

img

It looks somewhat clearer, we have clearly separated some subarrays. How do we continue to leverage this knowledge and use the cut-in-half method repeatedly?


The following step is the most important one.

Remember that we are looking for the median of sorted A + B which is one or two target values. We regard the index of the target value in the sorted(A + B) as k. For example:

  • If the lengths of A and B are 6 and 5, the target index is k = (6 + 5 + 1) / 2 = 6, we shall look for the 6th smallest element.

  • If the lengths of A and B are 6 and 6, the target indexes are k = (6 + 6) / 2 = 6 and k + 1 = 7, we shall look for the 6th and the 7th smallest elements.

Depending on whether the total number of elements is odd or even, we need the kthk^{th}kth (and maybe the (k+1)th(k + 1)^{th}(k+1)th) elements. What matters is that we set an index k at the beginning and we want to find the kthk^{th}kth smallest element using the Binary Search-like algorithm discussed previously (for convenience, we will discuss only the kthk^{th}kth element for now).

However, during the Binary Search-like algorithm, we keep removing one half of an array, so the index k might not stay unchanged. Suppose we removed 3 elements that are smaller than the original kthk^{th}kth smallest element, we shall look for the (k−3)th(k-3)^{th}(k3)th smallest element from the remaining arrays.

img

More specifically:

If k is larger than half the total number of elements in sorted(A + B), it means that the kthk^{th}kth element is in the second (larger) half of sorted(A + B), thus AleftA_{\text{left}}Aleft (or BleftB_{\text{left}}Bleft, the smaller of the two smaller sections according to the comparison) is guaranteed not to contain this element, and we can safely cut this half, and reduce k by the length of the removed half.

If k is not larger than half the total number of elements in sorted(A + B), it means that the kthk^{th}kth element is in the first (smaller) half of sorted(A + B), thus BrightB_{\text{right}}Bright (or ArightA_{\text{right}}Aright, the larger of the two larger sections according to the comparison) is guaranteed not to contain this element, and we can safely discard it. Note that we don't need to modify k this time, since we removed one larger half that doesn't affect the order of the kthk^{th}kth smallest element.

We can continue our search like above in the remaining arrays. The long arrow that starts from the bottom and points to the top-left indicates that we are repeating the process. Once we cut off part of either A or B, we regard the remaining arrays as modified A and B and restart this algorithm. Note that the following picture represents one case only: we consider the case that a_value < b_value, thus we remove either the smaller half of A or the larger half of B. If the comparison result is a_value >= b_value, we shall remove either the smaller half of B or the larger half of A.

img

That's it. We cut one of the two arrays in half at each step, so this approach has a logarithmic time complexity which we will discuss in detail later.

One more thing!

In the previous picture, we repeat all processes using the modified arrays, but this is just for the sake of understanding. We won't create copies of two arrays repeatedly, because that would introduce a linear time complexity at least. Instead, we just treat a part of the original array as the modified array for the next step, so that we can repeat the process on the original array without making any duplication. To do this, we need to maintain four pointers, two pointers for each array, e.g., a_start and a_end represent an inclusive range [a_start, a_end] of A.


Algorithm

Let's define a function that helps us find the kthk^{th}kth smallest element from two inclusive ranges [a_start, a_end] and [b_start, b_end] from arrays A and B.

  1. If the range (for example, a range of A) is empty, in other words a_start > a_end, it means all elements in A are passed, we just return the (k - a_start)-th element from the other array B. Vice versa if b_start > b_end.

  2. Otherwise, get the middle indexes of the two ranges: a_index = (a_start + a_end) / 2, b_index = (b_start + b_end) / 2.

  3. Get the middle values of the two ranges: a_value = A[a_index], b_value = B[b_index].

  4. Cut one array in half, according to:

    • If a_index + b_index < k, cut one smaller half.
      • If a_value < b_value, cut the smaller half of A.
      • Otherwise, cut the smaller half of B.
    • Otherwise, cut one larger half.
      • If b_value < a_value, cut the larger half of B.
      • Otherwise, cut the larger half of A.
  5. Repeat step 1 using the new starting and ending indexes of A and B.

Then we move on to find the median elements, and get the length of both arrays na = len(A) and nb = len(B).

  • If the total number of elements in A and B is odd, we just use the above function to find the middle element, that is k = (na + nb) / 2.
  • Otherwise, we use the function to find two middle elements: k = (na + nb) / 2 - 1 and k = (na + nb) / 2, and return their average.

Complexity Analysis

Let mmm be the size of array nums1 and nnn be the size of array nums2.

  • Time complexity: O(log⁡(m⋅n))O(\log(m \cdot n))O(log(mn))

    • At each step, we cut one half off from either nums1 or nums2. If one of the arrays is emptied, we can directly get the target from the other array in a constant time. Therefore, the total time spent depends on when one of the arrays is cut into an empty array.
    • In the worst-case scenario, we may need to cut both arrays before finding the target element.
    • One of the two arrays is cut in half at each step, thus it takes logarithmic time to empty an array. The time to empty two arrays are independent of each other.

    img

    • Therefore, the time complexity is O(log⁡m+log⁡n)O(\log m + \log n)O(logm+logn).
      O(log⁡m+log⁡n)=O(log⁡(m⋅n))O(\log m + \log n) = O(\log (m\cdot n))O(logm+logn)=O(log(mn))
  • Space complexity: O(log⁡m+log⁡n)O(\log m + \log n)O(logm+logn)

    • Similar to the analysis on time complexity, the recursion steps depend on the number of iterations before we cut an array into an empty array. In the worst-case scenario, we need O(log⁡m+log⁡n)O(\log m + \log n)O(logm+logn) recursion steps.

    • However, during the recursive self-call, we only need to maintain 4 pointers: a_start, a_end, b_start and b_end. The last step of the function is to call itself, so if tail call optimization is implemented, the call stack always has O(1)O(1)O(1) records.

    • Please refer to Tail Call for more information on tail call optimization.


// OJ: https://leetcode.com/problems/median-of-two-sorted-arrays
// Author: github.com/lzl124631x
// Time: O(log(MN))
// Space: O(log(MN))
class Solution {
public:
    double findMedianSortedArrays(vector<int>& A, vector<int>& B) {
        int M = A.size(), N = B.size();
        function<int(int, int, int, int, int)> search = [&](int target, int al, int ar, int bl, int br) {
            if (ar < al) return B[target - al];
            if (br < bl) return A[target - bl];
            int amid = (al + ar) / 2, bmid = (bl + br) / 2;
            if (amid + bmid < target) {
                if (A[amid] > B[bmid]) return search(target, al, ar, bmid + 1, br);
                return search(target, amid + 1, ar, bl, br);
            }
            if (A[amid] > B[bmid]) return search(target, al, amid - 1, bl, br);
            return search(target, al, ar, bl, bmid - 1);
        };
        if ((M + N) % 2) return search((M + N) / 2, 0, M - 1, 0, N - 1);
        return (search((M + N - 1) / 2, 0, M - 1, 0, N - 1) + search((M + N) / 2, 0, M - 1, 0, N - 1)) / 2.;
    }
};

Solution 3.

// OJ: https://leetcode.com/problems/median-of-two-sorted-arrays/
// Author: github.com/lzl124631x
// Time: O(log(min(M, N)))
// Space: O(1)
// Ref: https://leetcode.com/problems/median-of-two-sorted-arrays/solution/
// Ref: https://youtu.be/LPFhl65R7ww
class Solution {
public:
    double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2) {
        if (nums1.size() > nums2.size()) swap(nums1, nums2);
        int M = nums1.size(), N = nums2.size(), L = 0, R = M, K = (M + N + 1) / 2;
        while (true) {
            int i = (L + R) / 2, j = K - i;
            if (i < M && nums2[j - 1] > nums1[i]) L = i + 1;
            else if (i > L && nums1[i - 1] > nums2[j]) R = i - 1;
            else {
                int maxLeft = max(i ? nums1[i - 1] : INT_MIN, j ? nums2[j - 1] : INT_MIN);
                if ((M + N) % 2) return maxLeft;
                int minRight = min(i == M ? INT_MAX : nums1[i], j == N ? INT_MAX : nums2[j]);
                return (maxLeft + minRight) / 2.0;
            }
        }
    }
};