Describe an algorithm to find the largest 1 million numbers in 1 billion numbers. Assume that the computer memory can hold all one billion numbers.

**Thoughts:**

We can certainly sort those numbers and return the last 1 million of them. That takes .

If we think about it, we actually do not need to sort them. After all, we just need the largest 1 million numbers, in whatever orders. Therefore we can sort of “partially” sort the numbers and try to find the parts that we need. To do this, we get inspiration from Quicksort, where we get two partitions around pivot during each run:

[elements < pivot][elements >= pivot]For simplicity, Let us denote the number of elements in the right part as .

If is exactly equal to 1 million, we have found the largest 1 million numbers!

If is larger than 1 million, we need to keep looking for the 1 million numbers, but in the left part of the array this time. We can do this in a recursive way.

If is less than 1 million, we first remember those numbers and then we search for the largest (1 million – ) numbers in the left part of the array. Again, recursion here.

Using the random pivot choosing approach, this algorithm takes time complexity and it does not need any additional space since we can do the partition in place.

Codes:public static int quickPartition(int[] array, int low, int high) { if (high <= low) return -1; // randomly choose a pivot int index = new Random().nextInt(high - low + 1) + low; int pivot = array[index]; // swap the pivot to the front int num = array[index]; array[index] = array[low]; array[low] = num; // partition as in Quicksort int L = low + 1; int R = high; while (R >= L) { while (R >= L && array[L] < pivot) L++; while (R >= L && array[R] >= pivot) R--; if (R >= L) { int temp = array[L]; array[L] = array[R]; array[R] = temp; L++; R--; } } // swap the pivot back to appropriate position. num = array[R]; array[R] = array[low]; array[low] = num; return R; } public static List<Integer> largestM(int[] numbers, int M, int low, int high) { // RIndex = index of pivot int RIndex = quickPartition(numbers, low, high); // mFound = # of the largest numbers int mFound = high + 1 - RIndex; List<Integer> results = new LinkedList<Integer>(); if (mFound == M) { // Done! int i = RIndex; for (; i <= high; ++i) results.add(numbers[i]); return results; } else if (mFound > M) { // Keep looking // check if those mFound elements are actually the same boolean duplicates = true; for (int i = RIndex; i <= high; ++i) { if (numbers[i] != numbers[RIndex]) duplicates = false; } if (!duplicates) return largestM(numbers, M, RIndex, high); else { // if they are the same, just return M duplicates of them for (int k = 0; k < M; ++k) results.add(numbers[RIndex]); return results; } } else { // Has found some, keep looking for the rest int i = RIndex; for (; i < numbers.length; ++i) results.add(numbers[i]); results.addAll(largestM(numbers, M - mFound, low, RIndex - 1)); return results; } }Advertisements