## (Merge k Sorted Lists)

Merge k sorted linked lists and return it as one sorted list. Analyze and describe its complexity.

Thoughts:
We start with the first list and then merge it with the second list, then the third list, so on and the so forth. For merging two sorted linked list, we use the linear merging algorithm borrowed from Merge Sort. The total complexity is $O(kn)$.

Code (Java):

```public class Solution {
public ListNode mergeKLists(ArrayList<ListNode> lists) {
for (ListNode node : lists)
}

} else {
}
}
}

Code (C++):
class Solution {
public:
ListNode *mergeKLists(vector<ListNode *> &lists) {
for(int i = 0; i < lists.size(); ++i)
}

private:

} else {
}
}
};

__ATA.cmd.push(function() {
__ATA.initSlot('atatags-26942-5d3249dc2a65e',  {
collapseEmpty: 'before',
sectionId: '26942',
location: 120,
width: 300,
height: 250
});
});

```

## Finding the 1st missing positive int in an array (First Missing Positive)

Given an unsorted integer array, find the first missing positive integer.

For example,
Given [1,2,0] return 3,
and [3,4,-1,1] return 2.

Your algorithm should run in $O(n)$ time and uses constant space.

Thoughts:
The idea is simple. What is the most desired array we want to see? Something like [1,2,3] then we know 4 is missing, or [1, 8, 3, 4] then we know 2 is missing. In other word, “all the numbers are in their correct positions”. What are correct positions? For any $i$, A[i] = i+1. So our goal is to rearrange those numbers (in place) to their correct positions. We then need to decide how to arrange them. Let’s take the [3, 4, -1, 1] as an example. The 1st number, 3, we know it should stay in position 2. So we swap A[0] = 3 with A[2]. We then get [-1, 4, 3, 1]. We can’t do anything about -1 so we leave it there. The 2nd number, 4, we know it should sit in A[3]. So we swap A[1] = 4 with A[3]. We then get [-1, 1, 3, 4]. Now 1 should stay in A[0], so we keep swapping and we get [1, -1, 3, 4]. Notice now every positive number is staying in their correct position (A[0]=1, A[2]=3 and A[3]=4). We then need one more scan to find out 2 is missing.

Code (Java):

```import java.util.HashSet;
public class Solution {
public int firstMissingPositive(int[] A) {
for(int i = 0; i < A.length; ++i) {
while(A[i] != i + 1) {
if(A[i] <= 0 || A[i] > A.length || A[i] == A[A[i]-1])
break;
int temp = A[i];
A[i] = A[temp - 1];
A[temp - 1] = temp;
}
}

for(int i = 0; i < A.length; ++i)
if(A[i] != i + 1)
return i + 1;
return A.length + 1;
}
}

Code (C++):
class Solution {
public:
int firstMissingPositive(int A[], int n) {
for(int i = 0; i < n; ++i) {
while(A[i] != i + 1) {
if(A[i] <= 0 || A[i] > n || A[i] == A[A[i]-1])
break;
int temp = A[i];
A[i] = A[temp - 1];
A[temp - 1] = temp;
}
}
for(int i = 0; i < n; ++i)
if(A[i] != i+  1)
return i + 1;
return n + 1;
}
};

__ATA.cmd.push(function() {
__ATA.initVideoSlot('atatags-370373-5d3249dc2ae10', {
sectionId: '370373',
});
});

```

## Find largest 1M numbers in 1B numbers

Describe an algorithm to find the largest 1 million numbers in 1 billion numbers. Assume that the computer memory can hold all one billion numbers.

Thoughts:
We can certainly sort those numbers and return the last 1 million of them. That takes $O(n\log n)$.
If we think about it, we actually do not need to sort them. After all, we just need the largest 1 million numbers, in whatever orders. Therefore we can sort of “partially” sort the numbers and try to find the parts that we need. To do this, we get inspiration from Quicksort, where we get two partitions around pivot during each run:

```[elements < pivot][elements >= pivot]

For simplicity, Let us denote the number of elements in the right part as $m$.
If $m$ is exactly equal to 1 million, we have found the largest 1 million numbers!
If $m$ is larger than 1 million, we need to keep looking for the 1 million numbers, but in the left part of the array this time. We can do this in a recursive way.
If $m$ is less than 1 million, we first remember those $m$ numbers and then we search for the largest (1 million – $m$) numbers in the left part of the array. Again, recursion here.
Using the random pivot choosing approach, this algorithm takes $O(n)$ time complexity and it does not need any additional space since we can do the partition in place.
Codes:
public static int quickPartition(int[] array, int low, int high) {
if (high <= low)
return -1;

// randomly choose a pivot
int index = new Random().nextInt(high - low + 1) + low;
int pivot = array[index];

// swap the pivot to the front
int num = array[index];
array[index] = array[low];
array[low] = num;

// partition as in Quicksort
int L = low + 1;
int R = high;

while (R >= L) {
while (R >= L && array[L] < pivot)
L++;
while (R >= L && array[R] >= pivot)
R--;
if (R >= L) {
int temp = array[L];
array[L] = array[R];
array[R] = temp;
L++;
R--;
}
}

// swap the pivot back to appropriate position.
num = array[R];
array[R] = array[low];
array[low] = num;

return R;
}

public static List<Integer> largestM(int[] numbers, int M, int low, int high) {
// RIndex = index of pivot
int RIndex = quickPartition(numbers, low, high);
// mFound = # of the largest numbers
int mFound = high + 1 - RIndex;

if (mFound == M) { // Done!
int i = RIndex;
for (; i <= high; ++i)
return results;

} else if (mFound > M) { // Keep looking
// check if those mFound elements are actually the same
boolean duplicates = true;
for (int i = RIndex; i <= high; ++i) {
if (numbers[i] != numbers[RIndex])
duplicates = false;
}

if (!duplicates)
return largestM(numbers, M, RIndex, high);
else {
// if they are the same, just return M duplicates of them
for (int k = 0; k < M; ++k)
return results;
}
} else { // Has found some, keep looking for the rest
int i = RIndex;
for (; i < numbers.length; ++i)

results.addAll(largestM(numbers, M - mFound, low, RIndex - 1));
return results;
}

}

```

## Circus tower sorting problem

A circus is designing a tower routine consisting of people standing atop one another’s shoulders. For practical and aesthetic reasons, each person must be both shorter and lighter than the person below him or her. Given the heights and weights of each person in the circus, write a method to compute the largest possible number of people in such a tower.
EXAMPLE:
Input (ht, wt): (65, 100) (70, 150) (56, 90) (75, 190) (60, 95) (68, 110)
Output: The longest tower is length 6 and includes from top to bottom: (56, 90)
(60,95) (65,100) (68,110) (70,150) (75,190)

My initial thoughts:
Sort the data according to height. If same height, sort according to weight then. After sorting, scan the data to find the maximum ordering sequence. Starting from the 1st element, if the next element is ordered, keep going. Otherwise, mark the 1st invalid element. Next time, start from the invalid element. In each scan, store the length of the sequence. Output the largest one.

My initial codes:

```	public static int highestTower(List<Pair<Integer>> data) {

class HeightComparator implements Comparator<Pair<Integer>> {
@Override
public int compare(Pair<Integer> p1, Pair<Integer> p2) {
if (!p1.getX().equals(p2.getX()))
return p1.getX() - p2.getX();
else {
return p1.getY() - p2.getY();
}
}
}

Collections.sort(data, new HeightComparator());
int count = 1;
int max_count = -Integer.MAX_VALUE;
int fromIndex = 0;
boolean notchanged = true;
while (fromIndex != data.size()) {
Pair<Integer> base = data.get(fromIndex);
for (int i = fromIndex; i < data.size() - 1; ++i) {
Pair<Integer> item = data.get(i);
if (item.getX() > base.getX() && item.getY() > base.getY()) {
count++;
base = item;
} else if (notchanged) {
fromIndex = i;
notchanged = false;
}
if (count > max_count)
max_count = count;
}
}
return max_count;
}

It doesn’t work. Infinite loop. After debugging, the program runs correctly but the logic behind is actually wrong hence the output is not correct. What I try to do is as follows:
Suppose, after sorting we have: (1, 3), (2, 2), (3, 1), (4, 6), (5, 4), (6, 5)
Then after 1st scan, we have the max sequence as (1, 3), (4, 6) and the invalid index is 1 (i.e., from (2, 2)). However, I’m expecting we have (1, 3), (5, 4) and (6, 5) as the max sequence.
Solution:
It goes as follows:

Sort the data according to height. Denote it as $D_{1}$. That is $O(n\log n)$.
Sort the data according to weight. Denote it as $D_{2}$. That is $O(n\log n)$.
Find the length of the Longest Common Sub-sequence of $D_{1}$ and $D_{2}$. Using dynamic programming, it’s $O(n^{2})$.

public static int highestTower(List<Pair<Integer>> data) {

class HeightComparator implements Comparator<Pair<Integer>> {
private int which;

public HeightComparator(int which) {
this.which = which;
}

@Override
public int compare(Pair<Integer> p1, Pair<Integer> p2) {
if (which == 0) {
if (!p1.getX().equals(p2.getX()))
return p1.getX() - p2.getX();
else {
return p1.getY() - p2.getY();
}
} else {
if (!p1.getY().equals(p2.getY()))
return p1.getY() - p2.getY();
else {
return p1.getX() - p2.getX();
}
}
}
}

List<Pair<Integer>> copy = new ArrayList<Pair<Integer>>();
Collections.sort(data, new HeightComparator(0));
Collections.sort(copy, new HeightComparator(1));
return LCSLength(data, copy);
}

private static int LCSLength(List<Pair<Integer>> X,
List<Pair<Integer>> Y) {
int[][] array = new int[X.size() + 1][Y.size() + 1];
for (int i = 0; i < X.size(); ++i)
array[i][0] = 0;
for (int j = 0; j < Y.size(); ++j)
array[0][j] = 0;
for (int i = 0; i < X.size(); ++i) {
for (int j = 0; j < Y.size(); ++j) {
if (X.get(i).equals(Y.get(j)))
array[i + 1][j + 1] = array[i][j] + 1;
else
array[i + 1][j + 1] =
Math.max(array[i + 1][j], array[i][j + 1]);
}
}
return array[X.size()][Y.size()];
}

```

## Sorting a 2GB file with one string per line

If you have a 2GB file with one string per line, which sorting algorithm would you use to sort the file and why?

My initial thoughts:
Because 2GB size of strings are way too huge to be put into main memory, I came up with two ways:

1. K-way merge sort. Divide the file into K pieces, transfer them into main memory and sort them.
2. Bucket sort. Sort each character in order.

Solution:
When an interviewer gives a size limit of 2GB, it should tell you something – in this case, it suggests that they don’t want you to bring all the data into memory.
So what do we do? We only bring part of the data into memory..
Algorithm:
How much memory do we have available? Let’s assume we have X MB of memory available.

1. Divide the file into K chunks, where X * K = 2 GB. Bring each chunk into memory and sort the lines as usual using any $O(nlogn)$ algorithm. Save the lines back to the file.
2. Now bring the next chunk into memory and sort.
3. Once we’re done, merge them one by one.

The above algorithm is also known as external sort. Step 3 is known as N-way merge.
The rationale behind using external sort is the size of data. Since the data is too huge and we can’t bring it all into memory, we need to go for a disk based sorting algorithm.

## Sort an array of strings so that anagrams are next to each other

Write a method to sort an array of strings so that all the anagrams are next to each other.

My initial thoughts:
Do a bubble-sort kind sort. Check if two pairs of strings are anagrams or not, if yes, swap.

My initial codes:

```	private static boolean areAnagrams(String s1, String s2) {
if (s1.length() != s2.length() || s1 == null || s2 == null)
return false;
s1 = s1.toLowerCase();
s2 = s2.toLowerCase();
int[] table = new int[26];
for (int i = 0; i < s1.length(); ++i) {
int index = s1.charAt(i) - 'a';
table[index]++;
}
for (int i = 0; i < s2.length(); ++i) {
int index = s2.charAt(i) - 'a';
if (table[index] <= 0)
return false;
table[index]--;
}
for (int i = 0; i < 26; ++i) {
if (table[i] > 0)
return false;
}
return true;
}

private static void swap(String[] strings, int i, int j) {
String temp = strings[i];
strings[i] = strings[j];
strings[j] = temp;
}

public static void sortAnagrams(String[] strings) {
for (int i = 0; i < strings.length - 1; ++i) {
for (int j = i + 1; j < strings.length; ++j) {
if (areAnagrams(strings[i], strings[j]))
swap(strings, i + 1, j);
}
}
}

It does not guarantee the alphabetical orderings of strings:

INPUT: “xyz”, “ca”, “ab”, “ac”, “ba”, “zyx”
EXPECTED OUTPUT: “ab”, “ba”, “ac”, “ca”, “xyz”, “zyx”
ACTUAL OUTPUT: “xyz”, “zyx”, “ab”, “ba”, “ac”, “ca”

Solution:
public class AnagramComparator implements Comparator<String> {
public String sortChars(String s) {
char[] content = s.toCharArray();
Arrays.sort(content);
return new String(content);
}

public int compare(String s1, String s2) {
return sortChars(s1).compareTo(sortChars(s2));
}
}

public static void main(String[] args) {
String[] strings = { "xyz", "ca", "ab", "ac", "ba", "zyx" };
AnagramComparator comparator = new AnagramComparator();
Arrays.sort(strings, comparator);
}

```

## Merge two sorted arrays

You are given two sorted arrays, A and B, and A has a large enough buffer at the end to hold B. Write a method to merge B into A in sorted order.

My initial thoughts:
This is part of the merge-sort. Denote the size of $A$ as $m$ and size of $B$ as $n$, then we have a $O(m+n)$ linear algorithm.

My initial codes:

```	public static int[] merge(int[] A, int[] B) {
int i = 0;
int j = 0;
int[] merged = new int[A.length + B.length];
int index = 0;
while (i != A.length && j != B.length) {
if (A[i] <= B[j]) {
merged[index++] = A[i];
i++;
} else {
merged[index++] = B[j];
j++;
}
}
if (i != A.length) {
for (int m = i; m < A.length; ++m)
merged[index++] = A[m];
}
if (j != B.length) {
for (int m = j; m < B.length; ++m)
merged[index++] = B[m];
}
return merged;
}

It takes $O(m+n)$ additional space. Plus, I don’t utilize the fact that $A$ has a large enough buffer at the end to hold $B$.
Solution:
public static void bufferMerge(int[] A, int[] B,
// Assume sizeAdata + sizeB = A.length
// i.e., B exactly fits into the buffer at the end of A
int k = A.length - 1; // Index of last location of array A
int i = sizeAdata - 1;// Index of last element in array A
int j = sizeB - 1;// Index of last element in array B

// Start comparing from the last element and merge A and B
while (i >= 0 && j >= 0) {
if (A[i] > B[j]) {
A[k--] = A[i--];
} else {
A[k--] = B[j--];
}
}
while (j >= 0) {
A[k--] = B[j--];
}
}

Highlighted line 17. We don’t need to consider the case where $j$ reaches 0 where $i$ is still greater than 0. In that case, it means array $A$ has some leading elements left. Then we don’t need to move them because:
All of them must be less the smallest element in $B$. So they should still be in the leading positions to maintain the ordering.