Determine if a string has all unique characters

Implement an algorithm to determine if a string has all unique characters What if you can not use additional data structures?

My initial thoughts:

  1. If additional data structures are allowed, then apparently HashTable comes to mind.
  2. There are 128(256) ASCII characters, so we can use an auxiliary array of size 256, each cell stores the count of that particular character. If the count is greater than 1, then it’s the sign of repetition of characters.

My initial codes:

public static boolean doesStringHasAllUniqueChars(String s) {
  int[] table = new int[256];
  for(char c : s) {
    int key = c % 256;
    if(table[key] == 0)
      table[key] = 1;
    else
      return false;
  return true;
}

Comments after running:

  1. Compiling error at highlighted line. String s: Can only iterate over an array or an instance of java.lang.Iterable. Change to s.toCharArray().
  2. Everything else looks just fine. Test for empty string, string with unique chars and string with repeated chars.

Solution:

	public static boolean isUniqueChars2(String str) {
		boolean[] char_set = new boolean[256];
		for (int i = 0; i < str.length(); i++) {
			int val = str.charAt(i);
			if (char_set[val])
				return false;
			char_set[val] = true;
		}
		return true;
	}

Comments:

  1. Using boolean table is much better than my solution. It saves spaces.
  2. You don’t need to mod 256 if you assume there are <256 chars.
  3. Time complexity O(n); space complexity O(n).

Yet another crazy solution is given by:

	public static boolean isUniqueChars(String str) {
		int checker = 0;
		for (int i = 0; i < str.length(); ++i) {
			int val = str.charAt(i) - 'a';
			if ((checker & (1 << val)) > 0)
				return false;
			checker |= (1 << val);
		}
		return true;
	}

Which assumes that the characters are from ‘a’ to ‘z’. Let me explain what this is:

  1. ‘checker’ is the bit vector. checker[i] = 0 if ASCII code ‘a’ + i has not been presented; 1 otherwise. That is what line 6 is doing. | is inclusive OR (i.e., the general sense OR).
  2. Within loop, compare wether a bit-position has already been filled (i.e., value 1). This is done by logical and. That is what line 5 is doing. & is bitwise AND. << shifts the left-hand operand to right-hand bits to the left.

Other thoughts:

  1. Check every char of the string with every other char of the string for duplicate occurrences. This will take O(n^2) time and no space.
  2. If we are allowed to destroy the input string, we could sort the string in O(n log n) time and then linearly check the string for neighboring characters that are identical. Careful, though – many sorting algorithms take up extra space.

4 Comments (+add yours?)

  1. Trackback: Remove Duplicate Characters in a String « Runhe Tian Coding Interview
  2. mohit
    Feb 10, 2014 @ 13:50:46

    excellent solution

    Reply

  3. Kevin
    Mar 23, 2014 @ 20:40:05

    Shamelessly copied from Gayle Laakman Book

    Reply

  4. suptim
    Jul 12, 2014 @ 07:54:22

    space complexity is O(1) not O(n). No matter how long the input string is, the allocated charset array or bit-vector is 128-element long.

    Reply

Leave a comment