Remove Duplicate Characters in a String

Design an algorithm and write code to remove the duplicate characters in a string without using any additional buffer. NOTE: One or two additional variables are fine. An extra copy of the array is not.
FOLLOW UP
Write the test cases for this method.

My initial thoughts:

  1. Use the same bit-vector logic (here) to detect duplicates.
  2. If found duplicates, shift the remaining chars (skip continuous duplicates, i.e., “abaac” will be “abc”) to the left

My initial codes:

void removeDuplicateChars(char* str) {
  char* end = str;
  while(*end) {
    end++;
  }
  char* curr = str;
  int checker = 0;
  while(curr != end) {
    int value = *curr - 'a';
    if((checker & (1 << value)) > 0) {
      char* ptr = curr;
      char* tmp = curr;
      while(*ptr == *curr) {
        ptr++;
      }
      while(ptr != end) {
        *tmp++ = *ptr++;
      }
      *tmp = *ptr;
    }
    checker |= (1 << value);
    cout << str << endl;
    curr++;
  }
  *curr = '\0';
}

Comments after running:

  1. Empty string, fine. Strings with all unique chars, fine. Strings with one duplicate (e.g., “hello”), fine.
  2. Strings with multiple duplicates, only remove for one. (“aabb” outputs “abb”)

Solution:

	public static void removeDuplicates(char[] str) {
		if (str == null)
			return;
		int len = str.length;
		if (len < 2)
			return;

		int tail = 1;

		for (int i = 1; i < len; ++i) {
			int j;
			for (j = 0; j < tail; ++j) {
				if (str[i] == str[j])
					break;
			}
			if (j == tail) {
				str[tail] = str[i];
				++tail;
			}
		}
		str[tail] = 0;
	}

Comments:

  1. EXCEPTION when input is string that does not contain any duplicates ‘abcd’
  2. WORSE: ‘aaaa’ will be ‘a aa’.
  3. Not sure about putting ‘0’ into a char array in java means terminal of the string or not. We can definitely change this code to C/C++ and that should work.

So I write a C/C++ version based on this solution:

void removeDuplicates(char* str) {
	if(str == 0) return;
	char* end = str;
	int length = 0;
	while(*end) {
		length++;
		end++;
	}
	if(length < 2) return;
	
	int index = 1;
	for(int i = 1; i < length; ++i) {
		int j;
		for(j = 0; j < index; ++j) {
			if(str[i] == str[j]) break;
		}
		if(j == index) {
			str[index] = str[i];
			index++;
		}
	}
	str[index] = '\0';
}

Comments:

  1. This method does not use large additional memory
  2. Time complexity is O(n^2)

If additional memory of constant size is allowed, then we can build a boolean table:

void removeDuplicates(char* str) {
	if(str == 0) return;
	char* end = str;
	int length = 0;
	while(*end) {
		length++;
		end++;
	}
	if(length < 2) return;
	
	bool hit[256];
	for(int i = 0; i < 256; ++i)
		hit[i] = false;

	hit[str[0]] = true;
	int j = 1;
	for(int i = 1; i < length; ++i) {
		int index = str[i];
		if(!hit[index]) {
			str[j] = str[i];
			++j;
			hit[index] = true;
		}
	}
	str[j] = '\0';
}

Test Cases

  1. Null string
  2. Empty string
  3. String does not contain any duplicates, e.g.: abcd
  4. String contains all duplicates, e.g.: aaaa
  5. String with all continuous duplicates, e.g.: aaabbb
  6. String with non-contiguous duplicates, e.g.: abababa
Advertisements

8 Comments (+add yours?)

  1. Cupidvogel
    Nov 15, 2012 @ 12:02:15

    Hmmm, nicely copied from Cracking the Coding Interview by Gayle Lackmann…

    Reply

  2. coolsops
    Jun 07, 2013 @ 21:50:27

    Thoughts about copying – No Problem.

    But you should try to explain the logic.

    Reply

  3. coolsops
    Jun 07, 2013 @ 21:57:49

    No problem about copying but you should consider explaining the logic..

    Reply

  4. coolsops
    Jun 07, 2013 @ 21:59:13

    No problem about copying the code but you should consider explaining the logic to your readers.

    Reply

  5. Ria
    Sep 03, 2013 @ 07:24:23

    Reply

  6. Ria
    Sep 03, 2013 @ 07:27:04

    Solution 2 will also remove single duplicate only. For eg “aabbcc” would output
    “a bbcc”.

    Reply

  7. hunter123
    Feb 18, 2014 @ 22:46:31

    So the book solution is wrong? lol

    Reply

  8. dharshan
    Jul 02, 2014 @ 14:10:23

    proper soln is not available fo anythng

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: