## Remove Duplicate Characters in a String

Design an algorithm and write code to remove the duplicate characters in a string without using any additional buffer. NOTE: One or two additional variables are fine. An extra copy of the array is not.
Write the test cases for this method.

My initial thoughts:

1. Use the same bit-vector logic (here) to detect duplicates.
2. If found duplicates, shift the remaining chars (skip continuous duplicates, i.e., “abaac” will be “abc”) to the left

My initial codes:

```void removeDuplicateChars(char* str) {
char* end = str;
while(*end) {
end++;
}
char* curr = str;
int checker = 0;
while(curr != end) {
int value = *curr - 'a';
if((checker & (1 << value)) > 0) {
char* ptr = curr;
char* tmp = curr;
while(*ptr == *curr) {
ptr++;
}
while(ptr != end) {
*tmp++ = *ptr++;
}
*tmp = *ptr;
}
checker |= (1 << value);
cout << str << endl;
curr++;
}
*curr = '\0';
}

Empty string, fine. Strings with all unique chars, fine. Strings with one duplicate (e.g., “hello”), fine.
Strings with multiple duplicates, only remove for one. (“aabb” outputs “abb”)

Solution:
public static void removeDuplicates(char[] str) {
if (str == null)
return;
int len = str.length;
if (len < 2)
return;

int tail = 1;

for (int i = 1; i < len; ++i) {
int j;
for (j = 0; j < tail; ++j) {
if (str[i] == str[j])
break;
}
if (j == tail) {
str[tail] = str[i];
++tail;
}
}
str[tail] = 0;
}

EXCEPTION when input is string that does not contain any duplicates ‘abcd’
WORSE: ‘aaaa’ will be ‘a aa’.
Not sure about putting ‘0’ into a char array in java means terminal of the string or not. We can definitely change this code to C/C++ and that should work.

So I write a C/C++ version based on this solution:
void removeDuplicates(char* str) {
if(str == 0) return;
char* end = str;
int length = 0;
while(*end) {
length++;
end++;
}
if(length < 2) return;

int index = 1;
for(int i = 1; i < length; ++i) {
int j;
for(j = 0; j < index; ++j) {
if(str[i] == str[j]) break;
}
if(j == index) {
str[index] = str[i];
index++;
}
}
str[index] = '\0';
}

This method does not use large additional memory
Time complexity is O(n^2)

If additional memory of constant size is allowed, then we can build a boolean table:
void removeDuplicates(char* str) {
if(str == 0) return;
char* end = str;
int length = 0;
while(*end) {
length++;
end++;
}
if(length < 2) return;

bool hit[256];
for(int i = 0; i < 256; ++i)
hit[i] = false;

hit[str[0]] = true;
int j = 1;
for(int i = 1; i < length; ++i) {
int index = str[i];
if(!hit[index]) {
str[j] = str[i];
++j;
hit[index] = true;
}
}
str[j] = '\0';
}

Test Cases

Null string
Empty string
String does not contain any duplicates, e.g.: abcd
String contains all duplicates, e.g.: aaaa
String with all continuous duplicates, e.g.: aaabbb
String with non-contiguous duplicates, e.g.: abababa

__ATA.cmd.push(function() {
sectionId: '370373',
});
});

__ATA.cmd.push(function() {
collapseEmpty: 'before',
sectionId: '26942',
width: 300,
height: 250
});
});

__ATA.cmd.push(function() {
collapseEmpty: 'before',
sectionId: '114160',
width: 300,
height: 250
});
});

Related
```

1. Hmmm, nicely copied from Cracking the Coding Interview by Gayle Lackmann…

2. Thoughts about copying – No Problem.

But you should try to explain the logic.

3. No problem about copying but you should consider explaining the logic..

4. No problem about copying the code but you should consider explaining the logic to your readers.

5. Solution 2 will also remove single duplicate only. For eg “aabbcc” would output
“a bbcc”.

6. So the book solution is wrong? lol

7. dharshan
Jul 02, 2014 @ 14:10:23

proper soln is not available fo anythng