Minimum Deletions to Make Character Frequencies Unique#
Problem statement#
[1]A string s
is considered “good” if there are no two different characters in the string that have the same frequency, meaning each character appears a unique number of times.
You’re given a string s
, and your task is to determine the minimum number of characters you need to delete from s
to make it a “good” string.
The frequency of a character in a string is the count of times that character appears in the string. For instance, in the string "aab"
, the frequency of 'a'
is 2
, and the frequency of 'b'
is 1
.
Example 1#
Input: s = "aab"
Output: 0
Explanation: s is already good.
Example 2#
Input: s = "aaabbbcc"
Output: 2
Explanation: You can delete two 'b's resulting in the good string "aaabcc".
Another way is to delete one 'b' and one 'c' resulting in the good string "aaabbc".
Example 3#
Input: s = "ceabaacb"
Output: 2
Explanation: You can delete both 'c's resulting in the good string "eabaab".
Note that we only care about characters that are still in the string at the end (i.e. frequency of 0 is ignored).
Constraints#
1 <= s.length <= 10^5
.s
contains only lowercase English letters.
Solution: Delete the frequencies in sorted order#
Your goal is to make all the frequencies be different.
One way of doing that is by sorting the frequencies and performing the deletion.
Example 4#
For s = "ceaacbb"
, the frequencies of the characters are: freq['a'] = 2, freq['b'] = 2, freq['c'] = 2
and freq['e'] = 1
. They are already in sorted order.
Let the current frequency be the first frequency
freq['a'] = 2
.The next frequency is
freq['b'] = 2
, equal to the current frequency. Delete one appearance to make the current frequency be1
.The next frequency is
freq['c'] = 2
, bigger than the current frequency. Delete two appearances to make the current frequency be0
.Because the current frequency is
0
, delete all appearances of the remaining frequencies, which isfreq['e'] = 1
.In total, there are
4
deletions.
Code#
#include <algorithm>
#include <iostream>
#include <vector>
using namespace std;
int minDeletions(string& s) {
// map 'a'->0, 'b'->1, ..,'z'->25
vector<int> freq(26, 0);
for (char& c: s) {
// count the frequency of character c
freq[c - 'a']++;
}
// sort freq in descending order
sort(freq.begin(), freq.end(), greater<int>());
int deletion = 0;
int currentFreq = freq.at(0); // start with the max frequency
for (int i = 1; i < freq.size() && freq.at(i) > 0; i++) {
if (currentFreq == 0) {
// delete all remaining characters
deletion += freq.at(i);
} else if (freq[i] >= currentFreq) {
// delete just enough to make the freq[i] < currentFreq
deletion += freq.at(i) - currentFreq + 1;
currentFreq--;
} else {
// do not delete on freq[i] < currentFreq
currentFreq = freq.at(i);
}
}
return deletion;
}
int main() {
cout << minDeletions("aab") << endl;
cout << minDeletions("aaabbbcc") << endl;
cout << minDeletions("ceabaacb") << endl;
}
Output:
0
2
2
Complexity#
Runtime:
O(N)
, whereN = s.length
;Extra space:
O(1)
.
Conclusion#
The problem of determining the minimum number of deletions required to make character frequencies unique can be efficiently solved by counting the frequencies of characters and iteratively adjusting the frequencies to ensure uniqueness.
This solution achieves this by first counting the frequencies of characters and then sorting them in descending order. By iteratively processing the sorted frequencies, the solution ensures that each character frequency is unique while minimizing the number of deletions required.
Exercise#
Minimum Deletions to Make Array Beautiful[2].