Skip to content

Commit 4b1720c

Browse files
authored
fix(strings): use frequency-based signature for anagrams
Replaced the sorting-based signature implementation with a frequency-based approach using `collections.Counter`. This ensures that the signature represents both characters and their counts, preventing collisions and better grouping of true anagrams. Examples: - "test" → "e1s1t2" - "finaltest" → "a1e1f1i1l1n1s1t2" - "this is a test" → " 3a1e1h1i2s3t3" Also updated the anagram lookup to use the new frequency-based signatures, making results more accurate and avoiding false positives.
1 parent 196e658 commit 4b1720c

1 file changed

Lines changed: 9 additions & 6 deletions

File tree

strings/anagrams.py

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,21 +8,22 @@
88

99
def signature(word: str) -> str:
1010
"""
11-
Return a word sorted by its letters.
11+
Return a frequency-based signature for a word.
1212
1313
>>> signature("test")
14-
'estt'
14+
'e1s1t2'
1515
>>> signature("this is a test")
16-
' aehiisssttt'
16+
' 3a1e1h1i2s3t3'
1717
>>> signature("finaltest")
18-
'aefilnstt'
18+
'a1e1f1i1l1n1s1t2'
1919
"""
20-
return "".join(sorted(word))
20+
freq = collections.Counter(word)
21+
return "".join(f"{ch}{freq[ch]}" for ch in sorted(freq))
2122

2223

2324
def anagram(my_word: str) -> List[str]:
2425
"""
25-
Return every anagram of the given word.
26+
Return every anagram of the given word from the dictionary.
2627
2728
>>> anagram('test')
2829
['sett', 'stet', 'test']
@@ -34,9 +35,11 @@ def anagram(my_word: str) -> List[str]:
3435
return word_by_signature.get(signature(my_word), [])
3536

3637

38+
# Load word list
3739
data: str = Path(__file__).parent.joinpath("words.txt").read_text(encoding="utf-8")
3840
word_list = sorted({word.strip().lower() for word in data.splitlines()})
3941

42+
# Map signatures to word list
4043
word_by_signature = collections.defaultdict(list)
4144
for word in word_list:
4245
word_by_signature[signature(word)].append(word)

0 commit comments

Comments
 (0)