Viewing a single comment thread. View all comments

goedel777 t1_iwjxp76 wrote

unique(incorrectly_spelled_words)

2

Devinco001 OP t1_iwjxuul wrote

Yes, I have done that. It's after dropping the duplicates, the count is coming 10M

1

goedel777 t1_iwjxz5j wrote

Without seeing the code it will be impossible to help here

3

Devinco001 OP t1_iwmebpe wrote

Sure, but its just a for loop, looping through the words in the dictionary, and using a python library 'python-levenshtein' to calculate distance between the dictionary words and the mispelled word.

For now, I am skipping levenshtein for a faster approximate distance, using symspell algorithm. It is highly accurate and much faster. Reduced computation time from 21 days to 13 hours

0