Viewing a single comment thread. View all comments

sameersoi t1_iwjye93 wrote

First thing that popped into my mind was using a tree structure. A quick Google search (“suffix tree spellcheck”) led to this helpful article below that proposes the use of BK-trees (a tree that essential sorts child noes by Levenshtein edit distance). Crucially when doing a look up “Time Complexity will be O(L1L2log n), here L1 is the average length of word in our dictionary and L2 is the length of misspelled.”

https://www.geeksforgeeks.org/bk-tree-introduction-implementation/amp/

2

Devinco001 OP t1_iwmvton wrote

Yeah, I googled BK tree. The data structure is amazing and reduces a lot of computational time. While searching that, I found another algorithm, symspell. This is even faster with high accuracy but doesn't use levenshtein, so currently going to use that.

But BK trees are the go to method if using pure levenshtein and similar, more accurate, string distances. So will be comparing the accuracy of both algos to choose the better one. Thanks

2