Word Completion and Suggestion System

Project 3: Word Completion and Suggestion System

We’re developing a system that suggests and completes words based on user input, similar to how search engines do it. This makes typing faster and more intuitive.

Ternary Search Tree (TST): To Store a Dictionary of Words and Enable Fast Prefix Searches

What it is: A Ternary Search Tree is a type of search tree used to store a dynamic set or associative array where the keys are usually strings.

Why TST: TSTs allow efficient storage and querying of words, making them ideal for applications like autocomplete and spell-checking.

Ternary Search Tree
Fig 1. A Ternary Search Tree structure. Source: geeksforgeeks

Pros and Cons:

  • Pros: Fast lookups and insertions; space-efficient for storing prefixes; combines advantages of tries and binary search trees.
  • Cons: Can be more complex to implement; requires balancing to maintain efficiency.

GitHub Link: Example TST Implementation

References: IEEE Transactions on Knowledge and Data Engineering

Binary Search Trees (BST): To Maintain and Query Frequently Used Words

What it is: A Binary Search Tree is a data structure in which each node has at most two children, referred to as the left child and the right child.

Why BST: BSTs help in efficiently maintaining and querying frequently used words, allowing quick lookups and updates.

Binary Search Tree
Fig 2. A Binary Search Tree. Source: geeksforgeeks

Pros and Cons:

  • Pros: Provides ordered data; supports efficient insertion, deletion, and lookup operations.
  • Cons: Performance can degrade to \( O(n) \) in the worst case; requires balancing for optimal performance.

GitHub Link: Example BST Implementation

References: IEEE Transactions on Software Engineering

Auto-Suggestions Algorithm: To Implement the Autocomplete Functionality

What it is: An auto-suggestions algorithm generates possible completions for a given input based on a predefined dictionary or dataset.

Why Autocomplete: Autocomplete algorithms enhance user experience by predicting and suggesting possible completions, saving time and effort.

Levenshtein Automata:Levenshtein automata are used to quickly correct strings and perform fuzzy searches of words in a dictionary. The Levenshtein distance is a text similarity measure that compares two words and returns a numeric value representing the distance between them. The distance reflects the total number of single-character edits required to transform one word into another

Reference:Levenshtein