SymSpellCppPy

SymSpellCppPy: Pybind11 binding for SymSpellPy

class SymSpellCppPy.Info

Bases: pybind11_object

property corrected_string

Read-only property to get the word segmented and spelling corrected string.

property distance_sum

Read-only property to get the edit distance sum between input string and corrected string.

get_corrected(self: SymSpellCppPy.Info) str

Get the word segmented and spelling corrected string.

get_distance(self: SymSpellCppPy.Info) int

Get the edit distance sum between input string and corrected string.

get_probability(self: SymSpellCppPy.Info) float

Get the sum of word occurrence probabilities in log scale. This is a measure of how common and probable the corrected segmentation is.

get_segmented(self: SymSpellCppPy.Info) str

Get the word segmented string.

property log_prob_sum

Read-only property to get the sum of word occurrence probabilities in log scale. This is a measure of how common and probable the corrected segmentation is.

property segmented_string

Read-only property to get the word segmented string.

set(self: SymSpellCppPy.Info, segmented_string: str, corrected_string: str, distance_sum: int, log_prob_sum: float) None

Set the properties of Info object.

Parameters:
  • segmented_string – Word segmented string.

  • corrected_string – Word segmented and spelling corrected string.

  • distance_sum – Edit distance sum between input string and corrected string.

  • log_prob_sum – Sum of word occurrence probabilities in log scale (a measure of how common and probable the corrected segmentation is).

class SymSpellCppPy.SuggestItem

Bases: pybind11_object

SuggestItem is a class that contains a suggested correct spelling for a misspelled word.

property count

Gets or sets the frequency of the suggestion in the dictionary (a measure of how common the word is).

property distance

Gets or sets the edit distance between the searched for word and the suggestion.

property term

Gets or sets the suggested correctly spelled word.

class SymSpellCppPy.SymSpell

Bases: pybind11_object

SymSpell is a class that provides fast and accurate spelling correction using Symmetric Delete spelling correction algorithm.

count_threshold(self: SymSpellCppPy.SymSpell) int

Retrieves the frequency threshold to be considered as a valid word for spelling correction.

create_dictionary(self: SymSpellCppPy.SymSpell, corpus: str) bool

Load multiple dictionary words from a file containing plain text.

create_dictionary_entry(self: SymSpellCppPy.SymSpell, key: str, count: int) bool

Create or update an entry in the dictionary.

delete_dictionary_entry(self: SymSpellCppPy.SymSpell, key: str) bool

Deletes a word from the dictionary and updates internal representation accordingly.

entry_count(self: SymSpellCppPy.SymSpell) int

Retrieves the total number of delete words formed in the dictionary.

load_bigram_dictionary(self: SymSpellCppPy.SymSpell, corpus: str, term_index: int, count_index: int, separator: str = ' ') bool

Load multiple dictionary entries from a file of word/frequency count pairs.

load_dictionary(self: SymSpellCppPy.SymSpell, corpus: str, term_index: int, count_index: int, separator: str = ' ') bool

Load multiple dictionary entries from a file of word/frequency count pairs.

load_pickle(self: SymSpellCppPy.SymSpell, filepath: str) None

Load internal representation from file

load_pickle_bytes(self: SymSpellCppPy.SymSpell, bytes: buffer) None

Load internal representation from buffers, such as ‘bytes’ and ‘memoryview’

lookup(*args, **kwargs)

Overloaded function.

  1. lookup(self: SymSpellCppPy.SymSpell, input: str, verbosity: SymSpellCppPy.Verbosity) -> List[SymSpellCppPy.SuggestItem]

    Find suggested spellings for a given input word, using the maximum edit distance specified during construction of the SymSpell dictionary.

  2. lookup(self: SymSpellCppPy.SymSpell, input: str, verbosity: SymSpellCppPy.Verbosity, max_edit_distance: int) -> List[SymSpellCppPy.SuggestItem]

    Find suggested spellings for a given input word, using the maximum edit distance provided to the function.

  3. lookup(self: SymSpellCppPy.SymSpell, input: str, verbosity: SymSpellCppPy.Verbosity, max_edit_distance: int, include_unknown: bool) -> List[SymSpellCppPy.SuggestItem]

    Find suggested spellings for a given input word, using the maximumedit distance provided to the function and include input word in suggestions if no words within edit distance found.

  4. lookup(self: SymSpellCppPy.SymSpell, input: str, verbosity: SymSpellCppPy.Verbosity, max_edit_distance: int = 2, include_unknown: bool = False, transfer_casing: bool = False) -> List[SymSpellCppPy.SuggestItem]

    Find suggested spellings for a given input word, using the maximum edit distance provided to the function and include input word in suggestions if no words within edit distance found & preserve transfer casing.

lookup_compound(*args, **kwargs)

Overloaded function.

  1. lookup_compound(self: SymSpellCppPy.SymSpell, input: str) -> List[SymSpellCppPy.SuggestItem]

    LookupCompound supports compound-aware automatic spelling correction of multi-word input strings with three cases:
    1. Mistakenly inserted space into a correct word led to two incorrect terms.

    2. Mistakenly omitted space between two correct words led to one incorrect combined term.

    3. Multiple independent input terms with/without spelling errors.

  2. lookup_compound(self: SymSpellCppPy.SymSpell, input: str, max_edit_distance: int) -> List[SymSpellCppPy.SuggestItem]

    LookupCompound supports compound-aware automatic spelling correction of multi-word input strings with three cases:
    1. Mistakenly inserted space into a correct word led to two incorrect terms.

    2. Mistakenly omitted space between two correct words led to one incorrect combined term.

    3. Multiple independent input terms with/without spelling errors.

  3. lookup_compound(self: SymSpellCppPy.SymSpell, input: str, max_edit_distance: int, transfer_casing: bool) -> List[SymSpellCppPy.SuggestItem]

    LookupCompound supports compound-aware automatic spelling correction of multi-word input strings with three cases:
    1. Mistakenly inserted space into a correct word led to two incorrect terms.

    2. Mistakenly omitted space between two correct words led to one incorrect combined term.

    3. Multiple independent input terms with/without spelling errors.

max_length(self: SymSpellCppPy.SymSpell) int

Retrieves the maximum length of words in the dictionary.

purge_below_threshold_words(self: SymSpellCppPy.SymSpell) None

Remove all below threshold words from the dictionary.

save_pickle(self: SymSpellCppPy.SymSpell, filepath: str) None

Save internal representation to file

save_pickle_bytes(self: SymSpellCppPy.SymSpell) bytes

Save internal representation to bytes

word_count(self: SymSpellCppPy.SymSpell) int

Retrieves the total number of words in the dictionary.

word_segmentation(*args, **kwargs)

Overloaded function.

  1. word_segmentation(self: SymSpellCppPy.SymSpell, input: str) -> SymSpellCppPy.Info

    WordSegmentation divides a string into words by inserting missing spaces at the appropriate positions. Misspelled words are corrected and do not affect segmentation. Existing spaces are allowed and considered for optimum segmentation.

  2. word_segmentation(self: SymSpellCppPy.SymSpell, input: str, max_edit_distance: int) -> SymSpellCppPy.Info

    WordSegmentation divides a string into words by inserting missing spaces at the appropriate positions. Misspelled words are corrected and do not affect segmentation. Existing spaces are allowed and considered for optimum segmentation.

  3. word_segmentation(self: SymSpellCppPy.SymSpell, input: str, max_edit_distance: int, max_segmentation_word_length: int) -> SymSpellCppPy.Info

    WordSegmentation divides a string into words by inserting missing spaces at the appropriate positions. Misspelled words are corrected and do not affect segmentation. Existing spaces are allowed and considered for optimum segmentation.

class SymSpellCppPy.Verbosity

Bases: pybind11_object

Members:

TOP

Top suggestion with the highest term frequency of the suggestions of smallest edit distance found.

CLOSEST

All suggestions of smallest edit distance found, the suggestions are ordered by term frequency.

ALL

All suggestions <= maxEditDistance, the suggestions are ordered by edit distance, then by term frequency (highest first)

ALL = <Verbosity.ALL: 2>
CLOSEST = <Verbosity.CLOSEST: 1>
TOP = <Verbosity.TOP: 0>
property name
property value