||Strategies for generating suggestions for incorrectly spelled words.|
||Data and other attributes defined here:|
- OCR = 1
- TYPO = 0
||Represents a token in tokenized natural language text.|
||Methods defined here:|
- __init__(self, tokenText, tokenType)
Data and other attributes defined here:
- NONE = 0
- PUNCTUATION = 2
- UNKNOWN = 4
- WHITESPACE = 3
- WORD = 1
||Represents an instance of Voikko. The instance has state, such as|
settings related to spell checking and hyphenation, and methods for performing
various natural language analysis operations. One instance should not be
used simultaneously from multiple threads.
Currently no more than one instance can be in initialized state. This is because
libvoikko is not yet thread safe. This restriction should go away in future
||Methods defined here:|
- Creates a new Voikko instance.
- analyze(self, word)
- Analyze the morphology of given word and return the list of
analysis results. The results are represented as maps having property
names as keys and property values as values.
- getHyphenationPattern(self, word)
- Return a character pattern that describes the hyphenation of given word.
' ' = no hyphenation at this character,
'-' = hyphenation point (character at this position
is preserved in the hyphenated form),
'=' = hyphentation point (character at this position
is replaced by the hyphen.)
- grammarErrorExplanation(self, errorCode, language)
- Return a human readable explanation for grammar error code in
- grammarErrors(self, paragraph)
- Check the given paragraph for grammar errors and return a
list of GrammarError objects representing the errors that were found.
- hyphenate(self, word)
- Return the given word in fully hyphenated form.
- init(self, path=None, variant='fi_FI', cacheSize=0)
- Initialize the Voikko instance with the following optional parameters:
path Extra path that will be checked first when looking for linguistic
variant Variant of morphological dictionary to use.
cacheSize Parameter that controls the size of in memory cache for
spell checking results. 0 is the default size, 1 is twice as large
as 0 etc. -1 disables the spell checking cache entirely.
- listDicts(self, path=None)
- Return a list of Dictionary objects representing the available
dictionary variants. If path is specified, it will be searched first
before looking from the standard locations. This method can be called
even if the Voikko instance has not yet been initialized.
- setAcceptAllUppercase(self, value)
- Accept words even when all of the letters are in uppercase. Note that this is
not the same as setIgnoreUppercase: with this option the word is still
checked, only case differences are ignored.
- setAcceptBulletedListsInGc(self, value)
- (Grammar checking only): Accept paragraphs if they would be valid within
- setAcceptExtraHyphens(self, value)
- (Spell checking only): Allow some extra hyphens in words. This option relaxes
hyphen checking rules to work around some unresolved issues in the underlying
morphology, but it may cause some incorrect words to be accepted. The exact
behaviour (if any) of this option is not specified.
- setAcceptFirstUppercase(self, value)
- Accept words even when the first letter is in uppercase (start of sentence etc.)
- setAcceptMissingHyphens(self, value)
- (Spell checking only): Accept missing hyphens at the start and end of the word.
Some application programs do not consider hyphens to be word characters. This
is reasonable assumption for many languages but not for Finnish. If the
application cannot be fixed to use proper tokenisation algorithm for Finnish,
this option may be used to tell libvoikko to work around this defect.
- setAcceptTitlesInGc(self, value)
- (Grammar checking only): Accept incomplete sentences that could occur in
titles or headings. Set this option to true if your application is not able
to differentiate titles from normal text paragraphs, or if you know that
you are checking title text.
- setAcceptUnfinishedParagraphsInGc(self, value)
- (Grammar checking only): Accept incomplete sentences at the end of the
paragraph. These may exist when text is still being written.
- setHyphenateUnknownWords(self, value)
- (Hyphenation only): Hyphenate unknown words.
- setIgnoreDot(self, value)
- Ignore dot at the end of the word (needed for use in some word processors).
If this option is set and input word ends with a dot, spell checking and
hyphenation functions try to analyse the word without the dot if no results
can be obtained for the original form. Also with this option, string tokenizer
will consider trailing dot of a word to be a part of that word.
- setIgnoreNonwords(self, value)
- (Spell checking only): Ignore non-words such as URLs and email addresses.
- setIgnoreNumbers(self, value)
- Ignore words containing numbers.
- setIgnoreUppercase(self, value)
- Accept words that are written completely in uppercase letters without checking
them at all.
- setMinHyphenatedWordLength(self, value)
- The minumum length for words that may be hyphenated. This limit is also enforced on
individual parts of compound words.
- setNoUglyHyphenation(self, value)
- Do not insert hyphenation positions that are considered to be ugly but correct
- setSuggestionStrategy(self, value)
- Set the suggestion strategy to be used when generating spelling suggestions.
- spell(self, word)
- Check the spelling of given word. Return true if the word is correct,
false if it is incorrect.
- suggest(self, word)
- Generate a list of suggested spellings for given (misspelled) word.
If the given word is correct, the list contains only the word itself.
- Uninitialize this Voikko instance. The instance must be initialized again
before it can be used.
- tokens(self, text)
- Split the given natural language text into a list of Token objects.