You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Run pip install pythainlp for minimum dependency, just enough to run core functions of PyThaiNLP. Run pip install pythainlp[full] to install every packages that required for extended functions (like machine-learnet name entity recognizer that rely on keras).
Some class and function names are changed from 1.7 to make it aligned with PEP8 (Style Guide for Python Code), make it more explicit about what they are doing, or make it more consistent with other related classes/functions. For examples:
thainer and thai2rom classes are now ThaiNameTagger and ThaiTransliterator (CapWords for class name)
pythainlp.soundex.LK82, pythainlp.soundex.Udom83, and pythainlp.MetaSound functions are now pythainlp.soundex.lk82, pythainlp.soundex.udom83, and pythainlp.soundex.metasound (small caps for function name, also move metasound to soundex module)
collation, correction, and romanization functions are now collate, correct, and romanize -- in a verb (action) form, and in line with tokenize and summarize functions.
pythainlp.corpus.alphabets, pythainlp.corpus.tone, etc. constants are now pythainlp.thai_consonants, pythainlp.thai_tonemarks, etc.
They are also now str instead of set.
This is to follow the example of string.ascii_letters, etc. str also iterate a little bit faster in one character for one member use cases that these constants are usually used for.
These changes will resulted in breaking code if your code directly invoke those classes/functions. In general, the change should be only at the level of class or function name, there should be no change at the arguments passing to the class or the function. Please refer to the API doc.
New evaluation corpus
New features
pythainlp.transliterate.transliterate
grapheme to phoneme (Add pythainlp.g2p.ipa #139)NorvigSpellChecker
class - can be initialized with custom dictionary (อยากเพิ่มคำใน pythainlp.spell ครับ #119, Update Peter Norvig's spell checker to suggest words based on probability #137)pythainlp.util.thai_strftime
for date and time formatting (use standarddatetime.strftime
directives) (Utility functions: rearrange package locations + add thai_strftime() date and time formatter #160)pip install pythainlp
for minimum dependency, just enough to run core functions of PyThaiNLP. Runpip install pythainlp[full]
to install every packages that required for extended functions (like machine-learnet name entity recognizer that rely on keras).pythainlp.util.thaicheck
- Thai check Add check thai word #171Bug fixes
metasound
soundex to work as described in the Snae & Brückner (2009) paper. (Fix MetaSound + Adjust tokenizer selector + More documentation + clean code #135)Other improvements and optimizations
ImportError
, if there is import error, instead of sys.exit()tokenize
,summarize
, etc. will always return something even the engine specified is not found (will fall back to default engine) (summarize: Small variable rename and handle engine not found case #131)Name changes in API
rank
,find_keyword
,collate
, and functions related to date and time, are now inpythainlp.util
module. (Utility functions: rearrange package locations + add thai_strftime() date and time formatter #160)thainer
andthai2rom
classes are nowThaiNameTagger
andThaiTransliterator
(CapWords for class name)pythainlp.soundex.LK82
,pythainlp.soundex.Udom83
, andpythainlp.MetaSound
functions are nowpythainlp.soundex.lk82
,pythainlp.soundex.udom83
, andpythainlp.soundex.metasound
(small caps for function name, also move metasound to soundex module)collation
,correction
, andromanization
functions are nowcollate
,correct
, andromanize
-- in a verb (action) form, and in line withtokenize
andsummarize
functions.pythainlp.corpus.alphabets
,pythainlp.corpus.tone
, etc. constants are nowpythainlp.thai_consonants
,pythainlp.thai_tonemarks
, etc.str
instead ofset
.string.ascii_letters
, etc.str
also iterate a little bit faster in one character for one member use cases that these constants are usually used for.The text was updated successfully, but these errors were encountered: