Simple Thai Preprocess Functions
This repository provides simple preprocess techniques for Thai sentences/phrases
The module supports Python 3.6+
pip install th-simple-preprocessor
from th_preprocessor.preprocess import preprocess
text = '"::::: อย่างไรก็ตามนูร์ ฮิชัม อับดุลเลาะห์ 21-09-2018 https://www.malaysiakini.com/news/444015"'
words = preprocess(text)
print(words)
# อย่างไรก็ตามนูร์ ฮิชัม อับดุลเลาะห์ WSNUMBER WSNUMBER WSNUMBER WSLINKth_preprocessor.preprocess.normalize_linkth_preprocessor.preprocess.normalize_at_mentionth_preprocessor.preprocess.normalize_emailth_preprocessor.preprocess.normalize_hahath_preprocessor.preprocess.normalize_numth_preprocessor.preprocess.normalize_phoneth_preprocessor.preprocess.normalize_accented_charsth_preprocessor.preprocess.normalize_special_charsth_preprocessor.preprocess.remove_hashtagsth_preprocessor.preprocess.remove_tagth_preprocessor.preprocess.remove_dup_spacesth_preprocessor.preprocess.remove_emojith_preprocessor.preprocess.replace_dup_charsth_preprocessor.preprocess.replace_dup_emojisth_preprocessor.preprocess.insert_spacesth_preprocessor.preprocess.normalize_emojith_preprocessor.preprocess.remove_others_charth_preprocessor.preprocess.remove_stopwordsth_preprocessor.preprocess.preprocess
All licenses in this repository are copyrighted by their respective authors. Everything else is released under CC0. See LICENSE for details.