Here I collect some python modules for NLP
text vector, text similarity
- Chinese text to vector, text vectorization tools, including word vectorization and sentence vectorization
from text2vec import SearchSimilarity
a = '如何更换花呗绑定银行卡'
b = '花呗更改绑定银行卡'
c = '我什么时候开通了花呗'
corpus = [a, b, c]
print(corpus)
search_sim = SearchSimilarity(corpus=corpus)
print(a, 'scores:', search_sim.get_scores(query=a))
print(a, 'rank similarities:', search_sim.get_similarities(query=a))
word extract
word segmentation toolkit
Example
代码示例1
import thulac
thu1 = thulac.thulac() #默认模式
text = thu1.cut("我爱北京天安门", text=True) #进行一句话分词
print(text)
没有评论:
发表评论