toyNLP是工作之余的NLP项目,仅做参考,实现算法:
一、 NER 命名实体识别
- Bi-LSTM + CRF
- BERT + CRF
- ALBERT + CRF
二、文本分类
- char Bigram + SVM
三、文本相似
- MaLSTM
四、NLP基础组件
- 双数组Trie树(Double Array Trie)
[1] Lample, Guillaume, et al. "Neural architectures for named entity recognition." arXiv preprint arXiv:1603.01360 (2016).
[2] Mueller, Jonas, and Aditya Thyagarajan. "Siamese recurrent architectures for learning sentence similarity." national conference on artificial intelligence (2016): 2786-2792.