司念文,王衡军,李伟,单义栋,谢鹏程.基于注意力长短时记忆网络的中文词性标注模型[J].计算机科学,2018,45(4):66-70, 82
基于注意力长短时记忆网络的中文词性标注模型
Chinese Part-of-speech Tagging Model Using Attention-based LSTM
投稿时间:2017-05-11  修订日期:2017-07-18
DOI:10.11896/j.issn.1002-137X.2018.04.009
中文关键词:  词性标注,长短时记忆网络,注意力机制,上下文特征
英文关键词:Part-of-speech tagging,Long short-term memory,Attention mechanism,Contextual feature
基金项目:
作者单位E-mail
司念文 中国人民解放军信息工程大学三院 郑州450001  
王衡军 中国人民解放军信息工程大学三院 郑州450001 wang-hengjun@163.com 
李伟 66083部队 北京100144  
单义栋 中国人民解放军信息工程大学三院 郑州450001  
谢鹏程 西安交通大学数学与统计学院 西安710049  
摘要点击次数: 324
全文下载次数: 208
中文摘要:
      针对传统的基于统计模型的词性标注存在人工特征依赖的问题,提出一种有效的基于注意力长短时记忆网络的中文词性标注模型。该模型以基本的分布式词向量作为单元输入,利用双向长短时记忆网络提取丰富的词语上下文特征表示。同时在网络中加入注意力隐层,利用注意力机制为不同时刻的隐状态分配概率权重,使隐层更加关注重要特征,从而优化和提升隐层向量的质量。在解码过程中引入状态转移概率矩阵,以进一步提升标注准确率。在《人民日报》和中文宾州树库CTB5语料上的实验结果表明,该模型能够有效地进行中文词性标注,其准确率高于条件随机场等传统词性标注方法,与当前较好的词性标注模型也十分接近。
英文摘要:
      Because traditional statistical model based Chinese part-of-speech tagging relies heavily on manually designed features,this paper proposed an effective attention based long short-term memory model for Chinese part-of-speech tagging.The proposed model utilizes the basic distributed word vector as the unit input,and extracts rich contextual feature representation with bidirectional long short-term memory.At the same time,an attention based hidden layer is added in the network,and the attention probability is distributed for hidden state in different time to optimize and improve the quality of hidden vector.The state transition probability is employed in decoding process to further improve accuracy.Experimental results on PKU and CTB5 dataset show that the proposed model is able to make Chinese part-of-speech tagging effectively.It achieves higher accuracy than traditional methods and gets competitive results compared with state-of-the-art models.
查看全文  查看/发表评论  下载PDF阅读器