文章摘要
黄德根,李泽中,万如.基于SVM和CRF的双层模型中文机构名识别[J].,2010,(5):782-787
基于SVM和CRF的双层模型中文机构名识别
Chinese organization name recognition using cascaded model based on SVM and CRF
  
DOI:10.7511/dllgxb201005028
中文关键词: 机构名识别  条件随机场(CRF)  支持向量机(SVM)  双层模型
英文关键词: organization name recognition  conditional random fields (CRF)  support vector machine (SVM)  cascaded model
基金项目:中央高校基本科研业务费专项资金资助项目(DUT10RW202).
作者单位
黄德根,李泽中,万如  
摘要点击次数: 1037
全文下载次数: 3454
中文摘要:
      提出了一种基于支持向量机(SVM)和条件随机场(CRF)的双层模型进行中文机构名识别的方法.第一层模型采用CRF识别简单机构名,并将识别结果传至第二层辅助下一步的识别;第二层采用基于驱动的方法,将SVM和CRF结合进行复杂机构名的识别;最后将两层的识别结果合并,并通过一个后续处理对置信度较低的识别结果进行修正.大规模真实语料的开放测试表明,精确率达到94.83 %,召回率达到95.02%,证明了该方法的有效性.
英文摘要:
      A cascaded approach of Chinese organization name recognition based on support vector machine (SVM) and conditional random fields (CRF) is proposed. The simple organization name is recognized in the first level with CRF, and the result is used to support the decision of the second level. Then, a drive-based method is proposed in the second level for recognition of the complicated organization name combining SVM and CRF. Finally, the results of the two levels are combined, and a post-processing to correct those results with low confidence is adopted. The results show that this approach based on SVM and CRF is efficient in recognizing organization name through open test for large-scale real linguistics, and the recalling rate achieves 95.02% and the precision rate achieves 94.83%.
查看全文   查看/发表评论  下载PDF阅读器
关闭