文章摘要
李丽双,党延忠,廖文平,黄德根,张颖.CRF与规则相结合的中文地名识别[J].,2012,(2):285-289
CRF与规则相结合的中文地名识别
Recognition of Chinese location names based on CRF and rules
  
DOI:10.7511/dllgxb201202021
中文关键词: 中文信息处理  中文地名识别  条件随机域  基于规则的后处理
英文关键词: Chinese information processing  Chinese location names recognition  conditional random fields  rule based post processing
基金项目:国家自然科学基金资助项目61173101,71031002.
作者单位
李丽双,党延忠,廖文平,黄德根,张颖  
摘要点击次数: 1932
全文下载次数: 1076
中文摘要:
      采用递增式学习策略优化条件随机域(conditional random fields,CRF)的特征模板以提高中文地名的识别效果,结合语言学相关知识构建规则库,以弥补机器学习模型获取知识不够全面导致召回率偏低的不足,最终实现了CRF与规则相结合的中文地名识别系统.实验结果表明,采用CRF与规则相结合的方法识别中文文本中的地名是有效的,对Bakeoff2007 NER任务的MSRA语料进行开放测试,召回率、精确率和 F 值分别为94.67%、92.35%和93.50%.
英文摘要:
      The feature templates of conditional random fields (CRF) are optimized employing incremental learning′s strategy to improve the performance of recognizing Chinese location names. Combining the linguistic knowledge,a rule base is constructed to avoid the low recall caused by the insufficient knowledge obtained from machine learning model. Finally,a system combining CRF with rules to identify location names in Chinese texts is achieved. Experimental results show that the proposed method is effective. In the MSRA corpus Bakeoff2007 NER task,the recall,precision and F value obtained by this system can reach 94.67%,92.35% and 93.50 % respectively in open tests.
查看全文   查看/发表评论  下载PDF阅读器
关闭