Chinese Journal of Tissue Engineering Research ›› 2018, Vol. 22 ›› Issue (20): 3237-3242.doi: 10.3969/j.issn.2095-4344.0302

Previous Articles     Next Articles

Named entity recognition based on bidirectional long short-term memory combined with case report form

Yang Hong-mei1, Li Lin2, Yang Ri-dong1, Zhou Yi1, 2   

  1. 1Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, Guangdong Province, China; 2Xinjiang Medical University, Urumqi 830011, Xinjiang Uygur Autonomous Region, China
  • Received:2018-03-13 Online:2018-07-18 Published:2018-07-18
  • Contact: Zhou Yi, M.D., Associate professor, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, Guangdong Province, China; Xinjiang Medical University, Urumqi 830011, Xinjiang Uygur Autonomous Region, China
  • About author:Yang Hong-mei, Master candidate, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, Guangdong Province, China
  • Supported by:

    the National Key Research & Development Precision Medicine Project of China, No. 2016YFC0901602; the Project of NSFC-Guangdong Big Data Science Center, No. U1611261; the Advanced and Key Technology Innovation Project of Guangdong Province, No. 2014B010118003; the Major University-Industry Cooperation Innovation Project of Guangzhou in 2017, No. 201604016136; the Major Project of Health Medicine Cooperation Innovation Project of Guangzhou, No. 201604020016

Abstract:

BACKGROUND: Electronic medical record (EMR) is an important source of medical source, reflecting medical knowledge. There are patient clinical features in EMR, which enables decision support system and precision medicine.
OBJECTIVE: To extract important medical entities of EMR using information extraction, and to discover hepatocellular carcinoma knowledge.
METHODS: The EMR database of a Grade-A Tertiary hospital in Guangdong Province was used. We retrieved clinical records (18 542 sentences) of 240 patients suffering from hepatocellular carcinoma, including admission notes and discharge summaries. The records were remarked according to the predetermined standards. Totally 180 patients’ records (13 839 sentences) were selected randomly for training and 60 patients’ records (4 703 sentences) were remained for testing. Bidirectional long short-term memory combined with case report form was used to identify the model. The performance of NER systems was evaluated on the test datasets, and precision, recall, F1 of strict matching were caculated.
RESULTS AND CONCLUSION: Evaluation on the dataset showed that an F1-measure of 0.853 5 was for admission, F1-measure of 0.726 5 was for the discharge summaries, and an overall F1-measure was 0.805 2. In this study, we have achieved the auto-name entity identification model of EMR, but the accuracy of entity extraction needs further investigation.

中国组织工程研究杂志出版内容重点:组织构建;骨细胞;软骨细胞;细胞培养;成纤维细胞;血管内皮细胞;骨质疏松组织工程

Key words: Medical Records Systems, Computerized, Neural Networks (Computer), Liver Neoplasms, Tissue Engineering

CLC Number: