Natural Language Processing on Electronic Health Records
Substance use (smoke, alcohol and drug use) are a significant part of a patients’ history, which can be used for clinical care or clinical research purposes. It is essential for clinicians to have a precise picture of each patient’s substance use, which helps to improve patients’ health care. It has been proven that substance use are the leading causes of morbidity and mortality However, the related information are hidden in the free text. Recently, NLP methods have received a high attention to be used in analysing EHR narrative text and extract the hidden information. An automated system to extract patients’ substance use based on unstructured text in medical discharge records has been proposed. Multistage has been applied to extract the substance use status and attributes such as type, frequency and amount. An extension of negation detection (NegEx algorithm) has been developed and used to detect negation among patients’ records. Various natural language processing techniques along with rule-based techniques have been applied, and compared with machine learning techniques. The proposed system was able to achieve significant performance in detecting substance use status and attributes from clinical notes.