Machine learning algorithm for information extraction from gynaecological domain in Tamil

M. Rajasekar, Angelina Geetha

Abstract


Information Extraction is a significant task in Natural Language Processing. It is the process of extracting useful information from unstructured text. Information extraction helps in most of the recent NLP applications like scientific research, financial investigation, business intelligence, media monitoring, healthcare records management, agriculture, and pharmacy research. There are several information extraction research approaches using many techniques from English dataset. As a multi lingual country India, it is actually challenging task to extract information from text in Indian language. Such research work has been done in the following domain data travel, food, agriculture, weather casting, social media, marketing and bio- medical. In this research work the relevant information extracted from Gynaecology related text data in Tamil language. The combination of machine learning based classification model and ontology representation is used extract the useful information. Auto filling IE framework is designed to extract the appropriate information in a structured format. The user query will be pre-processed and converted into entities to check from the classified data using ontological representation by using machine learning based classification model naïve bayes classification. From the ontological representation entity based relation extraction will be performed to fill the IE framework. The proposed IE framework given good results in extracting relevant information based on user query. It was analyzed for more than 57 user queries regarding gynaecological issues. The 75% of accuracy obtained for the correctness in user queries.

Full Text: PDF

Published: 2021-09-06

How to Cite this Article:

M. Rajasekar, Angelina Geetha, Machine learning algorithm for information extraction from gynaecological domain in Tamil, J. Math. Comput. Sci., 11 (2021), 7140-7153

Copyright © 2021 M. Rajasekar, Angelina Geetha. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

 

Copyright ©2022 JMCS