Clinical information digitalization and structuring system using optical character recognition and natural language processing techniques

Authors

  • Hugo Eduardo Castro Aranzábal Universidad de Lima
  • Walter Giancarlo Pinedo Barrientos Universidad de Lima. Lima, Perú

DOI:

https://doi.org/10.26439/ciis2018.5466

Keywords:

optical character recognition, computer vision, natural language processing, international statistical classification of diseases, support vector machine, term frequency – inverse document frequency

Abstract

This work aims to develop a system that allows the digitization and structuring of clinical records written in a traditional fashion by a doctor using optical character recognition and natural language processing techniques, proving in a non-invasive workflow. In addition, it will enable the use of the generated data for future work.

Downloads

Download data is not yet available.

References

Biondich, P., Overhage, M., Dexter, P., Downs, S., Lemmon, L., y McDonald, C. (2002). A modern optical character recognition system in a real world clinical setting: some accuracy and feasibility observations. Recuperado de http://www.ncbi.nlm.nih. gov/pmc/articles/PMC2244242/

Candice, N. y Erasmus, L. (2016). Electronic Medical Records: A developing and developed country analysis. Recuperado de http://iamot2016.org/proceedings/ papers/IAMOT_ 2016_paper_32.pdf

Castro, H. y Pinedo, W. (2018). Sistema de digitalización y estructuración de información clínica con técnicas de reconocimiento óptico de caracteres y procesamiento de lenguaje natural. Tesis de pregrado. Universidad de Lima.

Charles, D., Meghan, G. y Searcy, T. (abril de 2015). Adoption of Electronic Health Record Systems among U.S. NonFederal. Recuperado de https://www.healthit.gov/sites/default/files/data-brief/2014HospitalAdoptionDataBrief.pdf

Dadhania, S., y Dhobi, J. (2012). Improved kNN Algorithm by Optimizing Cross-validation. International Journal of Engineering Research y Technology (IJERT) 1(3), pp. 1-6. Recuperado de http://www.ijert.org/download/135/improved-knn-algorithm-by-optimiz ing-cross-validation

Grother, P., y Hanaoka, K. (2016). NIST special database 19 hand printed forms and characters. (Segunda edición). Recuperado de National Institute of Standards and Technology:

https://s3.amazonaws.com/nist- srd/SD19/sd19_users_guide_edition_2.pdf

Hilbert, M. (2015). Digital technology and social change (Open Online Course at the University of California, freely available). Recuperado de https://canvas.instructure.com/courses/949415

Mickevicius, V., Krilavicius, T. y Morkevicius, V. (2015). Classification of short legal lithuanian texts. Recuperado de http://bpti.lt/wp-content/uploads/2016/02/bsnlp2015.pdf

Ouchtati, S., Redjimi, M., y Bedda, M. (2015). An offline system for the recognition of the fragmented handwritten numeric vhains. International Journal of Future Computer and Communication 4(1), pp. 33-39. Recuperado de http://www.ijfcc.org/vol4/351-C032.pdf

Perea, J., Martín, M., Montejo, A., y Diaz, M. (2008). Categorización de textos biomédicos usando UMLS. Procesamiento del Lenguaje Natural 40 (pp. 121-127). Recuperado de http://www.sepln.org/revistaSEPLN/revista/40/todo.pdf

Pradeep, J., Srinivasan, E., y Himavathi, S. (marzo de 2011). Neural Network based handwritten character recognition system without feature extraction. En: International Conference on Computer, Communication and Electrical Technology-ICCCET 2011, 18th y 19th March, (pp. 40-44). Recuperado de http://ieeexplore.ieee.org/document/5762513/

Rasmussen, L. V., Peissig, P., McCarty, C., y Starren, J. (junio de 2012). Development of an optical character recognition pipeline for handwritten form fields from an electronic health record. Recuperado de https://jamia.oxfordjournals.org/content /19/e1/e90

Stanford Natural Language Processing Group (2015). Spanish FAQ for Stanford CoreNLP, parser, POS tagger, and NER. Recuperado de https://nlp.stanford.edu/software/spanish-faq.shtml

Sun, H. (2015). k- Nearest Neighbour and SVM classifier with feature extraction and feature selection. Recuperado de http://homepages.rpi.edu/~sunh6/15fall6967.pdf

Downloads

Published

2021-10-11

How to Cite

Clinical information digitalization and structuring system using optical character recognition and natural language processing techniques. (2021). Actas Del Congreso Internacional De Ingeniería De Sistemas, 181-189. https://doi.org/10.26439/ciis2018.5466