DOI:
https://doi.org/10.14483/23448393.21135Published:
2024-11-27Issue:
Vol. 29 No. 3 (2024): September-DecemberSection:
Computational IntelligenceAdvanced Neural Model for Spanish Spell-Checking
Modelo neuronal avanzado para corrección ortográfica en español
Keywords:
neocortex, deep neural model, spell-checker, pattern recognition (en).Keywords:
neocórtex, modelo neuronal profundo, reconocimiento de patrones, corrector ortográfico (es).Downloads
Abstract (en)
Context: Correcting spelling errors in written content, particularly in Spanish texts, remains a critical challenge in natural language processing (NLP) due to the complexity of word structures and the inefficiency of existing methods when applied to large datasets.
Method: This paper introduces a novel neural model inspired by the brain’s cognitive mechanisms for recognizing and correcting misspelled words. Through a deep hierarchical framework with specialized recognition neurons and advanced activation functions, the model is designed to enhance the accuracy and scalability of spelling correction systems. Our approach not only improves error detection but also provides context-aware corrections.
Results: The results show that the model achieves an F-measure of 83%, significantly surpassing the 73% accuracy of traditional spell-checkers, marking a substantial advancement in automated spelling correction for the Spanish language.
Conclusions: The features of the neural model facilitate spelling correction by emulating the cognitive mechanisms of the human mind. Our model detects more orthographic error types and reports less false positives. As for its limitations, this proposal requires the supervised definition of the weights assigned to the variables used for recognition.
Abstract (es)
Contexto: La corrección de errores ortográficos en textos escritos, especialmente en textos en español, sigue siendo un desafío crucial en el procesamiento del lenguaje natural (PLN) debido a la complejidad de las estructuras de las palabras y la ineficacia de los métodos existentes cuando se aplican a grandes conjuntos de datos.
Método: Este artículo presenta un novedoso modelo neuronal inspirado en los mecanismos cognitivos del cerebro para reconocer y corregir palabras mal escritas. A través de un marco jerárquico profundo con neuronas de reconocimiento especializadas y funciones de activación avanzadas, el modelo está diseñado para mejorar la precisión y la escalabilidad de los sistemas de corrección ortográfica. Nuestro enfoque no solo mejora la detección de errores, sino que también proporciona correcciones conscientes del contexto.
Resultados: Los resultados muestran que el modelo alcanza una medida F del 83 %, superando significativamente el 73 % de precisión de los correctores ortográficos tradicionales, lo que representa un avance sustancial en la corrección automática de ortografía para el idioma español.
Conclusiones: Las funcionalidades del modelo neuronal computacional facilitan la corrección ortográfica al emular los mecanismos cognitivos de la mente humana. Nuestro modelo detecta más tipos de errores ortográficos y presenta menos falsos positivos. En cuanto a las limitaciones, la propuesta requiere una definición supervisada de los pesos asignados a las variables que se utilizan para el reconocimiento.
References
S. Almurashi, "Analysis of the most common spelling errors in English for Saudi students: A case study of foundation year students," Getsempena English Edu. J., vol. 10, no. 1, pp. 73-89, 2023. https://doi.org/10.46244/geej.v10i1.2081
F. Bustamante and E. Díaz, “Spelling error pattern in Spanish for word processing applications,” in Proc. 5th Int. Conf. Lang. Res. Eval., 2006, pp. 93-98. http://www.lrec-conf.org/proceedings/lrec2006/pdf/119_pdf.pdf
S. Singh and A. Mahmood, “The NLP cookbook: Modern recipes for transformer based deep learning architectures,” IEEE Access, vol. 9, pp. 68675-68702, 2021. https://doi.org/10.1109/ACCESS.2021.3077350
A. Ferreira and S. Hernández. “Diseño e implementación de un corrector ortográfico dinámico para el sistema tutorial inteligente”, Rev. Signos, vol. 50, no. 95, pp. 385-407, 2017. http://dx.doi.org/10.4067/S0718-09342017000300385
A. San Mateo, "Un corpus de bigramas utilizado como corrector ortográfico y gramatical destinado a hablantes nativos de español," Rev. Signos, vol. 49, no. 90, pp. 94-118, 2016. http://dx.doi.org/10.4067/S0718-09342016000100005
P. Gamallo and M. Garcia, “LinguaKit: A multilingual tool for linguistic analysis and information extraction,” Linguamatica, vol. 9, no. 1, pp.19-28, 2017.
G. Zomer and A. Frankenberg-Garcia, “Beyond grammatical error correction: Improving L1-influenced research writing in English using pre-trained encoder-decoder models,” in Find. Assoc. Comp. Ling. EMNLP 2021 , 2021, pp. 2534-2540. https://doi.org/10.18653/v1/2021.findings-emnlp.216
B. Ünlütabak and O. Bal, “Theory of mind performance of large language models: A comparative analysis of Turkish and English,” Comp. Speech Lang., vol. 89, art. 101698, 2025. https://doi.org/10.1016/j.csl.2024.101698
M. Bijoy et al. “A transformer-based spelling error correction framework for Bangla and resource scarce Indic languages,” Comp. Speech Lang., vol. 89, art. 101703, 2025. https://doi.org/10.1016/j.csl.2024.101703
E. Puerto, J. Aguilar, R. Vargas, and J. Reyes, “An Ar2p deep learning architecture for the discovery and the selection of features,” Neural Process. Letters, vol. 50, no. 1, pp. 623-643, 2019. https://doi.org/10.1007/s11063-019-10062-4
E. Puerto, and J. Aguilar and A. Pinto, “Automatic spell-checking system for Spanish based on the Ar2p neural network model,” Computers, vol. 13, no. 13, art. 76, 2024. https://doi.org/10.3390/computers13030076
E. Puerto and B. R. Pérez, "Análisis de la teoría de la mente humana basada en el reconocimiento de patrones," 2014. [Online]. Available: http://hdl.handle.net/20.500.12749/12358
E. Puerto Cuadros, "Avances en el conocimiento y modelado computacional del cerebro autista: Una revisión de literatura," Cuad. Activa, vol. 9, no. 2017, pp. 109-125, 2017. https://doi.org/10.53995/20278101.425
R. Kurzweil, “How to make mind,” Futurist, vol. 47, no. 2, pp. 14-17, 2013.
K. Omelianchuk, V. Atrasevych, A. Chernodub, and O. Skurzhanskyi, "GECToR – Grammatical error correction: Tag, not rewrite," in 15th Work. Innov. Use NLP Build. Edu. App., 2020, pp. 163-170. https://doi.org/10.48550/arXiv.2005.12592
I. A. Khabutdinov, A. V. Chashchin, A. V. Grabovoy, A. S. Kildyakov, and U. V. Chekhovich, “RuGECToR: Rule-based neural network model for Russian language grammatical error correction,” Programm. Comp. Software, vol. 50, no. 4, pp. 315-321, 2024. https://doi.org/10.1134/S0361768824700129
S. Rothe, J. Mallinson, E. Malmi, S. Krause, and A. Severyn, "A simple recipe for multilingual grammatical error correction," in ACL-IJCNLP 2021, 2021, pp. 702-707. https://doi.org/10.18653/v1/2021.acl-short.89
S. Flachs, O. Lacroix, H. Yannakoudakis, M. Rei, and A. Søgaard, "Grammatical error correction in low error density domains: A new benchmark and analyses," in 2020 Conf. Empirical Methods Natural Lang. Process., 2020, pp. 8467-8478. https://doi.org/10.48550/arXiv.2010.07574
C. Bryant, Z. Yuan, M. R. Qorib, H. Cao, H. T. Ng, and T. Briscoe, "Grammatical error correction: A survey of the state of the art," Comp. Ling., vol. 49, no. 3, pp. 643-701. https://doi.org/10.1162/coli_a_00478
V. González, B. González, and M. Muriel, “STILUS: sistema de revisión lingüística de textos en castellano,” Proc. Leng. Nat., vol. 29, pp. 305-306, 2002.
I. da Cunha, M. Montané, and L. Hysa, “The arText prototype: An automatic system for writing specialized texts,” in Proc. Euro. Chapter Assoc. Comp. Ling., 2017, pp. 57-60. https://aclanthology.org/E17-3015
E. Agirre et al., “XUXEN: A spelling checker/corrector for Basque based on two-level morphology,” in 3rd Conf. Applied Natural lang. Process., 1992, pp. 119-125.
A. Valdehíta, "Un corpus de bigramas utilizado como corrector ortográfico y gramatical destinado a hablantes nativos de español," Rev. Signos, vol. 49, pp. 94-118, 2016.
C. Napoles, K. Sakaguchi, and J. Tetreault, "A fluency corpus and benchmark for grammatical error correction", in Proc. Euro. Chapter Assoc. Comp. Ling., 2017, pp. 229-234.
https://doi.org/10.48550/arXiv.1702.04066
E. Puerto and J. Aguilar, “Formal description of a pattern for a recursive process of recognition,” in Proc. IEEE Latin American Conf. Comp. Intell., 2016, pp. 1-2. https://doi.org/10.1109/LA-CCI.2016.7885746
E. Puerto, J. Aguilar, and D. Chávez, “A new recursive patterns matching model inspired in systematic theory of human mind,” Int. J. Advance. Comp. Tech. (IJACT), vol. 28, no. 9, 2017.
E. Puerto, J. Aguilar, R. Vargas, and J. Reyes, “An Ar2p deep learning architecture for the discovery and the selection of features,” Neural Process. Letters, vol. 50, no. 1, pp. 623-643, 2019. https://doi.org/10.1007/s11063-019-10062-4
D. Powers, “Evaluation: From precision, recall and f-measure to ROC, informedness, markedness & correlation,” J. Machine Learn. Tech., vol. 2, pp. 37-63, 2011.
E. Puerto and J. Aguilar. “Learning algorithm for the recursive pattern recognition model,” App. Artif. Intell., vol. 30, no. 7, pp. 662-678, 2016. https://doi.org/10.1080/08839514.2016.1213584
How to Cite
APA
ACM
ACS
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver
Download Citation
License
Copyright (c) 2024 Eduard Gilberto Puerto Cuadros
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
From the edition of the V23N3 of year 2018 forward, the Creative Commons License "Attribution-Non-Commercial - No Derivative Works " is changed to the following:
Attribution - Non-Commercial - Share the same: this license allows others to distribute, remix, retouch, and create from your work in a non-commercial way, as long as they give you credit and license their new creations under the same conditions.