Preview
Accuracy Improvement of Khmer Text Recognition by Correcting Post-recognized Characters
Abstract
Key Messages
- The Constitution of Cambodia establishes Khmer as the official language, making accurate digital processing of Khmer documents crucial for national development. Limitations hinder current digital transformation efforts in both public and private sectors in Khmer text recognition technology.
- Existing Optical Character Recognition (OCR) tools show significant limitations with Khmer script, with common character recognition errors affecting document processing efficiency. Our post-processing correction method improves Khmer OCR accuracy from 93.4 to 96.4%, representing a significant advancement in Khmer text digitization.
- The proposed solution can be integrated into existing document management systems without requiring extensive infrastructure changes. Government agencies and private organizations can achieve higher efficiency in document digitization while maintaining Khmer language integrity.
- Government institutions should prioritize the adoption of improved Khmer OCR systems to enhance public service delivery. Investment in Khmer language digital tools will support Cambodia’s digital transformation goals while preserving its linguistic heritage.
Full text article
Generated from XML file
Authors
SRUN, S., KEAN, T., & BUN, L. Accuracy Improvement of Khmer Text Recognition by Correcting Post-recognized Characters. Insight: Cambodia Journal of Basic and Applied Research, 6(2), -. https://doi.org/10.61945/cjbar.2024.6.2.05
Copyright and license info is not available