Combined Contrast Enhanced and Wide-Baseline Technique for Kannada Text Detection in Images
DOI:
https://doi.org/10.17762/jaz.v44iS6.2289Keywords:
Contrast Limited Adaptive Histogram Equalization, Text Detection, Wide Baseline Image MatchingAbstract
Text characters contained in images are a valuable source of information for content-based indexing and retrieval applications. These text characters are difficult to identify and distinguish due to their various sizes, grayscale values, and intricate backgrounds. The paper presents a new method for identifying text contained in images of any grayscale value. The proposed scheme uses a combination of contrast-limited adaptive histogram equalization (CLAHE) algorithm, which enhances the local contrast and limits any noise in the image, and the wide baseline image matching technique which helps locate an object in the image. Applying a series of morphological operations and filtering at each stage, the resultant component is the detected text which is either a character, word or a line segment. MATLAB based simulation and evaluation on a self-curated Kannada, a popular south Indian language and other standard datasets proves that the proposed technique outperforms other methods consistently on precision, recall and F1-score. Importantly, on the Kannada dataset, it returns the highest recall of 98% since the system is specifically tuned for its linguistic features proving its robustness. Further, the proposed technique can be extended to image pre-processing tasks for deep learning models to improve their accuracy and for text recognition tasks.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2023 Shahzia Siddiqua, C. Naveena, Sunilkumar S. Manvi

This work is licensed under a Creative Commons Attribution 4.0 International License.