An improved fast edit approach for two-string approximated mean computation applied to OCR

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/142778
Información del item - Informació de l'item - Item information
Título: An improved fast edit approach for two-string approximated mean computation applied to OCR
Autor/es: Abreu Salas, José Ignacio | Rico-Juan, Juan Ramón
Grupo/s de investigación o GITE: Reconocimiento de Formas e Inteligencia Artificial
Centro, Departamento o Servicio: Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos
Palabras clave: Dataset editing | Shape prototypes | Edit distance | Median string
Fecha de publicación: 13-dic-2012
Editor: Elsevier
Cita bibliográfica: Pattern Recognition Letters. 2013, 34(5): 496-504. https://doi.org/10.1016/j.patrec.2012.11.019
Resumen: This paper presents a new fast algorithm for computing an approximation to the mean of two strings of characters representing a 2D shape and its application to a new Wilson-based editing procedure. The approximate mean is built up by including some symbols from the two original strings. In addition, a Greedy approach to this algorithm is studied, which allows us to reduce the time required to compute an approximate mean. The new dataset editing scheme relaxes the criterion for deleting instances proposed by the Wilson editing procedure. In practice, not all instances misclassified by their near neighbors are pruned. Instead, an artificial instance is added to the dataset in the hope of successfully classifying the instance in the future. The new artificial instance is the approximated mean of the misclassified sample and its same-class nearest neighbor. Experiments carried out over three widely known databases of contours show that the proposed algorithm performs very well when computing the mean of two strings, and outperforms methods proposed by other authors. In particular, the low computational time required by the heuristic approach makes it very suitable when dealing with long length strings. Results also show that the proposed preprocessing scheme can reduce the classification error in about 83% of trials. There is empirical evidence that using the Greedy approximation to compute the approximated mean does not affect the performance of the editing procedure.
Patrocinador/es: This work is partially supported by the Spanish CICYT under project DPI2006-15542-C04-01, the Spanish MICINN through project TIN2009-14205-CO4-01 and by the Spanish research program Consolider Ingenio 2010: MIPRCV (CSD2007-00018).
URI: http://hdl.handle.net/10045/142778
ISSN: 0167-8655 (Print) | 1872-7344 (Online)
DOI: 10.1016/j.patrec.2012.11.019
Idioma: eng
Tipo: info:eu-repo/semantics/article
Derechos: © 2012 Elsevier B.V.
Revisión científica: si
Versión del editor: https://doi.org/10.1016/j.patrec.2012.11.019
Aparece en las colecciones:INV - GRFIA - Artículos de Revistas

Archivos en este ítem:
Archivos en este ítem:
Archivo Descripción TamañoFormato 
ThumbnailAbreu_Rico-Juan_2013_PatternRecognitLett_final.pdfVersión final (acceso restringido)939,57 kBAdobe PDFAbrir    Solicitar una copia
ThumbnailAbreu_Rico-Juan_2013_PatternRecognitLett_preprint.pdfPreprint (acceso abierto)688,72 kBAdobe PDFAbrir Vista previa


Todos los documentos en RUA están protegidos por derechos de autor. Algunos derechos reservados.