An improved fast edit approach for two-string approximated mean computation applied to OCR

Abreu Salas, José Ignacio; Rico-Juan, Juan Ramón

An improved fast edit approach for two-string approximated mean computation applied to OCR

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/142778

Información del item - Informació de l'item - Item information
Título:	An improved fast edit approach for two-string approximated mean computation applied to OCR
Autor/es:	Abreu Salas, José Ignacio \| Rico-Juan, Juan Ramón
Grupo/s de investigación o GITE:	Reconocimiento de Formas e Inteligencia Artificial
Centro, Departamento o Servicio:	Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos
Palabras clave:	Dataset editing \| Shape prototypes \| Edit distance \| Median string
Fecha de publicación:	13-dic-2012
Editor:	Elsevier
Cita bibliográfica:	Pattern Recognition Letters. 2013, 34(5): 496-504. https://doi.org/10.1016/j.patrec.2012.11.019
Resumen:	This paper presents a new fast algorithm for computing an approximation to the mean of two strings of characters representing a 2D shape and its application to a new Wilson-based editing procedure. The approximate mean is built up by including some symbols from the two original strings. In addition, a Greedy approach to this algorithm is studied, which allows us to reduce the time required to compute an approximate mean. The new dataset editing scheme relaxes the criterion for deleting instances proposed by the Wilson editing procedure. In practice, not all instances misclassified by their near neighbors are pruned. Instead, an artificial instance is added to the dataset in the hope of successfully classifying the instance in the future. The new artificial instance is the approximated mean of the misclassified sample and its same-class nearest neighbor. Experiments carried out over three widely known databases of contours show that the proposed algorithm performs very well when computing the mean of two strings, and outperforms methods proposed by other authors. In particular, the low computational time required by the heuristic approach makes it very suitable when dealing with long length strings. Results also show that the proposed preprocessing scheme can reduce the classification error in about 83% of trials. There is empirical evidence that using the Greedy approximation to compute the approximated mean does not affect the performance of the editing procedure.
Patrocinador/es:	This work is partially supported by the Spanish CICYT under project DPI2006-15542-C04-01, the Spanish MICINN through project TIN2009-14205-CO4-01 and by the Spanish research program Consolider Ingenio 2010: MIPRCV (CSD2007-00018).
URI:	http://hdl.handle.net/10045/142778
ISSN:	0167-8655 (Print) \| 1872-7344 (Online)
DOI:	10.1016/j.patrec.2012.11.019
Idioma:	eng
Tipo:	info:eu-repo/semantics/article
Derechos:	© 2012 Elsevier B.V.
Revisión científica:	si
Versión del editor:	https://doi.org/10.1016/j.patrec.2012.11.019
Aparece en las colecciones:	INV - GRFIA - Artículos de Revistas

Archivos en este ítem:

Archivos en este ítem:
Archivo	Descripción	Tamaño	Formato
Abreu_Rico-Juan_2013_PatternRecognitLett_final.pdf	Versión final (acceso restringido)	939,57 kB	Adobe PDF	Abrir Solicitar una copia
Abreu_Rico-Juan_2013_PatternRecognitLett_preprint.pdf	Preprint (acceso abierto)	688,72 kB	Adobe PDF	Abrir Vista previa Cerrar vista previa

Ver citas en Google Académico

Muestra el registro completo