Boosting Perturbation-Based Iterative Algorithms to Compute the Median String

Empreu sempre aquest identificador per citar o enllaçar aquest ítem http://hdl.handle.net/10045/141854
Información del item - Informació de l'item - Item information
Títol: Boosting Perturbation-Based Iterative Algorithms to Compute the Median String
Autors: Mirabal, Pedro | Abreu Salas, José Ignacio | Seco, Diego | Pedreira, Óscar | Chávez, Edgar
Grups d'investigació o GITE: Procesamiento del Lenguaje y Sistemas de Información (GPLSI)
Centre, Departament o Servei: Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos | Universidad de Alicante. Instituto Universitario de Investigación Informática
Paraules clau: Approximate median string | Algorithm initialization | Half space proximal neighbors
Data de publicació: 23-de desembre-2021
Editor: IEEE
Citació bibliogràfica: IEEE Access. 2021, 9: 169299-169308. https://doi.org/10.1109/ACCESS.2021.3137767
Resum: The most competitive heuristics for calculating the median string are those that use perturbation-based iterative algorithms. Given the complexity of this problem, which under many formulations is NP-hard, the computational cost involved in the exact solution is not affordable. In this work, the heuristic algorithms that solve this problem are addressed, emphasizing its initialization and the policy to order possible editing operations. Both factors have a significant weight in the solution of this problem. Initial string selection influences the algorithm’s speed of convergence, as does the criterion chosen to select the modification to be made in each iteration of the algorithm. To obtain the initial string, we use the median of a subset of the original dataset; to obtain this subset, we employ the Half Space Proximal (HSP) test to the median of the dataset. This test provides sufficient diversity within the members of the subset while at the same time fulfilling the centrality criterion. Similarly, we provide an analysis of the stop condition of the algorithm, improving its performance without substantially damaging the quality of the solution. To analyze the results of our experiments, we computed the execution time of each proposed modification of the algorithms, the number of computed editing distances, and the quality of the solution obtained. With these experiments, we empirically validated our proposal.
Patrocinadors: This work was supported in part by the Comisión Nacional de Investigación Científica y Tecnológica - Programa de Formación de Capital Humano Avanzado (CONICYT-PCHA)/Doctorado Nacional/2014-63140074 through the Ph.D. Scholarship, in part by the European Union's Horizon 2020 under the Marie Sklodowska-Curie under Grant 690941, in part by the Millennium Institute for Foundational Research on Data (IMFD), and in part by the FONDECYT-CONICYT under Grant 1170497. The work of ÓSCAR PEDREIRA was supported in part by the Xunta de Galicia/FEDER-UE refs under Grant CSI ED431G/01 and Grant GRC: ED431C 2017/58, in part by the Office of the Vice President for Research and Postgraduate Studies of the Universidad Católica de Temuco, VIPUCT Project 2020EM-PS-08, and in part by the FEQUIP 2019-INRN-03 of the Universidad Católica de Temuco.
URI: http://hdl.handle.net/10045/141854
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2021.3137767
Idioma: eng
Tipus: info:eu-repo/semantics/article
Drets: This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
Revisió científica: si
Versió de l'editor: https://doi.org/10.1109/ACCESS.2021.3137767
Apareix a la col·lecció: INV - GPLSI - Artículos de Revistas
Investigacions finançades per la UE

Arxius per aquest ítem:
Arxius per aquest ítem:
Arxiu Descripció Tamany Format  
ThumbnailMirabal_etal_2021_IEEEAccess.pdf1,02 MBAdobe PDFObrir Vista prèvia


Tots els documents dipositats a RUA estan protegits per drets d'autors. Alguns drets reservats.