Detecting Misleading Headlines Through the Automatic Recognition of Contradiction in Spanish

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/136242
Información del item - Informació de l'item - Item information
Título: Detecting Misleading Headlines Through the Automatic Recognition of Contradiction in Spanish
Autor/es: Sepúlveda-Torres, Robiert | Bonet-Jover, Alba | Saquete Boró, Estela
Grupo/s de investigación o GITE: Procesamiento del Lenguaje y Sistemas de Información (GPLSI)
Centro, Departamento o Servicio: Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos
Palabras clave: Annotation Guideline | Contradiction Detection | Dataset Annotation | Deep Learning Techniques | Disinformation Detection | Human Language Technologies | Natural Language Processing
Fecha de publicación: 14-jul-2023
Editor: IEEE
Cita bibliográfica: IEEE Access. 2023, 11: 72007-72026. https://doi.org/10.1109/ACCESS.2023.3295781
Resumen: Misleading headlines are part of the disinformation problem. Headlines should give a concise summary of the news story helping the reader to decide whether to read the body text of the article, which is why headline accuracy is a crucial element of a news story. This work focuses on detecting misleading headlines through the automatic identification of contradiction between the headline and body text of a news item. When the contradiction is detected, the reader is alerted to the lack of precision or trustworthiness of the headline in relation to the body text. To facilitate the automatic detection of misleading headlines, a new Spanish dataset is created (ES_Headline_Contradiction) for the purpose of identifying contradictory information between a headline and its body text. This dataset annotates the semantic relationship between headlines and body text by categorising the relation between texts as compatible , contradictory and unrelated . Furthermore, another novel aspect of this dataset is that it distinguishes between different types of contradictions, thereby enabling a more fine-grain identification of them. The dataset was built via a novel semi-automatic methodology, which resulted in a more cost-efficient development process. The results of the experiments show that pre-trained language models can be fine-tuned with this dataset, producing very encouraging results for detecting incongruency or non-relation between headline and body text.
Patrocinador/es: This research work is funded by MCIN/AEI/ 10.13039/501100011033 and, as appropriate, by “ERDF A way of making Europe”, by the “European Union” or by the “European Union NextGenerationEU/PRTR” through the project TRIVIAL: Technological Resources for Intelligent VIral AnaLysis through NLP (PID2021-122263OB-C22) and the project SOCIALTRUST: Assessing trustworthiness in digital media (PDC2022-133146-C22). Also funded by Generalitat Valenciana through the project NL4DISMIS: Natural Language Technologies for dealing with dis- and misinformation (CIPROM/2021/21), and the grant ACIF/2020/177.
URI: http://hdl.handle.net/10045/136242
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2023.3295781
Idioma: eng
Tipo: info:eu-repo/semantics/article
Derechos: This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
Revisión científica: si
Versión del editor: https://doi.org/10.1109/ACCESS.2023.3295781
Aparece en las colecciones:INV - GPLSI - Artículos de Revistas

Archivos en este ítem:
Archivos en este ítem:
Archivo Descripción TamañoFormato 
ThumbnailSepulveda-Torres_etal_2023_IEEEAccess.pdf1,24 MBAdobe PDFAbrir Vista previa


Este ítem está licenciado bajo Licencia Creative Commons Creative Commons