Applying Human-in-the-Loop to construct a dataset for determining content reliability to combat fake news

Bonet-Jover, Alba; Sepúlveda-Torres, Robiert; Saquete Boró, Estela; Martínez-Barco, Patricio; Piad-Morffis, Alejandro; Estévez-Velarde, Suilan

Applying Human-in-the-Loop to construct a dataset for determining content reliability to combat fake news

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/137336

Registro completo de metadatos

Registro completo de metadatos
Campo DC	Valor	Idioma
dc.contributor	Procesamiento del Lenguaje y Sistemas de Información (GPLSI)	es_ES
dc.contributor.author	Bonet-Jover, Alba	-
dc.contributor.author	Sepúlveda-Torres, Robiert	-
dc.contributor.author	Saquete Boró, Estela	-
dc.contributor.author	Martínez-Barco, Patricio	-
dc.contributor.author	Piad-Morffis, Alejandro	-
dc.contributor.author	Estévez-Velarde, Suilan	-
dc.contributor.other	Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos	es_ES
dc.contributor.other	Universidad de Alicante. Instituto Universitario de Investigación Informática	es_ES
dc.date.accessioned	2023-09-20T11:15:05Z	-
dc.date.available	2023-09-20T11:15:05Z	-
dc.date.issued	2023-09-20	-
dc.identifier.citation	Engineering Applications of Artificial Intelligence. 2023, 126(Part D): 107152. https://doi.org/10.1016/j.engappai.2023.107152	es_ES
dc.identifier.issn	0952-1976 (Print)	-
dc.identifier.issn	1873-6769 (Online)	-
dc.identifier.uri	http://hdl.handle.net/10045/137336	-
dc.description.abstract	Annotated corpora are indispensable tools to train computational models in Natural Language Processing. However, in the case of more complex semantic annotation processes, it is a costly, arduous, and time-consuming task, resulting in a shortage of resources to train Machine Learning and Deep Learning algorithms. In consideration, this work proposes a methodology, based on the human-in-the-loop paradigm, for semi-automatic annotation of complex tasks. This methodology is applied in the construction of a reliability dataset of Spanish news so as to combat disinformation and fake news. We obtain a high quality resource by implementing the proposed methodology for semi-automatic annotation, increasing annotator efficacy and speed, with fewer examples. The methodology consists of three incremental phases and results in the construction of the RUN dataset. The annotation quality of the resource was evaluated through time-reduction (annotation time reduction of almost 64% with respect to the fully manual annotation), annotation quality (measuring consistency of annotation and inter-annotator agreement), and performance by training a model with RUN semi-automatic dataset (Accuracy 95% F1 95%), validating the suitability of the proposal.	es_ES
dc.description.sponsorship	This research work is funded by MCIN/AEI/10.13039/501100011033 and, as appropriate, by “ERDF A way of making Europe”, by the “European Union” or by the “European Union NextGenerationEU/PRTR” through the project TRIVIAL: Technological Resources for Intelligent VIral AnaLysis through NLP (PID2021-122263OB-C22) and the project SOCIALTRUST: Assessing trustworthiness in digital media (PDC2022-133146-C22). It is also funded by Generalitat Valenciana, Spain through the project NL4DISMIS: Natural Language Technologies for dealing with dis- and misinformation (CIPROM/2021/21), and the grant ACIF/2020/177.	es_ES
dc.language	eng	es_ES
dc.publisher	Elsevier	es_ES
dc.rights	© 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).	es_ES
dc.subject	Natural language processing	es_ES
dc.subject	Fake news detection	es_ES
dc.subject	Assisted annotation	es_ES
dc.subject	Dataset construction	es_ES
dc.subject	Human-in-the-Loop Artificial Intelligence	es_ES
dc.subject	Active learning	es_ES
dc.title	Applying Human-in-the-Loop to construct a dataset for determining content reliability to combat fake news	es_ES
dc.type	info:eu-repo/semantics/article	es_ES
dc.peerreviewed	si	es_ES
dc.identifier.doi	10.1016/j.engappai.2023.107152	-
dc.relation.publisherversion	https://doi.org/10.1016/j.engappai.2023.107152	es_ES
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2021-122263OB-C22	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PDC2022-133146-C22	es_ES
Aparece en las colecciones:	INV - GPLSI - Artículos de Revistas

Archivos en este ítem:

Archivos en este ítem:
Archivo	Descripción	Tamaño	Formato
Bonet-Jover_etal_2023_EngApplArtifIntellig.pdf		2,08 MB	Adobe PDF	Abrir Vista previa Cerrar vista previa

Ver citas en Google Académico

Muestra el registro sencillo

Este ítem está licenciado bajo Licencia Creative Commons