SemPCA-Summarizer: Exploiting Semantic Principal Component Analysis for Automatic Summary Generation

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/86730
Información del item - Informació de l'item - Item information
Título: SemPCA-Summarizer: Exploiting Semantic Principal Component Analysis for Automatic Summary Generation
Autor/es: Alcón, Óscar | Lloret, Elena
Grupo/s de investigación o GITE: Procesamiento del Lenguaje y Sistemas de Información (GPLSI)
Centro, Departamento o Servicio: Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos
Palabras clave: Natural language processing | Human language technologies | Intelligent information processing | Automatic text summarization | Principal component analysis
Área/s de conocimiento: Lenguajes y Sistemas Informáticos
Fecha de publicación: 2018
Editor: Slovak Academy of Sciences. Institute of Informatics
Cita bibliográfica: Computing and Informatics. 2018, 37: 1126-1148. doi:10.4149/cai_2018_5_1126
Resumen: Text summarization is the task of condensing a document keeping the relevant information. This task integrated in wider information systems can help users to access key information without having to read everything, allowing for a higher efficiency. In this research work, we have developed and evaluated a single-document extractive summarization approach, named SemPCA-Summarizer, which reduces the dimension of a document using Principal Component Analysis technique enriched with semantic information. A concept-sentence matrix is built from the textual input document, and then, PCA is used to identify and rank the relevant concepts, which are used for selecting the most important sentences through different heuristics, thus leading to various types of summaries. The results obtained show that the generated summaries are very competitive, both from a quantitative and a qualitative viewpoint, thus indicating that our proposed approach is appropriate for briefly providing key information, and thus helping to cope with a huge amount of information available in a quicker and efficient manner.
Patrocinador/es: This research work has been partially funded by the Generalitat Valenciana and the Spanish Government through the projects PROMETEOII/2014/001, TIN2015-65100-R, and TIN2015-65136-C2-2-R.
URI: http://hdl.handle.net/10045/86730
ISSN: 1335-9150 (Print) | 2585-8807 (Online)
DOI: 10.4149/cai_2018_5_1126
Idioma: eng
Tipo: info:eu-repo/semantics/article
Derechos: © Computing and Informatics
Revisión científica: si
Versión del editor: https://doi.org/10.4149/cai_2018_5_1126
Aparece en las colecciones:INV - GPLSI - Artículos de Revistas

Archivos en este ítem:
Archivos en este ítem:
Archivo Descripción TamañoFormato 
Thumbnail2018_Alcon_Lloret_CompInform.pdf595,82 kBAdobe PDFAbrir Vista previa


Todos los documentos en RUA están protegidos por derechos de autor. Algunos derechos reservados.