Assisting non-expert speakers of under-resourced languages in assigning stems and inflectional paradigms to new word entries of morphological dictionaries

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/71353
Información del item - Informació de l'item - Item information
Título: Assisting non-expert speakers of under-resourced languages in assigning stems and inflectional paradigms to new word entries of morphological dictionaries
Autor/es: Esplà-Gomis, Miquel | Carrasco, Rafael C. | Sánchez-Cartagena, Víctor M. | Forcada, Mikel L. | Sánchez-Martínez, Felipe | Pérez-Ortiz, Juan Antonio
Grupo/s de investigación o GITE: Transducens
Centro, Departamento o Servicio: Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos
Palabras clave: Enlargement of morphological dictionaries | Knowledge elicitation | Resource development for under-resourced languages | Machine translation
Área/s de conocimiento: Lenguajes y Sistemas Informáticos
Fecha de publicación: dic-2017
Editor: Springer Science+Business Media Dordrecht
Cita bibliográfica: Language Resources and Evaluation. 2017, 51(4): 989-1017. doi:10.1007/s10579-016-9360-9
Resumen: This paper presents a new method with which to assist individuals with no background in linguistics to create monolingual dictionaries such as those used by the morphological analysers of many natural language processing applications. The involvement of non-expert users is especially critical for under-resourced languages which either lack or cannot afford the recruitment of a skilled workforce. Adding a word to a morphological dictionary usually requires identifying its stem along with the inflection paradigm that can be used in order to generate all the word forms of the new entry. Our method works under the assumption that the average speakers of a language can successfully answer the polar question “is x a valid form of the word w to be inserted?”, where x represents tentative alternative (inflected) forms of the new word w. The experiments show that with a small number of polar questions the correct stem and paradigm can be obtained from non-experts with high success rates. We study the impact of different heuristic and probabilistic approaches on the actual number of questions.
Patrocinador/es: This work has been partially funded by the Spanish Ministry of Science & Innovation through project TIN2009-14009-C02-01, by the Spanish Ministry of Economy & Competitiveness through Project TIN2012-32615, by the Generalitat Valenciana through grant ACIF/2010/174 from VALi+d programme, and by the European Commission through Project PIAP-GA-2012-324414 (Abu-MaTran).
URI: http://hdl.handle.net/10045/71353
ISSN: 1574-020X (Print) | 1574-0218 (Online)
DOI: 10.1007/s10579-016-9360-9
Idioma: eng
Tipo: info:eu-repo/semantics/article
Derechos: © Springer Science+Business Media Dordrecht 2016
Revisión científica: si
Versión del editor: http://dx.doi.org/10.1007/s10579-016-9360-9
Aparece en las colecciones:INV - TRANSDUCENS - Artículos de Revistas

Archivos en este ítem:
Archivos en este ítem:
Archivo Descripción TamañoFormato 
Thumbnail2017_Espla-Gomis_etal_LangResources&Evaluation_final.pdfVersión final (acceso restringido)419,11 kBAdobe PDFAbrir    Solicitar una copia
Thumbnail2017_Espla-Gomis_etal_LangResources&Evaluation_postprint.pdfVersión revisada (acceso abierto)225,04 kBAdobe PDFAbrir Vista previa


Todos los documentos en RUA están protegidos por derechos de autor. Algunos derechos reservados.