Assisting non-expert speakers of under-resourced languages in assigning stems and inflectional paradigms to new word entries of morphological dictionaries
Por favor, use este identificador para citar o enlazar este ítem:
http://hdl.handle.net/10045/71353
Título: | Assisting non-expert speakers of under-resourced languages in assigning stems and inflectional paradigms to new word entries of morphological dictionaries |
---|---|
Autor/es: | Esplà-Gomis, Miquel | Carrasco, Rafael C. | Sánchez-Cartagena, Víctor M. | Forcada, Mikel L. | Sánchez-Martínez, Felipe | Pérez-Ortiz, Juan Antonio |
Grupo/s de investigación o GITE: | Transducens |
Centro, Departamento o Servicio: | Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos |
Palabras clave: | Enlargement of morphological dictionaries | Knowledge elicitation | Resource development for under-resourced languages | Machine translation |
Área/s de conocimiento: | Lenguajes y Sistemas Informáticos |
Fecha de publicación: | dic-2017 |
Editor: | Springer Science+Business Media Dordrecht |
Cita bibliográfica: | Language Resources and Evaluation. 2017, 51(4): 989-1017. doi:10.1007/s10579-016-9360-9 |
Resumen: | This paper presents a new method with which to assist individuals with no background in linguistics to create monolingual dictionaries such as those used by the morphological analysers of many natural language processing applications. The involvement of non-expert users is especially critical for under-resourced languages which either lack or cannot afford the recruitment of a skilled workforce. Adding a word to a morphological dictionary usually requires identifying its stem along with the inflection paradigm that can be used in order to generate all the word forms of the new entry. Our method works under the assumption that the average speakers of a language can successfully answer the polar question “is x a valid form of the word w to be inserted?”, where x represents tentative alternative (inflected) forms of the new word w. The experiments show that with a small number of polar questions the correct stem and paradigm can be obtained from non-experts with high success rates. We study the impact of different heuristic and probabilistic approaches on the actual number of questions. |
Patrocinador/es: | This work has been partially funded by the Spanish Ministry of Science & Innovation through project TIN2009-14009-C02-01, by the Spanish Ministry of Economy & Competitiveness through Project TIN2012-32615, by the Generalitat Valenciana through grant ACIF/2010/174 from VALi+d programme, and by the European Commission through Project PIAP-GA-2012-324414 (Abu-MaTran). |
URI: | http://hdl.handle.net/10045/71353 |
ISSN: | 1574-020X (Print) | 1574-0218 (Online) |
DOI: | 10.1007/s10579-016-9360-9 |
Idioma: | eng |
Tipo: | info:eu-repo/semantics/article |
Derechos: | © Springer Science+Business Media Dordrecht 2016 |
Revisión científica: | si |
Versión del editor: | http://dx.doi.org/10.1007/s10579-016-9360-9 |
Aparece en las colecciones: | INV - TRANSDUCENS - Artículos de Revistas |
Archivos en este ítem:
Archivo | Descripción | Tamaño | Formato | |
---|---|---|---|---|
2017_Espla-Gomis_etal_LangResources&Evaluation_final.pdf | Versión final (acceso restringido) | 419,11 kB | Adobe PDF | Abrir Solicitar una copia |
2017_Espla-Gomis_etal_LangResources&Evaluation_postprint.pdf | Versión revisada (acceso abierto) | 225,04 kB | Adobe PDF | Abrir Vista previa |
Todos los documentos en RUA están protegidos por derechos de autor. Algunos derechos reservados.