A trigram part-of-speech tagger for the Apertium free/open-source machine translation platform
Empreu sempre aquest identificador per citar o enllaçar aquest ítem
http://hdl.handle.net/10045/12032
Títol: | A trigram part-of-speech tagger for the Apertium free/open-source machine translation platform |
---|---|
Autors: | Sheikh, Zaid Md Abdul Wahab | Sánchez-Martínez, Felipe |
Grups d'investigació o GITE: | Transducens |
Centre, Departament o Servei: | Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos |
Paraules clau: | Hidden Markov Model | Part-of-speech tagger | Machine translation | Apertium |
Àrees de coneixement: | Lenguajes y Sistemas Informáticos |
Data de publicació: | de novembre-2009 |
Editor: | Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos |
Citació bibliogràfica: | SHEIKH, Zaid Md Abdul Wahab; SÁNCHEZ-MARTÍNEZ, Felipe. "A trigram part-of-speech tagger for the Apertium free/open-source machine translation platform". En: Proceedings of the First International Workshop on Free/Open-Source Rule-Based Machine Translation / Edited by Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Francis M. Tyers. Alicante : Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos, 2009, pp. 67-74 |
Resum: | This paper describes the implementation of a second-order hidden Markov model (HMM) based part-of-speech tagger for the Apertium free/open-source rule-based machine translation platform. We describe the part-of-speech (PoS) tagging approach in Apertium and how it is parametrised through a tagger definition file that defines: (1) the set of tags to be used and (2) constrain rules that can be used to forbid certain PoS tag sequences, thus re-fining the HMM parameters and increasing its tagging accuracy. The paper also reviews the Baum-Welch algorithm used to estimate the HMM parameters and compares the tagging accuracy achieved with that achieved by the original, first-order HMM-based PoS tagger in Apertium. |
Patrocinadors: | Google Summer of Code 2009 program, and the Spanish Ministry of Science and Innovation under project TIN2009-14009-C02-01. |
URI: | http://hdl.handle.net/10045/12032 |
Idioma: | eng |
Tipus: | info:eu-repo/semantics/article |
Revisió científica: | si |
Apareix a la col·lecció: | Freerbmt09 - Ponencias INV - TRANSDUCENS - Comunicaciones a Congresos, Conferencias, etc. |
Arxius per aquest ítem:
Arxiu | Descripció | Tamany | Format | |
---|---|---|---|---|
paper9.pdf | 239,77 kB | Adobe PDF | Obrir Vista prèvia | |
Tots els documents dipositats a RUA estan protegits per drets d'autors. Alguns drets reservats.