An open-source shallow-transfer machine translation toolbox: consequences of its release and availability
Empreu sempre aquest identificador per citar o enllaçar aquest ítem
http://hdl.handle.net/10045/27526
Títol: | An open-source shallow-transfer machine translation toolbox: consequences of its release and availability |
---|---|
Autors: | Armentano Oller, Carme | Corbí Bellot, Antonio Miguel | Forcada, Mikel L. | Ginestí Rosell, Mireia | Bonev, Boyan | Ortiz Rojas, Sergio | Pérez-Ortiz, Juan Antonio | Ramírez Sánchez, Gema | Sánchez-Martínez, Felipe |
Grups d'investigació o GITE: | Transducens | Laboratorio de Investigación en Visión Móvil (MVRLab) |
Centre, Departament o Servei: | Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos |
Paraules clau: | Machine translation | Shallow-transfer | Open-source |
Àrees de coneixement: | Lenguajes y Sistemas Informáticos |
Data de publicació: | de setembre-2005 |
Editor: | OSMaTran |
Citació bibliogràfica: | ARMENTANO-OLLER, Carme, et al. "An open-source shallow-transfer machine translation toolbox: consequences of its release and availability". En: OSMaTran : Open-Source Machine Translation, A workshop at Machine Translation Summit X, September 12-16, 2005, Phuket, Thailand, p. 23-30 |
Resum: | By the time Machine Translation Summit X is held in September 2005, our group will have released an open-source machine translation toolbox as part of a large government-funded project involving four universities and three linguistic technology companies from Spain. The machine translation toolbox, which will most likely be released under a GPL-like license includes (a) the open-source engine itself, a modular shallow-transfer machine translation engine suitable for related languages and largely based upon that of systems we have already developed, such as interNOSTRUM for Spanish—Catalan and Traductor Universia for Spanish—Portuguese, (b) extensive documentation (including document type declarations) specifying the XML format of all linguistic (dictionaries, rules) and document format management files, (c) compilers converting these data into the high-speed (tens of thousands of words a second) format used by the engine, and (d) pilot linguistic data for Spanish—Catalan and Spanish—Galician and format management specifications for the HTML, RTF and plain text formats. After describing very briefly this toolbox, this paper aims at exploring possible consequences of the availability of this architecture, including the community-driven development of machine translation systems for languages lacking this kind of linguistic technology. |
Patrocinadors: | The development of the toolbox is funded by project FIT-340101-2004-3 (Spanish Ministry of Industry, Commerce and Tourism). |
URI: | http://hdl.handle.net/10045/27526 |
Idioma: | eng |
Tipus: | info:eu-repo/semantics/conferenceObject |
Revisió científica: | si |
Apareix a la col·lecció: | INV - TRANSDUCENS - Comunicaciones a Congresos, Conferencias, etc. INV - MVRLab - Comunicaciones a Congresos, Conferencias, etc. |
Arxius per aquest ítem:
Arxiu | Descripció | Tamany | Format | |
---|---|---|---|---|
armentano05p.pdf | 192,92 kB | Adobe PDF | Obrir Vista prèvia | |
Tots els documents dipositats a RUA estan protegits per drets d'autors. Alguns drets reservats.