Retrieving Music Semantics from Optical Music Recognition by Machine Translation

Thomae, Martha E.; Ríos-Vila, Antonio; Calvo-Zaragoza, Jorge; Rizo, David; Iñesta, José M.

Retrieving Music Semantics from Optical Music Recognition by Machine Translation

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/109930

Información del item - Informació de l'item - Item information
Título:	Retrieving Music Semantics from Optical Music Recognition by Machine Translation
Autor/es:	Thomae, Martha E. \| Ríos-Vila, Antonio \| Calvo-Zaragoza, Jorge \| Rizo, David \| Iñesta, José M.
Grupo/s de investigación o GITE:	Reconocimiento de Formas e Inteligencia Artificial
Centro, Departamento o Servicio:	Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos
Palabras clave:	Music semantics \| Optical music recognition \| Machine translation
Área/s de conocimiento:	Lenguajes y Sistemas Informáticos
Fecha de publicación:	2020
Editor:	Tufts University
Cita bibliográfica:	Thomae, Martha E., et al. “Retrieving Music Semantics from Optical Music Recognition by Machine Translation”. In: De Luca, Elsa; Flanders, Julia (Eds.). Music Encoding Conference Proceedings 26-29 May, 2020 Tufts University, Boston (USA), pp. 19-24. https://doi.org/10.17613/605z-nt78
Resumen:	In this paper, we apply machine translation techniques to solve one of the central problems in the field of optical music recognition: extracting the semantics of a sequence of music characters. So far, this problem has been approached through heuristics and grammars, which are not generalizable solutions. We borrowed the seq2seq model and the attention mechanism from machine translation to address this issue. Given its example-based learning, the model proposed is meant to apply to different notations provided there is enough training data. The model was tested on the PrIMuS dataset of common Western music notation incipits. Its performance was satisfactory for the vast majority of examples, flawlessly extracting the musical meaning of 85% of the incipits in the test set—mapping correctly series of accidentals into key signatures, pairs of digits into time signatures, combinations of digits and rests into multi-measure rests, detecting implicit accidentals, etc.
Patrocinador/es:	This work is supported by the Spanish Ministry HISPAMUS project TIN2017-86576-R, partially funded by the EU, and by CIRMMT’s Inter-Centre Research Exchange Funding and McGill’s Graduate Mobility Award.
URI:	http://hdl.handle.net/10045/109930
DOI:	10.17613/605z-nt78
Idioma:	eng
Tipo:	info:eu-repo/semantics/conferenceObject
Derechos:	Creative Commons Attribution-NonCommercial-NoDerivatives License
Revisión científica:	si
Versión del editor:	https://doi.org/10.17613/605z-nt78
Aparece en las colecciones:	INV - GRFIA - Comunicaciones a Congresos, Conferencias, etc.

Archivos en este ítem:

Archivos en este ítem:
Archivo	Descripción	Tamaño	Formato
Thomae_etal_2020_Music_encoding_conference_proceedings.pdf		545,63 kB	Adobe PDF	Abrir Vista previa Cerrar vista previa

Ver citas en Google Académico

Muestra el registro completo

Este ítem está licenciado bajo Licencia Creative Commons