Optical music recognition for homophonic scores with neural networks and synthetic music generation
Empreu sempre aquest identificador per citar o enllaçar aquest ítem
http://hdl.handle.net/10045/134683
Títol: | Optical music recognition for homophonic scores with neural networks and synthetic music generation |
---|---|
Autors: | Alfaro-Contreras, María | Iñesta, José M. | Calvo-Zaragoza, Jorge |
Grups d'investigació o GITE: | Reconocimiento de Formas e Inteligencia Artificial |
Centre, Departament o Servei: | Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos | Universidad de Alicante. Instituto Universitario de Investigación Informática |
Paraules clau: | Optical music recognition | Deep learning | End-to-end recognition | Music encoding |
Data de publicació: | 26-de maig-2023 |
Editor: | Springer Nature |
Citació bibliogràfica: | International Journal of Multimedia Information Retrieval. 2023, 12:12. https://doi.org/10.1007/s13735-023-00278-5 |
Resum: | The recognition of patterns that have a time dependency is common in areas like speech recognition or natural language processing. The equivalent situation in image analysis is present in tasks like text or video recognition. Recently, Convolutional Recurrent Neural Networks (CRNN) have been broadly applied to solve these tasks in an end-to-end fashion with successful performance. However, its application to Optical Music Recognition (OMR) is not so straightforward due to the presence of different elements sharing the same horizontal position, disrupting the linear flow of the timeline. In this paper, we study the ability of the state-of-the-art CRNN approach to learn codes that represent this disruption in homophonic scores. In our experiments, we study the lower bounds in the recognition task of real scores when the models are trained with synthetic data. Two relevant conclusions are drawn: (1) Our serialized ways of encoding the music content are appropriate for CRNN-based OMR; (2) the learning process is possible with synthetic data, but there exists a glass ceiling when recognizing real sheet music. |
Patrocinadors: | Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This paper is part of the I+D+i PID2020-118447RA-I00 (MultiScore) project, funded by MCIN/AEI/10.13039/501100011033. The first author is supported by grant FPU19/04957 from the Spanish Ministerio de Universidades. |
URI: | http://hdl.handle.net/10045/134683 |
ISSN: | 2192-6611 (Print) | 2192-662X (Online) |
DOI: | 10.1007/s13735-023-00278-5 |
Idioma: | eng |
Tipus: | info:eu-repo/semantics/article |
Drets: | © The Author(s) 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. |
Revisió científica: | si |
Versió de l'editor: | https://doi.org/10.1007/s13735-023-00278-5 |
Apareix a la col·lecció: | INV - GRFIA - Artículos de Revistas |
Arxius per aquest ítem:
Arxiu | Descripció | Tamany | Format | |
---|---|---|---|---|
Alfaro-Contreras_etal_2023_IntJMultimedInfoRetr.pdf | 1,17 MB | Adobe PDF | Obrir Vista prèvia | |
Aquest ítem està subjecte a una llicència de Creative Commons Llicència Creative Commons