A Preliminary Study of Few-shot Learning for Layout Analysis of Music Scores

Empreu sempre aquest identificador per citar o enllaçar aquest ítem http://hdl.handle.net/10045/138500
Información del item - Informació de l'item - Item information
Títol: A Preliminary Study of Few-shot Learning for Layout Analysis of Music Scores
Autors: Castellanos, Francisco J. | Gallego, Antonio-Javier | Fujinaga, Ichiro
Grups d'investigació o GITE: Reconocimiento de Formas e Inteligencia Artificial
Centre, Departament o Servei: Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos | Universidad de Alicante. Instituto Universitario de Investigación Informática
Paraules clau: Few-shot learning | Layout analysis | Music scores | Optical Music Recognition
Data de publicació: de novembre-2023
Editor: International Workshop on Reading Music Systems
Citació bibliogràfica: Castellanos, Francisco J.; Gallego, Antonio Javier; Fujinaga, Ichiro. “A Preliminary Study of Few-shot Learning for Layout Analysis of Music Scores”. In: Calvo-Zaragoza, Jorge; Pacha, Alexander; Shatri, Elona (Eds.). Proceedings of the 5th International Workshop on Reading Music Systems: 4th November, 2023, Milan, Italy, pp. 44-48
Resum: Few-shot techniques offer a promising avenue to reduce the high demand for annotated data required by current machine learning-based applications, such as Optical Music Recognition (OMR). This is a field dedicated to the automatic transcription of music notation from sheet music images. Traditional OMR systems strongly depend on layout analysis, a crucial step involving the identification and segmentation of several components within a music score, such as staff lines, text, or notes. The standard approach requires extensive fully annotated training data, which are resource-intensive and time-consuming to label and curate by domain experts. We present a preliminary study on the use of few-shot learning to alleviate the disadvantages associated with manual annotations. The proposal minimizes the human effort required by employing only partial annotations. For this, we introduce an oversampling technique to train models using a limited set of annotated patches extracted from the score images. Our experimental findings, conducted on four benchmark datasets, underscore the efficacy of the proposed patch extraction. Despite operating with a reduced amount of annotated data, our method achieves performance levels competitive with models trained on the complete dataset. This work points out the potential of few-shot learning in the context of layout analysis for music scores, offering the promise of more efficient and accessible OMR systems.
Patrocinadors: This research was supported by the I+D+i project TED2021-132103A-I00 (DOREMI), funded by MCIN/AEI/10.13039/501100011033, the Social Sciences and Humanities Research Council (895-2013-1012) and the Fonds de recherche du Québec-Société et Culture (2022-SE3-303927).
URI: http://hdl.handle.net/10045/138500
Idioma: eng
Tipus: info:eu-repo/semantics/conferenceObject
Drets: © The respective authors. Licensed under a Creative Commons Attribution 4.0 International License (CC-BY-4.0).
Revisió científica: si
Versió de l'editor: https://doi.org/10.48550/arXiv.2311.04091
Apareix a la col·lecció: INV - GRFIA - Comunicaciones a Congresos, Conferencias, etc.

Arxius per aquest ítem:


Tots els documents dipositats a RUA estan protegits per drets d'autors. Alguns drets reservats.