Domain adaptation for staff-region retrieval of music score images

Please use this identifier to cite or link to this item: http://hdl.handle.net/10045/126680
Información del item - Informació de l'item - Item information
Title: Domain adaptation for staff-region retrieval of music score images
Authors: Castellanos, Francisco J. | Gallego, Antonio-Javier | Calvo-Zaragoza, Jorge | Fujinaga, Ichiro
Research Group/s: Reconocimiento de Formas e Inteligencia Artificial
Center, Department or Service: Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos
Keywords: Unsupervised domain adaptation | Staff retrieval | Music score images | Optical music recognition
Issue Date: 10-Sep-2022
Publisher: Springer Nature
Citation: International Journal on Document Analysis and Recognition (IJDAR). 2022, 25: 281-292. https://doi.org/10.1007/s10032-022-00411-w
Abstract: Optical music recognition (OMR) is the field that studies how to automatically read music notation from score images. One of the relevant steps within the OMR workflow is the staff-region retrieval. This process is a key step because any undetected staff will not be processed by the subsequent steps. This task has previously been addressed as a supervised learning problem in the literature; however, ground-truth data are not always available, so each new manuscript requires a preliminary manual annotation. This situation is one of the main bottlenecks in OMR, because of the countless number of existing manuscripts , and the associated manual labeling cost. With the aim of mitigating this issue, we propose the application of a domain adaptation technique, the so-called Domain-Adversarial Neural Network (DANN), based on a combination of a gradient reversal layer and a domain classifier in the inference neural architecture. The results from our experiments support the benefits of our proposed solution, obtaining improvements of approximately 29% in the F-score.
Sponsor: Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This paper is part of the I+D+i PID2020-118447RA-I00 (MultiScore) project funded by MCIN/AEI/10.13039/501100011033. The first author acknowledges support from the “Programa I+D+i de la Generalitat Valenciana” through grants ACIF/2019/042 and CIBEFP/2021/72. This work also draws on research supported by the Social Sciences and Humanities Research Council (895-2013-1012) and the Fonds de recherche du Québec-Société et Culture (2022-SE3-303927).
URI: http://hdl.handle.net/10045/126680
ISSN: 1433-2833 (Print) | 1433-2825 (Online)
DOI: 10.1007/s10032-022-00411-w
Language: eng
Type: info:eu-repo/semantics/article
Rights: © The Author(s) 2022. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Peer Review: si
Publisher version: https://doi.org/10.1007/s10032-022-00411-w
Appears in Collections:INV - GRFIA - Artículos de Revistas

Files in This Item:
Files in This Item:
File Description SizeFormat 
ThumbnailCastellanos_etal_2022_IJDAR.pdf4,59 MBAdobe PDFOpen Preview


This item is licensed under a Creative Commons License Creative Commons