Garrido Muñoz, Carlos, Sánchez Hernández, Adrián, Castellanos, Francisco J., Calvo-Zaragoza, Jorge Domain Adaptation for Document Image Binarization via Domain Classification Garrido-Munoz, Carlos, et al. “Domain Adaptation for Document Image Binarization via Domain Classification”. In: Tallón-Ballesteros, Antonio J. (Ed.). Modern Management based on Big Data II and Machine Learning and Intelligent Systems III. Proceedings of MMBD 2021 and MLIS 2021. Amsterdam: IOS Press BV, 2021. ISBN 978-1-64368-224-2, pp. 569-582 URI: http://hdl.handle.net/10045/121474 DOI: 10.3233/FAIA210289 ISSN: ISBN: 978-1-64368-224-2 Abstract: Binarization represents a key role in many document image analysis workflows. The current state of the art considers the use of supervised learning, and specifically deep neural networks. However, it is very difficult for the same model to work successfully in a number of document styles, since the set of potential domains is very heterogeneous. We study a multi-source domain adaptation strategy for binarization. Within this scenario, we look into a novel hypothesis where a specialized binarization model must be selected to be used over a target domain, instead of a single model that tries to generalize across multiple domains. The problem then boils down to, given several specialized models and a new target set, deciding which model to use. We propose here a simple way to address this question by using a domain classifier, that estimates which of the source models must be considered to binarize the new target domain. Our experiments on several datasets, including different text styles and music scores, show that our initial hypothesis is quite promising, yet the way to deal with the decision of which model to use still shows great room for improvement. Keywords:Document Image Binarization, Deep Neural Networks, Unsupervised Domain Adaptation, Domain Classifier IOS Press info:eu-repo/semantics/bookPart