Domain Adaptation for Document Image Binarization via Domain Classification

Please use this identifier to cite or link to this item: http://hdl.handle.net/10045/121474
Información del item - Informació de l'item - Item information
Title: Domain Adaptation for Document Image Binarization via Domain Classification
Authors: Garrido Muñoz, Carlos | Sánchez Hernández, Adrián | Castellanos, Francisco J. | Calvo-Zaragoza, Jorge
Research Group/s: Reconocimiento de Formas e Inteligencia Artificial
Center, Department or Service: Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos
Keywords: Document Image Binarization | Deep Neural Networks | Unsupervised Domain Adaptation | Domain Classifier
Knowledge Area: Lenguajes y Sistemas Informáticos
Issue Date: 2021
Publisher: IOS Press
Citation: Garrido-Munoz, Carlos, et al. “Domain Adaptation for Document Image Binarization via Domain Classification”. In: Tallón-Ballesteros, Antonio J. (Ed.). Modern Management based on Big Data II and Machine Learning and Intelligent Systems III. Proceedings of MMBD 2021 and MLIS 2021. Amsterdam: IOS Press BV, 2021. ISBN 978-1-64368-224-2, pp. 569-582
Abstract: Binarization represents a key role in many document image analysis workflows. The current state of the art considers the use of supervised learning, and specifically deep neural networks. However, it is very difficult for the same model to work successfully in a number of document styles, since the set of potential domains is very heterogeneous. We study a multi-source domain adaptation strategy for binarization. Within this scenario, we look into a novel hypothesis where a specialized binarization model must be selected to be used over a target domain, instead of a single model that tries to generalize across multiple domains. The problem then boils down to, given several specialized models and a new target set, deciding which model to use. We propose here a simple way to address this question by using a domain classifier, that estimates which of the source models must be considered to binarize the new target domain. Our experiments on several datasets, including different text styles and music scores, show that our initial hypothesis is quite promising, yet the way to deal with the decision of which model to use still shows great room for improvement.
Sponsor: This paper has been supported by Generalitat Valenciana through grant ACIF/2019/042 and project GV/2020/030, and Universidad de Alicante through project GRE19-04. The first two authors carried out this work as recipients of a grant from the Office for Educational Quality and Innovation of the University of Alicante, within the collaboration agreement with Banco de Santander S.A.
URI: http://hdl.handle.net/10045/121474
ISBN: 978-1-64368-224-2 | 978-1-64368-225-9
DOI: 10.3233/FAIA210289
Language: eng
Type: info:eu-repo/semantics/bookPart
Rights: © 2021 The authors and IOS Press. This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
Peer Review: si
Publisher version: https://doi.org/10.3233/FAIA210289
Appears in Collections:INV - GRFIA - Capítulos de Libros

Files in This Item:


This item is licensed under a Creative Commons License Creative Commons