Efficient multi-task progressive learning for semantic segmentation and disparity estimation

Cuevas-Velasquez, Hanz; Galán Cuenca, Alejandro; Fisher, Robert B.; Gallego, Antonio-Javier

Efficient multi-task progressive learning for semantic segmentation and disparity estimation

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/143003

Información del item - Informació de l'item - Item information
Título:	Efficient multi-task progressive learning for semantic segmentation and disparity estimation
Autor/es:	Cuevas-Velasquez, Hanz \| Galán Cuenca, Alejandro \| Fisher, Robert B. \| Gallego, Antonio-Javier
Grupo/s de investigación o GITE:	Reconocimiento de Formas e Inteligencia Artificial
Centro, Departamento o Servicio:	Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos
Palabras clave:	Computer vision \| Stereo vision \| Semantic segmentation \| Joint learning \| 3D modelling \| Multi-task \| Disparity estimation
Fecha de publicación:	17-may-2024
Editor:	Elsevier
Cita bibliográfica:	Pattern Recognition. 2024, 154: 110601. https://doi.org/10.1016/j.patcog.2024.110601
Resumen:	Scene understanding is an important area in robotics and autonomous driving. To accomplish these tasks, the 3D structures in the scene have to be inferred to know what the objects and their locations are. To this end, semantic segmentation and disparity estimation networks are typically used, but running them individually is inefficient since they require high-performance resources. A possible solution is to learn both tasks together using a multi-task approach. Some current methods address this problem by learning semantic segmentation and monocular depth together. However, monocular depth estimation from single images is an ill-posed problem. A better solution is to estimate the disparity between two stereo images and take advantage of this additional information to improve the segmentation. This work proposes an efficient multi-task method that jointly learns disparity and semantic segmentation. Employing a Siamese backbone architecture for multi-scale feature extraction, the method integrates specialized branches for disparity estimation and coarse and refined segmentations, leveraging progressive task-specific feature sharing and attention mechanisms to enhance accuracy for solving both tasks concurrently. The proposal achieves state-of-the-art results for joint segmentation and disparity estimation on three distinct datasets: Cityscapes, TrimBot2020 Garden, and S-ROSeS, using only of the parameters of previous approaches.
Patrocinador/es:	This work was supported by the I+D+i project TED2021-132103A-I00 (DOREMI), funded by MCIN/AEI /10.13039/501100011033.
URI:	http://hdl.handle.net/10045/143003
ISSN:	0031-3203 (Print) \| 1873-5142 (Online)
DOI:	10.1016/j.patcog.2024.110601
Idioma:	eng
Tipo:	info:eu-repo/semantics/article
Derechos:	© 2024 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Revisión científica:	si
Versión del editor:	https://doi.org/10.1016/j.patcog.2024.110601
Aparece en las colecciones:	INV - GRFIA - Artículos de Revistas

Archivos en este ítem:

Archivos en este ítem:
Archivo	Descripción	Tamaño	Formato
Cuevas-Velasquez_etal_2024_PatternRecognition.pdf		6,07 MB	Adobe PDF	Abrir Vista previa Cerrar vista previa

Ver citas en Google Académico

Muestra el registro completo