A New Big Data Benchmark for OLAP Cube Design Using Data Pre-Aggregation Techniques

Tardío, Roberto; Maté, Alejandro; Trujillo, Juan

A New Big Data Benchmark for OLAP Cube Design Using Data Pre-Aggregation Techniques

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/112029

Información del item - Informació de l'item - Item information
Título:	A New Big Data Benchmark for OLAP Cube Design Using Data Pre-Aggregation Techniques
Autor/es:	Tardío, Roberto \| Maté, Alejandro \| Trujillo, Juan
Grupo/s de investigación o GITE:	Lucentia
Centro, Departamento o Servicio:	Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos
Palabras clave:	OLAP \| Big data \| Benchmarking \| Data warehousing
Área/s de conocimiento:	Lenguajes y Sistemas Informáticos
Fecha de publicación:	4-dic-2020
Editor:	MDPI
Cita bibliográfica:	Tardío R, Maté A, Trujillo J. A New Big Data Benchmark for OLAP Cube Design Using Data Pre-Aggregation Techniques. Applied Sciences. 2020; 10(23):8674. https://doi.org/10.3390/app10238674
Resumen:	In recent years, several new technologies have enabled OLAP processing over Big Data sources. Among these technologies, we highlight those that allow data pre-aggregation because of their demonstrated performance in data querying. This is the case of Apache Kylin, a Hadoop based technology that supports sub-second queries over fact tables with billions of rows combined with ultra high cardinality dimensions. However, taking advantage of data pre-aggregation techniques to designing analytic models for Big Data OLAP is not a trivial task. It requires very advanced knowledge of the underlying technologies and user querying patterns. A wrong design of the OLAP cube alters significantly several key performance metrics, including: (i) the analytic capabilities of the cube (time and ability to provide an answer to a query), (ii) size of the OLAP cube, and (iii) time required to build the OLAP cube. Therefore, in this paper we (i) propose a benchmark to aid Big Data OLAP designers to choose the most suitable cube design for their goals, (ii) we identify and describe the main requirements and trade-offs for effectively designing a Big Data OLAP cube taking advantage of data pre-aggregation techniques, and (iii) we validate our benchmark in a case study.
Patrocinador/es:	This work has been funded by the ECLIPSE project (RTI2018-094283-B-C32) from the Spanish Ministry of Science, Innovation and Universities.
URI:	http://hdl.handle.net/10045/112029
ISSN:	2076-3417
DOI:	10.3390/app10238674
Idioma:	eng
Tipo:	info:eu-repo/semantics/article
Derechos:	© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Revisión científica:	si
Versión del editor:	https://doi.org/10.3390/app10238674
Aparece en las colecciones:	INV - LUCENTIA - Artículos de Revistas

Archivos en este ítem:

Archivos en este ítem:
Archivo	Descripción	Tamaño	Formato
Tardio_etal_2020_ApplSci.pdf		507,85 kB	Adobe PDF	Abrir Vista previa Cerrar vista previa

Ver citas en Google Académico

Muestra el registro completo

Este ítem está licenciado bajo Licencia Creative Commons