A New Big Data Benchmark for OLAP Cube Design Using Data Pre-Aggregation Techniques
Por favor, use este identificador para citar o enlazar este ítem:
http://hdl.handle.net/10045/112029
Título: | A New Big Data Benchmark for OLAP Cube Design Using Data Pre-Aggregation Techniques |
---|---|
Autor/es: | Tardío, Roberto | Maté, Alejandro | Trujillo, Juan |
Grupo/s de investigación o GITE: | Lucentia |
Centro, Departamento o Servicio: | Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos |
Palabras clave: | OLAP | Big data | Benchmarking | Data warehousing |
Área/s de conocimiento: | Lenguajes y Sistemas Informáticos |
Fecha de publicación: | 4-dic-2020 |
Editor: | MDPI |
Cita bibliográfica: | Tardío R, Maté A, Trujillo J. A New Big Data Benchmark for OLAP Cube Design Using Data Pre-Aggregation Techniques. Applied Sciences. 2020; 10(23):8674. https://doi.org/10.3390/app10238674 |
Resumen: | In recent years, several new technologies have enabled OLAP processing over Big Data sources. Among these technologies, we highlight those that allow data pre-aggregation because of their demonstrated performance in data querying. This is the case of Apache Kylin, a Hadoop based technology that supports sub-second queries over fact tables with billions of rows combined with ultra high cardinality dimensions. However, taking advantage of data pre-aggregation techniques to designing analytic models for Big Data OLAP is not a trivial task. It requires very advanced knowledge of the underlying technologies and user querying patterns. A wrong design of the OLAP cube alters significantly several key performance metrics, including: (i) the analytic capabilities of the cube (time and ability to provide an answer to a query), (ii) size of the OLAP cube, and (iii) time required to build the OLAP cube. Therefore, in this paper we (i) propose a benchmark to aid Big Data OLAP designers to choose the most suitable cube design for their goals, (ii) we identify and describe the main requirements and trade-offs for effectively designing a Big Data OLAP cube taking advantage of data pre-aggregation techniques, and (iii) we validate our benchmark in a case study. |
Patrocinador/es: | This work has been funded by the ECLIPSE project (RTI2018-094283-B-C32) from the Spanish Ministry of Science, Innovation and Universities. |
URI: | http://hdl.handle.net/10045/112029 |
ISSN: | 2076-3417 |
DOI: | 10.3390/app10238674 |
Idioma: | eng |
Tipo: | info:eu-repo/semantics/article |
Derechos: | © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
Revisión científica: | si |
Versión del editor: | https://doi.org/10.3390/app10238674 |
Aparece en las colecciones: | INV - LUCENTIA - Artículos de Revistas |
Archivos en este ítem:
Archivo | Descripción | Tamaño | Formato | |
---|---|---|---|---|
Tardio_etal_2020_ApplSci.pdf | 507,85 kB | Adobe PDF | Abrir Vista previa | |
Este ítem está licenciado bajo Licencia Creative Commons