Tech Talk: Taming Large Intermediate Results for Joins over Graph-Structured Relations
Taming Large Intermediate Results for Joins over Graph-Structured Relations
Date: 26th April 2022
Time: 14:00 – 15:30
Querying graph-structured relations, i.e., those with many-to-many (m-n) relationships between entities, is ubiquitous and integral to a wide range of analytical applications such as recommendations on social networks and fraud detection in financial transactional networks. These queries are heavy on joins, can be broadly categorized as cyclic and acyclic, and tend to be challenging as they produce large intermediate results. For cyclic queries, the large intermediate results can be unnecessary and due to the suboptimality of the most common core join algorithms being binary joins. For acyclic queries, while the large intermediate results are part of the final results, they contain a lot of redundancy.
In this talk, I first review two algorithmic solutions to handle the large intermediate results challenge: 1) worst-case optimal joins; and 2) factorized representations. I will present a comprehensive query processor design and implementation that integrates the two solutions and go over various research questions tackled to make the integration possible and efficient. Our query processor is capable of having a robust query plan space with many efficient plans that lead to orders of magnitude speedups. We further implement a dynamic programming query optimizer backed with extra optimization rules to pick efficient plans.
Amine Mhedhbi is a Ph.D. student at the University of Waterloo. His work focuses on query processing, optimization, and storage techniques for large-scale graph-structured data management. As part of his research, he designs and implements the GraphflowDB system. He received the VLDB Best Paper Award in 2018 and the Microsoft Ph.D. Research Fellowship 2020-2022.
To join this talk please use the following details.
Meeting ID： 0184647198