The Paper
JSON Tiles: Fast Analytics on Semi-Structured Data
Paper Link
https://db.in.tum.de/~durner/papers/json-tiles-sigmod21.pdf
Format
We start at 6:10, don't be late!
The discussion lasts for about 1 to 1.5 hours, depending upon the paper.
Read the paper (done before you arrive)
Introductions (name, and background)
First impressions (1-2 minutes this is what I thought)
Structured review (we move through the paper in order, everyone gets a chance to ask questions, offer comments, and raise concerns)
Free form discussion
Nominate and vote on the next paper
Abstract
Developers often prefer flexibility over upfront schema design, making semi-structured data formats such as JSON increasingly popular. Large amounts of JSON data are therefore stored and analyzed by relational database systems. In existing systems, however, JSON's lack of a fixed schema results in slow analytics. In this paper, we present JSON tiles, which, without losing the flexibility of JSON, enables relational systems to perform analytics on JSON data at native speed. JSON tiles automatically detects the most important keys and extracts them transparently - often achieving scan performance similar to columnar storage. At the same time, JSON tiles is capable of handling heterogeneous and changing data. Furthermore, we automatically collect statistics that enable the query optimizer to find good execution plans. Our experimental evaluation compares against state-of-the-art systems and research proposals and shows that our approach is both robust and efficient.