Locality-aware Partitioning in Parallel Database SystemsErfan Zamanian; Carsten Binnig; Abdallah Salama
In: Timos K. Sellis; Susan B. Davidson; Zachary G. Ives (Hrsg.). Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM SIGMOD International Conference on Management of Data (SIGMOD-2015), May 31 - June 4, Melbourne, Victoria, Australia, Pages 17-30, ACM, 2015.
Parallel database systems horizontally partition large amounts of structured data in order to provide parallel data processing capabilities for analytical workloads in shared-nothing clusters. One major challenge when horizontally partitioning large amounts of data is to reduce the network costs for a given workload and a database schema. A common technique to reduce the network costs in parallel database systems is to co-partition tables on their join key in order to avoid expensive remote join operations. However, existing partitioning schemes are limited in that respect since only subsets of tables in complex schemata sharing the same join key can be co-partitioned unless tables are fully replicated.