Skip to main content
Publication

Could Blobs Fuel Storage-Based Convergence between HPC and Big Data?

Authors

Matri, Pierre; Alforov, Yevhen; Brandon, Alvaro; Kuhn, Michael; Carns, Philip; Ludwig, Thomas

Abstract

The increasingly growing data sets processed on HPC platforms raise major challenges for the underlying storage layer. A promising alternative to POSIX-IO- compliant file systems are simpler blobs (binary large objects), or object storage systems. Such systems offer lower overhead and better performance at the cost of largely unused features such as file hierarchies or permissions. Similarly, blobs are increasingly considered for replacing distributed file systems for big data analytics or as a base for storage abstractions such as key-value stores or time-series databases. This growing interest in such object storage on HPC and big data platforms raises the question: Are blobs the right level of abstraction to enable storage-based convergence between HPC and Big Data? In this paper we study the impact of blob-based storage for real-world applications on HPC and cloud environments. The results show that blobbased storage convergence is possible, leading to a significant performance improvement on both platforms