Any views expressed within media held on this service are those of the contributors, should not be taken as approved or endorsed by the University, and do not necessarily reflect the views of the University in respect of any particular issue.

【ERC Coffee House Tech Talk Series】Cosine: A Cloud-Cost Optimized Self-Designing Key-Value Storage Engine [Subarna Chatterjee]

Date

Thursday 14th July @ 14:00 – 15:00 (UK time)

Presenter

Subarna Chatterjee

Affiliation

Harvard University

Location

[Online] Meeting link: https://welink.zhumu.com/j/159295680

Abstract

We present a self-designing key-value storage engine, Cosine, which can take the shape of the close to “perfect” engine architecture given an input workload, a cloud budget, a target performance, and required cloud SLAs. By identifying and formalizing the first principles of storage engine layouts and core key-value algorithms, Cosine constructs a massive design space comprising of sextillion (10^36) possible storage engine designs over a diverse space of hardware and cloud pricing policies for three cloud providers – AWS, GCP, and Azure. Cosine spans across diverse designs such as Log-Structured Merge-trees, B-trees, Log-Structured Hash-tables, in-memory accelerators for filters and indexes as well as trillions of hybrid designs that do not appear in the literature or industry but emerge as valid combinations of the above. Cosine includes a unified distribution-aware I/O model and a learned concurrency-aware CPU model that with high accuracy can calculate the performance and cloud cost of any possible design on any workload and virtual machines. Cosine can then search through that space in interactive times to find the best design and materializes the actual code of the resulting storage engine design using a templated Rust implementation. We demonstrate that on average Cosine outperforms state-of-the-art storage engines such as write-optimized RocksDB, read-optimized WiredTiger, and very write-optimized FASTER by 23x, 25x, and 20x, respectively, for diverse workloads, data sizes, and cloud budgets across all YCSB core workloads and many variants.

Short Bio

Subarna Chatterjee is a post-doc at Harvard University advised by Stratos Idreos. Her research is about improving the performance of modern data systems by reasoning about the read-write tradeoff of the underlying data structures and algorithms. Prior to joining Harvard, she did her Ph.D. from Indian Institute of Technology Kharagpur and her first post-doc at Inria, Rennes, France. In 2016, she was selected as one of the “10 Women in Networking/Communications That You Should Watch” and is one of the young scientists to attend the Heidelberg Laureate Forum.

css.php

Report this page

To report inappropriate content on this page, please use the form below. Upon receiving your report, we will be in touch as per the Take Down Policy of the service.

Please note that personal data collected through this form is used and stored for the purposes of processing this report and communication with you.

If you are unable to report a concern about content via this form please contact the Service Owner.

Please enter an email address you wish to be contacted on. Please describe the unacceptable content in sufficient detail to allow us to locate it, and why you consider it to be unacceptable.
By submitting this report, you accept that it is accurate and that fraudulent or nuisance complaints may result in action by the University.

  Cancel