Skip to main content

Posts

Showing posts from February, 2019

SSTables and Its Wonders!

Recap: In the last blog, we mentioned about the log structured based storage engine. Where the log-structured storage segment is a sequence of key-value pairs. These pairs appear in the order that they were written and values later in the log take precedence over the values for the same key earlier in the log. A step ahead... what if we modify its way of storage and store the data in sorted order of their keys? We call this format as Sorted String Table or SSTable as an abbreviation. Where we store keys in its sorted order. But wait! What about sequential writes that made bitcask faster? you might have this doubt. Let's keep that doubt as it is for now. Benefits with the approach: This would, undoubtedly, in turn, improve the performance of compaction(the process of eliminating duplicate values) and merging of the segments. Compaction would be more of a merge function in the famous MergeSort. While merging the segments, if we come across the same keys, then we cho...

Unpacking Storage Engines!

When it comes to storage, the very basic notion that occurs is a Database. A database, should store the data when given and read it back when asked. But wait, As a developer, Why should we care?                    We know the databases, SQL and popular NoSQL. And there is a vast number of different databases available out there. Every one of them is optimized for different types of workloads and are based on different storage engines. FYI storage engine is a software module that a database management system uses to create, read, update and delete data from a database. In order to tune the storage engines, you have got to understand the type of workload you are going to serve and what storage engine is doing under the hood.                      There are different storage engines. Ex. Log-structured storage engines, and page-oriented storage engines. ...