Key-value stores are a foundational storage pattern. They are deceptively simple and support some equivalent of these core methods:
- kvs.put(key, value)
- value = kvs.get(key)
- key = kvs.next(key)
Key-value stores are utilized by developers and are typically embedded within a larger software solution. Any program on a given laptop shouldn't generate "that" much data comparatively speaking, so it seems like a common key-value store implementation would suffice for all use cases. Server computing, however, represents a more complex problem space as computers keep getting faster and denser with more cores. This, combined with more and longer running processes, produces more data. Having more data to manage increases performance demands, both in reading, writing, and file management. All of this in turn increases opportunities for failure conditions to emerge with resulting data loss and potentially corrupted files. Also, as with my previous [email protected] post Why are there so many programming languages?, never underestimate the ability of intellectual property and ownership issues to either bring people together, or to separate them, with respect to who owns what code. This post will focus on the evolution of single machine key-value stores, as there is also another category of functionally similar but distinct frameworks for distributed key-value stores.
While the external appearance of a key-value store is straightforward enough, there are more than a few functional and technical attributes under the hood to consider:
- Programming language bindings
- Platform support
- Operational features
- Key-size and value-size constraints
- Result cache
- Stability & crash resistance
- Online backups
- Internal file management
- E.g., log structured merge file management
- Supported file-types: Hash, B+Tree, other
- Development features
- Ease of use/build/distribution
- Affected by all of the above, plus more such as:
- Read performance
- Write performance
- Maximum practical database size
- And still have favorable read/write characteristics
- Language implementation potential constraints
- Open source (various types), commercial
… which is longer than one might initially expect given the humble set of put, get, next, and delete methods. Doing this well at scale is harder than it looks. A single deficiency in any one of the above functional attributes for a use-case could be the reason a team would either branch an existing framework to address said needs or to build anew.
Brief History of Key-Value Stores
Some key-values stores in use today describe themselves as "modern implementations of Unix DBM," which is true as far as it goes. But the broader history of key-value stores dates back further than that, and least a few of the influential stores that have been created since the beginning of computing should be mentioned.
VSAM (Virtual Storage Access Method) Files on the IBM mainframe date to the early 1970's with options such as key-sequenced and entry-sequence were a replacement for even earlier implementations of the same concepts.
RMS (Record Management Services) was effectively VSAM for DEC/VAX platforms.
dBase was an ISAM (Indexed Sequential Access Method) that had great popularity in the 1980's and 1990's along with "xBase" clones as an embedded datastore for business applications, especially on the IBM PC platform, and dates to 1979 with an earlier incarnation from 1978.
DBM - 1979
Irrespective of platform, developers have long had a need to be able to save their data, retrieve their data, iterate over their data, and if-and-when necessary, delete their data, with performance better than O(n). This is such a fundamental need that IBM's System/38 became notable for shipping with a relational database management system that was integrated into the operating system in 1978. That database was one of the first productized implementations of SQL.
With that as a backdrop, DBM appeared in Unix in 1979 as a NoSQL datastore before the term was coined because before System/38 just about everything was, quite literally, "No SQL". While this is the starting point for Unix and Linux DBM history, it didn't appear in a data management vacuum.
NDBM was created as an incremental improvement on DBM, such as being able to open more than one database at a time within a script (but still only single-threaded). Why was NDBM created instead of just extending DBM? I don't know.
GDBM was created as a replacement for NDBM but under the GNU Public License.
BerkeleyDB was created in 1991, and per the Communications article A conversation with Margo Seltzer and Mike Olson, one of the primary drivers was licensed based—specifically, to have DBM/NDBM capabilities that weren't tied to the AT&T Unix codebase, and was an outgrowth of the larger BSD Unix effort for the same purpose. BerkeleyDB became notable for being dual-licensed—GPL but also with a commercial license alternative—and this arrangement was quite attractive to the open-source community. Per above, intellectual property arrangements can sometimes bring folks together. BerkeleyDB also added features not present in NDBM, such as transactions.
SDBM is an DBM implementation for Perl and is an example of the importance of language bindings for framework adoption.
QDBM was created around 2000 and was the first version of the "Cabinet" key-value data stores later this decade.
JDBM is another example of the importance of language bindings as it was a key-value store written in Java that had a lot of traction with Java development projects because of its ease of integration. JDBM morphed into major versions 2, 3, and 4 and then inspired MapDB, which is written in Kotlin but is still in the Java VM ecosystem.
Tokyo Cabinet was created in 2006 as a successor to QDBM, with improved performance.
Kyoto Cabinet was created in 2009 as a successor to Tokyo Cabinet, with a variety of in-memory and on-disk configurations depending on use case.
LevelDB was created in 2011 by Google. Though single-process access, it supports a host of other operationally sophisticated features like snapshots, supporting multiple changes in an atomic batch, and a way to override operating system interactions, among other things.
RocksDB was created in 2012 by Facebook and was heavily inspired by LevelDB. It was optimized for usage in fast storage and high-CPU environments with even more write and read improvements, and more operational pluggability.
Revisiting BerkeleyDB: Oracle changed the BerkeleyDB license in 2013 to AGPL sending shockwaves through the open source community. The framework LMDB (not a typo) was apparently created for OpenLDAP as a replacement for BerkeleyDB because of the license change, which is notable because LDAP was instrumental in BerkeleyDB's history. Additionally, Debian Linux phased out BerkeleyDB distribution after the license change. Sometimes intellectual property and ownership issues can bring people together, but now it separated them.
Tkrzw was created in 2020 as the latest in the "Cabinet" series with even more configuration options than Kyoto Cabinet. As to the significance of the name "Tkrzw," I am reminded of the quote "there are only two hard things in computer science: cache invalidation and naming things."
Have we seen the last new key-value datastore? If history is any guide, I doubt it.
Apologies to any key-value datastore or frameworks not mentioned in this post. There are a lot of them.
- VSAM - https://en.wikipedia.org/wiki/Virtual_Storage_Access_Method
- VAX/RMS - https://en.wikipedia.org/wiki/Record_Management_Services
- dBase - https://en.wikipedia.org/wiki/DBase
- System/38 - https://en.wikipedia.org/wiki/IBM_System/38
- There seemed to be multiple RDBMS efforts at IBM in the 1970's in the early 1980's – System R was a research project, there was SQL/DS, and then there was DB2.
- I'm not sure which RDBMS was in System/38.
- Unix DBM - https://en.wikipedia.org/wiki/DBM_(computing)
- GDBM - https://www.gnu.org.ua/software/gdbm/
- BSD Unix - https://en.wikipedia.org/wiki/Berkeley_Software_Distribution
- BerkeleyDB - https://en.wikipedia.org/wiki/Berkeley_DB
- MapDB - https://mapdb.org/
- Tokyo Cabinet - http://fallabs.com/tokyocabinet/
- Kyoto Cabinet - http://fallabs.com/kyotocabinet/
- LevelDB - https://github.com/google/leveldb
- RocksDB – http://rocksdb.org/
- LMDB - https://www.symas.com/symas-embedded-database-lmdb
- Tkrzw - https://dbmx.net/tkrzw/
- Log Structured Merge Tree - https://en.wikipedia.org/wiki/Log-structured_merge-tree
Distributed Key-Value Datastores
- Bigtable – https://en.wikipedia.org/wiki/Bigtable
- Apache HBase – https://hbase.apache.org/
- And many more, but these are a story for another time.
Doug Meil is a software architect in healthcare data management and analytics. He also founded the Cleveland Big Data Meetup in 2010. More of his ACM posts can be found at https://www.linkedin.com/pulse/publications-doug-meil
Thanks Doug, this article brought back ancient memories of sleepless nights managing a large VSAM database with Alternate Indexes - and actively managing the Control Interval & Control Area splits & free space - oh those were the days! :-)