Open source In-Memory database architecture

My friend and I are considering to build In-Memory relational open source database.

We have discussed about database architecture and our goals.

I’ll try to summarize it in this blog post.

We have a few objectives:

    • To create a database using C++.
    • To create a database which will work on Linux, Mac OS X and Windows.
    • To use as much as open source components we can.
    • To create a database which is very fast, has low memory consumption and has little needs for an administration (no need to rebuild and optimize indices, handle deadlocks and etc. ).
    • Be able to handle very large data.


In-Memory database architecture diagram

Here is a small diagram of our vision of database architecture.


1) Management studio and JDBC driver

We plan to use ODBC/JDBC driver for communication.

This way many languages will be able use database without having to write native drivers.

I also think that phpMyAdmin supports ODBC for connection to MySQL.

Our plan is to use SQL standard similar to MySQL so we can reuse phpMyAdmin as management studio for our database.

We planned to use phpMyAdmin as management studio at least for initial database release.

2) WIRE protocol

We can use the same WIRE Protocol as MySQL and re-use ODBC/JDBC drivers that MySQL uses, but if we do that we will be tied up to their standard.

Also, we plan to use websockets within the C++ REST SDK for handling communication between our database and client.

Using this library will enable us to easily offer REST service or enable direct communication between the browser and our database.

But, if we use websocket we won’t be able to reuse existing ODBC/JDBC drivers.

So, there is a question. Should we use existing wire protocol and ODBC/JDBC implementation or create our own WIRE protocol and drivers?

3) Memory Manager

We will need to create a custom memory manager to be able to have lock-free structures and to fine–tune memory allocation.

4) Query processing

We plan to use SQL Lite Lemon parser for parsing queries.

We plan to create custom compiler / optimizer and executor.

5) Transaction manager

Our database should support ACID transactions.

The transaction manager will handle all transaction operations.

6) I/O Manager

This component will handle all reading and writing from/to hard drive.

7) Structures manager

We’ve created a new innovative structure for indices.

The main purpose of this component is to handle operations related to indices and data as well with some metadata and statistic tables.

8) Recovery manager

This component will create snapshots from the transaction log and enable us to quickly recover from system failure.

Any feedback is welcomed. Feel free to suggest improvements to our design or some other open source components/projects we can use in this project.