A Story On High Performance Managed Databases In the Cloud

Let’s start by thinking of a stock exchange’s trading platform, where hundreds of thousands of transactions may occur in a single second and millisecond-level differences in the arrival times of trade requests matter. This gives some idea on high performance databases. There are widespread use cases of high-performance databases in all industries - anywhere ultra-low latency and high volume of real-time concurrent transactions are needed. Naturally, such performances cannot rely on the speeds of hard drives, even those of SSDs.

There have been many technologies and heralds in this field; one particularly worth mentioning is the open source Redis project, initialised by Salvatore Sanfilippo more than ten years ago. Not sure how many still think of the ‘Remote Dictionary Server’ origin when the word is frequently mentioned nowadays. Redis has become the name for that popular open source, in-memory data structure store that is a cache, a database, or a real-time data platform.

Redis adopts a parent and child processes concept where the parent process maintains the request serving while the child process handles the data storing. Data is accessed from memory (RAM) by client requests (the serving) but can be stored in a different format (the storing, which can be on disks). In addition, data can be reconstructed back in memory.

In recent years, explosive data demands have made the high-performance datastore’s case even more vital. Then Microservices applications started casting more emphasis on ‘hot data’ (comparing to warm data and cold data), further changing the data and analytics requirements.

The advancement of cloud services has seen many providers, AWS, Azure and Google Cloud included, providing in-memory cache and datastores. These hassle-free managed services have enabled many more businesses and organisations to use high performance databases for caching, session management, real-time inventory, leader boards, fraud mitigation, finance, retail, IoT and many other things.

Using Amazon MemoryDB as an example, this Redis-compatible, in-memory database service delivers ultra-fast performance to the tune of microsecond level read and single-digit millisecond write latencies, supporting ultra-high request rates and data volumes.

What is equally impressive about this high performance is the fact that Amazon MemoryDB is a fully managed service - durable, secure, reliable, scalable and managed for you. The uses of resharding and Multi-AZ Transaction-Log replication have ensured high availability, scalability, durability and recoverability.

AWS does offer another Redis compatible data store in Amazon ElastiCache, ‘a Redis compatible in-memory data store built for the cloud. Power real-time applications with sub-millisecond latency.’ - As described by AWS.

It is worthwhile noting that Amazon ElastiCache for Redis is a cache service that works with a primary database (that is SSDs or HDDs powered) behind it, caching only part of the data in-memory with strategy (algorithm) choices like Lazy loading, Write-through and Adding TTL. Typically, in-memory caches offer little data persistence. ElastiCache for Redis does better on this because it provides replication and snapshots. However, it is still clear that on data durability, ElastiCache for Redis is only halfway to what MemoryDB does on ensuring full durability with no data loss.

On the other hand, ElastiCache working with a number of choices of primary databases together provides database choices, being it RDS flavours, S3, MongoDB, Cassandra or others. While MemoryDB is a key-value based No-SQL database.

Both MemoryDB and ElastiCache for Redis offer Redis APIs and data structures with common and different use cases. MemoryDB comes with a price as it has the complete database in memory plus the advanced capabilities in durability and high availability. For the right use cases its value is unparalleled. ElastiCache remains a popular choice with its features and functionalities.

For now, a stock exchange’s trading platform probably remains on-premises as a proprietary system. But for many (and in growing numbers) other applications desiring high-performance databases, managed in-memory data-stores in the cloud are indeed an excellent proposition to consider.

                                                                                                                                     -- Simon Wang

Comments

Popular posts from this blog

Fairness Evaluation and Model Explainability In AI

AWS and Generative AI

Amazon CloudFront and Its Primary and Secondary Origins