Benchmark series: Amazon MemoryDB and how it stands compared to Amazon ElastiCache for Redis
Table of contents
Until recently, Amazon Web Services (AWS) offered a collection of 10 Database Services. With the addition of Amazon MemoryDB, it now offers 11, resulting in more than 15 Database Engines that users can choose from. Amazon MemoryDB for Redis is a Redis-compatible, durable, in-memory database service that delivers ultra-fast performance for modern, microservices applications. It is based in part on the open source Redis platform, but adds durability and persistence. AWS positions this as targeted for customers who require a full-blown database service rather than a cache for Redis, based on the assumption that the use cases will be quite different. It allows developers to build applications using the same Redis data structures, APIs, and commands they are familiar with. Furthermore, they get access to advanced features such as built-in replication, the least recently used (LRU) eviction, transactions, and automatic partitioning.
On the other hand, AWS ElastiCache for Redis offers a fully managed platform that makes it easy to deploy, manage, and scale a high performance distributed in-memory data store cluster. It is fully compatible with the usual Redis data structures, APIs, and clients, allowing your existing applications that already use Redis to start using ElastiCache without any code changes. It supports both Redis cluster and non-cluster modes, providing enhanced high availability and reliability, with automatic failover scenarios across availability zones.
Getting Started
Upon release, MemoryDB became available in US East (N. Virginia), EU (Ireland), Asia Pacific (Mumbai) and South America (Sao Paulo). MemoryDB clusters can be created using the AWS Management Console, AWS Command Line Interface (CLI), or AWS SDKs. CloudFormation support did not come out of the box with MemoryDB unlike other recent service releases from AWS. This means that we will have to wait before we can deploy MemoryDB clusters with CloudFormation and the AWS CDK.
There are several screenshot walkthrough guides out there to get you through setting up a MemoryDB cluster in the console. Below you can find a CLI command for getting you up and running quickly with a Cluster.
aws memorydb create-cluster \
--cluster-name memorydb-test \
--node-type db.r6g.large \
--acl-name open-access \
--subnet-group-name memorydb-test-subnet-group \
--security-group-ids <memorydb-test-security-group-id>
A few assumptions:
memorydb-test-subnet-group refers to a MemoryDB subnet group created with the
aws memorydb create-subnet-group
command.memorydb-test-security-group-id refers to a pre-existing Security Group ID.
Performance
The first questions that popped in my mind when Amazon MemoryDB was released was related to performance. More specifically, how does Amazon MemoryDB for Redis compare with Amazon ElastiCache for Redis. Can you get the same performance out of the more robust implementation of Redis that was encapsulated around MemoryDB?
In order to give an answer to the above question, the following experiment came to life:
The memtier benchmark tool was used within EC2 to test the performance of both ElastiCache and MemoryDB. For reference, version 1.3.0 was used.
In the table below you can find the various configurations that were used for the experiment.
EC2 instance | ElastiCache | MemoryDB |
m5.xlarge | r6g.large | r6g.large |
m5.xlarge | r6g.2xlarge | r6g.2xlarge |
m5.xlarge | r6g.8xlarge | r6g.8xlarge |
memtier benchmark was configured as follows:
A 50/50 Set to Get ratio
Each object has random data in the value
For comparison purposes exactly the same configuration was used for ElastiCache and for MemoryDB.
You can find the raw data from the test here.
Throughput
In the next plot you can see the throughput performance of the combined Sets and Gets, also refered to as Total Throughput. You can see the results across the 3 different instance types that were used for ElastiCache and for MemoryDB.
From the results it is quite obvious that MemoryDB can deliver less total throughput than ElastiCache across the same instance types. We can also see that when replication is enabled across multiple Availability Zones (AZs) the performance drops even within the same service type. Final remark for this graph is that, taking into consideration the configuration used for the memtier benchmark, performance seems to flatline after the r6g.2xlarge for ElastiCache, comethign that cannot be said for MemoryDB that seems to have increasing performance as we move to the r6g.8xlarge.
Looking at the performance of Sets and Gets separetely, we end up with the followin plots for the throughput, one for the 1 node situation and one with a replication set with 3 nodes.
Without surprise, looking deeper into the Sets and Gets, we get the same remarks as for the Total Throughput. MemoryDB, in that regard, delivers less performance compared to ElasticSearch. The gap seems to be even greater when multiple replicas are used for the shards.
Latency
In the following plot you can see the latency performance of the combined Sets and Gets, also refered to as Total Latency.
MemoryDB is exhibiting considerable latency, at least when it comes to the benchmarking execution set. Only when we are increasing the underlying computing capcity considerably we get results equivalent to ElastiCache. On the hand, ElastiCache seems to be consistent as we go through the different instance types when it comes to latency. Increasing the number of replicas from 0 to 2, the performance increase is less when switching to the r6g.2xlarge from the r6g.large. The above statement is true only for the 99 and 99.9 percentiles.
Below you can see the results for the 1 node scenario split over the Sets and Gets.
Conclusion
It is quite evident that the extra durability of the implementation of Amazon MemoryDB comes at cost of performance that is available with Amazon ElastiCache. Although the results are affected by the configuration that was selected for the benchmarking tool, I believe that they are a good indicator for the expectations we can have in terms of performance from the two services. This means that the two services are different in their offerings and will remain available in the long run and they have distinct use cases.