Cassandra 2.0.9 to Amazon Keyspaces Migration using DSBULK Utility

Gugatharsan Sivalingam
8 min readMar 21, 2023
Amazon Keyspaces

These days, most of the organizations are trying to move out from their on-premise servers to the cloud. For this kind of migration, people mainly go for the lift and shift approach. Even though we moved the application and databases to the cloud, the architecture map and your team’s responsibilities won’t change. This means you are not getting full benefits out of the cloud migration. This is why major cloud vendors introduced fully-managed services to make our life easier. If we are talking about this from the AWS perspective, For relational databases have RDS, non-relational databases have Document DB, Amazon Keyspaces, etc and for caching they have Amazon ElastiCache. What are the advantages of the fully-managed database over the self-managed database?

  • Patching and minor upgrades without downtime.
  • Automated backups and failover.
  • High availability and durability.
  • Can get Support from cloud vendors quickly.

In this article, my main motivation is to cover the topic of Migrating from Cassandra to Amazon Keyspaces, the related issues with this migration, and how we can overcome those issues. As am aware, the only approach to do this migration is using the DSBULK utility provided by DataStax. This migration involves several challenges. If your current running Cassandra is an older version then it’ll be a huge headache. Then you need to find the compatible version of the DSBULK utility to do the migration. If we are doing this in the production environment, we need to take down the live servers until successfully migrate all the data from Cassandra to Keyspaces. Depending on the data size, downtime will also increase. So, we need to be careful when choosing the keyspace for migration from Cassandra.

Another important thing you need to verify before the migration is to confirm the maximum row size of the record because, from the AWS Keyspace side, it’s a hard limit that cannot be changed. The maximum allowable size from AWS is 1 MB. If your row size exceeds, then you’ll face lots of issues during the migration. In this situation, if compression on the client side is still producing rows larger than 1MB then you will need to use a pointer and place the record inside a datastore such as S3…

--

--

Gugatharsan Sivalingam

I’m a tech savvy person who loves to keep myself updated to the current trends of the technology world.