Planning to migrate your on-premise databases to the cloud? Don't start until you read our comprehensive guide detailing how to plan and what to consider at every step in the process.
Data migration to the cloud is the process of transferring data from on-premise storage to a database that has been deployed in the cloud. Although there are a growing number of cloud providers and platforms, AWS, GCP, and Azure are currently the three largest cloud service providers. The vast majority of cloud deployments are done on these three platforms.
A major reason for migration of data from on-premise to cloud resources can be explained by understanding the benefits around why companies prefer cloud infrastructure over on-premise. Cloud infrastructure offers benefits such as cost efficiency, reduced maintenance, extended scalability, and increased development speed. Thanks to these benefits, organizations have been working to migrate all, or at least the majority, of their services and data to the cloud.
However, data migration is a challenging task especially when you are using on-premise databases for live services. There is plenty of planning and strategy that goes into undertaking a cloud migration. For instance, you will need to ensure that the migration does not impact your production service and, after cut-over, guarantees the continuity of the migrated data and services. In this article we will discuss the major considerations for migrating databases to the cloud.
How to Plan a Database Migration From On-Premise to Cloud Infrastructure
For migrating data to the cloud, your team will need to perform in-depth research on the existing databases and their architectures. You’ll also need to gather as much information as you can on the target cloud service where these databases will eventually reside. After deciding which cloud service you want to migrate to, you will have to choose which DBMS you will deploy. This will uncover questions such as if you want to choose the same database engine you’re currently using? An example would be a data migration from an on-premise instance of MySQL to a cloud-based instance of MySQL. You may also choose to move to a new DBMS. This could follow the lines of moving from an on-premise version of MySQL to a cloud-based deployment of Postgres.
Furthermore, you will want to understand the ecosystem where your new DBMS will reside. Many major cloud service providers offer a suite of products across their different services including DBMS, servers, analytics, data backup, and more. Thus, at a planning stage, it is important to understand how your new DBMS will co-exist and add value or leverage the infrastructure as a whole.
Different Cloud Migration Strategies, and Choosing the Best One
To get a full picture of all the factors included in a database migration from on-premise to cloud, let’s look at the 6R’s. Each of these R’s are building blocks in the strategy for a successful cloud migration.
- Rehost: This first data migration strategy is known to be the most common practice by businesses that want to move their infrastructure to the cloud. This technique is sometimes referred to as the “lift-and-shift” method. It is regarded as one of the easier options for migration since you don’t have to manage both on-prem and cloud infrastructures. It also requires the least drastic changes in regards to how things are currently working versus the future state. When you rehost, you essentially replicate your servers and databases in the cloud. Cloud providers usually manage their underlying infrastructure, which allows you to benefit from reduced maintenance overhead. Furthermore, cloud providers, in order to attract the customers whose infrastructure is on-premise, will sometimes provide managed migration services.
- Replatform: Replatforming means that you change your existing operating systems or databases as you migrate to the cloud. While the “rehost” method moves the same software and hardware infrastructure including OS and databases, in replatforming, you update this infrastructure. For example, if you want to use a more recent operating system version or new database engine, you would plan that change as part of replatforming. By this, you can achieve improvement to your underlying software infrastructure. Compared to rehosting, replatforming is considerably more work and risk. Replatforming does not mean a change to your application architecture nor to application code in this context.
- Refactor: In refactoring, you apply extensive changes to the application code to utilize cloud native services. By this, you make both software and hardware more optimized to run in a cloud environment. For example, you might want to consider shifting from a server-based application to serverless architecture. You may also consider using a storage service like AWS S3 (instead of your current disk-based storage) to store big data and leverage big data technologies on the cloud. Due to the scope of the changes that can come with a refactor, it will require a large amount of effort from your teams and require the most regression testing and validation. When there is more work to do, it usually entails a higher risk to the project and the business.
- Repurchase: The next option, repurchase, is available if you want to use applications that are built as Software as a Service (SaaS). In addition to infrastructure, managing the software used by your organization requires team resources as well. By simply using an existing software or by replacing your existing application with a different pre-built and off-the-shelf product, you can reduce management overhead and optimize costs. Many of these managed SaaS platforms also have data migration help built in as well.
- Retire: In some cases, some of your existing applications might have become redundant. Before you embark on a data migration journey, it is important to identify which of your digital assets you want to migrate. If you find that some of your applications won’t be used, retiring them can reduce migration efforts and reduce your application portfolio's cost of ownership. Making sure these applications still fit in your architecture and roadmap is a great way to judge if you should retire an application or service, slowly decommission it, or migrate it.
- Retain: The final option is retaining. There can be databases or applications that you do not want to move for certain reasons. If you find that there is no benefit from data migration and too much risk, you may simply leave them as they are on-premise. Some legacy services and databases don’t have an advisable migration path and therefore may be more stable by just keeping them where they are. Cloud providers do sometimes offer solutions to ease those potential pains and risks but the decision still needs to take in all risks associated with a migration and the benefits. However, every business has different circumstances, so the choice to keep an application where it stands might make sense.
These six strategies can help you to explore what available options are out there to migrate databases to the cloud. While they can give you an idea of a direction, you will still need a customized approach to flesh out low-level factors. You can use the following consideration points to pick one of the six strategies.
Return on Investment (ROI)
When businesses fund a project they always keep the financial cost and return a top priority. Moving databases to the cloud is not a small investment by any means. Management and engineers will have to dedicate their time to this massive task while keeping normal operations for existing services running smoothly. Assuming you want to choose the rehost option, the easiest and quickest option, you can calculate an estimated cost by pricing out the same infrastructure in the cloud. Most cloud service providers provide a cost calculator which allows you to easily project costs of running the infrastructure and licensing. Using the hardware and software specifications you want, you can get an estimated monthly or yearly cost.
When you move to the cloud, your cost is likely to become lower. You can use the gap between the current and the new lower cost to estimate your return and new cost of ownership after migration. Of course, you will want to factor in other items such as engineering effort, reduced labor to maintain databases hosted in the cloud, potential savings in innovation, and other items that suit your business cases.
After migration, you will have to point existing applications to the new databases in the cloud. This effort can impact your business in many ways, big and small. If there are other applications and servers to go to the cloud, you may want to start with those that impact the business the least in order to reduce risk and ensure you have a successful migration plan. Depending on your circumstances, you might want to consider a gradual hybrid approach of rehost and retain. This means that you move low-impact assets first and ensure they are stabilized and operating as expected in the new cloud environment. Then, you move the high-impact ones to the cloud once the process has been vetted.
Depending on how much dependency you have on the existing databases, complexity can go up or down. If it’s highly complex, used heavily, or is considered critical infrastructure, you may want to initially rehost your databases in the cloud. After they have been rehosted, you can then build migration pipelines between on-prem databases and new databases in the cloud to sync data to the new target. This way, you can guarantee the data integrity and business continuity.
Measuring current performance and future needs can be an important step for a replatform or refactor option. This will give you a benchmark to dial in your cloud environment to either match the performance or exceed it. In the cloud, you have easy access to deploy in multiple regions and availability zones and almost instantaneous horizontal scaling. Considering these flexible options, you can adjust the specifications of database hardware and software as well with a few clicks or lines of configuration.
Service Level Agreements (SLA)
The last important factor is understanding the responsibilities of your cloud provider. For example, major cloud companies guarantee uptime, redundancy, restoration, patching, and protection from attacks such as DDoS, reflection attacks, or SYN floods. If you currently have on-going expenses for blocking these malicious attacks, you may be able to save on those costs as well.
Now that we learned data migration strategies, let’s explore what options and tools we can use for database migration. For this, we will focus on the three major cloud providers - AWS, GCP, and Azure.
Cloud Provider Options and Tools - AWS
Amazon RDS provides managed database service for the most popular database engines. Supported databases include MySQL, PostgreSQL, MariaDB, Oracle, Microsoft SQL Server, and Amazon Aurora. Chances are your on-premise databases fall into some of these well-known platforms. For data migration into these databases in AWS, the cloud firm provides AWS DMS - Database Migration Service. The service assists with migrating databases to AWS quickly and securely. During migration, your source database can remain fully operational, which minimizes downtime and business impact. Using AWS DMS, you can migrate between the homogeneous database engines - MySQL to MySQL - or between heterogeneous platforms - Oracle to Postgres.
Cloud Provider Options and Tools - GCP
Google Cloud Platform supports major RDBMS platforms such as MySQL, Postgres, and SQL Server under the service called Cloud SQL. In addition, it has non-traditional database service support. This includes Cloud Spanner, a cloud-native relational database with large-scale, consistency, and 99.999% availability. The firm also has the service called Bare Metal that’s designed to lift and shift Oracle workloads to Google cloud. GCP also has a Database Migration Service to move your data into Cloud SQL. GCP migration service lets you move data across both homogeneous and heterogeneous DB platforms.
Cloud Provider Options and Tools - Azure
On Azure, you can use SQL Server, PostgreSQL, MySQL, MariaDB, Azure SQL Database, and Azure SQL Managed Instance for relational database service. Like the other two cloud platforms, Azure also has a Database Migration Service to assist with data migration efforts. Using the managed data migration service, you can move data from on-premise database and server objects including user accounts, agent jobs, and SSIS (SQL Server Integration Services) packages to a new RDBMS hosted on the cloud platform. Like the other two providers, Azure allows homogeneous and heterogeneous migrations.
Using Arcion for Migration
Aside from using the cloud provider tools for data migration, there is also an easy way to migrate data by using Arcion. Arcion supports a wide number of source and target DB engines and big data platforms. It is also cloud agnostic, meaning that you can use it across all of the cloud providers without having to relearn or use different configurations depending on what service you are using.
Using Arcion, you can migrate data to your new platform while enjoying the benefits of zero-downtime and automatic schema migration. This allows you to pick a target platform, NoSQL or SQL, and easily migrate the data and the schema automatically. Another benefit is zero-downtime and zero data loss since you can also enable data replication. With data replication enabled, you can migrate your data and even keep the latest data that changed after the migration started and have it kept up to date on the new database. When you decide to flip application and services to use your new database, you can be sure that even the latest data is available.
Common Migration Paths and Helpful Resources
When you decide to migrate databases to the cloud, it makes sense to utilize the cloud providers managed database services. These services provide a range of benefits that you cannot access while using on-premise databases including the ability to spin up new production-ready databases in a few clicks. This is a common path when choosing to replatform your databases, like when you want to choose a different target database engine for your cloud migration. There can be many different scenarios when it comes to migrating from one database to another. These options could include moving from an Oracle Database to Amazon Aurora, SQL Server to Amazon RDS, or Teradata to Amazon Keyspaces. Below we will go over five potential migration options and share some of the motivations for potentially choosing such a migration path.
Oracle and SQL Server to Amazon Aurora
The biggest driver to move from Oracle or SQL Server to Amazon Aurora is reduced cost while maintaining the same performance. AWS claims that “If you're using a commercial database such as Microsoft SQL Server or Oracle Database, Aurora can provide the same performance at one-tenth the cost.” In addition to the cost savings, Aurora provides better scalability, availability, and durability thanks to the cloud environment that it is deployed on. You can add up to fifteen low-latency read replicas to enhance scalability and read capacity. Aurora also creates replications of your data to six copies across three availability zones to guarantee your data safety. The copies are stored in Amazon S3 to guarantee durability. Since this is all run on top of AWS, there’s many cost and performance benefits of using Aurora over the mentioned alternatives.
Oracle and SQL Server to Amazon RDS
In Amazon RDS, you can choose to run popular database platforms such as MySQL, PostgreSQL, MariaDB, Oracle, and Microsoft SQL Server. So, if replatforming is not an option, you can take the rehosting approach - using the same DB engines on the cloud. This way, you can minimize the amount of refactoring work while benefiting from what the cloud environment can offer in terms of cost savings and performance enhancements. RDS is a fully-managed relational database service that allows users to operate databases flexibly and securely. Since it’s a managed service, AWS takes care of patching and upgrading instances without any downtime. Also, you can implement more robust monitoring and alerting features using other AWS services such as AWS CloudWatch and SNS. Of course, although a cost saving is not as dramatic as when you opt for Amazon Aurora, you can still achieve cost savings while sticking with your original DB engine.
Apache Cassandra databases to Amazon DynamoDB
Apache Cassandra is an open-source NoSQL database engine and DynamoDB is a managed NoSQL database platform on AWS. DynamoDB can auto-scale and export data into S3 for analysis and to be consumed by other analytical services. By using DynamoDB, data is protected by encrypting data according to AES-256 (Advanced Encryption Standard). The managed NoSQL DB has stronger real-time processing capabilities than Cassandra has, and you can scale up throughput and storage easily on AWS. To ensure data safety, you can enable automatic backup and use restore features.
Apache Cassandra to Amazon Keyspaces
Migrating Cassandra to DynamoDB requires a replatforming approach since there are some differences in the underlying key-value schema. In contrast, Amazon Keyspaces use the same Cassandra engine, but it’s a fully managed service. If you want to choose a rehost option for your Cassandra database, you could consider migrating into Keyspaces. The managed NoSQL service has scalable and highly available capabilities that take advantage of AWS cloud infrastructure. You can keep the same application code and tools you used for your on-premise Cassandra for Keyspaces. Like other managed services, you don’t have to provision, patch, or operate software since it’s all taken care of by AWS at the platform level. Furthermore, Keyspaces works on a serverless architecture, which allows you to pay only for your usage and further reduces running costs.
Netezza to Amazon Redshift
IBM Netezza is a subsidiary of IBM and Netezza is a data warehouse solution by the company. Netezza stores each row of data in data blocks, while Redshift stores data in a columnar fashion. When you handle massive analytics workloads, a column-based storing method can give users dramatic performance benefits. Redshift particularly shows enhanced efficiency in query processing. For example, when you have a table that has 100 columns and 1 billion rows and want to calculate an average of a single column, a traditional method will be scanning the entire table, which requires massive I/O overhead. In Redshift, only the queried single column is read for calculation and returns a result in a much faster way. Plus, thanks to the benefits of RedShift being a managed cloud service, users can use concurrency scaling to scale out and back in. RedShift can scale up both vertically, by upgrading memory and CPU, and horizontally, by increasing the number of clusters.
In this article, we discussed how to migrate databases to the cloud and shared the 6 R's strategies and tactics to inform and guide you on deciding the best options and path for your data migration efforts. Depending on your strategy, you could migrate into a new database platform on cloud, maximizing the benefits of a new target database and cloud infrastructure. You could also migrate to the same database engine to minimize refactoring efforts and take advantage of data migration tools and managed services offered by the cloud provider. Before migration, it is important to perform analysis on which database candidates are most feasible for your use case and then gauge efforts and impact.
To learn more about managed database migration into the cloud, check out the documentation supplied by some of the major cloud service providers we discussed today.
- AWS Managed Migration Service
- GCP Managed Migration Concepts Principles
- Azure Managed Migration Service
If you want the most flexibility, you should also consider checking out Arcion for database migrations. Beyond what the cloud providers offer, Arcion enables zero downtime, zero data loss, and automatic schema migration to make even the most complex migrations secure and simple. Sign up for a free Arcion Cloud account (limited connectors) or book a personalized demo to see Arcion Self-hosted in action!