Data Migration Tools: What They Are and Why You Need Arcion

Matt Tanner
Developer Relations Lead
August 4, 2022
18
 min read
Get a migration case study in your inbox
Join our newsletter

Getting started with cloud data migration

Data migration is the movement of data from one source or computing environment to another. The data being transferred could be files, databases, or even entire storage systems. At some point in time, every system and application is going to require some type of data migration. As of late, this happens even more frequently since technology is evolving at an ever-increasing pace. Unlocking the latest use cases for your data usually means using the newest technologies. Using these technologies is likely going to involve some sort of data migration.

These are just a few hints around why data migration and why has it become so important. So much so, that companies are spending billions of dollars migrating data. 

Today, businesses generate huge amounts of data. Data is a company's most valuable asset and its importance is constantly increasing. Companies face a great deal of pressure in extracting the maximum value from this data. The outcome of failing to do this is the risk of being left behind by their competitors. In other words, to be a successful company,  you need to be using the best and fastest technologies for your applications and data. Even further, your data needs to be stored efficiently, securely, and easily accessible.

This is why data migration is an important piece of cloud migration and striving to keep organizations relevant and evolving for the better.

Table of Contents

Understanding the factors to keep in mind for data migration tools 

Earlier, I said the goal of data migration is generally done to enhance performance, increase competitiveness, and to modernize existing systems. However, in order for these benefits to be full seen, it has to be done right. So many companies have had failed migration strategies and an abundance of horror stories often circulate within the technical sphere. Much of the time the failure revolves around the fact that many of the new systems fail to meet expectations. 

A complete and accurate data migration strategy ensures business continuity and data integrity. A robust migration plan looks to lower costs, reduce manual effort, and helps prevent incomplete execution of the migration. All of these reasons are common scenarios that cause migration projects to fail altogether or fall short of their potential.

Therefore, selecting the right data migration tool becomes an imperative decision for the success of the project. Each decision to use a specific tool will depend on the company's needs and goals. Below are a few factors you should consider when choosing a data migration service or tool:

Pricing

It's important to start with a look at your budget and the pricing factor for each tool. This will give you an idea of which tools are applicable or not within your projected budget. This is key in figuring out if you are financially able to acquire good data migration tools. 

Basically, there are a few ways to go about understanding the overall price for the migration. Firstly, you can choose to go with open source tools. These are usually on the lower end of cost but, to use them effectively you'd need the right expertise to act as a glue for the whole operation. Licensing costs may be minimal but you will need to prepare for training and labor costs. Another way would be to use cloud-based migration tools and services. This could save you significant amounts on infrastructure and labor costs and expedite the timeline. This approach can help to free up resources for other projects.  

Performance and Scalability 

There's only one solution for this factor that is truly easy and cost-effective. That solutions is to use cloud-based tools and services. The storage ability of the cloud allows it to be able to scale up and down depending on the workload. The scalability of these platforms is built into the fabric of the tools and services they offer, essentially giving unlimited compute and storage resources.

By using a cloud environment,  the performance and scalability automatically trumps on-premise tools. On-premise tools lack the ability to scale due to being limited by the hardware on which they run. In order for an on-premise tools to scale, hardware, including additional memory and storage, must be added to the server. Alternatively, complete servers may be added to the installation as well. This can be very cost intensive, require downtime, and only allows for scaling upwards and doesn’t provide the ability to easily scale down as needed.

Location

This is another important factor when considering data migration tools. You have to consider whether you want to migrate data from: 

  • On-premise to on-premise 
  • On-premise to the cloud
  • One cloud to another cloud

Depending on the location that you’ll be executing the migration on, certain tools may or may not be applicable. For instance, most on-premise tools can be deployed on the cloud as well. However, most tools provided on cloud platforms are only available on the cloud and some may also only be available on specific cloud platforms.

Reliability

How reliable you can tolerate for the data migration tool you’ll be using? Cloud-based data migration tools are tend to be better than on-premise tools since they can easily achieve close to 100% uptime. This is due to a cloud environments ability to easily offer highly redundant architectures. If uptime is high, the chances of a data migration failing in the middle of the transfer. 

Security

During data migration, the tools you use have to be subjected to security and compliance requirements. Data should be encrypted both in-flight and at-rest. Most cloud data migration solutions and service offer enterprise-grade security by default. At times, this can be limiting if your organization needs to adhere to some security standard that is not supported by the cloud provider. 

On-premise solutions must be configured to be secure. This potentially makes on-premise systems less secure since they depend on the security of your overall infrastructure, which could be misconfigured. Even with this possibility, certain security and compliance standards are more straightforward to achieve with an on-premise deployment. HIPAA or PCI compliance, for instance, can sometimes be easier to achieve with an on-premise deployment.

Understanding types of data migration tools

Types of data migration tools come in a few different flavors. These consist of self-scripted, on-premise/self-managed, or cloud-based tools. Let’s take a look of each of these options.

Self Scripted

If the project is a small one, self-scripted data migration can be an efficient way to tackle data migration. This “do-it-yourself”, in-house solution unfortunately doesn’t scale very well. Usually, this type of approach uses a script which extracts data into an intermediary file which is then either read line-by-line and loaded into the target system. Sometimes this file may be ingested through a process on the target system like loading through a CSV process.

Pros

  • Usually a quick method to get up and running
  • Requirements are simple 
  • Cost-effective when done for a small project

Cons

  • Coding skills required
  • Changing needs require substantial re-work/increase in cost
  • Changes can be difficult if the code is not well-documented
  • Can be tough to keep data in sync if system is still receiving live data at time of migration

On-Premise Tools

Unlike the self-scripted data migration tool,  these are designed to migrate data within the network of a large or medium enterprise installation. On-premise tools for data migration are abundant and easily accessible.

Pros

  • The IT team has control of full-stack from physical hardware, security, and application configuration
  • Low latency if moving data within the same network
  • Predictable costs since costs are static and not metered with usage

Cons

  • IT team must manage security and software updates
  • IT team must keep tools up and running
  • Scalability is limited unless major investments are made to increase scale (adding extra servers/memory and licenses)
  • Certain tools may not be available for on-premise deployments

Cloud-Based Tools

Cloud-based data migration tools may be a better choice for organizations moving data from various sources and streams, including on-premise and cloud-based data stores to a cloud-based destination.

Pros

  • Easily scalable to handle changing business needs
  • Pay-as-you-go pricing eliminates spending on unused resources
  • On-demand compute power and storage handles spikes in demand through auto-scaling up and down

Cons

  • Security concerns - real or perceived - may lead to internal resistance
  • The solution may not support all required data sources and destinations
  • Post-paid billing and orphaned resources may lead to surprises for the finance department

Most popular data migration tools

When migrating to modern architecture from legacy systems with no downtime, it’s important to keep both legacy systems and modern distributed databases in sync. That's why it's important to select a database migration tool that has Change Data Capture (CDC). The use of CDC technology can help pick up transactions from the source, transform the data to the target system, and apply the transaction on the target. To keep both in sync, it needs to perform continuous data replication from the legacy system to the new distributed database. CDC provides guaranteed transactional integrity and the ability to repair and replay, as well as handle DDLs and bi-directional data replication.

Three main requirements of a cloud native CDC are:

1. Transactional integrity, where data is consistent with the source and able to maintain a global order of transactions as source.

2. Scalability, where Extract and Load data move at a high volume and velocity while also being able to auto scale up and out.

3. Low data latency, with replication and minimal impact on source.

Having discussed comprehensively the intrinsic details surrounding data migration, let’s now take a look at a comprehensive list of some of the best data migration tools available on the market.

Arcion

Arcion was designed from the ground up to help organizations power their most complex and advanced data migration use cases. We have developed a revolutionary zero-code solution that enables enterprises to deploy high-performance and high-volume data pipelines in minutes, instead of months. Whether you require an on-premise or cloud-based self-managed deployment, prefer a SaaS-based solution, or want to adopt a hybrid solution, Arcion can handle the job. Arcion offers both Arcion Self-Managed (free download with 30-day free trial, no payment info required) and Arcion Cloud (14-day free trial, no payment info required) to serve all of your needs. Both of these options offer reliable data migration, all with zero downtime.

Here are a few unique advantages offered by Arcion to achieve zero downtime database migration:

  • At Arcion, we have a library of 20+ pre-built connectors that includes OLTP databases (e.g., Oracle, IBM DB2, SAP, SQL Server, MongoDB, Postgres, MySQL, etc.), data warehouses (e.g., Teradata, Oracle Exadata/RAC, RedShift, SAP IQ, SAP Hana, etc.), and cloud analytic platforms (e.g., Databricks, Snowflake, SingleStore, etc.). 
  • Distributed & highly scalable architecture (patent-pending technology): many other existing CDC solutions don’t scale for high-volume, high-velocity data, resulting in slow pipelines, slow delivery to the target systems. Arcion is the only end-to-end multi-threaded CDC solution on the market that auto-scales vertically & horizontally. Any process that Arcion runs on Source & Target is parallelized using patent-pending techniques to achieve maximum throughput. There isn’t a single step within the pipeline that is single threaded. It gives Arcion users ultra low latency CDC replication and can always keep up with the forever increasing data volume on Source.
  • Built-in high availability (HA): Arcion is designed with high availability built-in. It makes the pipeline robust without disruption and data always available in the target, in real-time.
  • Auto-recovery (patent-pending technology): Internally, Arcion does a lot of check-pointing. Therefore, any time the process gets killed for any reason (e.g., database, disk, network, server crashes), it resumes from the point where it was left off, instead of restarting from scratch. The entire process is highly optimized with a novel design that makes the recovery extremely fast.  
  • Automatic schema conversion & schema evolution support: Arcion handles schema changes out-of-the-box requiring no user intervention. This helps mitigate data loss and eliminate downtime caused by pipeline-breaking schema changes by intercepting changes in the source database and propagating them while ensuring compatibility with the target's schema evolution. Other alternative solutions would reload (re-do the initial load) the data when there is a schema change in the source databases, which causes pipeline downtime and requires a lot of compute resources (expensive!).
Advantages of using Arcion for database migration

Hevo Data Migration

Hevo Data is a data migration tool that allows you to transfer data from different sources to a source of your choice easily and efficiently. Hevo allows users to move data to a data warehouse, run transformations for analytics, and deliver operational intelligence to business tools.

Some of its key features are:

  • Hevo provides real-time migration
  • Hevo has a scalable infrastructure with over 100 SaaS sources like Google Analytics, Salesforce and other database sources as well
  • Hevo claims to be an automated platform that requires minimal maintenance

Hevo Data Migration Pricing:

According to Hevo Data's website, Hevo has a free tier comes with 1 million free events, 50+ connectors, and the free initial load. The next tier starts with 5 million events at $239/month, and an access to 150+ connectors.

AWS Data Migration Service

AWS Data Migration Service (DMS) is a database migration service that is owned by Amazon with the primary goal of migrating databases to AWS in a secure and reliable manner. Another important part of the AWS migration tooling is the AWS glue. AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.

Some of its key features are:

  • AWS data pipeline allows you to create a pipeline graphically through a console using the AWS Command Line Interface (CLI). 
  • It allows the source database to keep on running throughout the migration process
  • It is a popular tool and it is always available, this has made it an ideal solution for continuous data migrations
  • It reduces the time for an application to run 

AWS Data Migration Service Pricing:

According to the Amazon Web Service website, AWS Data Migration Service is billed based on how often your activities and preconditions are scheduled to run. Another factor pricing takes into consideration is where they run (AWS or on-premises). To calculate the estimated cost, you can check out the AWS Data Pipeline pricing page.

Integrate.io/Xplenty

Integrate.io, formerly known as Xplenty, allows you quickly and easily prepare your data using a simple data integration cloud service. It also has a user-friendly interface that seamlessly connects with a large variety of data stores: relational databases such as Heroku Postgres, PostgreSQL, etc.

Here are some great features of Integrate.io:

  • Integrate.io allows you to integrate data from multiple sources into a single data pipeline and implement a wide range of other data transformations out-of-the-box, without additional code to be written.
  • Integrate.io ensures a secure transfer of company’s data while migrating from one source to another.
  • Great UI that is easy to use and configure.

Integrate.io's pricing:

Integrate.io doesn't disclose pricing. The company offers a 7-day free trial to users who request a product demo. 

IBM Informix 

Informix by IBM is another excellent tool that can be used to move data from one IBM database to another IBM database. Informix has the unique ability to seamlessly integrate and migrate SQL, NoSQL/JSON, time series, and spatial data. With the help of the hybrid cloud infrastructure, Informix helps businesses with data migration projects.

Here are some great features of IBM Informix:

  • Perform analytics close to data sources to enhance local decision-making. Access business intelligence faster with enhanced integration with various tools and applications. 
  • Ensure always-on operations across your grid environment. Upgrade, maintain, and configure the grid with no downtime.  Successfully meet service-level agreements.
  • IBM Informix delivers fast data transfer that helps transactional data workload enable Data Analytics on data in real-time.

IBM Informix's pricing:

IBM Informix has a free developer edition. For the Informix Cloud's pricing, the small package starts at $1,250/Instance, the medium package at $2,200/Instance, the large package at $4,000/Instance, and the extra large package at $8,000/Instance. 

Fivetran

Fivetran is a highly comprehensive ETL tool responsible for taking your data from a system like Quickbooks and inserting it into your centralized data storage system (often referred to as a data warehouse). It offers Data Integration that is built on a fully managed ELT(extract, load, transform) architecture. Note that Fivetran mainly focuses on SaaS integrations. It acquired a CDC solution provider in September 2021 called HVR that has a database migration solution. We wrote another detailed blog post that compares popular CDC solutions if you're interested.

Here are some great features of Fivetran:

  • Eliminates the need to hire a data engineer to build data pipelines that connect various SaaS applications. 
  • Fivetran does all this by offering a myriad of connectors to both source and destination connectors that allow for both pushing and pulling of data
  • After loading the data, Fivetran offers data teams the ability to easily set up custom data transformations. 

Fivetran pricing:

Fivetran offers credits that are based on customers' monthly active row threshold(MARS) in a given period. Check the Fivetran website for more on the pricing.

Azure DocumentDB

Azure DocumentDB Data Migration Tool is owned by Microsoft. This tool allows users to migrate data to DocumentDB, which is NoSQL Document Database. Azure DocumentDB Migration tool enables migration of data to Azure DocumentDB from CSV file/s, JSON file/s, SQL, Azure DocumentDB, MongoDB, and many others.

Here are some great features of Azure DocumentDB:

  • It supports a wide range of Windows operating systems and .NET frameworks 4.5.1 or higher versions.
  • Automatic scaling of backend database storing data and documents
  • It provides real-time data streaming services that are best for transactional data.

Azure DocumentDB Data Migration Tool pricing:

Azure DocumentDB Data Migration Tool is free open source tool, but it only migrates into DocumentDB.

Stitch Data Migration

Stitch is a self-serve ETL tool that you can use to connect all your data sources to your data warehouse. In the stitch context, data sources are called integrations, and your target storage is called the destination. Stitch is also focused on being a no-code solution that rapidly moves data from 130+ sources into the data warehouse of your choice.

Here are some great features of Stitch:

  • It allows you to spend less time managing your data pipeline, and focus on generating valuable insights from it.
  • Quickly identify and fix any issues you might come across in your data pipeline with Stitch’s efficient error handling mechanism
  • Route your data from a single data source to multiple destinations to power your use-cases

Stitch Data Migration pricing:

Stitch offers three tier plans from a range from $100 to $1250 per month depending on the required number of rows(millions)/month. It also offers customers a 14-day free trial.

Matillion

An article by TechCrunch described Matillion as “a platform to help companies harness their data so that it can be used, and what’s more, the platform is not just for data scientists, but it’s written with a “low-code” approach that can be used by a wider group of users”. Matillion is a cloud-based ETL platform that focus on data integration from SaaS applications and a few database connectors to the cloud data warehouses. Note that Matillion does not directly do Change Data Capture, and it connects users to AWS DMS for the CDC features.  

Here are some great features of Matillion:

  • Matillion possesses over 100 connectors to popular online SaaS services that you can connect to and pull your data from.
  • Push-down ELT technology uses the power of your data warehouse to process complex joins over millions of rows in seconds.
  • Collaboration baked-in. Build jobs together in disparate locations, like in Google Docs.

Matillion pricing:

Matillion price plans depending on the platform on which the customer’s data warehouse runs. For a 14-day period, a customer can use the tool for free. It also offers the following plans:

  • Basic at $2.00 per credit
  • Advanced at $2.50 per credit
  • Enterprise $2.70 per credit

DBConvert Studio

Another popular tool for database migration and synchronization is the DBConvert studio. It converts database structures and data between different database formats. 

Here are some great features of DBConvert Studio:

  • With DBConvert, you can seamlessly convert database structure and data between various formats, synchronize data spread across multiple databases as well as cross-database migration
  • A flexible built-in Scheduler can be used to launch tasks at a specific time without a GUI run
  • It allows total management of databases, enables quick configuration & connection, keeps data secure, and makes managing large amounts of data more efficient.

Informatica Cloud Data Wizard

Informatica Cloud Data Wizard is a flexible data loader for any type of user. It is an excellent Salesforce app that allows its users to easily synchronize common as well as custom Salesforce objects with CSV files, with SaaS applications such as NetSuite and cloud storage like Box.

Here are some great features of Informatica Cloud Data Wizard:

  • Admins can establish connections with external applications and conduct on-the-fly transformations.
  • It provides in-app integration in order to enhance its user productivity.
  • Informatica was given the Leader title in the 2021 Gartner Magic Quadrant for Data Integration Tools

Configero Data Loader

The Configero Data Loader for Salesforce is a native, platform web application for importing thousands of records into Salesforce. Using the power of our grid technology, users can now dynamically filter records by any custom or standard object to see a list of relevant records available for mass editing and updating before loading data into Salesforce. 

Here are some great features of Configero Data Loader:

  • Loads 50,000+ records in 20 seconds
  • Wizard-driven for ease of use
  • External ID matching(Auto save capabilities enables)
  • Dynamic multi-column filtering

Rocket Data Migration

Rocket Data Migration solutions include all aspects of data migration comprehensively. It is designed to boost the established migration procedures with the least manual effort. This tool simultaneously provides any level of support required throughout the migration.

Here are some great features of Rocket Data Migration:

  • Ensures data integrity by safeguarding against data corruption or loss.
  • Reduces storage costs and thereby improves return on investment.
  • Minimizes the interference of migration activities in meeting the daily objectives.

Brocade's DMM (Data Migration Manager) 

Brocade's Data Migration Manager (DMM) can automatically create migration pairs, implement automatic zoning, and can use a scheduler feature to perform tasks at preset times. It is an innovative, powerful and performance-oriented solution for high-performance data migration activities. It works efficiently in heterogeneous storage environments. 

Here are some great features of Brocade's DMM (Data Migration Manager):

  • It gives a high-throughput speed of up to one terabyte (TB) per hour.
  • It optimizes migrations across Storage Area Networks (SANs) with an easy setup facility.
  • It has a unique feature to determine the migration time in advance. This feature allows the clients to plan time, budget, and resources associated with the migration activity.

HDS Universal Replicator

Hitachi Universal Replicator (HUR) is an array-side software and journal disk caching solution that offers fast and efficient replication of small or large data volumes.

Here are some great features of the HDS Universal Replicator:

  • Replicate more data in less time to remote locations.
  • HDS replicator reduces resource consumption and provides significant data protection.
  • It maintains redundant data centers in distant geographies for rapid recovery from disasters.

Apex Data Loader

The Apex Data Loader is a Salesforce product. A Salesforce data loader is a universal tool used for bulk importing and exporting data.  It can input, update, and completely delete large amounts of Salesforce records.

Here are some great features of the Apex Data Loader:

  • It is easy to use and allows customers to get data into salesforce objects
  • When importing records, a data loader will read, extract, and load information from CSV files or from a database. 

Talend Open Studio

Talend Open Studio is a free open source ETL tool for Data Integration and Big Data. It is an Eclipse-based developer tool and job designer. 

Here are some great features of the Talend Open Studio:

  • Provides all features needed for data integration and synchronization with 900 components, built-in connectors, converting jobs to Java code automatically, and much more.
  • Reduces storage costs and thereby improves return on investment.

Rsync

Rsync is a free command-line tool that lets you transfer files and directories to local and remote destinations. Rsync can be used for mirroring, performing backups, or migrating data to other servers.

Here are some great features of the Rsync:

  • It acts as file synchronization and data transfer program.
  • It uses SSH to connect to the remote system and invokes the remote host’s Rsync to determine which parts of data need to be transferred over the secure connection.

EMC Rainfinity 

EMC Rainfinity is a product of Dell EMC Corporation. It is designed to help organizations to reduce storage management costs. EMC Rainfinity makes moving files to a different system transparent for both Unix and Windows users, with no downtime and minimum disruption.

Here are some great features of EMC Rainfinity:

  • It implements automated file archiving algorithms that can perform data migration across heterogeneous servers and NAS environments.
  • It comes with easy-to-use wizards to transparently move files across NAS and CAS.

Conclusion

Choosing a tool for migrating data can be overwhelming. As you can see above, many tools exist that cater to a variety of use cases. Being one of the largest factors in the success of a migration project or digital transformation, it is important that you choose the most flexible, reliable, and scalable solution for your needs. A successful data migration project can open up so many doors for your business and make life much better for technical teams building the future of your product. This guide has allowed you to see the options available and their strengths and weaknesses. Using your newfound knowledge of the tools available, we hope that your next data migration project is a success.

Back to Blog
Data Migration Tools: What They Are and Why You Need Arcion

Data Migration Tools: What They Are and Why You Need Arcion

Matt Tanner
Developer Relations Lead
August 4, 2022

Understanding the factors to keep in mind for data migration tools 

Earlier, I said the goal of data migration is generally done to enhance performance, increase competitiveness, and to modernize existing systems. However, in order for these benefits to be full seen, it has to be done right. So many companies have had failed migration strategies and an abundance of horror stories often circulate within the technical sphere. Much of the time the failure revolves around the fact that many of the new systems fail to meet expectations. 

A complete and accurate data migration strategy ensures business continuity and data integrity. A robust migration plan looks to lower costs, reduce manual effort, and helps prevent incomplete execution of the migration. All of these reasons are common scenarios that cause migration projects to fail altogether or fall short of their potential.

Therefore, selecting the right data migration tool becomes an imperative decision for the success of the project. Each decision to use a specific tool will depend on the company's needs and goals. Below are a few factors you should consider when choosing a data migration service or tool:

Pricing

It's important to start with a look at your budget and the pricing factor for each tool. This will give you an idea of which tools are applicable or not within your projected budget. This is key in figuring out if you are financially able to acquire good data migration tools. 

Basically, there are a few ways to go about understanding the overall price for the migration. Firstly, you can choose to go with open source tools. These are usually on the lower end of cost but, to use them effectively you'd need the right expertise to act as a glue for the whole operation. Licensing costs may be minimal but you will need to prepare for training and labor costs. Another way would be to use cloud-based migration tools and services. This could save you significant amounts on infrastructure and labor costs and expedite the timeline. This approach can help to free up resources for other projects.  

Performance and Scalability 

There's only one solution for this factor that is truly easy and cost-effective. That solutions is to use cloud-based tools and services. The storage ability of the cloud allows it to be able to scale up and down depending on the workload. The scalability of these platforms is built into the fabric of the tools and services they offer, essentially giving unlimited compute and storage resources.

By using a cloud environment,  the performance and scalability automatically trumps on-premise tools. On-premise tools lack the ability to scale due to being limited by the hardware on which they run. In order for an on-premise tools to scale, hardware, including additional memory and storage, must be added to the server. Alternatively, complete servers may be added to the installation as well. This can be very cost intensive, require downtime, and only allows for scaling upwards and doesn’t provide the ability to easily scale down as needed.

Location

This is another important factor when considering data migration tools. You have to consider whether you want to migrate data from: 

  • On-premise to on-premise 
  • On-premise to the cloud
  • One cloud to another cloud

Depending on the location that you’ll be executing the migration on, certain tools may or may not be applicable. For instance, most on-premise tools can be deployed on the cloud as well. However, most tools provided on cloud platforms are only available on the cloud and some may also only be available on specific cloud platforms.

Reliability

How reliable you can tolerate for the data migration tool you’ll be using? Cloud-based data migration tools are tend to be better than on-premise tools since they can easily achieve close to 100% uptime. This is due to a cloud environments ability to easily offer highly redundant architectures. If uptime is high, the chances of a data migration failing in the middle of the transfer. 

Security

During data migration, the tools you use have to be subjected to security and compliance requirements. Data should be encrypted both in-flight and at-rest. Most cloud data migration solutions and service offer enterprise-grade security by default. At times, this can be limiting if your organization needs to adhere to some security standard that is not supported by the cloud provider. 

On-premise solutions must be configured to be secure. This potentially makes on-premise systems less secure since they depend on the security of your overall infrastructure, which could be misconfigured. Even with this possibility, certain security and compliance standards are more straightforward to achieve with an on-premise deployment. HIPAA or PCI compliance, for instance, can sometimes be easier to achieve with an on-premise deployment.

Understanding types of data migration tools

Types of data migration tools come in a few different flavors. These consist of self-scripted, on-premise/self-managed, or cloud-based tools. Let’s take a look of each of these options.

Self Scripted

If the project is a small one, self-scripted data migration can be an efficient way to tackle data migration. This “do-it-yourself”, in-house solution unfortunately doesn’t scale very well. Usually, this type of approach uses a script which extracts data into an intermediary file which is then either read line-by-line and loaded into the target system. Sometimes this file may be ingested through a process on the target system like loading through a CSV process.

Pros

  • Usually a quick method to get up and running
  • Requirements are simple 
  • Cost-effective when done for a small project

Cons

  • Coding skills required
  • Changing needs require substantial re-work/increase in cost
  • Changes can be difficult if the code is not well-documented
  • Can be tough to keep data in sync if system is still receiving live data at time of migration

On-Premise Tools

Unlike the self-scripted data migration tool,  these are designed to migrate data within the network of a large or medium enterprise installation. On-premise tools for data migration are abundant and easily accessible.

Pros

  • The IT team has control of full-stack from physical hardware, security, and application configuration
  • Low latency if moving data within the same network
  • Predictable costs since costs are static and not metered with usage

Cons

  • IT team must manage security and software updates
  • IT team must keep tools up and running
  • Scalability is limited unless major investments are made to increase scale (adding extra servers/memory and licenses)
  • Certain tools may not be available for on-premise deployments

Cloud-Based Tools

Cloud-based data migration tools may be a better choice for organizations moving data from various sources and streams, including on-premise and cloud-based data stores to a cloud-based destination.

Pros

  • Easily scalable to handle changing business needs
  • Pay-as-you-go pricing eliminates spending on unused resources
  • On-demand compute power and storage handles spikes in demand through auto-scaling up and down

Cons

  • Security concerns - real or perceived - may lead to internal resistance
  • The solution may not support all required data sources and destinations
  • Post-paid billing and orphaned resources may lead to surprises for the finance department

Most popular data migration tools

When migrating to modern architecture from legacy systems with no downtime, it’s important to keep both legacy systems and modern distributed databases in sync. That's why it's important to select a database migration tool that has Change Data Capture (CDC). The use of CDC technology can help pick up transactions from the source, transform the data to the target system, and apply the transaction on the target. To keep both in sync, it needs to perform continuous data replication from the legacy system to the new distributed database. CDC provides guaranteed transactional integrity and the ability to repair and replay, as well as handle DDLs and bi-directional data replication.

Three main requirements of a cloud native CDC are:

1. Transactional integrity, where data is consistent with the source and able to maintain a global order of transactions as source.

2. Scalability, where Extract and Load data move at a high volume and velocity while also being able to auto scale up and out.

3. Low data latency, with replication and minimal impact on source.

Having discussed comprehensively the intrinsic details surrounding data migration, let’s now take a look at a comprehensive list of some of the best data migration tools available on the market.

Arcion

Arcion was designed from the ground up to help organizations power their most complex and advanced data migration use cases. We have developed a revolutionary zero-code solution that enables enterprises to deploy high-performance and high-volume data pipelines in minutes, instead of months. Whether you require an on-premise or cloud-based self-managed deployment, prefer a SaaS-based solution, or want to adopt a hybrid solution, Arcion can handle the job. Arcion offers both Arcion Self-Managed (free download with 30-day free trial, no payment info required) and Arcion Cloud (14-day free trial, no payment info required) to serve all of your needs. Both of these options offer reliable data migration, all with zero downtime.

Here are a few unique advantages offered by Arcion to achieve zero downtime database migration:

  • At Arcion, we have a library of 20+ pre-built connectors that includes OLTP databases (e.g., Oracle, IBM DB2, SAP, SQL Server, MongoDB, Postgres, MySQL, etc.), data warehouses (e.g., Teradata, Oracle Exadata/RAC, RedShift, SAP IQ, SAP Hana, etc.), and cloud analytic platforms (e.g., Databricks, Snowflake, SingleStore, etc.). 
  • Distributed & highly scalable architecture (patent-pending technology): many other existing CDC solutions don’t scale for high-volume, high-velocity data, resulting in slow pipelines, slow delivery to the target systems. Arcion is the only end-to-end multi-threaded CDC solution on the market that auto-scales vertically & horizontally. Any process that Arcion runs on Source & Target is parallelized using patent-pending techniques to achieve maximum throughput. There isn’t a single step within the pipeline that is single threaded. It gives Arcion users ultra low latency CDC replication and can always keep up with the forever increasing data volume on Source.
  • Built-in high availability (HA): Arcion is designed with high availability built-in. It makes the pipeline robust without disruption and data always available in the target, in real-time.
  • Auto-recovery (patent-pending technology): Internally, Arcion does a lot of check-pointing. Therefore, any time the process gets killed for any reason (e.g., database, disk, network, server crashes), it resumes from the point where it was left off, instead of restarting from scratch. The entire process is highly optimized with a novel design that makes the recovery extremely fast.  
  • Automatic schema conversion & schema evolution support: Arcion handles schema changes out-of-the-box requiring no user intervention. This helps mitigate data loss and eliminate downtime caused by pipeline-breaking schema changes by intercepting changes in the source database and propagating them while ensuring compatibility with the target's schema evolution. Other alternative solutions would reload (re-do the initial load) the data when there is a schema change in the source databases, which causes pipeline downtime and requires a lot of compute resources (expensive!).
Advantages of using Arcion for database migration

Hevo Data Migration

Hevo Data is a data migration tool that allows you to transfer data from different sources to a source of your choice easily and efficiently. Hevo allows users to move data to a data warehouse, run transformations for analytics, and deliver operational intelligence to business tools.

Some of its key features are:

  • Hevo provides real-time migration
  • Hevo has a scalable infrastructure with over 100 SaaS sources like Google Analytics, Salesforce and other database sources as well
  • Hevo claims to be an automated platform that requires minimal maintenance

Hevo Data Migration Pricing:

According to Hevo Data's website, Hevo has a free tier comes with 1 million free events, 50+ connectors, and the free initial load. The next tier starts with 5 million events at $239/month, and an access to 150+ connectors.

AWS Data Migration Service

AWS Data Migration Service (DMS) is a database migration service that is owned by Amazon with the primary goal of migrating databases to AWS in a secure and reliable manner. Another important part of the AWS migration tooling is the AWS glue. AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.

Some of its key features are:

  • AWS data pipeline allows you to create a pipeline graphically through a console using the AWS Command Line Interface (CLI). 
  • It allows the source database to keep on running throughout the migration process
  • It is a popular tool and it is always available, this has made it an ideal solution for continuous data migrations
  • It reduces the time for an application to run 

AWS Data Migration Service Pricing:

According to the Amazon Web Service website, AWS Data Migration Service is billed based on how often your activities and preconditions are scheduled to run. Another factor pricing takes into consideration is where they run (AWS or on-premises). To calculate the estimated cost, you can check out the AWS Data Pipeline pricing page.

Integrate.io/Xplenty

Integrate.io, formerly known as Xplenty, allows you quickly and easily prepare your data using a simple data integration cloud service. It also has a user-friendly interface that seamlessly connects with a large variety of data stores: relational databases such as Heroku Postgres, PostgreSQL, etc.

Here are some great features of Integrate.io:

  • Integrate.io allows you to integrate data from multiple sources into a single data pipeline and implement a wide range of other data transformations out-of-the-box, without additional code to be written.
  • Integrate.io ensures a secure transfer of company’s data while migrating from one source to another.
  • Great UI that is easy to use and configure.

Integrate.io's pricing:

Integrate.io doesn't disclose pricing. The company offers a 7-day free trial to users who request a product demo. 

IBM Informix 

Informix by IBM is another excellent tool that can be used to move data from one IBM database to another IBM database. Informix has the unique ability to seamlessly integrate and migrate SQL, NoSQL/JSON, time series, and spatial data. With the help of the hybrid cloud infrastructure, Informix helps businesses with data migration projects.

Here are some great features of IBM Informix:

  • Perform analytics close to data sources to enhance local decision-making. Access business intelligence faster with enhanced integration with various tools and applications. 
  • Ensure always-on operations across your grid environment. Upgrade, maintain, and configure the grid with no downtime.  Successfully meet service-level agreements.
  • IBM Informix delivers fast data transfer that helps transactional data workload enable Data Analytics on data in real-time.

IBM Informix's pricing:

IBM Informix has a free developer edition. For the Informix Cloud's pricing, the small package starts at $1,250/Instance, the medium package at $2,200/Instance, the large package at $4,000/Instance, and the extra large package at $8,000/Instance. 

Fivetran

Fivetran is a highly comprehensive ETL tool responsible for taking your data from a system like Quickbooks and inserting it into your centralized data storage system (often referred to as a data warehouse). It offers Data Integration that is built on a fully managed ELT(extract, load, transform) architecture. Note that Fivetran mainly focuses on SaaS integrations. It acquired a CDC solution provider in September 2021 called HVR that has a database migration solution. We wrote another detailed blog post that compares popular CDC solutions if you're interested.

Here are some great features of Fivetran:

  • Eliminates the need to hire a data engineer to build data pipelines that connect various SaaS applications. 
  • Fivetran does all this by offering a myriad of connectors to both source and destination connectors that allow for both pushing and pulling of data
  • After loading the data, Fivetran offers data teams the ability to easily set up custom data transformations. 

Fivetran pricing:

Fivetran offers credits that are based on customers' monthly active row threshold(MARS) in a given period. Check the Fivetran website for more on the pricing.

Azure DocumentDB

Azure DocumentDB Data Migration Tool is owned by Microsoft. This tool allows users to migrate data to DocumentDB, which is NoSQL Document Database. Azure DocumentDB Migration tool enables migration of data to Azure DocumentDB from CSV file/s, JSON file/s, SQL, Azure DocumentDB, MongoDB, and many others.

Here are some great features of Azure DocumentDB:

  • It supports a wide range of Windows operating systems and .NET frameworks 4.5.1 or higher versions.
  • Automatic scaling of backend database storing data and documents
  • It provides real-time data streaming services that are best for transactional data.

Azure DocumentDB Data Migration Tool pricing:

Azure DocumentDB Data Migration Tool is free open source tool, but it only migrates into DocumentDB.

Stitch Data Migration

Stitch is a self-serve ETL tool that you can use to connect all your data sources to your data warehouse. In the stitch context, data sources are called integrations, and your target storage is called the destination. Stitch is also focused on being a no-code solution that rapidly moves data from 130+ sources into the data warehouse of your choice.

Here are some great features of Stitch:

  • It allows you to spend less time managing your data pipeline, and focus on generating valuable insights from it.
  • Quickly identify and fix any issues you might come across in your data pipeline with Stitch’s efficient error handling mechanism
  • Route your data from a single data source to multiple destinations to power your use-cases

Stitch Data Migration pricing:

Stitch offers three tier plans from a range from $100 to $1250 per month depending on the required number of rows(millions)/month. It also offers customers a 14-day free trial.

Matillion

An article by TechCrunch described Matillion as “a platform to help companies harness their data so that it can be used, and what’s more, the platform is not just for data scientists, but it’s written with a “low-code” approach that can be used by a wider group of users”. Matillion is a cloud-based ETL platform that focus on data integration from SaaS applications and a few database connectors to the cloud data warehouses. Note that Matillion does not directly do Change Data Capture, and it connects users to AWS DMS for the CDC features.  

Here are some great features of Matillion:

  • Matillion possesses over 100 connectors to popular online SaaS services that you can connect to and pull your data from.
  • Push-down ELT technology uses the power of your data warehouse to process complex joins over millions of rows in seconds.
  • Collaboration baked-in. Build jobs together in disparate locations, like in Google Docs.

Matillion pricing:

Matillion price plans depending on the platform on which the customer’s data warehouse runs. For a 14-day period, a customer can use the tool for free. It also offers the following plans:

  • Basic at $2.00 per credit
  • Advanced at $2.50 per credit
  • Enterprise $2.70 per credit

DBConvert Studio

Another popular tool for database migration and synchronization is the DBConvert studio. It converts database structures and data between different database formats. 

Here are some great features of DBConvert Studio:

  • With DBConvert, you can seamlessly convert database structure and data between various formats, synchronize data spread across multiple databases as well as cross-database migration
  • A flexible built-in Scheduler can be used to launch tasks at a specific time without a GUI run
  • It allows total management of databases, enables quick configuration & connection, keeps data secure, and makes managing large amounts of data more efficient.

Informatica Cloud Data Wizard

Informatica Cloud Data Wizard is a flexible data loader for any type of user. It is an excellent Salesforce app that allows its users to easily synchronize common as well as custom Salesforce objects with CSV files, with SaaS applications such as NetSuite and cloud storage like Box.

Here are some great features of Informatica Cloud Data Wizard:

  • Admins can establish connections with external applications and conduct on-the-fly transformations.
  • It provides in-app integration in order to enhance its user productivity.
  • Informatica was given the Leader title in the 2021 Gartner Magic Quadrant for Data Integration Tools

Configero Data Loader

The Configero Data Loader for Salesforce is a native, platform web application for importing thousands of records into Salesforce. Using the power of our grid technology, users can now dynamically filter records by any custom or standard object to see a list of relevant records available for mass editing and updating before loading data into Salesforce. 

Here are some great features of Configero Data Loader:

  • Loads 50,000+ records in 20 seconds
  • Wizard-driven for ease of use
  • External ID matching(Auto save capabilities enables)
  • Dynamic multi-column filtering

Rocket Data Migration

Rocket Data Migration solutions include all aspects of data migration comprehensively. It is designed to boost the established migration procedures with the least manual effort. This tool simultaneously provides any level of support required throughout the migration.

Here are some great features of Rocket Data Migration:

  • Ensures data integrity by safeguarding against data corruption or loss.
  • Reduces storage costs and thereby improves return on investment.
  • Minimizes the interference of migration activities in meeting the daily objectives.

Brocade's DMM (Data Migration Manager) 

Brocade's Data Migration Manager (DMM) can automatically create migration pairs, implement automatic zoning, and can use a scheduler feature to perform tasks at preset times. It is an innovative, powerful and performance-oriented solution for high-performance data migration activities. It works efficiently in heterogeneous storage environments. 

Here are some great features of Brocade's DMM (Data Migration Manager):

  • It gives a high-throughput speed of up to one terabyte (TB) per hour.
  • It optimizes migrations across Storage Area Networks (SANs) with an easy setup facility.
  • It has a unique feature to determine the migration time in advance. This feature allows the clients to plan time, budget, and resources associated with the migration activity.

HDS Universal Replicator

Hitachi Universal Replicator (HUR) is an array-side software and journal disk caching solution that offers fast and efficient replication of small or large data volumes.

Here are some great features of the HDS Universal Replicator:

  • Replicate more data in less time to remote locations.
  • HDS replicator reduces resource consumption and provides significant data protection.
  • It maintains redundant data centers in distant geographies for rapid recovery from disasters.

Apex Data Loader

The Apex Data Loader is a Salesforce product. A Salesforce data loader is a universal tool used for bulk importing and exporting data.  It can input, update, and completely delete large amounts of Salesforce records.

Here are some great features of the Apex Data Loader:

  • It is easy to use and allows customers to get data into salesforce objects
  • When importing records, a data loader will read, extract, and load information from CSV files or from a database. 

Talend Open Studio

Talend Open Studio is a free open source ETL tool for Data Integration and Big Data. It is an Eclipse-based developer tool and job designer. 

Here are some great features of the Talend Open Studio:

  • Provides all features needed for data integration and synchronization with 900 components, built-in connectors, converting jobs to Java code automatically, and much more.
  • Reduces storage costs and thereby improves return on investment.

Rsync

Rsync is a free command-line tool that lets you transfer files and directories to local and remote destinations. Rsync can be used for mirroring, performing backups, or migrating data to other servers.

Here are some great features of the Rsync:

  • It acts as file synchronization and data transfer program.
  • It uses SSH to connect to the remote system and invokes the remote host’s Rsync to determine which parts of data need to be transferred over the secure connection.

EMC Rainfinity 

EMC Rainfinity is a product of Dell EMC Corporation. It is designed to help organizations to reduce storage management costs. EMC Rainfinity makes moving files to a different system transparent for both Unix and Windows users, with no downtime and minimum disruption.

Here are some great features of EMC Rainfinity:

  • It implements automated file archiving algorithms that can perform data migration across heterogeneous servers and NAS environments.
  • It comes with easy-to-use wizards to transparently move files across NAS and CAS.

Conclusion

Choosing a tool for migrating data can be overwhelming. As you can see above, many tools exist that cater to a variety of use cases. Being one of the largest factors in the success of a migration project or digital transformation, it is important that you choose the most flexible, reliable, and scalable solution for your needs. A successful data migration project can open up so many doors for your business and make life much better for technical teams building the future of your product. This guide has allowed you to see the options available and their strengths and weaknesses. Using your newfound knowledge of the tools available, we hope that your next data migration project is a success.

Matt Tanner
Developer Relations Lead

Take Arcion for a Spin

Deploy the only cloud-native data replication platform you’ll ever need. Get real-time, high-performance data pipelines today.

Get started for free

5 connectors: Oracle, MySQL, Databricks, Snowflake, SingleStore

Pre-configured enterprise instance

Available in four US AWS regions

Free download

20+ enterprise source and target connectors

Deploy on-prem or VPC

Satisfy security requirements

Join the waitlist for Arcion Cloud (beta)

Fully managed, in the cloud.

Start your 30-day free trial with Arcion self-hosted edition

Self managed, wherever you want it.

Please use a valid email so we can send you the trial license.