Announcing Arcion August 2022 Release: Do More With Less

Rajkumar Sen
CTO @ Arcion
August 17, 2022
5
 min read
Get a migration case study in your inbox
Join our newsletter

I am super excited to announce the latest release from the Arcion team. The team put in a lot of hard work to bring you the critical features and functionalities that will make our real-time, CDC pipelines much faster and easier to use. Not only have we expanded our ever-growing network of our enterprise database connectors, we have also added some exclusive platform features:

  • Oracle Native Log Reader functionality
  • Support for DDL/Schema evolution in Real-time CDC (Exclusive to Arcion)
  • No-code, streaming column transformations
  • Support BigQuery and Azure-Managed SQL Server as Source & Imply as Target

Let’s look at each of these features in detail.

Table of Contents

Oracle Native Log Reader (Beta): 10x Faster Extraction from Oracle with Zero Impact  

Arcion’s Oracle Native Log Reader feature means that Arcion now reads directly from Oracle redo log files (online and archive) to stream CDC records out of an Oracle database. This is a tremendous win for all our users. With Oracle Native Log Reader, Arcion users can stream changes out of Oracle without connecting to Oracle and running expensive LogMiner queries inside Oracle (which could impact the performance of the Oracle server and also consume a lot of PGA memory).  

Another benefit of the Native Log Reader compared to Oracle API (e.g., LogMiner) is the complete decoupling from an existing Oracle feature which had several limitations (e.g. LogMiner cannot mine binary column data more than 4KB). 

But the secret sauce is when the Oracle Native Log Reader combined with our scalable, parallel, end-to-end multi-threaded architecture, Arcion is able to extract CDC records out of Oracle faster than any traditional CDC solutions. If you ever have trouble extracting data out of high-volume Oracle installations like RAC, etc., because of the terabytes of redo log it produces daily, Arcion is your answer. 

Oracle Native Log Reader is currently in beta and available with Arcion Self-hosted, and it’ll be available in Arcion Cloud later this year.  

Arcion's Architecture

Automatic DDL/Schema Evolution: Minimize Pipeline Downtime & No More Manual Work

Keeping schemas aligned between Source and Target is a classic problem among data engineers who set up and maintain pipelines. Previously, replication would need to be stopped if there was a change to the DDL or schema on the source database. Relevant updates would need to be made to the config files or schema to reflect the change on the target system. Only then could replication resume–if it didn’t need to start over. This, of course, leads to downtime, consumes expensive compute resources, and can be prone to user error and data loss when editing configuration files.

Arcion has always supported automated DDL and schema evolution without requiring any user intervention, including being the only platform to currently support out-of-the-box DDL replication with Snowflake and Databricks. Our latest iteration aims to make it even more seamless, reliable, and easy to use. Now, Arcion automatically replicates DDL and schema updates to the target database when a change is made on the source, eliminating downtime. Furthermore, Arcion does this automatically, meaning no manual changes to the configuration files and no manual pipeline restarts are needed. Everything is taken care of by Arcion platform without requiring any user intervention.

Here is a quick demo of how Arcion handles schema evolution from MySQL to Databricks in real time.

Streaming Column Transformations: Make Data Team’s Life Easier Without Custom Code 

Arcion users can now do streaming column transformations and compute new columns when moving data from the Source to the Target in-flight. To achieve this previously, data engineers had to create a staging table in the target platform and then write custom code to transform data to generate new columns on the target. This inefficient approach resulted in lower performance, increased engineering efforts, and delayed SLAs. 

With Streaming Column Transformations, Arcion performs the transformation on-the-fly without any custom code from the user. Instead, Arcion takes care of the configuration for you and automatically transforms your data within the pipeline while moving to the target. The resulting column is created and data loaded during replication. This optimization makes the entire process easier and results in better performance. 

Many other CDC solutions offer similar transformation capabilities, but they all require data engineering teams to write custom Java or Python code or some form of declarative code. Arcion does it for you without any custom code.

A typical use of this feature is when you want to create a new computed column in a target analytics platform like Databricks or Snowflake (new partitioning column), where the new column is computed as a function of some source columns.

New Connectors: BigQuery, Azure-Managed SQL Server, and Imply

Arcion expanded support and added more connectors to our arsenal. Now, you can extract data from Google BigQuery and Azure SQL Managed Instance as supported sources. These will be supported on Arcion Self-hosted and the Arcion Cloud support will come soon. 

Founded by the original creators of Apache Druid, Imply is a new generation of database that’s designed for real-time, modern analytics applications. Now with Arcion and Imply, users can build analytics applications that deliver interactive data experiences at any scale. Arcion supports Imply on both Arcion Self-hosted and Arcion Cloud.

Give Arcion a try!

With our latest innovations going live, there’s never been a better time to try Arcion. If your organization uses Oracle and thinks of integrating with Snowflake or Databricks for your analytic workloads, Arcion can accelerate your data migration and replication efforts significantly with the new Oracle Native Log Reader feature. Moreover, with Automatic Schema Evolution (DDL) and Streaming Column Transformation, Arcion removes the pain of writing custom code and the burden of manual processes that impacts the data pipelines.  

All the new features are available within Arcion Self-hosted, and will be available within Arcion Cloud later this year.

Ready to give it a go? Download Arcion Self-hosted for free (we don't ask for any payment info), and let me know how you like it. Share your feedback in our community Slack channel. Hope you enjoy the new release!

Back to Blog
Announcing Arcion August 2022 Release: Do More With Less

Announcing Arcion August 2022 Release: Do More With Less

Rajkumar Sen
CTO @ Arcion
August 17, 2022

Oracle Native Log Reader (Beta): 10x Faster Extraction from Oracle with Zero Impact  

Arcion’s Oracle Native Log Reader feature means that Arcion now reads directly from Oracle redo log files (online and archive) to stream CDC records out of an Oracle database. This is a tremendous win for all our users. With Oracle Native Log Reader, Arcion users can stream changes out of Oracle without connecting to Oracle and running expensive LogMiner queries inside Oracle (which could impact the performance of the Oracle server and also consume a lot of PGA memory).  

Another benefit of the Native Log Reader compared to Oracle API (e.g., LogMiner) is the complete decoupling from an existing Oracle feature which had several limitations (e.g. LogMiner cannot mine binary column data more than 4KB). 

But the secret sauce is when the Oracle Native Log Reader combined with our scalable, parallel, end-to-end multi-threaded architecture, Arcion is able to extract CDC records out of Oracle faster than any traditional CDC solutions. If you ever have trouble extracting data out of high-volume Oracle installations like RAC, etc., because of the terabytes of redo log it produces daily, Arcion is your answer. 

Oracle Native Log Reader is currently in beta and available with Arcion Self-hosted, and it’ll be available in Arcion Cloud later this year.  

Arcion's Architecture

Automatic DDL/Schema Evolution: Minimize Pipeline Downtime & No More Manual Work

Keeping schemas aligned between Source and Target is a classic problem among data engineers who set up and maintain pipelines. Previously, replication would need to be stopped if there was a change to the DDL or schema on the source database. Relevant updates would need to be made to the config files or schema to reflect the change on the target system. Only then could replication resume–if it didn’t need to start over. This, of course, leads to downtime, consumes expensive compute resources, and can be prone to user error and data loss when editing configuration files.

Arcion has always supported automated DDL and schema evolution without requiring any user intervention, including being the only platform to currently support out-of-the-box DDL replication with Snowflake and Databricks. Our latest iteration aims to make it even more seamless, reliable, and easy to use. Now, Arcion automatically replicates DDL and schema updates to the target database when a change is made on the source, eliminating downtime. Furthermore, Arcion does this automatically, meaning no manual changes to the configuration files and no manual pipeline restarts are needed. Everything is taken care of by Arcion platform without requiring any user intervention.

Here is a quick demo of how Arcion handles schema evolution from MySQL to Databricks in real time.

Streaming Column Transformations: Make Data Team’s Life Easier Without Custom Code 

Arcion users can now do streaming column transformations and compute new columns when moving data from the Source to the Target in-flight. To achieve this previously, data engineers had to create a staging table in the target platform and then write custom code to transform data to generate new columns on the target. This inefficient approach resulted in lower performance, increased engineering efforts, and delayed SLAs. 

With Streaming Column Transformations, Arcion performs the transformation on-the-fly without any custom code from the user. Instead, Arcion takes care of the configuration for you and automatically transforms your data within the pipeline while moving to the target. The resulting column is created and data loaded during replication. This optimization makes the entire process easier and results in better performance. 

Many other CDC solutions offer similar transformation capabilities, but they all require data engineering teams to write custom Java or Python code or some form of declarative code. Arcion does it for you without any custom code.

A typical use of this feature is when you want to create a new computed column in a target analytics platform like Databricks or Snowflake (new partitioning column), where the new column is computed as a function of some source columns.

New Connectors: BigQuery, Azure-Managed SQL Server, and Imply

Arcion expanded support and added more connectors to our arsenal. Now, you can extract data from Google BigQuery and Azure SQL Managed Instance as supported sources. These will be supported on Arcion Self-hosted and the Arcion Cloud support will come soon. 

Founded by the original creators of Apache Druid, Imply is a new generation of database that’s designed for real-time, modern analytics applications. Now with Arcion and Imply, users can build analytics applications that deliver interactive data experiences at any scale. Arcion supports Imply on both Arcion Self-hosted and Arcion Cloud.

Give Arcion a try!

With our latest innovations going live, there’s never been a better time to try Arcion. If your organization uses Oracle and thinks of integrating with Snowflake or Databricks for your analytic workloads, Arcion can accelerate your data migration and replication efforts significantly with the new Oracle Native Log Reader feature. Moreover, with Automatic Schema Evolution (DDL) and Streaming Column Transformation, Arcion removes the pain of writing custom code and the burden of manual processes that impacts the data pipelines.  

All the new features are available within Arcion Self-hosted, and will be available within Arcion Cloud later this year.

Ready to give it a go? Download Arcion Self-hosted for free (we don't ask for any payment info), and let me know how you like it. Share your feedback in our community Slack channel. Hope you enjoy the new release!

Rajkumar Sen
CTO @ Arcion

Take Arcion for a Spin

Deploy the only cloud-native data replication platform you’ll ever need. Get real-time, high-performance data pipelines today.

Get started for free

5 connectors: Oracle, MySQL, Databricks, Snowflake, SingleStore

Pre-configured enterprise instance

Available in four US AWS regions

Free download

20+ enterprise source and target connectors

Deploy on-prem or VPC

Satisfy security requirements

Join the waitlist for Arcion Cloud (beta)

Fully managed, in the cloud.

Start your 30-day free trial with Arcion self-hosted edition

Self managed, wherever you want it.

Please use a valid email so we can send you the trial license.