Dawn of The Data Mobility Era

Luke Smith
Enterprise Solutions Architect
February 17, 2022
Matt Tanner
Developer Relations Lead
February 17, 2022
Gary Hagmueller
February 17, 2022
Get a migration case study in your inbox
Join our newsletter
Table of Contents

Having spent the last 20 years in the technology industry, I’ve witnessed firsthand multiple waves of innovation and the technologies that catalyzed them. During this time, I’ve learned that there are clear signs which indicate the imminent arrival of a massive transformation wave. Here’s the secret: the most profound transformations occur when two distinct secular technology trends converge.  

In the last 30 years, this has only happened a few times: Broadband + Internet = the Dot com era. Business Applications + distributed computing = the SaaS revolution. There are others, but you get the point. These waves form over years. The technologies that eventually converge are typically very different and in the early days don’t seem to have much, if anything, in common.  Convergence, which often happens suddenly, is driven by a catalyst that bridges technical gaps and enables a substantial increase in adoption and value derived from the integrated technologies.  Before long, you can’t tell the trends apart as they form a massive new utilization model that just makes sense to every user.

Today, it should come as no surprise that we sit at a point where two massive markets are merging into one.  The first market is the Machine Learning / AI space.  The second is the mainstream adoption of the cloud.  Both markets have matured immensely in the last 5 years.  The most evident sign of this maturity is the rise of both DevOps teams that drive adoption of the cloud and Data Teams (Data Engineers and Data Scientists) that build and maintain the sophisticated techniques that derive real value out of ML/AI.

But there is a problem… the catalyst is missing

Even though tech publications have heralded “data as the new oil” for a decade now, we’ve actually made few advances in mobilizing data since this notion was first discussed.  While we’ve seen early successes in mobilizing the data that lives in SaaS systems, almost none of those applications actually live in the stream of data as it is being generated.  Instead, “Transactional” data lives in enterprise data stores, and it is this data that is the lifeblood of every company.  Without real-time access to transactions, much of this data’s ability to generate value is completely lost.

And therein lies the problem.  Transaction data stores live in a world of their own.  Companies code business rules into the way they handle transactions and this makes them highly performant for the business, but extremely diverse in the way in which data is handled and processed.  These systems are highly optimized for read & write speed, so querying them creates a performance “tax” that most enterprises are unwilling to pay.  Finally, the security requirements of these systems are amongst the most stringent of all, so anything that touches them needs to be highly scrutinized and locked down.

Yet, the demand for transactional data in real time is immense.  The meteoric rise of our partners such as Databricks and Snowflake (among others) makes it clear that there is massive pent up demand to build robust analysis, applications and models on top of cloud data stores.  Yet, when the onion is peeled back a layer, we see that much of the data flowing into those clouds either comes from peripheral SaaS systems that are already multiple steps removed from transactions or from poorly constructed and very brittle batch-based queries or scripts.  There are legacy solutions in the market, but their unscalable single-node architectures and highly manual requirements mean they are absolutely the wrong choice upon which to build a modern data infrastructure.

I have first hand experience with this problem and its hobbling effect on companies that are desperate to gain competitive advantages from their own data.  My last two companies built AI & ML applications for enterprises.  The AI we delivered powered fraud-detection, customer behavior, churn avoidance, recommendation engines, real time decision support and many other mission critical applications.  The people we sold to wanted these applications to operate on real time data.  But NONE could get anything close to real time, as none of their IT teams were willing to engage the rudimentary replication vendors in the market - it was just too service intensive to set up & the license fees were often prohibitive.  Eventually, every application was built with a data pipeline built out of a brittle batch-based script that was run once per week (or month) & frequently broke.  

Even with this awful infrastructure, my customers still paid 7-figure annual fees to capture the value inherent in those ML/AI models.  

The pain my customers felt is why I’m so excited about Arcion’s mission.  Every Quarterly Customer Review meeting would end with an exasperated executive pleading “if there’s ANYTHING you could do to make these apps refresh closer to real time, you’d be a hero”.  From my own frustration at not being able to serve my customers’ demand, it became clear to me that data mobility is one of the most massive opportunities that sat just below the radar.  While data mobilization may sound a bit prosaic, if you run a company and want to maximize your competitive advantages and delight your customers, there is no more effective way to do so than to leverage the power of your real time transactional data.

Other firms in this space may be great at marketing, but Arcion is great at technology and we are truly unique.  Our platform uses Change Data Capture to tap into the logs that are updated in real time by those transactional databases.  No production impact, no security holes, and no latency.  We can do this for every major database & are even starting to do it for cloud data warehouses.  Many firms will say they can serve this need, but actually delivering real time CDC is really hard to do and there are very few technologists that have the ability to build it.  Why?  Well you effectively need to build a database in reverse to do it properly.  And building a normal database is hard to begin with.  The fact that global enterprises already pay Arcion millions in annual fees before we’ve made any real attempt to commercialize, should demonstrate loudly that the Arcion solution delivers value.

What is irrefutable is that if you get data mobilization right, magic happens.  Arcion is built on the philosophy that a modern data infrastructure empowers:

  • Users to build apps, models & AI on the platform that best suits the need
  • Data to flow from its sources in real time, no matter where it is generated
  • Builders to understand the business rules and formats that govern transactions, so development becomes fast, efficient, accurate and unbiased
  • Zero code setup and governance, so the network of connections remains secure, robust and highly performant – without creating the need to hire a new team

Arcion’s vision is to catalyze the destruction of the data silo, once and for all.  It’s been talked about forever, but Arcion is the first platform that can truly mobilize the data running through any system and make it available to any other.  Arcion eliminates the friction in deriving immediate value out of transactional data.  It’s this ability to bridge the gap that will integrate the advanced analytical value of ML/AI and the ubiquity of cloud based consumption.

Matt is a developer at heart with a passion for data, software architecture, and writing technical content. In the past, Matt worked at some of the largest finance and insurance companies in Canada before pivoting to working for fast-growing startups.
Matt is a developer at heart with a passion for data, software architecture, and writing technical content. In the past, Matt worked at some of the largest finance and insurance companies in Canada before pivoting to working for fast-growing startups.
Luke has two decades of experience working with database technologies and has worked for companies like Oracle, AWS, and MariaDB. He is experienced in C++, Python, and JavaScript. He now works at Arcion as an Enterprise Solutions Architect to help companies simplify their data replication process.
Join our newsletter

Take Arcion for a Spin

Deploy the only cloud-native data replication platform you’ll ever need. Get real-time, high-performance data pipelines today.

Free download

8 sources & 6 targets

Pre-configured enterprise instance

Available in four US AWS regions

Contact us

20+ enterprise source and target connectors

Deploy on-prem or VPC

Satisfy security requirements