Sybase To MongoDB: Load Data In Minutes

Luke Smith
Enterprise Solutions Architect
August 25, 2023
Matt Tanner
Developer Relations Lead
August 25, 2023
Matt Tanner
Matt Tanner
Developer Relations Lead
September 15, 2023
15
 min read
Join our newsletter

Introduction

In today's data-driven world, organizations constantly search for database technologies that provide greater flexibility, scalability, and performance. While traditional relational databases like Sybase have been valuable assets, evolving data requirements have led to a trend toward moving onto NoSQL databases. One of the most popular of those options is MongoDB. This guide provides an in-depth look into Sybase and MongoDB and a comprehensive step-by-step process for migrating your data from Sybase to MongoDB. We will cover two distinct methods: Arcion for an automated and scalable approach and a more manual approach. Let’s start by looking at the basics of Sybase and MongoDB.

Table of Contents

Introduction to Sybase

Sybase was a software company that specialized in the production of enterprise software, particularly for the storage and analysis of data. Sybase came to represent a suite of database products, including its most popular product, Sybase Adaptive Server Enterprise (ASE). Sybase ASE was a row-based Relational Database Management System (RDBMS) that could perform Online Transaction Processing (OLTP) operations efficiently and scalably. 

Sybase and all of its products were acquired by SAP in 2010. Sybase’s products were completely absorbed into the SAP ecosystem through this acquisition. The Sybase name was also discontinued and removed from all Sybase products in 2014. As part of this, Sybase Adaptive Server Enterprise became known as SAP Adaptive Server Enterprise. It continued to be a viable and widely-used RDBMS solution as it was already in use in the data stack of many enterprises up until the recent end-of-support announcement. With the announcement of the end of support by 2025, it has become imperative for organizations to migrate legacy data from the platform onto something more modern and with long-term support.

Key Features of Sybase

Sybase, specifically the Adaptive Server Enterprise (ASE), offers many features that make it a powerful choice for various applications. Let's take a closer look at some of these key features.

High Performance

Sybase ASE is known for its remarkable performance, especially in transaction-intensive applications. Its ability to process large amounts of data quickly and efficiently makes it popular in industries like finance, where handling high-volume data transactions quickly is vital.

Robust Security

In a time when data security threats are ever-present, Sybase's robust security measures stand out. It offers a comprehensive security framework, including role-based access control and high-grade encryption. These features thoroughly protect sensitive data and ensure only authorized users can access specific information.

Scalability

Scalability is a defining feature of Sybase ASE, accommodating the growing data needs of applications. Whether it's a small application that may grow or a large one handling vast amounts of data, Sybase can efficiently scale with the demand, ensuring smooth data operations and availability.

Reliability

Reliability is crucial for mission-critical applications, and Sybase ASE has an excellent reputation for it. Recognized for its stability and resilience, Sybase’s built-in fault tolerance mechanisms help to minimize disruptions. 

Introduction to MongoDB

MongoDB is an open-source, cross-platform documented-based database solution, with MongoDB Atlas as their multi-cloud database service.. That means that, unlike relational databases, it is based on the NoSQL paradigm. MongoDB does not store data in tabular format or relational tables; data is stored as documents in collections. Essentially these are large JSON-like objects where the data is stored. Because of this, it is possible to build scalable applications using MongoDB. The flexible schema approach of NoSQL and MongoDB means that data is not modeled with strict relationships. Many developers prefer to use NoSQL databases when building and deploying large scalable applications; MongoDB Atlas is a common choice for developers to quickly deploy MongoDB in the cloud to build even faster. 

MongoDB is particularly suited for working with large sets of distributed data. As mentioned previously, MongoDB manages document-oriented information using a JSON-like format for storing and retrieving information. MongoDB makes storing structured and unstructured data easy, handling a high volume of data and scaling it vertically and horizontally.

MongoDB supports easy-to-use and configured drivers for all major programming languages, so connectivity is simple. Using MongoDB allows developers to start building applications immediately without configuring a database schema first since no schema is required to store data. 

Key Features of MongoDB

MongoDB offers many features that make it a powerful choice for a wide range of applications and a popular NoSQL solution. Let's take a closer look at some of these key features.

Document-Oriented Storage

MongoDB uses a document-oriented model, storing data in BSON format (a binary representation of JSON-like documents). This allows MongoDB to store complex data types with ease.

Highly Scalable

MongoDB's built-in sharding allows it to handle large amounts of data by distributing it across multiple servers. MongoDB, especially when deploying with MongoDB Atlas, allows for easy and cost-efficient scaling.

Flexible Schema

Unlike SQL databases, MongoDB allows you to have a flexible schema, enabling you to store documents of different structures in the same collection. This is especially useful in the latest generations of applications with complex data structures that don’t fit into traditional tabular forms.

How to Migrate Data from Sybase to MongoDB

When migrating data from Sybase to MongoDB, there are many ways to implement a data pipeline to facilitate migration. This section will look at two ways to migrate and replicate data. First, we will look at Arcion to move data from Sybase to MongoDB in an automated fashion. Secondly, we will look at migrating data manually with some internal tools built into the platforms.

Method 1: Sybase to MongoDB Using Arcion

Arcion is a cloud-native Change Data Capture solution that guarantees terabyte-scale scalability and data consistency. It also has integrations to various enterprise databases and data warehouses. Arcion is a highly performant solution for data migration and data replication. In this section, We will use Arcion to migrate data from Sybase ASE to MongoDB (including MongDB Atlas).

Step 1: Download And Install Arcion Self-hosted

First, we will need to download and install Arcion Self-hosted. This will require a few steps, including downloading Replicant, creating a home directory, and adding your license. Each step can be detailed by referencing our quickstart guide in our docs.

Next, we must configure and enable Arcion to connect with Sybase and MongoDB. We will refer to the directory where Replicant was downloaded to as $REPLICANT_HOME in the following steps. Now, let’s go ahead and configure all of the connections with Arcion.

Step 2: Set up Connection Configuration for Sybase ASE

From $REPLICANT_HOME, navigate to the sample connection configuration file:

vi conf/conn/sybasease_src.yaml

If you store your connection credentials in AWS Secrets Manager, you can tell Replicant to retrieve them. For more information, see Retrieve credentials from AWS Secrets Manager.


Otherwise, you can put your credentials like usernames and passwords in plain form like the sample below:

type: SYBASE_ASE
host: HOSTNAME
port: PORT_NUMBER
database: 'DATABASE_NAME'
username: 'USERNAME'
password: 'PASSWORD'
max-connections: 20
max-retries: 10
retry-wait-duration-ms: 1000

Replace the following placeholders in the example above with:

HOSTNAME: hostname of the SAP ASE server

PORT_NUMBER: port number of the SAP ASE server

DATABASE: the name of the SAP ASE database to connect to

USERNAME: the username of the DATABASE user

PASSWORD: the password associated with USERNAME

After this is complete, save the file.

Note: If you want to use the bcp utility for extracting data from your Source ASE, you’ll need to specify some additional parameters in the connection configuration file. For more information, see Use bcp Utility for Extraction.

Step 3: Set up Extractor Configuration for Sybase ASE

From $REPLICANT_HOME, navigate to the Extractor configuration file:

vi conf/src/sybasease.yaml

Arcion supports both snapshot and realtime modes for SAP ASE. In this example, we will set the configuration to use realtime mode.

Step 3.1: Use realtime mode

First, make sure that the ASE account you specified in the Replicant connection configuration file has the following permissions granted:

  • sa_role
  • replication_role
  • sybase_ts_role

After that, you can specify extraction parameters under the realtime section of the configuration file. Below is a working sample:

realtime:
 threads: 1
 fetch-size-rows: 100000
 fetch-interval-s: 10
 _traceDBTasks: true
 heartbeat:
   enable: true
   catalog: tpch
   schema: blitzz
   interval-ms: 10000

It’s important to note that the fetch-interval-s parameter determines the interval between each CDC fetch cycle. Always make sure to keep its value above or equal to 10. For more information, see Limitations in the docs.

For a detailed explanation of configuration parameters in the Extractor file, read Extractor Reference.

Step 4: Set up connection configuration for MongoDB

From $REPLICANT_HOME, navigate to the sample MongoDB Connection configuration file:

vi conf/conn/mongodb_dst.yaml

For connecting to the MongoDB server, you can choose between the following methods for an authenticated connection:

In this example, we will simply connect to the server using a MongoDB connection string. The below example shows how the credentials can be added to the connection configuration file.

type: MONGODB
url: "mongodb://localhost:27019/?w=majority"
max-connections: 30
replica-sets:
  mongors1:
    url: "mongodb://localhost:27017/?w=majority&replicaSet=mongors1"
  mongors2:
    url: "mongodb://localhost:27027/?w=majority&replicaSet=mongors2"

Replace the following placeholders in the example above with your values. The value definitions can be seen below for each field:

type: The connection type representing the database. In this case, it’s MONGODB.

url: You can specify additional connection configurations in the url string according to the MongoDB syntax. For example, you can specify the number of connections, Read Concern Options, Write Concern Options, etc. For more information, see Connection String Options.

max-connections: The maximum number of connections Replicant uses to load data into the MongoDB system.

Replica-sets: For multiple replica-sets, specify all of them under replica-sets according to the preceding format.

Replicant monitors the replica-sets for oplog entries to carry out real-time replication. Each url of a MongoDB replica set must represent the host:port belonging to the replica set. url must contain the option replicaSet=<replicaSet_name> to represent the URL as a replica set.

Once these details are added, save the MongoDB configuration file.

Step 5: Set up Applier configuration

From $REPLICANT_HOME, navigate to the sample MongoDB Applier configuration file:

vi conf/dst/mongodb.yaml

The configuration file has two parts:

  • Parameters related to snapshot mode.
  • Parameters related to realtime mode.

Step 5.1: Parameters related to snapshot mode

To operate in snapshot mode, we will use the snapshot section in the mongodb.yaml config file opened at the start of Step 5. Below is an example of how to set up the config file for snapshot mode.

snapshot:
  threads: 16
  batch-size-rows: 5000
  txn-size-rows: 5000
#  map-key-to-id: false
#  skip-tables-on-failures : false
  bulk-load:
    enable: false
    type: FILE  # PIPE, FILE
  handle-failed-opers: true
  initIndexesPostSnapshot: true
#   denormalize:
#     enable: true
#  user-role:
#    init-user-roles: true
#  init-system-tables: true

Step 5.2: Parameters related to realtime modeTo operate in realtime mode, we will use the realtime section in the mongodb.yaml config file opened at the start of Step 5. Below is an example of how to set up the config file for realtime mode.

realtime:
  threads: 8
  batch-size-rows: 1000
  txn-size-rows: 2_0000
  handle-failed-opers: true
#  map-key-to-id: false
#  skip-tables-on-failures : false
#   perTableConfig:
#   - schema: tpch
#     tables:
#       CUSTOMER:
#         skip-upto-cursor: '{"extractorId":0,"replicaSetName":"mongors1","resumeToken":6868517489379115009,"seqNum":3,"v":1,"timestamp":1599201348000}'
# Transactional mode config
# realtime:
#   threads: 1
#   batch-size-rows: 1000
#   replay-consistency: GLOBAL #allowed values are GLOBAL/EVENTUAL
#   txn-group-count: 100
#   skip-upto-cursors: ['{"extractorId":0,"replicaSetName":"mongors1","resumeToken":6868517489379115009,"seqNum":3,"v":1,"timestamp":1599201348000}']

For more information about the configuration parameters for realtime mode, see the Realtime Mode docs.

Now that your MongoDB connector is configured, we can run Replicant, Arcion’s program for running the CDC pipeline.

Step 6: Running Replicant

From $REPLICANT_HOME, run the following command to execute Replicant in realtime mode:

./bin/replicant realtime conf/conn/sybasease_src.yaml \
conf/conn/mongo_dst.yaml \
--extractor conf/src/sybasease.yaml \
--applier conf/dst/mongo.yaml  \
--id repl2 --replace –overwrite

Once the command is executed, Replicant will start up the pipeline. In real-time mode, Replicant first creates the destination schemas if they are not already present. If the destination schemas are present, Replicant appends to the existing tables. 

In real-time mode, Replicant starts replicating real-time operations obtained from log-based CDC. By default, real-time mode starts replicating from the latest log position, but a custom start position can be specified by the user in the real-time section of the extractor configuration file.

With this, you have set up a real-time CDC pipeline with Arcion. As changes occur in the source Sybase database, they will be replicated in your MongoDB instance. In a matter of minutes, with no code, we have created a robust pipeline that will enable us to further many use cases, including running real-time analytics through the MongoDB platform.

Advantages of Using Arcion

The following are the benefits of using Arcion to load data from Sybase to MongoB:

  • Arcion allows smooth schema management of your data, ensuring data integrity, reliability, and consistency. It reads directly from the transaction logs and not from the database itself.
  • Arcion offers sub-second latency due to its distributed & highly scalable architecture. It supports auto-scaling vertically and horizontally so users can get ultra-low latency and maximum throughput even as their data volume grows.
  • Arcion’s in-built data validation works automatically to ensure data integrity is consistently maintained efficiently. This ensures that zero data loss occurs for scalable data migration and replication.
  • Arcion offers effortless setup and maintenance with its no-code platform that removes DevOps dependencies, so you don’t need to incorporate Kafka, Spark Streaming, Kinesis, or other streaming tools. Whether moving data from Sybase to MongoDB, Microsoft SQL Server, or a big data platform, Arcion makes configuration and maintenance easy.
  • Arcion is SOC 2 Type 1 & Type 2, HIPAA, and PCI compliant. The enterprise-grade security and compliance standards ensure data governance.

Method 2: Sybase to MongoDB by Manually Coding

For those who want a more manual approach, this method involves exporting data from Sybase, transforming it into a MongoDB-friendly format, and then importing it into MongoDB. Below is a step-by-step guide using native tools for manually migrating data from Sybase to MongoDB.

Dump Data from Sybase ASE

Our first step is to dump data from Sybase ASE. We will connect to the Sybase ASE database instance using the following `isql` command. Below is an example of how that command looks.

isql -U username -P password -S server_name

Replace `username`, `password`, and `server_name` with your Sybase ASE username, password, and server name.

Once connected and logged into the Sybase ASE database, we will export the database data using the `bcp` utility. This command exports the data from the specified table in your Sybase database to a CSV file. The script below shows an example bcp command.

bcp your_database.your_table out your_file -c -t, -T -S your_server -U your_username -P your_password

In this command, replace the following values:

  • `your_database` with the name of your Sybase database
  • `your_table` with the name of the table you want to export
  • `your_file` with the desired name for your output CSV file (you can specify a path too)
  • `your_server` with your Sybase server name
  • `your_username` with your Sybase username
  • `your_password` with your Sybase password

Convert the Dumped Data to a Format Compatible with MongoDB

Next, we will convert the dumped data to a format compatible with MongoDB. In this example, we will need a Python script to convert the CSV data to JSON format. Below is an example Python script that can convert the SQL data into a NoSQL-friendly format.

import csv
import json


csvFilePath = 'your_file.csv'
jsonFilePath = 'your_file.json'


data = {}


with open(csvFilePath) as csvFile:
    csvReader = csv.DictReader(csvFile)
    for rows in csvReader:
        id = rows['id'] 
        data[id] = rows


with open(jsonFilePath, 'w') as jsonFile:
    jsonFile.write(json.dumps(data, indent=4))

In this script, replace the following values: 

  • `your_file.csv` with the name (and path if not in the same directory) of the CSV file you created in the previous step
  • `your_file.json` with the desired name for your output JSON file. Replace `id` in `id = rows['id']` with the primary key of your Sybase database table.

Load the Data into the MongoDB Database

In this last step, we will load the data into the MongoDB database. We will do this using the `mongoimport` utility. Below is an example script using the mongoimport command.

mongoimport --db your_database --collection your_collection --file your_file.json

In this command, replace the following values:

  • `your_database` with the name of your MongoDB database, 
  • `your_collection` with the name of the MongoDB collection you want to import the data into
  • `your_file.json` with the name of the JSON file you created in the previous step. 

Verify the Data has been Successfully Loaded

Optionally, but recommended, is the final part of our migration: verifying that the data has been successfully loaded into our MongoDB instance. The simplest way is to connect to the MongoDB database and run a query to verify the data has been loaded correctly. If you see the expected documents, you can conclude that the data migration to MongoDB is complete.

Limitations and Challenges of Using Custom Code Method

Even though manually moving data from Sybase to MongoDB with custom code is possible, it does come with some significant limitations and challenges. Let’s take a look at a few of them below.

Time-Consuming

Using custom code can be quite time-consuming, especially for large datasets. It involves multiple steps, including data exporting, data conversion, and data importing, which can significantly prolong the process compared to automated tools.

Technical Knowledge Requirement

The method requires a deep understanding of the Sybase and MongoDB database systems, the `bcp` and `mongoimport` utilities, and proficiency in scripting languages used for data transformation (e.g., Python or JavaScript). It also requires knowledge of data structures, particularly JSON format, which MongoDB prefers for data input.

Risk of Inaccuracies and Data Loss

The manual process increases the chance of errors during data transformation or transfer. This could result in inaccurate data or even data loss. This is why validating the data after migration is crucial to ensure data integrity and accuracy, as we did in the final steps above.

Difficulty Scaling

One of the biggest downsides to using custom code is that it can be difficult to scale. Manual methods might not be ideal for large datasets or complex database structures. They might be less efficient and harder to manage than automated migration tools requiring less management at scale.

Conclusion

Migrating from a traditional database like Sybase to a NoSQL solution like MongoDB can be a game-changer for many organizations. These types of migrations allow for increased flexibility, scalability, and performance. However, the migration process requires careful planning and execution. Whether you choose an automated tool like Arcion or a hands-on, manual coding approach, this guide has equipped you with the knowledge to implement either and understand the benefits and risks. 

As we saw, Arcion is a real-time, in-memory Change Data Capture (CDC) solution that guarantees scalability and ensures data consistency when moving data from Sybase to MongoDB. To easily implement a data integration strategy, get started with Arcion today by connecting with our team of Sybase and MongoDB migration and replication experts. CDC pipelines built with Arcion are easy to build, maintain, and scale and are production-ready in minutes.

Matt is a developer at heart with a passion for data, software architecture, and writing technical content. In the past, Matt worked at some of the largest finance and insurance companies in Canada before pivoting to working for fast-growing startups.
Luke has two decades of experience working with database technologies and has worked for companies like Oracle, AWS, and MariaDB. He is experienced in C++, Python, and JavaScript. He now works at Arcion as an Enterprise Solutions Architect to help companies simplify their data replication process.
Join our newsletter

Take Arcion for a Spin

Deploy the only cloud-native data replication platform you’ll ever need. Get real-time, high-performance data pipelines today.

Free download

8 sources & 6 targets

Pre-configured enterprise instance

Available in four US AWS regions

Contact us

20+ enterprise source and target connectors

Deploy on-prem or VPC

Satisfy security requirements