When migrating data from PostgreSQL, Oracle Database is a prevalent destination. For actually migrating the data, there are plenty of different methods that can be used. In this blog, we will look at two approaches: one using Arcion for an automated approach to data migration and a custom code method that requires various manual steps. First, let’s look at Postgres and Oracle a bit further before diving into the migration details.
Introduction to Postgres
Originating from the work done during the University of California's Ingress project, PostgreSQL, often termed Postgres derived from “POST inGRES”, is a versatile open-source RDBMS. Not only does it adhere to SQL standards, but its distinct extensibility allows for creating unique data types, functions, and more, coupled with ACID compliance underpinning the database. Powered by an active community, it continuously receives a steady flow of innovative features and security patches covering the latest vulnerabilities. As an active open-source database, Postgres is a testament to the power of open collaboration with its comprehensive documentation and vast support.
Key Features of Postgres
As a free-to-use software, Postgres is chosen by many for its cost-effectiveness. It rivals proprietary systems in performance and versatility, mainly due to community contributions that enhance its feature set at an accelerated pace.
With the implementation of write-ahead logs, Postgres ensures data integrity and consistency. It fully embodies the ACID principles and supports complex database operations like foreign keys, joins, and stored procedures.
A solid adherence to SQL standards ensures that familiar SQL tools and commands directly apply to Postgres, eliminating the learning curve for new users.
Postgres has features such as free-text search, JSON data type support, and point-in-time recovery. These allow for a diverse application range, from No-SQL workloads to geospatial systems.
Developers can seamlessly integrate Postgres using popular programming languages like Python, Java, and Ruby, streamlining their projects' database-related tasks.
Introduction to Oracle
Oracle is a popular Relational Database Management System (RDBMS) produced and marketed by Oracle Corporation. It is one of the most used RDBMS to store, organize, and retrieve information. Oracle was the first database to use Enterprise Grid Computing (EGC) to perform tasks. EGC helps retrieve data from any location through a group of computers connected to a network. This makes Oracle a flexible and cost-effective way to manage information and applications.
Oracle runs on multiple types of servers across different operating systems, such as Windows, UNIX, Linux, and macOS. Oracle is a multi-model database system where each database is a data collection treated as a unit to store and retrieve related information. Oracle offers in-memory, NoSQL, and MySQL databases through on-premise deployments or via Oracle Cloud.
Oracle has five editions of its database to choose from: Standard Edition One, Standard Edition, Enterprise Edition, Express Edition (XE), and Personal Edition. For further information on Oracle, visit here.
Key Features of Oracle
Scalability and Performance
Oracle boasts high scalability and performance, being optimized for large enterprise environments. It ensures data concurrency, consistency, and reduced latency for improved querying efficiency.
As an enterprise solution, Oracle integrates top-tier security features. This includes robust authorization mechanisms, data encryption (at rest and in transit), data masking, continuous auditing, and comprehensive security assessments.
Backup and Recovery
With built-in features for data recovery during failovers, Oracle minimizes data loss risks. Its backup and restore capabilities further solidify its resilience against data setbacks.
Oracle is versatile in its deployment options and can be easily deployed on OS’s such as Windows, MacOS, and Linux. It can also be used in cloud or hybrid settings, which is great for tailoring the platform for each user's individual needs.
How to Migrate Data from Postgres to Oracle
In this section, you will be shown how to migrate data from Postgres to Oracle using two methods. The first approach will be through Arcion, a data management platform with sub-second latency, data consistency, and Change Data Capture (CDC) Support. Arcion can simplify the process of migrating data from Postgres to Oracle as it has automatic schema conversion. The second method explored below is the manual method of moving data from Postgres to Oracle using command line tools and other SQL developer resources.
Method 1: Postgres to Oracle using Arcion
To use Arcion to migrate data from Postgres to Oracle, you need to download and install the Arcion Self-Hosted version. Arcion also has a cloud offering, but for this tutorial, you will utilize the version that supports the on-premises installation.
Arcion’s CLI tool, Replicant, is what you will use to migrate data from PostgreSQL to Oracle. To enable this to work correctly, you will need to download Replicant, set up the necessary users and permissions, and configure the required settings. Below are a series of steps to help you achieve this.
Step 1: Download And Install Arcion Self-hosted
You will first need to download and install the self-hosted version of Arcion. To do so, follow the steps outlined on the Arcion Self-hosted webpage. Once you have downloaded Arcion’s CLI tool Replicant, you must create a home directory for it. By default, the home directory will be where Replicant was downloaded. From now on, this home directory will just be referred to as $REPLICANT_HOME in the examples below. The next step is adding your credentials to your Arcion instance, which is covered in the quick start guide that you can reference for more details on how to do this.
Having downloaded and installed Arcion Self-hosted, you can move on to the next steps, where we will configure and enable Arcion to connect with PostgreSQL and Oracle Database.
Step 2: Create a user in PostgreSQL
After the installation of Arcion, the next step is to create a user in PostgreSQL by executing the following steps.
- Connect to your database server and log into the PostgreSQL client.
- Create a user that will be used for replication.
- Grant the permissions listed below.
Step 3: Setup PostgreSQL for Replication
First, you’ll need to open the postgresql.conf file so it can be edited. Below is the command you can use to edit it with vi.
Next, set up the parameters as shown below in postgresql.conf
Lastly, to enable log consumption for CDC replication on the PostgreSQL server, the test_decoding plugin that is, by default, installed in PostgreSQL can be used. Alternatively, you can install the logical decoding plugin wal2json. To use wal2json, first install it using the documentation here.
Next, create a replication slot using the SQL command below.
You can use the command below to verify that the slot has been created.
Step 4: Setup Connection Configuration
From $REPLICANT_HOME, go to the connection configuration file.
Next, you can instruct Replicant to retrieve your connection credentials from AWS Secrets Manager if stored there. If not, simply put them as shown in the sample below.
You can also enable SSL for the connection by including the SSL field and specifying the needed parameters.
Step 5: Setup Filter Configuration for PostgreSQL
From $REPLICANT_HOME, open the filter configuration file.
Specify the data to be replicated based on your replication needs using the template format below.
Step 6: Setup Extractor Configuration
To set up the extractor configuration, you must specify the configuration in the Extractor configuration file according to your requirements. A sample Extractor configuration file, postgresql.yaml, is in the $REPLICANT_HOME/conf/src directory. In this file, you can configure either snapshot, realtime, or delta-snapshot replication modes based on your needs by specifying parameters within each respective mode.
Below is an example of the configuration for snapshot mode.
Step 7: Setting Up The Connection Configuration for Oracle
With your PostgreSQL connection configured, you can now set up the Oracle connection.
For Replicant to work with Oracle Database, you must install Oracle JDBC Driver as a dependency. To do so, follow the steps below.
- Go to the Oracle Database JDBC driver Downloads page.
- Download the Oracle JDBC Driver ojdbc8.jar.
- Put the ojdbc8.jar file inside the $REPLICANT_HOME/lib directory.
Next, create a shared directory to which the Replicant host and the Oracle host will have READ and WRITE access.
After that, you can set up Oracle user permissions for the host replicant user. In the examples below, the shared directory created in the step above is assumed to be in the following path: /data/shared_fs
You must then manually create a user schema and a schema named io_replicate. Grant both of them permission to access a tablespace from an Oracle client.
With the appropriate permissions set, you can set up the connection configuration. From $REPLICANT_HOME, go to the sample configuration and edit it. Below is the command to edit the configuration using vi.
Below is the sample configuration.
Ensure that the specified user for the migration has the following schema privileges that enable the data replication in the tables to be created.
- CREATE TABLE
- CREATE SCHEMA
- CREATE ANY DIRECTORY
- ALTER TABLE
Step 8: Setting Up The Applier Configuration
Next, you need to add the Applier configuration for Oracle. The Applier configuration file contains all the parameters Replicant uses while loading data into the Oracle Database.
From $REPLICANT_HOME, go to the sample Oracle Applier configuration file:
Below is a sample configuration.
Step 9: Running Replicant
From $REPLICANT_HOME, run the following command to execute Replicant in snapshot mode. When Replicant starts up, the pipeline will be immediately activated after executing the command.
After executing the command, the pipeline is started in snapshot mode, and the data migration process begins.
Advantages of using Arcion
Some benefits of using an integrated data management platform like Arcion instead of a manual or native database solution when migrating data from Postgres to Oracle are highlighted below.
Thanks to its distributed & highly scalable architecture, Arcion is the only CDC solution with an underlying end-to-end multi-threaded architecture supporting vertical and horizontal auto-scaling. Its patent-pending technology parallelizes every single Arcion CDC process for maximum throughput. So, users get ultra-low latency and maximum throughput even as data volume grows.
Automated Schema Management
Arcion allows smooth schema management of your data, ensuring data integrity, reliability, and consistency. As the schema evolves on the MySQL instance, these changes will automatically be applied to the BigQuery instance to keep both in sync.
100% Agentless CDC
Arcion is the only CDC vendor that offers 100% agentless CDC to all its supported 20+ connectors. Arcion’s agentless CDC connectors apply to all the complex enterprise databases modern enterprises use, like Microsoft SQL Server, MongoDB, and all versions of Oracle. Arcion reads directly from the transaction logs, never from the database itself. Previously, data teams faced administrative nightmares and security risks associated with running agent-based software in production environments. You can now replicate data in real-time, at scale, with guaranteed delivery — but without the inherent performance issues or the security concerns of agent-based connectors.
Transactional Integrity and Consistency
Arcion provides transactional integrity and data consistency through its CDC technology. To further this effort, Arcion also has built-in data validation support that works automatically and efficiently to maintain data integrity. It offers a solution for scalable data migration and replication while ensuring zero data loss has occurred.
No-code, Easy Setup
Effortless setup & maintenance: Arcion's no-code platform removes DevOps dependencies; you do not need to incorporate Kafka, Spark Streaming, Kinesis, or other streaming tools. So you can simplify the data architecture, saving both time and cost.
Stay Compliant by Default
Arcion is SOC 2 Type 1 & Type 2, HIPAA, and PCI compliant. The enterprise-grade security and compliance standards ensure data governance.
Method 2: Postgres to Oracle (Manual Method)
To manually migrate data from Postgres to Oracle, you must first extract it from the Postgres database, then import it into Oracle DBMS using a tool like Oracle SQL Developer. Oracle SQL Developer is a Graphical User Interface (GUI) tool by Oracle that provides a convenient way for developers to perform database tasks, including importing databases for migration purposes. First, let’s look at extracting the data from PostgreSQL.
Step 1: Extract the Data from PostgreSQL
There are several ways of extracting or exporting data from Postgres. For example, you can use the COPY TO command or the pg_dump utility to back up databases.
To use the COPY TO command, you can use the sample COPY TO command below. This command will extract data from the specified table and load the contents into a CSV file.
Similarly, you can run the pg_dump utility to extract data from a database. Below is a sample command that exports an entire database into a SQL file called database.sql.
Step 2: Import The Data Into Oracle
The next step is to import the extracted data into Oracle DBMS. To achieve this, you will utilize Oracle SQL Developer to bring the PostgreSQL data into the Oracle instance.
In Oracle SQL Developer, go to Database on the menu, then click Import Data to import this data into a new table. Next, input the necessary configuration details in the Data Import Wizard and select your desired CSV file or data to import.
Go to Data Import > Data Formats, where you will specify the data format from the source database, and then go to the Data Import > Mapping tab to change the mapping of the data types if you so desire. You can now click on the Import button to start the migration process. Once the import is complete, your Postgres data will be available in Oracle.
Limitations and Challenges of Using Custom Code Method
The custom code method described above is a manual process that is tedious and prone to errors and data loss. It is a brittle process that must be managed carefully and can lead to poor performance, especially at scale, and a drop in productivity for the developers implementing it. Below are further limitations of the custom code approach.
Data Type and Conversion Challenges
PostgreSQL and Oracle may handle data types and data conversion differently. This can lead to potential challenges when migrating data, especially with complex data types or transformations. To ensure data is migrated successfully, the migration requires careful attention and validation to ensure the data is accurately converted during the migration process.
Manual Effort and Maintenance
Implementing custom scripts for data migration requires manual effort and ongoing maintenance. As the data structures or schemas change over time, the scripts to load the data must be updated to ensure the proper transformations.
Lack of Automation
The custom code approach does not have the automation capabilities offered by data management platforms like Arcion. If migrations are required to be performed on an ongoing basis, building the pipelines to move the data manually will consume much developer time.
This article looked at two popular and powerful database management systems - PostgreSQL and Oracle Database. You were also shown their key features and what makes them stand out amongst database systems. Next, we explored two approaches for migrating data from Postgres to Oracle. The first method utilized a data management platform, Arcion, and the second method was a manual process of moving data from Postgres to Oracle. The key advantages of using a platform like Arcion were outlined alongside the limitations of a custom code approach.
At this point, you should have enough knowledge and the background information required to migrate your data from Postgres to Oracle Database successfully. A solution like Arcion leverages real-time, in-memory Change Data Capture (CDC) that guarantees scalability and ensures data consistency is preferred when moving data from PostgreSQL to Oracle. To efficiently implement a data integration strategy, get started with Arcion today by connecting with our team. CDC pipelines built with Arcion are easy to build, maintain, and scale and are production-ready in minutes.