Postgres To Oracle: Load Data In Minutes

Luke Smith
Enterprise Solutions Architect
August 29, 2023
Matt Tanner
Developer Relations Lead
August 29, 2023
Luke Smith
Matt Tanner
Developer Relations Lead
August 29, 2023
16
 min read
Join our newsletter

When migrating data from PostgreSQL, Oracle Database is a prevalent destination. For actually migrating the data, there are plenty of different methods that can be used. In this blog, we will look at two approaches: one using Arcion for an automated approach to data migration and a custom code method that requires various manual steps. First, let’s look at Postgres and Oracle a bit further before diving into the migration details.

Table of Contents

Introduction to Postgres

Originating from the work done during the University of California's Ingress project, PostgreSQL, often termed Postgres derived from “POST inGRES”, is a versatile open-source RDBMS. Not only does it adhere to SQL standards, but its distinct extensibility allows for creating unique data types, functions, and more, coupled with ACID compliance underpinning the database. Powered by an active community, it continuously receives a steady flow of innovative features and security patches covering the latest vulnerabilities. As an active open-source database, Postgres is a testament to the power of open collaboration with its comprehensive documentation and vast support.

Key Features of Postgres

Open-Source

As a free-to-use software, Postgres is chosen by many for its cost-effectiveness. It rivals proprietary systems in performance and versatility, mainly due to community contributions that enhance its feature set at an accelerated pace.

Reliability

With the implementation of write-ahead logs, Postgres ensures data integrity and consistency. It fully embodies the ACID principles and supports complex database operations like foreign keys, joins, and stored procedures.

SQL-Compatible

A solid adherence to SQL standards ensures that familiar SQL tools and commands directly apply to Postgres, eliminating the learning curve for new users.

Advanced Features

Postgres has features such as free-text search, JSON data type support, and point-in-time recovery. These allow for a diverse application range, from No-SQL workloads to geospatial systems.

Language Support

Developers can seamlessly integrate Postgres using popular programming languages like Python, Java, and Ruby, streamlining their projects' database-related tasks.

Introduction to Oracle

Oracle is a popular Relational Database Management System (RDBMS) produced and marketed by Oracle Corporation. It is one of the most used RDBMS to store, organize, and retrieve information. Oracle was the first database to use Enterprise Grid Computing (EGC) to perform tasks. EGC helps retrieve data from any location through a group of computers connected to a network. This makes Oracle a flexible and cost-effective way to manage information and applications. 

Oracle runs on multiple types of servers across different operating systems, such as Windows, UNIX, Linux, and macOS. Oracle is a multi-model database system where each database is a data collection treated as a unit to store and retrieve related information. Oracle offers in-memory, NoSQL, and MySQL databases through on-premise deployments or via Oracle Cloud.

Oracle has five editions of its database to choose from: Standard Edition One, Standard Edition, Enterprise Edition, Express Edition (XE), and Personal Edition. For further information on Oracle, visit here.

Key Features of Oracle

Scalability and Performance

Oracle boasts high scalability and performance, being optimized for large enterprise environments. It ensures data concurrency, consistency, and reduced latency for improved querying efficiency.

Security

As an enterprise solution, Oracle integrates top-tier security features. This includes robust authorization mechanisms, data encryption (at rest and in transit), data masking, continuous auditing, and comprehensive security assessments.

Backup and Recovery

With built-in features for data recovery during failovers, Oracle minimizes data loss risks. Its backup and restore capabilities further solidify its resilience against data setbacks.

Cross-Platform

Oracle is versatile in its deployment options and can be easily deployed on OS’s such as Windows, MacOS, and Linux. It can also be used in cloud or hybrid settings, which is great for tailoring the platform for each user's individual needs.

How to Migrate Data from Postgres to Oracle

In this section, you will be shown how to migrate data from Postgres to Oracle using two methods. The first approach will be through Arcion, a data management platform with sub-second latency, data consistency, and Change Data Capture (CDC) Support. Arcion can simplify the process of migrating data from Postgres to Oracle as it has automatic schema conversion. The second method explored below is the manual method of moving data from Postgres to Oracle using command line tools and other SQL developer resources.

Method 1: Postgres to Oracle using Arcion

To use Arcion to migrate data from Postgres to Oracle, you need to download and install the Arcion Self-Hosted version. Arcion also has a cloud offering, but for this tutorial, you will utilize the version that supports the on-premises installation.

Arcion’s CLI tool, Replicant, is what you will use to migrate data from PostgreSQL to Oracle. To enable this to work correctly, you will need to download Replicant, set up the necessary users and permissions, and configure the required settings. Below are a series of steps to help you achieve this.

Step 1: Download And Install Arcion Self-hosted

You will first need to download and install the self-hosted version of Arcion. To do so, follow the steps outlined on the Arcion Self-hosted webpage. Once you have downloaded Arcion’s CLI tool Replicant, you must create a home directory for it. By default, the home directory will be where Replicant was downloaded. From now on, this home directory will just be referred to as $REPLICANT_HOME in the examples below. The next step is adding your credentials to your Arcion instance, which is covered in the quick start guide that you can reference for more details on how to do this.

Having downloaded and installed Arcion Self-hosted, you can move on to the next steps, where we will configure and enable Arcion to connect with PostgreSQL and Oracle Database.

Step 2: Create a user in PostgreSQL

After the installation of Arcion, the next step is to create a user in PostgreSQL by executing the following steps.

  • Connect to your database server and log into the PostgreSQL client.
psql -U $POSTGRESQL_ROOT_USER
  • Create a user that will be used for replication.
CREATE USER <username> PASSWORD '<password>';
  • Grant the permissions listed below.
GRANT USAGE ON SCHEMA "<schema>" TO <username>;
GRANT SELECT ON ALL TABLES IN SCHEMA "<schema>" TO <username>;
ALTER ROLE <username> WITH REPLICATION;

Step 3: Setup PostgreSQL for Replication

First, you’ll need to open the postgresql.conf file so it can be edited. Below is the command you can use to edit it with vi.

vi $PGDATA/postgresql.conf

Next, set up the parameters as shown below in postgresql.conf

wal_level = logical
max_replication_slots = 1

Lastly, to enable log consumption for CDC replication on the PostgreSQL server, the test_decoding plugin that is, by default, installed in PostgreSQL can be used. Alternatively, you can install the logical decoding plugin wal2json. To use wal2json, first install it using the documentation here.

Next, create a replication slot using the SQL command below.

SELECT 'init' FROM pg_create_logical_replication_slot('<replication_slot_name>', 'wal2json');

You can use the command below to verify that the slot has been created.

SELECT * from pg_replication_slots;

Step 4: Setup Connection Configuration

From $REPLICANT_HOME, go to the connection configuration file.

vi conf/conn/postgresql.yaml

Next, you can instruct Replicant to retrieve your connection credentials from AWS Secrets Manager if stored there. If not, simply put them as shown in the sample below.

type: POSTGRESQL


host: localhost #Replace localhost with your PostgreSQL host name
port: 5432 #Replace the default port number 5432 if needed


database: "postgres" #Replace postgres with your database name
username: "replicant" #Replace replicant with your postgresql username
password: "Replicant#123" #Replace Replicant#123 with your user's password


max-connections: 30 #Maximum number of connections replicant can open in postgresql
socket-timeout-s: 60 #The timeout value for socket read operations. The timeout is in seconds and a value of zero means that it is disabled.
max-retries: 10 #Number of times any operation on the source system will be re-attempted on failures.
retry-wait-duration-ms: 1000 #Duration in milliseconds Replicant should wait before performing then next retry of a failed operation.


#List your replication slots (slots which hold the real-time changes of the source database) as follows
  replication-slots:
    io_replicate: #Replace "io-replicate" with your replication slot name
      - wal2json #plugin used to create replication slot (wal2json | test_decoding)
    io_replicate1: #Replace "io-replicate1" with your replication slot name
      - wal2json


log-reader-type: {STREAM|SQL}

You can also enable SSL for the connection by including the SSL field and specifying the needed parameters.

ssl:
  ssl-cert: <full_path_to_SSL_certificate_file>
  root-cert: <full_path_to_SSL_root_certificate_file>
  ssl-key: <full_path_to_SSL_key_file>

Step 5: Setup Filter Configuration for PostgreSQL

From $REPLICANT_HOME, open the filter configuration file.

vi filter/postgresql_filter.yaml

Specify the data to be replicated based on your replication needs using the template format below.

allow:
  catalog: <your_catalog_name>
  schema: <your_schema_name>
  types: <your_object_type>


#If not collections are specified, all the data tables in the provided catalog and schema will be replicated
allow:
  <your_table_name>:
    allow: ["your_column_name"]
    condtions: "your_condition"


  <your_table_name>:  
    allow: ["your_column_name"]
    conditions: "your_condition"


  <your_table_name>:    
  ```

Step 6: Setup Extractor Configuration

To set up the extractor configuration, you must specify the configuration in the Extractor configuration file according to your requirements. A sample Extractor configuration file, postgresql.yaml, is in the $REPLICANT_HOME/conf/src directory. In this file, you can configure either snapshot, realtime, or delta-snapshot replication modes based on your needs by specifying parameters within each respective mode.

Below is an example of the configuration for snapshot mode.

snapshot:
  threads: 16
  fetch-size-rows: 5_000


  _traceDBTasks: true
  min-job-size-rows: 1_000_000
  max-jobs-per-chunk: 32


  per-table-config:
  - catalog: tpch
    schema: public
    tables:
      lineitem:
        row-identifier-key: [l_orderkey, l_linenumber]
        split-key: l_orderkey
        split-hints:
          row-count-estimate: 15000
          split-key-min-value: 1
          split-key-max-value: 60_00

Step 7: Setting Up The Connection Configuration for Oracle

With your PostgreSQL connection configured, you can now set up the Oracle connection.

For Replicant to work with Oracle Database, you must install Oracle JDBC Driver as a dependency. To do so, follow the steps below.

  1. Go to the Oracle Database JDBC driver Downloads page.
  2. Download the Oracle JDBC Driver ojdbc8.jar.
  3. Put the ojdbc8.jar file inside the $REPLICANT_HOME/lib directory.

Next, create a shared directory to which the Replicant host and the Oracle host will have READ and WRITE access.

After that, you can set up Oracle user permissions for the host replicant user. In the examples below, the shared directory created in the step above is assumed to be in the following path: /data/shared_fs

GRANT CREATE TABLE TO <USERNAME>;
 --If you are unable to provide the permission above, you must manually create all the tables


 GRANT CREATE ANY DIRECTORY TO <USERNAME>;
 --If you are unable to provide the permission above, you must manually create the following directories:
 CREATE OR REPLACE DIRECTORY csv_data_dir AS '/data/shared_fs';
 CREATE OR REPLACE DIRECTORY csv_log_dir AS '/data/shared_fs';




 GRANT ALTER TABLE TO <USERNAME>;

You must then manually create a user schema and a schema named io_replicate. Grant both of them permission to access a tablespace from an Oracle client.

With the appropriate permissions set, you can set up the connection configuration. From $REPLICANT_HOME, go to the sample configuration and edit it. Below is the command to edit the configuration using vi.

vi conf/conn/oracle_dst.yaml

Below is the sample configuration.

type: ORACLE


host: localhost #Replace localhost with your oracle host name
port: 1521 #Replace the default port number 1521 if needed
service-name: IO #Replace IO with the service name that contains the schema you will be replicated


username: 'REPLICANT' #Replace REPLICANT with your username of the user that connects to your oracle server
password: 'Replicant#123' #Replace Replicant#123 with the your user's password


stage:
  type: SHARED_FS
  root-dir: /data/shared_fs  #Enter the path of the shared directory


max-connections: 30 #Maximum number of connections replicant can open in Oracle
max-retries: 10 #Number of times any operation on the system will be re-attempted on failures.
retry-wait-duration-ms: 1000 #Duration in milliseconds replicant should wait before performing then next retry of a failed operation


#charset: AL32UTF8 #Character set to use when transferring data. This option must match the charset value in the source connection configuration.

Ensure that the specified user for the migration has the following schema privileges that enable the data replication in the tables to be created.

  • CREATE TABLE
  • CREATE SCHEMA
  • CREATE ANY DIRECTORY
  • ALTER TABLE

Step 8: Setting Up The Applier Configuration

Next, you need to add the Applier configuration for Oracle. The Applier configuration file contains all the parameters Replicant uses while loading data into the Oracle Database.

From $REPLICANT_HOME, go to the sample Oracle Applier configuration file:

vi conf/dst/oracle.yaml

Below is a sample configuration.

snapshot:
  enable-partition-load: true
  disable-partition-wise-load-on-failure: false


  bulk-load : Arcion can leverage underlying support of FILEbased bulk loading into the target system.
    enable: true
    type: FILE
    serialize: true/false.
    method: SQL_LOADER.
 
 # native-load config is only available for Oracle -> Oracle replication
 #  native-load:
 #    enable: false
 #    stage-type: SHARED_FS
 #    directory: SHARED_STAGE
 #    shared-path: path-to-nfs

Step 9: Running Replicant

From $REPLICANT_HOME, run the following command to execute Replicant in snapshot mode. When Replicant starts up, the pipeline will be immediately activated after executing the command.

./bin/replicant snapshot
conf/conn/postgresql.yaml \conf/conn/oracle.yaml \
--extractor conf/src/postgresql.yaml \
--applier conf/dst/oracle.yaml  \
--id repl2 --replace –overwrite

After executing the command, the pipeline is started in snapshot mode, and the data migration process begins.

Advantages of using Arcion

Some benefits of using an integrated data management platform like Arcion instead of a manual or native database solution when migrating data from Postgres to Oracle are highlighted below.

Sub-second latency 

Thanks to its distributed & highly scalable architecture, Arcion is the only CDC solution with an underlying end-to-end multi-threaded architecture supporting vertical and horizontal auto-scaling. Its patent-pending technology parallelizes every single Arcion CDC process for maximum throughput. So, users get ultra-low latency and maximum throughput even as data volume grows.

Automated Schema Management

Arcion allows smooth schema management of your data, ensuring data integrity, reliability, and consistency. As the schema evolves on the MySQL instance, these changes will automatically be applied to the BigQuery instance to keep both in sync.

100% Agentless CDC

Arcion is the only CDC vendor that offers 100% agentless CDC to all its supported 20+ connectors. Arcion’s agentless CDC connectors apply to all the complex enterprise databases modern enterprises use, like Microsoft SQL Server, MongoDB, and all versions of Oracle. Arcion reads directly from the transaction logs, never from the database itself. Previously, data teams faced administrative nightmares and security risks associated with running agent-based software in production environments. You can now replicate data in real-time, at scale, with guaranteed delivery — but without the inherent performance issues or the security concerns of agent-based connectors.

Transactional Integrity and Consistency

Arcion provides transactional integrity and data consistency through its CDC technology. To further this effort, Arcion also has built-in data validation support that works automatically and efficiently to maintain data integrity. It offers a solution for scalable data migration and replication while ensuring zero data loss has occurred.

No-code, Easy Setup

Effortless setup & maintenance: Arcion's no-code platform removes DevOps dependencies; you do not need to incorporate Kafka, Spark Streaming, Kinesis, or other streaming tools. So you can simplify the data architecture, saving both time and cost.

Stay Compliant by Default

Arcion is SOC 2 Type 1 & Type 2, HIPAA, and PCI compliant. The enterprise-grade security and compliance standards ensure data governance.

Method 2: Postgres to Oracle (Manual Method)

To manually migrate data from Postgres to Oracle, you must first extract it from the Postgres database, then import it into Oracle DBMS using a tool like Oracle SQL Developer. Oracle SQL Developer is a Graphical User Interface (GUI) tool by Oracle that provides a convenient way for developers to perform database tasks, including importing databases for migration purposes. First, let’s look at extracting the data from PostgreSQL.

Step 1: Extract the Data from PostgreSQL

There are several ways of extracting or exporting data from Postgres. For example, you can use the COPY TO command or the pg_dump utility to back up databases.

To use the COPY TO command, you can use the sample COPY TO command below. This command will extract data from the specified table and load the contents into a CSV file.

COPY table_name TO 'C:table.csv' DELIMITER ',' CSV HEADER;

Similarly, you can run the pg_dump utility to extract data from a database. Below is a sample command that exports an entire database into a SQL file called database.sql.

pg_dump database_name database.sql

Step 2: Import The Data Into Oracle

The next step is to import the extracted data into Oracle DBMS. To achieve this, you will utilize Oracle SQL Developer to bring the PostgreSQL data into the Oracle instance.

In Oracle SQL Developer, go to Database on the menu, then click Import Data to import this data into a new table. Next, input the necessary configuration details in the Data Import Wizard and select your desired CSV file or data to import.

Go to Data Import > Data Formats, where you will specify the data format from the source database, and then go to the Data Import > Mapping tab to change the mapping of the data types if you so desire. You can now click on the Import button to start the migration process. Once the import is complete, your Postgres data will be available in Oracle.

Limitations and Challenges of Using Custom Code Method

The custom code method described above is a manual process that is tedious and prone to errors and data loss. It is a brittle process that must be managed carefully and can lead to poor performance, especially at scale, and a drop in productivity for the developers implementing it. Below are further limitations of the custom code approach.

Data Type and Conversion Challenges

PostgreSQL and Oracle may handle data types and data conversion differently. This can lead to potential challenges when migrating data, especially with complex data types or transformations. To ensure data is migrated successfully, the migration requires careful attention and validation to ensure the data is accurately converted during the migration process.

Manual Effort and Maintenance

Implementing custom scripts for data migration requires manual effort and ongoing maintenance. As the data structures or schemas change over time, the scripts to load the data must be updated to ensure the proper transformations.

Lack of Automation

The custom code approach does not have the automation capabilities offered by data management platforms like Arcion. If migrations are required to be performed on an ongoing basis, building the pipelines to move the data manually will consume much developer time.

Conclusion

This article looked at two popular and powerful database management systems - PostgreSQL and Oracle Database. You were also shown their key features and what makes them stand out amongst database systems. Next, we explored two approaches for migrating data from Postgres to Oracle. The first method utilized a data management platform, Arcion, and the second method was a manual process of moving data from Postgres to Oracle. The key advantages of using a platform like Arcion were outlined alongside the limitations of a custom code approach. 

At this point, you should have enough knowledge and the background information required to migrate your data from Postgres to Oracle Database successfully. A solution like Arcion leverages real-time, in-memory Change Data Capture (CDC) that guarantees scalability and ensures data consistency is preferred when moving data from PostgreSQL to Oracle. To efficiently implement a data integration strategy, get started with Arcion today by connecting with our team. CDC pipelines built with Arcion are easy to build, maintain, and scale and are production-ready in minutes.

Matt is a developer at heart with a passion for data, software architecture, and writing technical content. In the past, Matt worked at some of the largest finance and insurance companies in Canada before pivoting to working for fast-growing startups.
Luke has two decades of experience working with database technologies and has worked for companies like Oracle, AWS, and MariaDB. He is experienced in C++, Python, and JavaScript. He now works at Arcion as an Enterprise Solutions Architect to help companies simplify their data replication process.
Join our newsletter

Take Arcion for a Spin

Deploy the only cloud-native data replication platform you’ll ever need. Get real-time, high-performance data pipelines today.

Free download

8 sources & 6 targets

Pre-configured enterprise instance

Available in four US AWS regions

Contact us

20+ enterprise source and target connectors

Deploy on-prem or VPC

Satisfy security requirements