Are High Performance, No-Code, Live Data Replication Groups a part of your Data Ingestion Solution? They should be.

by Alton Dinsmore

Hero Image
July 17, 2020 1:16pm

Blog

Data powered businesses need a live data replication strategy to ensure accuracy, accessibility and consistency of data across various silos in an organization. As data architectures become more complex with multi-cloud and hybrid environments, on-premise systems, a combination of vendors and platforms participating, and an ever increasing volume and velocity of data coming in, bulk management systems for replication are essential.

Interestingly, in many Do-It-Yourself developments or new age, commercial solutions, replication use cases were simply not considered at the POC stage. When a manual approach must be pursued down the road to facilitate initial data capture and initial capture-CDC synchronization, the slog is real.

Replication groups totally change this landscape. They can facilitate large data migration, cross platform data warehousing (replicating to a data lake or data warehouse) and the managing of many thousands of objects in simple ways. Add in a no-code UI to facilitate creating these groups, simple inclusion or exclusion of tables to use, and you have a scalable solution with rapid deployment at your fingertips.



Replication Groups in Action with Equalum

When you come into the Equalum dashboard, you begin with the Classic View showing sources on the left, targets on the right and flows in the center. To navigate to replication mode, simply click on the toggle at the top of the screen and see below in Fig 1.



Fig. 1



In this example, we will perform a simple replication from one of the sources to one of the targets - Oracle to Postgres (Fig 2). When looking at our Postgres source, there are various schemas that you can replicate the data into:

Fig. 2


To identify the tables that we want to replicate, we enter in some table pattern matches from your chosen Source. You can use as many table patterns as you like in Equalum when setting up a replication group.

  • Select Your Schema: In this example, we selected the “OT” schema out of Oracle
  • Enter Table Name Pattern: We added “%” to capture all of the tables within the OT Schema. You can then choose to explicitly remove certain tables that you do not want to include in the replication. You can also, just as easily, click the removed tables and add them back into your replication group.
  • Add Additional Table Pattern Matching: If you would like to further specify tables to replicate, you can create a second table pattern matching to exclude tables within the selected schema (Fig. 3)



Fig 3. In this example, we wanted to exclude tables within the OT schema that began with EQ. We listed OT in the Schema Drop down, then “EQ%” into Table Name Patterns to pull all tables that contained the EQ naming convention. Add “Exclude” for the action, and we now have a streamlined set of tables in our Replication Group Tables at the bottom of the window that meet the two criteria set above.


The “Validate Tables” button at the bottom of the dashboard window performs two critical tasks.

Ensures Capability and Authority - Once clicked, the system will check to ensure that we have the capabilities to create these tables within the target schema and that we have the authority to execute.


Identifies Duplicates against existing Stream Tables - Any tables that are already being replicated from this source to the chosen target will be highlighted, allowing you to avoid replicating the data a second time. You can move the duplicates into the explicitly removed tables window, and then create your replication group.


Active Replication Groups Created in a Few Clicks

Head back to the Main Dashboard to the Replications window, where you see your new Replication group being processed. Within less than a minute, the Replication group will be active with 8 distinct tables being replicated after just a few clicks (Fig. 4). The replication automatically begins initial data capture followed by Change Data Capture from the synchronization offset. This helps achieve exactly once data replication, regardless of the source type, for any source type that stores data at rest. Trying to manually sync between the initial capture and CDC changes is a tedious task that is prone to failures, due to the high frequency of changes in the data set. Equalum’s automatic and reliable system ensures proper synchronization.

Fig 4.



Easily Add Data Transformations to Replications in Process

Clicking on the various flows for replication groups from the dashboard allows you to see which actions are being taken inside the Flow. Using the drag and drop features in Equalum, you can add a transformation and then specify your Transformation Expression based off of your desired outcome with a few clicks (Fig. 5). There are a myriad options available to you to enhance your replication including data type mapping and other specialized features.



Interested in learning more?

Download our White Paper in collaboration with Eckerson " The Why & How of Streaming First Data Architectures"

DOWNLOAD



Are High Performance, No-Code, Live Data Replication Groups a part of your Data Ingestion Solution? They should be.

Ready to Get Started?

Experience Enterprise-Grade Data Ingestion at Infinite Speed.