Introduction
Today’s data-rich environments require frequent movement of data from one place to another. It is not uncommon for software engineers, data architects, and other consumers of corporate data to require testing environments that closely resemble the dataset in production. This allows accurate and high-fidelity testing that enables better code and more realistic QA while maintaining the separation of production and testing environments and adhering to security and operational requirements.
The request to mirror environments sounds simple, but the logistics required to make it happen can be complex, tedious, and time-consuming. For example:
How easily can data be backed up on a daily basis?
How easily can the data be cleansed of PII or other sensitive data?
How easily can the data be moved or staged, including archive logs for point-in-time recovery
Is the process automated? Or does it require significant labor?
Do you have to write and maintain your own code to enable this process? Or can you use a simple, out-of-the-box solution?
What happens if an ad hoc request for an unscheduled refresh exists?
Solving these challenges is why we at Tessell have built native features that enable data refreshing across environments. Furthermore, we ensured that the solution is comprehensive, easy to use, and can also be programmatically automated using the Tessell API. Let's dive into how the solution works and how to use it.
Creating the Initial Clone
Data refreshing in Tessell works on a Source → clone model. The source is a live database (typically production) that is snapshotted on a regular basis. The clone is a separate virtual machine whose data is replaced on a scheduled basis from the source. Snapshots from the source are generated daily or on demand. Additionally, archive and transaction logs are stored for a set period of time to allow refreshing to a specific point in time.
Before creating a clone, you will need to determine your refresh strategy. Do you need to refresh daily, weekly, or monthly? This will dictate the SLA you need to define on the source to ensure the snapshot retention is in line with the refresh timeline.
View the SLA assigned to the Production Source
Browse to the overview page for the service instance
On the top right, click on Availability Machine
Once in the Availability Machine, click on Update SLA on the top right
If the SLA retention is not sufficient, you can select a different SLA or create a new SLA that matches your requirements in the DB Governance section of the Tessell interface.
Create the Clone
We can create the initial clone after you have validated that the SLA will retain the historical snapshots.
In the Availability Machine for the source. Select the Clone tab
Click on Create Clone
Select the snapshot to use for the initial data seed
Alternatively, select PITR region if you want to seed the clone with data to an exact point in time
Click on Clone. This will launch the provisioning wizard. The wizard will operate the same as a typical provisioning task in Tessell. However, you will see a section to add a Refresh Schedule. Fill out the refresh schedule with the time frame required. Automated refreshes are set up to run weekly or daily.
The pre and post-script sections allow you to run a script before the database is refreshed. For example, to extract data or settings from the target database before it is over-written. The post-script section allows you to perform actions on the database after the data has been refreshed. For example, re-import data or saved settings in the pre-script step.
You can load scripts in the Tessell Script Library within the interface if required. Scripts can be shell scripts or SQL scripts.
Here is an example of a script that manipulates users after the refresh is complete#!/bin/bash
# Variables
NEW_USER="NEWUSER"
NEW_USER_PASSWORD="myPassword123"
OUTPUT_FILE="output.log"
# SQL script to add a new user
# This assumes your SOURCE DB already has a user named
# CHANGEME and another named DELETEME
SQL_SCRIPT=$(cat <<EOF
CREATE USER $NEW_USER IDENTIFIED BY $NEW_USER_PASSWORD;
GRANT CONNECT, RESOURCE TO $NEW_USER;
DROP USER DELETEME;
ALTER USER CHANGEME IDENTIFIED BY Goodpassword123;
EXIT;
EOF
)
# Create a temporary script to be run as the oracle user
TEMP_SCRIPT="/tmp/create_user.sql"
# Write the SQL script to the temporary file
echo "$SQL_SCRIPT" > $TEMP_SCRIPT
# Execute the SQL script using sqlplus as sysdba and redirect output to a file
sudo su - oracle -c "sqlplus -s / as sysdba < $TEMP_SCRIPT > $OUTPUT_FILE 2>&1"
# Check if the user was added successfully by searching for "ORA-" in the output file
if grep -q "ORA-" $OUTPUT_FILE; then
echo "Failed to manage users. See $OUTPUT_FILE for details."
else
echo "Users managed successfully. See $OUTPUT_FILE for details."
fi
# Clean up the temporary script file
rm -f $TEMP_SCRIPTSubmit the request to provision the clone
Database Rename and Multi-Instance Support
Tessell supports renaming the Database during the cloning operation. This will allow you to logically manage the database names to avoid confusion, for example, Dev1, Dev2, etc. You can also rename the “master” credential to ensure that each database has a unique login (if required).
If you plan to place multiple clones derived from the same source database on a shared instance, you will need to ensure that each database instance has a unique name. Otherwise, Tessell will throw an error indicating that two databases with the same name cannot reside on the same virtual host.
Checking Refresh Status
After submitting the provisioning request, Tessell will build the initial clone. Once the clone is complete, you will see a few new objects that track the status and history of refreshes
Checking the last run
Run history will appear in two places
Last Refresh History in the Clone’s Service Instance Overview page
Entire History in the Events section of the Service Instance
Viewing and Editing Refresh Schedule
The Schedules section of the Clone’s Service Instance screen contains any existing schedule. You can also modify or delete the schedule from here
Manual Refresh
Frequently, refresh requests are ad hoc in nature. Tessell has provided functionality to allow a refresh to launch on demand through the Interface or by calling the API.
On-Demand Refresh - User Interface
The Clone’s Service Instance page displays a “Refresh Data” button at the top right corner. This button allows a user to trigger a refresh event manually.
With the Clone’s Service Instance page, click on Refresh Data
Select the Date to which you wish to refresh. Note that the calendar will only show the dates that have data available based on the Source Instance’s SLA
You can select Snapshots or PITR in the recovery mode
Snapshots will display available snapshots for the selected data
PITR will invoke a clock that allows you to select the exact time to recover
Optionally, you can select Pre Scripts and Post Scripts from the library.
Click Submit to initiate the refresh.
On-Demand Refresh - API
While in the Refresh Data screen, you can click the <> Code
button in the top right corner to see examples of refreshing data using the Tessell API.
The code will dynamically update based on your inputs in the form. As a best practice, fill out the form, then click <> Code
to see the resulting API call.