Skip to main content
Refreshing a Database

Automated Production to Development Refresh for Databases

Bakul Banthia avatar
Written by Bakul Banthia
Updated over 4 months ago

Introduction

Today’s data-rich environments require frequent movement of data from one place to another. It is not uncommon for software engineers, data architects, and other consumers of corporate data to require testing environments that closely resemble the dataset in production. This allows accurate and high-fidelity testing that enables better code and more realistic QA while maintaining the separation of production and testing environments and adhering to security and operational requirements.

The request to mirror environments sounds simple, but the logistics required to make it happen can be complex, tedious, and time-consuming. For example:

  • How easily can data be backed up on a daily basis?

  • How easily can the data be cleansed of PII or other sensitive data?

  • How easily can the data be moved or staged, including archive logs for point-in-time recovery

  • Is the process automated? Or does it require significant labor?

  • Do you have to write and maintain your own code to enable this process? Or can you use a simple, out-of-the-box solution?

  • What happens if an ad hoc request for an unscheduled refresh exists?

Solving these challenges is why we at Tessell have built native features that enable data refreshing across environments. Furthermore, we ensured that the solution is comprehensive, easy to use, and can also be programmatically automated using the Tessell API. Let's dive into how the solution works and how to use it.

Creating the Initial Clone

Data refreshing in Tessell works on a Source → clone model. The source is a live database (typically production) that is snapshotted on a regular basis. The clone is a separate virtual machine whose data is replaced on a scheduled basis from the source. Snapshots from the source are generated daily or on demand. Additionally, archive and transaction logs are stored for a set period of time to allow refreshing to a specific point in time.

Before creating a clone, you will need to determine your refresh strategy. Do you need to refresh daily, weekly, or monthly? This will dictate the SLA you need to define on the source to ensure the snapshot retention is in line with the refresh timeline.

View the SLA assigned to the Production Source

  1. Browse to the overview page for the service instance

  2. On the top right, click on Availability Machine

  3. Once in the Availability Machine, click on Update SLA on the top right

If the SLA retention is not sufficient, you can select a different SLA or create a new SLA that matches your requirements in the DB Governance section of the Tessell interface.

Create the Clone

We can create the initial clone after you have validated that the SLA will retain the historical snapshots.

  1. In the Availability Machine for the source. Select the Clone tab

  2. Click on Create Clone

  3. Select the snapshot to use for the initial data seed

  4. Alternatively, select PITR region if you want to seed the clone with data to an exact point in time

  5. Click on Clone. This will launch the provisioning wizard. The wizard will operate the same as a typical provisioning task in Tessell. However, you will see a section to add a Refresh Schedule. Fill out the refresh schedule with the time frame required. Automated refreshes are set up to run weekly or daily.

  6. The pre and post-script sections allow you to run a script before the database is refreshed. For example, to extract data or settings from the target database before it is over-written. The post-script section allows you to perform actions on the database after the data has been refreshed. For example, re-import data or saved settings in the pre-script step.
    You can load scripts in the Tessell Script Library within the interface if required. Scripts can be shell scripts or SQL scripts.

    Here is an example of a script that manipulates users after the refresh is complete

    #!/bin/bash

    # Variables
    NEW_USER="NEWUSER"
    NEW_USER_PASSWORD="myPassword123"
    OUTPUT_FILE="output.log"

    # SQL script to add a new user
    # This assumes your SOURCE DB already has a user named
    # CHANGEME and another named DELETEME
    SQL_SCRIPT=$(cat <<EOF
    CREATE USER $NEW_USER IDENTIFIED BY $NEW_USER_PASSWORD;
    GRANT CONNECT, RESOURCE TO $NEW_USER;
    DROP USER DELETEME;
    ALTER USER CHANGEME IDENTIFIED BY Goodpassword123;
    EXIT;
    EOF
    )

    # Create a temporary script to be run as the oracle user
    TEMP_SCRIPT="/tmp/create_user.sql"

    # Write the SQL script to the temporary file
    echo "$SQL_SCRIPT" > $TEMP_SCRIPT

    # Execute the SQL script using sqlplus as sysdba and redirect output to a file
    sudo su - oracle -c "sqlplus -s / as sysdba < $TEMP_SCRIPT > $OUTPUT_FILE 2>&1"

    # Check if the user was added successfully by searching for "ORA-" in the output file
    if grep -q "ORA-" $OUTPUT_FILE; then
    echo "Failed to manage users. See $OUTPUT_FILE for details."
    else
    echo "Users managed successfully. See $OUTPUT_FILE for details."
    fi

    # Clean up the temporary script file
    rm -f $TEMP_SCRIPT

  7. Submit the request to provision the clone

Database Rename and Multi-Instance Support

Tessell supports renaming the Database during the cloning operation. This will allow you to logically manage the database names to avoid confusion, for example, Dev1, Dev2, etc. You can also rename the “master” credential to ensure that each database has a unique login (if required).

If you plan to place multiple clones derived from the same source database on a shared instance, you will need to ensure that each database instance has a unique name. Otherwise, Tessell will throw an error indicating that two databases with the same name cannot reside on the same virtual host.

Checking Refresh Status

After submitting the provisioning request, Tessell will build the initial clone. Once the clone is complete, you will see a few new objects that track the status and history of refreshes

Checking the last run

Run history will appear in two places

  1. Last Refresh History in the Clone’s Service Instance Overview page

  2. Entire History in the Events section of the Service Instance

Viewing and Editing Refresh Schedule

The Schedules section of the Clone’s Service Instance screen contains any existing schedule. You can also modify or delete the schedule from here

Manual Refresh

Frequently, refresh requests are ad hoc in nature. Tessell has provided functionality to allow a refresh to launch on demand through the Interface or by calling the API.

On-Demand Refresh - User Interface

The Clone’s Service Instance page displays a “Refresh Data” button at the top right corner. This button allows a user to trigger a refresh event manually.

  1. With the Clone’s Service Instance page, click on Refresh Data

  2. Select the Date to which you wish to refresh. Note that the calendar will only show the dates that have data available based on the Source Instance’s SLA

  3. You can select Snapshots or PITR in the recovery mode

    1. Snapshots will display available snapshots for the selected data

    2. PITR will invoke a clock that allows you to select the exact time to recover

  4. Optionally, you can select Pre Scripts and Post Scripts from the library.

  5. Click Submit to initiate the refresh.

On-Demand Refresh - API

While in the Refresh Data screen, you can click the <> Code button in the top right corner to see examples of refreshing data using the Tessell API.

The code will dynamically update based on your inputs in the form. As a best practice, fill out the form, then click <> Code to see the resulting API call.

Did this answer your question?