How to refresh QA automatically by leveraging DevOps
Doing database devops can be challenging, but it can also be very rewarding.
DevOps is the union of people, process and products to enable continuous delivery of value to your end users.
In this post I am going to cover how to automatically refresh a lower environment commonly used for testing as part of your release (CD) pipeline.
Well, why? – you may be asking.
- In some cases, developers and testers need to test their application code-changes against a fresh copy of production-like data. This helps them do validations prior to publishing their changes to production where their apps are bombarded by end-user live workloads.
- Also, the lower environment may be used for testing, and we can’t overwrite the test data constantly. It needs to be a hot-standby refresh, made available when necessary.
Let’s consider the following diagram.
The diagram indicates that there’s 2 database servers on a private network, ie. SQL Server on Virtual Machines, one for PROD and the other for QA.
The database servers seem to be running T-SQL backup jobs (maybe, full/differential/log) onto a network file share for storage.
Traditionally, a Database Administrator (DBA) may run a series of scripts (Bash/CMD/PowerShell/T-SQL/etc) they’ve built to execute the refresh whenever the developers send a request.
What we want is to empower the developers to invoke that refresh as an additional task when releasing their app code-changes to QA via Azure DevOps and for the DBA to approve it. We also want the DevOps pipeline to leverage the DBA’s scripts that are in a code repository for source control. If the DBA doesn’t approve the refresh task, then the DevOps pipeline will not run the scripts.
- In order to accomplish this, the DBA has created a code repo in their favorite location to source control their work – GitHub.
- The DBA then works with the DevOps admin to create the integration between the project on Azure DevOps and the GitHub repo that contains the source scripts.
- The DevOps admin or DBA will first create a Continuous Integration (CI) pipeline that automatically triggers when there’s a commit to the main branch of the code repository and a path filter to a subfolder of the application userdb where the particular scripts are stored. Second, the CI pipeline needs to use an Azure DevOps self-hosted agent because the nodes are behind a private network. The DevOps admin can deploy the Azure DevOps pool agents on VMs withing that private network which authenticate back to the DevOps project using Personal Access Token (PAT). Third, the CI pipeline will simply have a task that does a Copy all contents (ie. **) of the source folder of the userdb in the repo (ie. hiramdb) to a target folder (C:\RefreshQA) on the agent pool VM.
- Next, the DevOps admin or DBA will create or edit a Continuous Deployment (CD) pipeline for releases of the app code-changes. In this pipeline, we will add a task called RefreshQA with multiple PowerShell tasks for the Agent Job so that we can debug and monitor the tasks as needed. This Agent Job will also run on the DevOps Self-Hosted Agent pool because it will target the QA and PROD nodes in a private network.
- In general the task will be:
- SaveLastQABackup: because we want to keep the backup before the refersh occurs as a snapshot, if we need to rollback the refresh for whatever reason. For practical storage space reasons, this task moves the latest QA full backup file to another folder to avoid loosing it by retention policy.
- ScriptOutPermission: because we want to restore role or object level access of QA back after the refresh, and access settings coming from PROD could be different. This task generates a sql script file with the permissions assigned to the userdb in QA, so that we can use it to reapply the QA access in task 6.
- DiffBak: because we typically need the latest PROD object schema definitions & data refreshed to QA. This task is first going to temporarily disable database Log backup in PROD while the refresh runs. Second, start the Differential backup. Thirds, loop till a successful diff backup ran within 15 minutes ago. Lastly, attempt to kill sessions connected to the QA userdb, so that we may begin to overwrite it without errors related to it being in-use.
- Refresh: because we need to overwrite the userdb in QA with the latest PROD copy. This task will first, again, attempt to kill sessions connected to the QA userdb incase anything is still lingering from step 3 or connected in-between step 3 & 4, to avoid errors related to it being in-use. Second, it will drop the userdb if it exists on the QA server instance to avoid unable to replace errors due to file names due to the OS resources. THE ENTREE: Third, after 30 seconds, it will get the PROD userdb backup history and the last full backup chain (full/diffs/log) and restore them on QA with data/log file mappings and replace option.
- EnableJobs: because we need to re-enable the jobs in PROD we disabled in step 3 that we disabled, such as the T-Log backup job, while we started the Diff backup job to avoid error backup already running.
- ResetQA: because we need to reset QA permissions, database options back to QA values, truncate logs, obfuscate PII prod-data, reset data in tables that were crucial to QA config app settings, etc. First, this task is going to loop every x-seconds until the refreshed QA userdb comes back online. Second, its going to echo out to the console the restore history for developers to see it was refreshed as of a particular point in time. Third, it’s going to set the required database options such as Simpe recovery, Trust Worthy, the database owner, shrink the log file, etc. Forth, its going to run the required reset data logic for the userdb. Fith, is going to reset the userdb security permissions that were scripted out in step 2. Sixth, its going to sync logins incase the SIDs of QA are different than the SIDs of PROD.
- ShowOuput: because we like logs. This task simply going to echo to the console the contents of the OutputRefresh.txt which was a log created by step 4.
- In general the task will be:
Finally, the result look like this for a release.
If you want to import the CI/CD pipelines or try this out, here’s the reference to all things mentioned above. https://github.com/hfleitas/refreshqa
Hopefully, that provides value to your DevOps projects. Let us know if you have any feedback in the comments below.
P.S. if you’re wondering where is part 1 of 2 for this topic, is coming soon…
One thought on “SQL Database DevOps (2 of 2)”
[…] Hiram Fleitas rebuilds the QA environment: […]