Following a recent post from our VP of R&D, Gil Hoffer, discussing the importance of upgrade testing, we decided to share with you how we use our system to perform upgrade testing for our own platform.
At Ravello, upgrade tests are performed automatically on a daily basis. Our upgrade testing comprises two main processes, each designed to verify different aspects of the upgrade.
Database migration validation
The main purpose of this process is to ensure that we won’t encounter problems when we perform the upgrade on the real environment. To do this, we take a replica of our entire platform and set its database up with a replica of our production environment. There are dangers associated with this methodology because the resulting platform and replica production environment hold lots of information that points to the same places as the real production environment does, for example, host IP addresses. For this reason we make sure to strip the replica platform of all possibilities of actually interacting with the real world.
This replica is used to simulate a real production rollout. After we ensure that it is safe, we use Jenkins to build the required revision (trunk or tag), and then verify the success of the migration including whether or not the amount of downtime incurred is acceptable.
The process is not fully automated. Sometimes after rolling out the real environment we need to implement minor fixes on the trunk. As a result there might be inconsistencies in the database that are detected by upgrade testing. To resolve this, we might need to run some manual SQL commands against the database replica to ensure its compatibility with the next upgrade. This process is semi-automatic; we edit the command in the code so that it runs all the data every time until we next need to edit it. Having already identified any database inconsistencies, we can then save time by using the same script during the production roll-out.
Upgrade scenario testing
This stage is less about checking that the upgrade on the real database and environment is going to work, and more about checking that various potential scenarios on the current environment will maintain stability with respect to its applications and VMs, and their abilities (not just Ravello, your apps as well). You want to know that whatever you have before upgrade will continue to meet the same expectations after upgrade.
How do we do this? We build a clean environment that holds no data, and run lots of pre- and post-upgrade scenarios. This stage is carried out in three phrases:
- Run pre-upgrade scenarios
- Perform upgrade
- Run post-upgrade scenarios
Multiple scenarios are tested in the same manner, before and after the upgrade. For example, using the Ravello UI we can take an application or VM, drag it to the canvas, and publish it. We then upgrade the platform, and then check that the same VM on that application is still alive and can be accessed through SSH, for example.
Upgrade testing in the future
As the complexity of systems and environments increases, we expect to see horizontal growth and a wider cross-section of test scenarios implemented. We are currently working on ways to upgrade each component of our product on its own, to allow greater flexibility and accelerate the roll-out process.
Our upgrade process – some technical data
- We create database backup for our production environment, we run the following code from the production database’s VM.
- pg_dump management > /tmp/dumpfile.dmp
- bzip2 /tmp/dumpfile.dmp
- s3cmd put /tmp/dumpfile.dmp.bz2 s3://PublicBucket/
- We create an application (using ravello app) from an existing blueprint that avoids a repeated environment creation scenario and contains the following:
- install s3cmd
- sudo apt-get install s3cmd
- s3 –configure
- insert Access key and Secret Access key
- We connect to the application database and then:
- Import db database and billing database
- sudo s3cmd get s3://PublicBucket/management_backup.dmp.bz2 /var/lib/postgresql/lastdb.bz2 >tmp.exe
- sudo bzip2 -d /var/lib/postgresql/lastdb.bz2
- sudo s3cmd get s3://PublicBucket/billing_backup.dmp.bz2 /var/lib/postgresql/lastbillingdb.bz2
- sudo bzip2 -d /var/lib/postgresql/lastbillingdb.bz2
- Create new database templates
- createdb -T template1 management
- createdb -T template1 billing
- Inject data into the new database
- psql management < /var/lib/postgresql/lastdb > tmp_db.log 2>tmp_error_db.log
- psql billing < /var/lib/postgresql/lastbillingdb > tmp_billing.log 2>tmp_error_billing.log
Note: If you don’t want to risk your production environment, you need to take precautions such as disabling entities that interact with the same features as the real environment.
- Import db database and billing database
- We run a job from Jenkins to deploy trunk/tag code on to this application and run migration.
Note: If this phase fails, we take the logs from all the VMs on the application and add them to the report.
- We assume our replica environment is up and ready with the trunk or tag version and with the same data as the production environment. We check two things:
- Simple interaction with the environment, for example, create a new application on the Ravello environment.
- Migration assertion – check the log for errors and verify that the migration completed within the given time frame.
- If all of the tests pass, we erase the replica environment.
Ravello works well with Jenkins to enable Continuous Integration. Each time a developer checks in code, new environments can be spun up automatically, allowing a battery of automated tests to execute quickly. Typically, the environments are shut down and resources are released when the tests are successful, but it can be saved for debugging if some tests fail. Learn MoreHow We Auto-test Ravello Platform Upgrades