In this post, we are going to dig in and illustrate why its nearly impossible (in most cases) to take an existing application running in a virtualized data center and move that to the cloud as-is (without any changes whatsoever). Every step of this migration to the cloud requires you to make decisions that further distance the cloud deployment from the original data center deployment – thereby leading to a high probability of the project failing.
You are the development manager for a distributed application running in your data center. Your application consists of a web server, application server, database server, message queue with DMZs, different subnets, firewalls etc. The business team is always asking for additional features and you are pressured to substantially reduce time-to-market, but IT never allocates enough resources for development and testing. So you end up sharing capacity, never testing as much as you want to, using mocks, stubs and other corner-cutting approaches.
An obvious solution seems to be to use the cloud. In theory, with the cloud you can get infinite capacity on demand and run things in parallel, shortening time-to-market. So, ideally, you would want to:
- Recreate your exact data center application environment in the cloud
- Create multiple replicas of this environment accessible to each developer/tester to minimize conflicts
- Improve Continuous Integration by running a full set of tests on each code check-in, allowing you to find problems early
But if you want to use the cloud for development and testing, you have to:
- Provision and configure the VMs
- Configure the application stack
- Automate the duplication process
- Integrate it with your build system and internal processes
And each one of these steps requires multiple sub-steps that create serious challenges. Let’s take a closer look:
Step 1: Provision and configure the VMs
- Educate yourself about Amazon: instance types, EBS/S3 backed images, private/public IP addresses and on it goes.
- Provision the instances.
- Configure an OS with the same packages, users, daemons, parameters, etc.Configure networking
- Install and configure middleware (web server, app server, database, queue etc.)
- SAVE AS AMIs (will make your life easier in Step 3)
Trying to achieve all of these sub-steps you are going to encounter some unique challenges. Let’s look at what these challenges are both if you’re not using Amazon’s Virtual Private Cloud (VPC) and if you are.
The main challenge in a Non-VPC environment is that it will be practically impossible to recreate your data center environment in the cloud. Because you started with a certain subnet configuration internally, you cannot have the same IP addresses or the same DNS, and you can’t use multicast or other protocols.
The main challenge in a VPC is that it is really hard to use. In fact, its completely different from the regular Amazon service. Further, you will not be able to create replicas of your environment as many times as you want. This is because of the limited number of public IP addresses that are externally accessible and the limited number of VPCs, accounts and regions. To get around the public IP limitation, you will need to set up a VPN, and then you’re back to dealing with your company’s IT organization. Good luck with that.
You might think you can solve some of these issues with a compromise by putting everything on the same subnet. But now, it’s a different configuration from what you have in-house, which might require you to configure the application differently (in-house you would use static IPs or DNS, but in AWS they keep changing) or in the worst case, rewrite portions of your application.
Step 2: Configure the application stack
In this step, you have to wire the different components of your application together. Some of the sub-steps here may include:
- For the web server you need to configure a reverse proxy.
- For the app server you need to configure the IP/DNS name of the database, schema name/URL – and the same for messaging. Depending on the app there will be additional configurations.
- For the database you need to create relevant users, define the database schema, populate with data and specify the IP addresses of the servers that can connect to the database.
- For the message queue you’ll need to configure queue depth and other parameters
Often developers do not have the expertise or authority to handle all these issues and require DBAs, Sys Admins, network engineers and other IT personnel. Developers who do not wish to be reliant on all of these other people, have a choice: either they give up, or end up misconfiguring, further distancing their dev/test environment from the production environment.
Step 3: Automate the duplication process
To give each of your developers and testers their own instance of the distributed application you need to automate the entire process. Let’s say that with the help of DBAs, sys admins, network engineers and IT consultants, you manage everything in Step 2, and – a bit of a stretch — you have done an amazing job of documenting the entire configuration. Now, how do you automate?
You can try and script all of this on your own, but more likely you would consider some automation platform that enables provisioning the VMs as well as automating the configuration (steps 1 and 2).
With such a platform you’d have to:
- Model the application
- Provision VMs
- Change VM configuration (for the new environment)
To do this, you might consider:
- A Cloud Management Platform, such as RightScale
- A Configuration Management tool, such as Chef or Puppet
- Tools offered by the cloud provider, such as Amazon’s CloudFormations
With all of these approaches you will have a steep learning curve. And in the case of commercial tools you will have to learn something proprietary that you will not be likely to use again down the road. And in any case, you still have to deal with individual VMs — not the entire application, requiring you to provision and configure each piece separately, and then tie them all together.
Step 4: Integrate it with your build system and internal processes
Giving each developer and tester a full application instance is not the end of the story. You want to be able to integrate this into your development process (such as your CI or build systems). Every time someone checks in code, you should be able to update the application and run some tests.
Assuming you are using a build system (such as Maven, Rake, Gradle, NAnt, MSBuild or CMake), you will need to integrate it with the automation solution you’ve created in Step 3 (that is, provision new VMs, or reuse old VMs), and also enable the deployment of your application on the different VMs provisioned in it.
Deploying the application can differ between different application configurations – it can be as easy as copying a single file to a directory (as is the case in many enterprise Java applications), but it can also require manually running scripts and other processes on the VMs, or even custom integration with your runtime environment.
Once you’ve done that, you can usually easily integrate your CI system (e.g. Jenkins, CruiseControl, TeamCity) and your IDE (e.g. Eclipse, Visual Studio) to use the build tool to deploy the full blown application on the relevant cloud instances, launch automated test scenarios and analyze them.
How Ravello Makes Dev & Testing in the Cloud Not Suck
As you can see, the problem is not just about learning a new cloud environment and doing things differently (not easy by itself) — rather it’s a lot about the detailed knowledge of every single moving part in your application.
With Ravello, you don’t need to know or do any of that. You simply take your virtual machines exactly as they are — same OS configuration, packages, application stack, etc. — and upload that to the Ravello service. There you define the networking that you want, and due to our advanced technology, you can very easily replicate your data center networking standards.
That’s it! You’re done. You can now create as many copies of this application as your team needs. Further, you can also snapshot and archive the full multi-VM application instance, and you can work iteratively and collaborate on live application instances.
In our next post, we will describe how we use Ravello to develop the Ravello management system. In the meantime, try it out for free (we pay for everything including your VM costs) and see for yourself.