Case Study / Deutsche Telekom
Director of DevOps, Deutsche Telekom HBS
Case Study Highlights
Deutsche Telekom’s cloud telephony application is a complex multi-tier application containing multiple VMs and various high-end physical network appliances, including appliances from F5 Networks, Brocade and Palo Alto Networks.
They have an extremely advanced and agile architecture, with VMware, Chef and Jenkins However it is not immune against more mundane problems in IT, specifically lack of capacity and the hassle of managing physical hardware.
Now with Ravello, developers can start up, on-demand, their own copy of the application environment. This means that developers do not have to share development environments anymore, leading to less waiting and higher quality software.
About Deutsche Telekom Hosted Business Solutions
Deutsche Telekom AG (DT) is a telecommunications multinational headquartered in Bonn, Germany. It has over 230,000 employees and generates close to $60 billion in revenue. It operates businesses in various countries worldwide, including mobile telephony (under the T-mobile brand), fixed telephony, broadband Internet and IT Services.
The Hosted Business Solutions (HBS) division has its origins in a 2013 acquisition of the telecommunications startup ChooChee. Under its new name and as part of DT, HBS is tasked with the development and roll-out of a next-generation cloud-based telephony service targeted at small and medium businesses.
Requirements for the Cloud Telephony Service
HBS designed the platform for its cloud telephony service on two pillars: extreme agility and rock-solid quality of service.
"The cloud telephony market for SMEs is extremely hot." explains Paul Sebastien, CMO at DT-HBS. "A lot of innovation is happening, and many players are entering this market. We need a platform that allows us to rapidly create new services and enter new markets, depending on how the market develops, and we need to do this faster than our competitors. Therefore extreme agility is an absolute must."
Ram Akuka, Director of DevOps at DT-HBS adds "Cloud telephony puts unique requirements on the underlying infrastructure. For example, consistent low latency is essential to call quality. This makes it impossible to host our production in a traditional public cloud. So instead, we implemented a private cloud combining DevOps technologies such as configuration management and Continuous Integration and Continuous Deployment, with an enterprise architecture. It is a 'best of both worlds' platform."
Overview of the Cloud Telephony Platform
The HBS private cloud runs on VMware vSphere and Apache CloudStack. It spans 3 locations: two production locations hosted with Equinix on the US east and west coast, and a dev/test location in the HBS office in Mountain View, CA.
The telephony application itself is a complex multi-tier application containing multiple VMs and various high-end physical network appliances, including appliances from F5 Networks, Brocade, Palo Alto Networks and Acme Packets (now owned by Oracle). The application consists of two independent parts, called "BSS" and "OSS". The BSS component is responsible for the end-user interaction, while the OSS component is responsible for handling the actual VoIP traffic. The VMs and hardware appliances isolated in multiple network tiers to provide for separation of traffic and QoS.
At any given time, multiple instances of the application are running. In addition to the production instances, there are instances for development, unit testing, integration and staging.
Configuration Management with Chef
All configuration in the each of the application instances is managed with Chef. This includes the typical system configuration such as user accounts and services configuration on virtual machines, but also the configuration of the routers, firewalls, load balancers, and even the DNS.
Continuous Integration / Continuous Delivery with Jenkins
HBS deploys application and OS updates using a fully automated CI/CD pipeline. Jenkins, the well-known open source integration server, drives this pipeline. It monitors the central Git repository, and as soon as a new commit is detected, it will build the software. The result of this build stage is a set of Debian packages that are copied to a central package repository.
After the build, Jenkins creates a databag in Chef referencing the newly built versions, and provisions a set of VMs in the unit testing environment. It then bootstraps these VMs into Chef, which applies a base configuration depending on the run list, and on top of that the version of the packages that just got built as stored in the databag. Once the VMs are fully up and running, Jenkins instructs the F5 load balancer to redirect traffic to the new VMs. At this point, an external testing daemon kicks in that runs the unit test suite. The tests for OSS and BSS are run into two separate unit testing environments.
If the unit tests pass, Jenkins moves to the next stage in the CD pipeline: the integration environment. In the integration environment, a similar process will run as in the unit testing environment, but here the entire system (OSS + BSS) is tested so together. After integration testing, there is a final staging environment again with its own set of tests. If the tests pass here as well, the code is deployed to both production clusters.
Going to the next level of Agility with Ravello Systems
While HBS has an extremely advanced and agile architecture, it is not immune against more mundane problems in IT, specifically lack of capacity and the hassle of managing physical hardware. For some time, the dev/test private cloud has been out of capacity, and HBS was hesitant to invest more time and money into it. Paul Sebastien: "We could invest in more on-premise infrastructure. However this would take away precious resources (staff and money) from developing our platform. Also having a large physical footprint makes us less able to respond to future changes."
After a quick evaluation, HBS decided to implement Ravello to augment their internal dev/test capacity. The first implementation provided a solution allows developers to start up, on-demand, their own (scaled down) copy of the application environment. This means that developers do not have to share development environments anymore, leading to less waiting and higher quality software.
"Getting a copy of our environment into Ravello was easy", according to Ram Akuka. "We simply uploaded our base image, and virtual appliance form factors of our physical appliances. We then used the Chef bootstrapping support that Ravello offers to bootstrap into our existing Chef infrastructure, and we were off to the races."
As a next step, DT-HBS will also implement Ravello for the unit testing environments. One of the key benefits here is that if a build fails the tests, the environment can be retained for debugging. Previously this was not possible as subsequent tests ran in different sets of VMs, and the previous VMs had to be cleaned up because of capacity constraints. But with Ravello, each test run has in its own environment, that can be stopped and restarted as many times as needed.