Sunday 12 October 2014

Adventures in continuous delivery, our build/deployment pipeline

Overview

We have been undergoing a bit of a dev-ops revolution at work. We have been on a mission to automate everything, well as much is as possible. Exciting times but we are still only just setting out on this adventure, I wanted to document where we are currently at.

First a brief overview of what we have. We have many many small windows services, websites and apis each belonging to a service and performing a specific role. I must quickly add we are a microsoft shop. More and more are our services moving towards a proper service orientated architecture. I hesitate to use the term micro services as it's so hard to pin a definition on the term but let's just say they are quite small, focused on a single responsibility.

We have 5 or 6 SPA apps mainly written with durandal and angular. 7 or 8 different APIs serving data to these apps and to external parties. 10 to 15 windows services which mostly publish and subscribe to N service bus queues.

We currently have 8 environments that we need to deploy to (going to be difficult to do this by hand, me thinks) including CI, QA*, Test*, Pre-prod* and live* (* the last 4 are doubled as we deploy into 2 different regions which both operate slightly differently and have different config and testing). This list is growing with every month that passes. We really really needed some automation, when it was just 3 environments in the UK region we just about got by with manual deployments.

I'm going to outline how the build pipeline integrates with the deployment pipelines and the steps that we take in each stage. But I'm not really going to concentrate on the actual technical details, this is more of a process document. 

1.0 The build pipeline

We operate on a trunk based development model (most of the time) and every time you check in we run a build that will produce a build artifact, push that in to an artifact repository and then run unit and integration tests on the artifact. 

Fig 1. The build pipeline

Build

1. Run a transform on the assembly info so that the resultant dll has build information inside the details. This aids us determine what version of a service is running on any environment, just look at the dlls properties.
2. Create a version.txt file that lives in the root of the service. This is easily looked at on an API or website as well as in the folder containing a service.
3. We check in all the versions of the config files for all the environments that we will be deploying to and use a transform to replace the specific parts of a common config file with environment specific details (e.g connection strings). Every environment's config is now part of the built artifact.
4. Build the solution, usually with msbuild, or for the SPA apps, gulp
5. If all this is successful upload the built artifact to the artifact repo (the go server)

Test

6. Fetch the built artifact
7. Run unit tests
8. Run integration tests

The test stage is separate so that we can run tests on a different machine if necessary. It also allows us to parallelise the tests running them on many machines at once if required.

Not shown on this diagram are the acceptance tests, these are run in another pipeline. Firstly we need to do a web deploy (as below) then setup some data in different databases and finally run the tests.

2.0 The web deploy pipeline

So far so good, everything is automated on the success of the previous stage. We then have the deployment pipelines of which only the one to CI is fully automated so that acceptance tests can be run on the fully deployed code. All the other environments are push button deploys using Go.
The deployment of all our websites/APIs/SPAs are very similar to each other and the same across all the environments so we have confidence that it will work when finally run against live.

Fig 2. The web deploy pipeline

Deploy

1. Fetch the build artifact
2. Select the desired config for this environment and discard the rest so there is no confusion later
3. Deploy to staging (I've written a separate article on this detailing how it works with IIS powershell and windows)
a. Delete the contents of the staging websites physical path
b. Copy the new code and config into the staging path

Switch blue green

We are using the BueGreenDeployment model for our deployments. Basically you deploy to a staging environment then when you are happy with any manual testing you switch it over to live with the use of powershell to switch the physical folders in IIS of staging and live. This gives a quick and easy rollback (just switch again) and minimises any down time for the website in question.

3.0 The service deployment pipeline

Much the same as the deployment of websites except for the fact the there is no blue green. The Services mainly read from queues and so this makes it difficult to run a staging version at the same time as a live version (not impossible but a bit advanced for us at the moment)

Fig 3. The service deploy pipeline

Deploy

The install step again utilises powershell heavily, firstly to stop the services, then back things up and deploy the new code before starting the service up again.

There is no blue green style of rollback here as there are complications to doing this with windows services and with reading off the production queues. There is probably room for improvement here but we should be confident that things work by the time we deploy live as we have proved it out in 2 or 3 environments before live.

Summary

I'm really impressed with Go as our CI/CD platform it gives some great tooling around the value stream map, promotion of builds to the other environments, pipeline templates and flexibility. We haven't just arrived at this setup of course, its been an evolution which we are still undergoing. But we are in a great position moving forward as we need to stand up more and more environments both on prem and in the cloud.

Fig 4. The whole deployment pipeline

Room for improvement

There is plenty of room for improvement in all of this though

* Config checked into source control and built into the artifact
Checking the config into the code base is great for our current team, we all know where the config is, its easy to change or add new things to it. But for a larger team or where we didn't want the entire team to know secret connection string to live DBs it wouldn't work. Thank goodness we don't have any paranoid DBAs here. Also there is a problem if we want to tweak some config in an environment. we need to produce an entire new build artifact from source code, which might now have other changes in it that we don't want to go live. We can handle this using feature toggles and a branch by abstraction mode of working but it requires good discipline which we as a team are only just getting our heads around. Basically if the code is always in a releasable state this is not an issue.

* Staging and live both have the same config
When you do blue green deployments as we are doing, both staging and live always point to the live resources and databases, so it's hard to test that the new UI in staging works with the new API, also in staging as both the staging and current live UI will be pointing to the live API. Likewise the live and staging API will both be pointing to the live DB or other resources. Blue green deployments are not designed for integration testing like this, that's what the lower environments are for.
On a very similar vein, logging will go to the same log files which can be a problem if your logging framework takes out locks on files, we use log4net a lot which does. There are options to work in a lock when required mode with log4net but it can really hit performance. We have solved this by rewriting the path to the log file on blue green switch.

* No blue green style deployments of windows services
The lack of blue green deployment of services means that we have a longer period of disruption when deploying and a slower rollback strategy. Added to this you can't test the service on the production server before you actually put it live. There are options here but it gets quite complicated to do, and by the time the service is going live you should have finished all your testing by now.

* Database upgrades are not part of deployment
At the time of writing we are still doing database deployments by hand, this is slowly changing and some of our DBs do now have automated deployments, mainly using the redgate SQL tool set, but we are still getting better at this. It's my hope that we will get to the fully automated deployments of data schemas at some point, but we are still concentrating on the deployment of the code base

* Snowflake servers
All our servers both on prem and in the cloud are built, installed and configured manually. I've started to use chocolaty and powershell to automate what I can around set-up and configuration, but the fact still remains that its a manual process to get a new server up and running. The consequence of this is that each server has small differences to other servers that "should" be the same. This means that we could introduce bugs in different environments due to accidental differences in the server itself.

* Ability to spin up environments set up as needed for further growth 
Related to the above point, as a way to move away from the problem of snowflake servers we need to look at technologies like puppet, chef, Desired state configuration etc. If we had this automation we could spin up test servers, deploy to other regions/markets, or scale up the architecture by creating more machines. 

Relevant Technology Stack (for this article)

• Windows
• Powershell
• IIS
• SVN and Git
• Msbuild and gulp

Next >>

Ive written a follow up article to this which details the nuts and bolts of the blue green deployment techniques we are currently using. blue-green-web-deployment-with-IIS-and-powershell.
The code for which can be found on my git hub here: https://github.com/DamianStanger/Powershell/

No comments:

Post a Comment