#4: Sho~dan Core Foundations - Automation and Scaling
The next blog in our Sho~dan series examines the importance of automation and scaling within the payments industry.
As we mentioned earlier in our blog series, #2: Creating Judopay's Next-Gen Tech Platform, our engineering team is lean. We are often asked how we can deliver so much with a lean team. The answer boils down to two things: hiring excellent engineers, and heavily focusing on automation.
David, Judopay’s Head of Infrastructure and Automation Engineering, will take you through some of the automation techniques we use at Judopay. We hope they provide inspiration for your automation efforts!
It’s all about Automation!
Automation engineering is a huge part of Sho~dan. If something needs to be manually performed, then it’s a candidate to be automated!
We have all experienced the mind-numbing boredom of repeating the same task again and again, awaiting the sweet release of death... or 5pm. Whichever comes first. What if we could get a robot to take over these boring repetitive tasks so we could focus on more meaningful ways to deliver value?
Continuous Integration (CI) helps ensure that when a software developer makes a change, we automatically have a production-grade deliverable prepared and ready to be deployed.
Continuous Integration pipelines can be built to perform virtually any task we wish to repeat for every software change made to our system.
This can include:
- Compiling code to ensure it builds without error.
- Automated testing to ensure the change behaves as expected and doesn’t break any existing functionality.
- Code linting to ensure the code change meets code quality and style standards.
- Security scanning code to check for common mistakes.
- Checking for security vulnerabilities in dependencies.
These are some of the most common tasks performed by CI pipelines, but virtually anything repetitive can be automated. Eliminating as many of these boring tasks as possible makes developers happy and more productive.
At Judopay, we have 100+ CI pipelines covering virtually all our engineering deliverables: from micro-services to middleware, infrastructure as code, mobile SDK’s, management reporting and even our developer and API documentation!
Continuous Delivery (CD) techniques automate the deployment of changes into test and production environments in a consistent, safe, and well tested manner, helping to ensure the software platform is not negatively impacted by changes as they are deployed.
Continuous Delivery might include the automation of steps such as:
- Removing servers from load balancers and adding them back after the deployment is complete in order to avoid downtime on the service.
- Deploying database scripts.
- Deploying compiled binaries onto servers, or leveraging tools such as Kubernetes to deploy the new version of your software.
- Triggering end-to-end tests to validate the deployment.
- Triggering a release of the software to App stores or package repositories.
- Automated rollback if any issues are detected.
Continuous Delivery is not just about automating deployments however. The hint is in the word continuous.
Frequently deploying small changes ensures that the deployment pipeline is well tested and reliable. Small changes are also easier to troubleshoot. If issues are encountered after a deployment where only one change was made, it’s much easier to find the culprit than if the deployment included a large number of changes, all of which might be contributing to the issue.
Automation at Judopay doesn’t stop at conventional software delivery!
Continuous Maintenance techniques are a less commonly talked about topic, but are an extremely useful tool to reduce the amount of manual toil engineers perform and help ensure strong security compliance.
Continuous Maintenance might include automating tasks such as:
- Server patching.
- Configuration drift reconciliation.
- Renewing credentials which should be rotated frequently such as passwords, certificates or service account keys.
- Self-healing, such as replacing or restarting servers or services which are reporting as unhealthy.
All of this can be automated such that these tasks are happening in the background all the time, or perhaps during pre-defined maintenance windows, without any action from engineers.
At Judopay, our systems gracefully remove themselves from load balancers or clusters, apply patches, resolve configuration drift, and add themselves back into the active pool after checking that the service they provide is still healthy. No monthly manual patching tasks for engineers! Of course this all happens in our staging environments first where we constantly run end-to-end tests and catch any issues. Once a change has been in the staging environment for two weeks, it is automatically promoted to production.
Additionally, virtually all of our infrastructure is managed as code vs traditional manual configuration.
When Infrastructure-as-Code is combined with continuous maintenance, you can automate even more of your security governance and ensure consistent configuration across your systems.
Continuous maintenance of server patching alone saves our engineers hundreds of hours every month and all without any impact to our live systems. Combined with the other techniques outlined above, we can ensure strong security, increase ROI as our infrastructure needs grow, and allow engineers to focus on meaningful work.
Whether it’s the New Year Sales or a merchant-specific peak in traffic, platforms must be able to effectively handle a sudden spike or drop in requests.
Platform elasticity isn’t something merchants will be aware of - what they will notice is slow transactions or lost sales when a platform doesn’t have the ability to dynamically scale.
Our Cloud Native infrastructure is configured to auto-scale, dynamically adapting to demand by adding or removing service instances as required. Thus we not only avoid transactions competing for processing power, we also avoid wasting our resources running them idle.
Typically, vertical scaling is the tool most systems use as it's the easiest to implement. Sho~dan’s ability to scale vertically allows us to individually tailor the size and performance of each service instance.
When a system is scaled vertically, the resources allocated to each instance of that system is changed - increasing the number of CPUs allocated or the amount of memory for example. Right-sizing systems provides a good baseline of cost versus performance.
Taking advantage of Cloud infrastructure, instances can be enlarged or shrunk to meet the average demand of the system without paying for more resources than needed.
In horizontal scaling, instead of changing the resources allocated to each instance, we increase or decrease the number of instances. You can think of an instance as a server, or in the case of a tool like Kubernetes, a “replica”.
Horizontal scaling can be more difficult to implement because often the system needs to be aware of the instances. Database clusters might need to replicate data, or load balancers might need to be updated to forward traffic to the new instances.
At Judopay we leverage Kubernetes to host our microservices. This empowers us to spin up more instances to run alongside each other within the Kubernetes cluster, or spin up more instances of Kubernetes itself. Horizontal scaling allows us to implement elasticity where we can scale up, or down to match real-time demand.
At Judopay we’ve spent a lot of energy putting in place the automation described above, as well as other more custom automation tooling specific to our use-cases. While it’s not always easy to balance these proactive efforts against delivering work which directly effects the company’s bottom-line, we’ve found that investing in automation pays dividends in the long run.
Implementing CI/CD, Continuous Maintenance, and designing scalable, Cloud Native systems is no mean feat. It takes time, effort and determination, but the end result is happier, more effective engineers and far lower overhead costs.
This is all part of why we are so proud of Sho~dan.