Bridging The AutoLiance Gap for PCI, SOX, SOC2, NIST, NERC ComplianceAnirban Banerjee
Automation is now the core mantra for all businesses. Getting the maximum utility out of every piece of software and service investment you have made is the natural way of things. With software orchestration frameworks, python, shell scripts and monitoring tools nearly every company now has it within its grasp to increase efficiency to the max.
There is a downside here though. In this race to automate everything and anything under the sun, often times compliance teams are left holding the bag. Its sexy to think about how to automate processes for granting access to servers, apps but its not as sexy to think about how does one go about auditing these automated actions, generate reports, make sure controls are aligned with the frameworks that you have to abide by. In this article we will discuss how automation can sometimes cause a spanner to be thrown into compliance processes and what are the best practices to balance the need for efficiency and speed with compliance workflows.
Automate all the things!
From server monitoring, resizing RAM, disk space on servers to taking actions based on Zendesk, Jira, tickets filed by your colleagues. Everything needs to be automated. One of the common ways to automate a bunch of IT, SecOps, DevOps tasks is to use pieces of software that provide the ability to perform configuration management. As an example, if your company is using cloud servers from any vendor, be it AWS, Oracle, Google or anything else, you can use configuration management software like Chef, Puppet, Salt, Ansible to automatically patch, update, modify these servers. Similarly you can use various types of plugins, such as scriptrunner for Jira to automate actions based on the contents of a support ticket filed by someone inside the company. For example, someone sends a support ticket requesting access to a server, you can use scriptrunner to parse the fields in the request and take appropriate actions via API calls to a 3rd party service.
What’s the catch here?
The catch is a bit non intuitive. It is common sense that automation is a good thing, and nobody should be arguing against it – however the tradeoff comes when you automate without paying attention to the compliance frameworks that you and your company need to adhere to. As an example – Company X has IT folks in the loop when a new person is onboarded into the company. Accounts need to be created for the new hire, email accounts need to be provisioned, access granted and more. In the rush to automate all the processes often times enterprises will buy point solutions that perform their duties well, but do not provide an end to end pipeline of automation. As an example, consider IAM and IDaaS solutions. They do well at probing your active directory, LDAP, Workday, Jumpcloud accounts and figuring out which user has been added, provisioning accounts for them but often times fall short in completing 100% of the necessary onboarding processes like creating accounts on server clusters. Yes, it is important to make sure the new hire gets an account on Salesforce, but they might also need new accounts on Windows machines that are not tied to AD – then what?
Furthermore, account creation on sensitive assets should not be completely automated. Your SCADA, ITAR system access should not be blindly granted just relying on automation. In fact the biggest “AutoLiance (Automation-Compliance) gap” in most automation pipelines is insufficient authorization controls. Who is blessing the request to provide access?
Automation bypasses a lot of authorization controls because people who develop the automation are often times not aware of the implications of taking humans out of the loop. Like it or not – only a minute percentage of companies around the world have everything documented about the importance of resources, who uses them, for what. Most companies still rely on experienced IT folks to dig up relevant information about which cluster of servers is used for what.
Why does the AutoLiance Gap happen?
The Gap happens because the way automation is rolled out in organizations, using piece meal software leads to lack of smooth authorization being layered through the entire process. A very simple example – You are using Chef to manage configurations for your cloud servers, patch and upgrade them. Everything sounds good. Then the DevOps team has a brilliant idea to automate user provisioning on the cloud servers by tying in a simple script that probes LDAP/AD and finds out if a new user has been added to a group or not. Here are the things that cause concern:
- The cloud server on which the account will be provisioned, is it critical to the company or a test cluster? If it is critical who is authorizing the account creation?
- During the account creation process is there a rationale being logged somewhere to say why the account is being created, and is it being (at least) rubber stamped by someone?
- How long will the account be created for? Who sets the default values? Do the default values change over time or depend on the sensitivity of the resource?
All these questions above are swept under the carpet when automation is rolled out into most organizations. It is only when SOC2, PCI, SOX, NIST, NERC compliance reports need to be filed when teams realize that automation has definitely reduced the burden of manual work but has not done anything to provide solid compliance controls. Using piece meal pieces of software to perform automation has upsides and downsides. You will spend less time launching your minimal solution, however, you will not be (easily) able to answer the above questions easily and reliably during your audit process.
How to fill the AutoLiance Gap?
There are two simple strategies that will help to fill the AutoLiance gap.
Layer Authorization and Explicit Logging
Instead of using individual tools to patch up workflows where you see manual effort being expended, instead we should think about toolchains. The Toolchain concept is not new. In fact people familiar with *nix systems, programming, compiling on platforms understand the concept very well. What I want to highlight is that when you identify gaps in the workflow where manual effort needs to be cut out, one should also think about layering in “authorization” and “Explicit logging”. A simple example, piggybacking on what we discussed above. You have used Chef to automate account creation, patch management etc. Use Jira and Scriptrunner to log tickets for any account creation requests. Use the Scriptrunner plugin for Jira to fire off API calls to Chef. This breaks a bit from the standard automation model where Chef will directly probe your directory structure and make decisions all by itself, instead you are introducing one layer of indirection where Chef is being told what to do by your support ticketing system. This is a simple example of “gluing” on Authorization and explicit logging when designing your automation pipelines.
GRC involvement from the Get Go
Governance, Risk and Compliance (GRC) teams should not be left to make do with what security, IT and DevOps have developed. GRC teams are an integral part of the picture and have a very core perspective into how things should be from an audit perspective. Its terribly crucial to get GRC teams to sign off on any automation project to fend off any costly AutoLiance gaps.
As we push for more automation in enterprises it is important to remember that there are more than one or two stakeholders. We need to involve the right people with the right perspective in the decision making process for the simple reason that not everyone has a complete view of the entire pipeline. Furthermore, using toolchains and layering in explicit authorization and logging will help you create audit-ability in the automation itself.