Why DevOps should be DevSecOps?

Introduction

Most people have heard of DevOps, but DevSecOps is still quite new. The “Sec” in the middle of “DevOps” obviously stands for “security.” This article will explore what DevOps and DevSecOps mean, and why you should consider integrating DevSecOps practices in your organization.

The Difference between DevOps and DevSecOps

First, let’s define what the terms “DevOps” and “DevSecOps” mean. DevOps is traditionally understood as the intersection between “development” and “operations”, i.e., between the domain of developing the software and the domain of running it. As such, DevOps is akin to a philosophy or a way of working where you collaborate closely between two previously siloed teams or engineers proficient in both domains. One of the main goals of “DevOps” is to achieve faster release cycles, and to ship software faster than ever before. Over the years, “DevOps” both expanded and refined technologies and techniques that would fall under its remit, such as:

Automation
Continuous integration, continuous delivery, and continuous deployment
Infrastructure as code
Facilitated code reviews (such as merge requests on GitHub)
Monitoring and alerting
Automated mitigation procedures taken as responses to above alerts
Designing DevOps-optimized applications (using, for example, the 12-factor app guidelines)
DevSecOps (see below)

The above list is by no means exhaustive. As you can see, DevOps is broad in its application domain and includes DevSecOps.

DevSecOps is a subset of DevOps that covers the intersection between “development,” “operations,” and “security,” with a focus on automation. When automation is not possible, rules and guidelines are established for developers, system engineers, and DevOps engineers to follow.

DevSecOps must be well thought out at every stage of the software lifecycle, both at a human-interaction level and automation level. With DevSecOps, both developers and operational teams can:

Ensure security training is adequate for diverse teams working on the software (e.g., developers, testers, operation engineers)
Ensure communication and collaboration between teams are quick, efficient, and as frictionless as possible
Automate security at every stage of the software lifecycle whenever possible, for example, during:

Threat modeling
Static analysis
Linting
Scanning for dependencies that have known vulnerabilities
Dynamic Application Security Testing (DAST)
Hardening
Interactive Application Security Testing/Runtime Application Self-Protection (IAST/RASP)
Centralized logging
Monitoring, alerting, and taking automated actions to remediate alerts

Precursors to security lapses

At times, DevOps teams tend to overlook security.

Pressure to deliver

When DevOps engineers are pressured to deliver, they tend to focus on producing something that works and de-prioritize anything that does not fulfill that primary goal. Consequently, they cut corners, and security is often the first casualty.

A typical example of how security can drag software development down is encryption at rest and in transit. These make the inspection of data (such as data saved on disk or network traffic) more problematic.

Security as the final stage of software development

Very often, security is thought of as a final step in software development, something that someone will do later. The main consequence of this mindset is that security is usually poorly thought-out, is incomplete, and takes the form of additional layers that come on top of the application.

This is a very poor practice, as the software itself is not secure. The right way is to develop the software with security in mind right from the start (i.e., the design phase) and to ensure that security is everyone’s responsibility.

Lack of automation

Usually, the implementation and testing of security aspects are left to security experts, usually at the end of the software lifecycle. The implementation and testing are usually done manually and are thus very time-consuming and error-prone. In such a situation, it is very easy, even for security experts, to miss vulnerabilities.

The solution is to increase the level of automation, especially in terms of security testing, both at the development phase (linting, static application security testing, etc.) and testing phase (penetration testing, fault injection testing, Dynamic Application Security Testing, etc.)

Skills gap

Not all engineers may be aware of the latest security aspects or have received timely training. Most DevOps engineers come from either a sysadmin background or a software development background They do not often receive enough exposure to security challenges or technical know-how to set up a secure production environment. Typically, the former is often unaware of how software should be developed in a secure manner, and the latter do not realize the security challenges related to setting up a production environment as a whole.

In a fast-paced development environment, security is often an afterthought that will be plugged in at some point in the future. The urgency of the moment is usually to implement certain features, fix critical bugs, and get something up and running as quickly as possible. For example, a startup (or a new project in an established organization) might need to get its software running before it can get further funding (or secure the project’s future). In this context, businesses often fail to prioritize security due to limited funding. On the other hand, the later a problem is discovered, the costlier it is to remedy. So, some balance needs to be achieved, based on cost/benefit analysis and automation.

Shift left

The practice of “shifting left” helps mitigate the problems exposed in the previous section. Although, developers might be careful about security while designing and coding the software, they might not understand the bigger picture when the software is integrated into a production environment, along with the vulnerabilities that come with that.

“Shift left” means to transform the way developers and DevOps teams work to move steps that were usually done later in the software development life cycle earlier. Here is a non-exhaustive list of actions that could be taken to shift left:

Establish coding standards, especially defensive programming.
Ensure development environments match as closely as possible with the production environments to avoid the typical “but it works for me” response when a bug is reported to a developer.
Implement several tools that should be run automatically as part of a continuous integration pipeline:

A linting tool to validate code quality and adherence to coding standards
A reputable, well-configured static analysis tool
A scanning tool to detect dependencies that have known vulnerabilities
A tool to ensure unit tests are written and run automatically

Automate the deployment of the software to the QA environment and run functional and non-functional tests (such as integration tests, Dynamic Application Security Testing, and performance tests) after each deployment in an automated manner.
Provide developers access to cloud resources and train them to create and manage such resources.

There are numerous benefits:

Bugs are either avoided entirely or detected early, which is always cheaper.
Best practices are encouraged right from the start for both developers and DevOps engineers.
Security becomes everybody’s responsibility, not just an afterthought plugged in at the end.

In a traditional siloed organization, DevOps engineers usually deal with the software as provided by the dev team and have little leverage over how the software is developed. Shifting left means that different teams will have to collaborate. It means that the developers and testers will have to learn from the DevOps engineers about how the production environment is architected, and why it is architected in this way. Similarly, DevOps engineers will need to learn about the software the developers are creating and how it works to integrate more efficiently in the production environment.

Developers require training in best security practices and they must be encouraged and given the time to apply those best practices to their code.

Shift left will require a deliberate effort from everyone (developers, DevOps engineers, management), and will temporarily slow everybody down until new processes and habits are in place.

Principle of least privilege

The principle of least privilege is a critical aspect of security in the context of software development and DevOps processes. This principle states that entities should only be given just enough privileges to perform the tasks they are meant to perform. On the face of it, it makes sense: If the software gets compromised, you want to limit the impact.

However, this principle is often difficult and time-consuming to implement properly. There are many permission systems, but they usually work on either allowing or denying certain actions performed on certain resources.

Implementation challenges

The initial work required to determine the minimal permissions is often non-trivial. The range of possible actions is usually quite possible to determine (although time-consuming in itself). However, the resources that can potentially be actioned can be very tricky, as this might depend on dynamic parameters. For example, you might know that a certain API call to, say, Amazon Web Services, will be made to act on EC2 instances, but which EC2 instance(s) can be very difficult to determine in advance. This is especially the case if the system is designed such that EC2 instances are ephemeral, which, in most cases, would be the right thing to do. In truth, the targeted resources aren’t the only aspect of API calls that can be dynamic, as the range of actions performed on these resources can be unpredictable as well; however, the range of such actions is usually much more limited than the range of targeted resources.

Just checking what permissions the software needs by dynamically seeing what it is doing is not enough. The range of potential actions and resources might differ if run in a different environment with varying configuration parameters and requests. That’s why you hear about “potential” actions and resources because the actual actions that the software will attempt to take on actual resources will most probably vary from one run to the next.

Consequently, it is hard to automate the determination of what the least privileges are for a given software. This is because it depends on not only the software’s code but also dynamic aspects, such as the environment in which the software is running, configuration parameters, and input parameters for a given request or job. Another difficulty with automating the discovery of such permissions is that at the end of the day, somebody (or something) has to decide whether those permissions are acceptable or not, for example, in the context of a regulated industry or the organization’s IT policies.

Even when the initial job of determining the least amount of privilege that a given software requires, a small change in the code might mandate a review of the existing permissions, which is again time-consuming and hard to automate. In essence, after such a change occurs, the same work that has been put into determining the initial set of permissions should be done again, but hopefully, now it will be much more limited in scope and thus won’t require the same amount of time.

Given the above, it is understandable that DevOps engineers, when working under pressure, are often tempted to give wide-ranging permissions. Even in production workload, it is common to see such broad permissions being used. It often starts with well-meaning intentions, such as simply to “get things going” and with the assurance that they will revisit and restrict these at a later stage. Unfortunately, that later stage often fails to materialize.

The reality of least privilege

Organizations must decide how much security they want versus the cost of implementing such security measures. A startup might be lax with its security requirements, as spending too many resources on crafting permissions on a least-privilege basis might be unaffordable.

On the other hand, a less precarious company with either funding or income might decide to implement the principle of least privilege in part, for example, by using wildcards for resources if the permission system in question allows them. Using such wildcards will at least limit the range of resources that can be actioned, as opposed to the full access that the startup would use.

A company requiring high security might have strict standards to properly implement the principle of least privilege, whatever the cost, because the cost of a breach might be colossal or because they are legally required to do so.

In those last two examples, a dedicated DevSecOps team working with the DevOps team might be a good idea or even necessary. For smaller companies, a good bet would be to educate the developers about the permission system that will be used to run their software, let them deploy their software in a sandbox environment similar to a production environment, and then let them play with it. In this way, they will probably be able to do a lot of the work of determining the least amount of privilege that the software needs for the simple reason that they wrote the code and are thus the best placed to know what actions the code might want to perform and on which resources.

Examples of malpractice and how they can be avoided

This section will review common practices detrimental to security and how DevSecOps can solve them.

Using vulnerable software

The first and probably the most common one is using vulnerable software packages. The vast majority use existing libraries and other such packages, whether open-source or proprietary, to re-use existing functionalities and not reinvent the wheel for problems already solved. The developers would typically ensure their packages are of good quality and reasonably free of defects and vulnerabilities.

The problem arises when organizations lack visibility into the supply chain and developers fail to use the latest versions of the dependencies. New vulnerabilities are discovered constantly. Several databases list vulnerabilities for various libraries; the most known and used one is the Common Vulnerabilities and Exposures (CVE).

Libraries that were once thought of as secure become less and less secure as security issues are found within them. Once such defects are found, the people coding the library will usually quickly issue a patch to fix or mitigate the security concern and publish a new library release. So, developers must ensure that the latest versions of their dependencies are used at all times. Upgrading a dependency is usually a painless process, but incompatibilities may occur every now and then that require some code refactoring. If the developers are under tight deadlines or high pressure, it will be very tempting for them to “just do this later,” which quickly becomes “never.”

The solution to this problem is relatively simple: using a reputable scanner to scan the code as part of a continuous integration process every time a change is pushed to the code repository. The code will thus be scanned automatically, and issues can be reported immediately to the developer responsible, who can then take immediate action. In addition to this automated process, the developer must be allowed to take the time necessary to upgrade the faulty dependencies to their latest versions. Again, on rare occasions, this could be a protracted effort, but it is absolutely necessary from a security standpoint.

Some organizations adopt the mantra of “if it ain't broke, don't fix it,” but upgrading the software dependencies is a non-negotiable security measure, unless in very specific circumstances, for example, if the software has no interaction whatsoever on any networks.

Implementing high-risk code

Another potential security issue regarding software development is the use of dangerous code constructs. A typical example would be a recursion, but there are many others.

The solution to this problem is establishing coding standards for the chosen programming languages and enforcing those standards. Static analysis tools can verify coding standards, flag high-risk coding constructs, and detect potential bugs ahead of time. Such tools can be made part of a continuous pipeline, where the code is checked on every change and the developer who pushed such code can be notified of any issues immediately. Static analysis tools might not detect all problems, though, so code reviews by team members would still be a good option.

Not validating user inputs

Whenever a user is expected to provide some input, especially in the form of free text, there is a danger of code injection attacks. Failing to sanitize and validate such user input is one of the main attack vectors for malicious actors. This aspect of security is very important, and it can be quite difficult to get right and completely secure.

There are libraries for many programming languages to check such user inputs, and organizations should definitely use them and use them properly. In addition, some static analysis tools will spot such potential vulnerabilities and flag them to the developers. Finally, runtime tools, such as Web Application Firewalls, can prevent the most common attacks.

Not testing for bugs

An obvious issue would be having bugs in your code, as bugs often have security implications on top of malfunctioning software. Bugs are an inevitable part of software development, but this can be mitigated by the use of a static analysis tool as well. Such a tool would be far from enough though. What is also required is comprehensive, and if possible automated, testing at various stages of the software lifecycle. Examples of such testing could be:

Unit testing
Integration testing
Stress/load testing
End-to-end testing

As a side note, but still very important, the later a bug is discovered, the costlier it is to rectify.

Not using encryption

Another pretty obvious problem is to leave data in plain text. The solution is obvious: Use encryption for data at rest and data in transit. Numerous libraries and third-party software can do this, so the actual cost of implementing such encryption is actually not that great.

Then, there is the issue of not monitoring how the software performs in the production environment. This would typically lead to the inability to understand what is going on when things go wrong, what the bottlenecks are, or why the software or system has crashed or otherwise ceased to function properly. Observability (defined as the combination of monitoring metrics, logging, and tracing) is indispensable to understanding how the software behaves under load, as well as preventing and mitigating (ideally in an automated manner) potential issues.

As a general note when it comes to DevOps, it often happens within many organizations that IT security is left to chance, or more precisely to the knowledge and goodwill of developers and DevOps engineers. This is probably good enough for a startup, but as an organization grows, it would definitely need to establish and enforce its security posture. Security posture is essentially the formalization of the various security aspects of your IT systems and workloads. It comprises the following:

Asset inventory and its attack surface
Processes in place within the organization to detect and mitigate cyberattacks
Processes in place to recover from successful or partially successful cyberattacks
How much of these processes is automated, and how much is manual?

Conclusion

Security is left as an afterthought in many organizations. You should strive to not be one of these. You need to assess your security requirements for the various projects and products your organization develops, and perform a cost analysis in accordance with the size, goals, and human and financial resources available within your organization.

If you are a startup with limited financial and human capital developing a minimum viable product, you might decide not to spend too much time and effort on security, but this must be a conscious decision. It is critical to consider the consequences of such a decision, especially how you will deal with the accumulated technical debt.

If your organization is already sizable, with secure funding and/or sales, and security still isn’t a central part of your IT processes and systems, then you should probably look at integrating some DevSecOps concepts.

As illustrated in this article, DevSecOps is about considering security at every step of the software lifecycle, focusing especially on security automation. DevSecOps can help, for example, scan for vulnerable dependencies, potential bugs, or security flaws using a variety of security tools, as well as monitor the software at runtime under load accompanied by alerting and automated remediation.

In conclusion, DevSecOps is an integral part of DevOps and must be integrated right from the start, or retrofitted as soon as practically possible.

Was this article helpful?

Why DevOps should be DevSecOps