Book Club: The DevOps Handbook (Chapter 23. Protecting the Deployment Pipeline and Integrating Into Change Management and Other Security and Compliance Controls)

This entry is part [part not set] of 25 in the series DevOps Handbook

The following is a chapter summary for “The DevOps Handbook” by Gene Kim, Jez Humble, John Willis, and Patrick DeBois for an online book club.

The book club is a weekly lunchtime meeting of technology professionals. As a group, the book club selects, reads, and discuss books related to our profession. Participants are uplifted via group discussion of foundational principles & novel innovations. Attendees do not need to read the book to participate.

Background on The DevOps Handbook

More than ever, the effective management of technology is critical for business competitiveness. For decades, technology leaders have struggled to balance agility, reliability, and security. The consequences of failure have never been greater―whether it’s the healthcare.gov debacle, cardholder data breaches, or missing the boat with Big Data in the cloud.

And yet, high performers using DevOps principles, such as Google, Amazon, Facebook, Etsy, and Netflix, are routinely and reliably deploying code into production hundreds, or even thousands, of times per day.

Following in the footsteps of The Phoenix Project, The DevOps Handbook shows leaders how to replicate these incredible outcomes, by showing how to integrate Product Management, Development, QA, IT Operations, and Information Security to elevate your company and win in the marketplace.

The DevOps Handbook

Chapter 23

Almost any IT organization of any size will have existing change management processes, which are the primary controls to reduce operations and security risks. The goal is to successfully integrate security and compliance into any existing change management process.

ITIL breaks changes down into three categories:

Standard Changes: lower-risk changes that follow an established and approved process but can also be pre-approved. They can include monthly updates of application tax tables or country codes, website content & styling changes, and certain types of application or operating system patches that have a well-understood impact. The change proposer does not require approval before deploying the change, and change deployments can be completely automated and should be logged so there is traceability.

Normal Changes: higher-risk changes that require review or approval from the agreed upon change authority. In many organizations, this responsibility is inappropriately placed on the change advisory board (CAB) or emergency change advisory board (ECAB), which may lack the required expertise to understand the full impact of the change, often leading to unacceptably long lead times. Large code deployments may contain hundreds of thousands of lines of new code, submitted by hundreds of developers. In order for normal changes to be authorized, the CAB will almost certainly have a well-defined request for change (RFC) form that defines what information is required for the go/no-go decision.

Urgent Changes: These are emergency and potentially high-risk changes that must be put into production immediately. These changes often require senior management approval but allow documentation to be performed after the fact. A key goal of DevOps practices is to streamline the normal change process such that it is also suitable for emergency changes.

Recategorize The Majority of Lower Risk Changes as Standard Changes

One way to support an assertion that changes are low risk is to show a history of changes over a significant time period and provide a complete list of production issues during that same period. Ideally, deployments will be performed automatically by configuration management and deployment pipeline tools and the results will be automatically recorded.

Creating this traceability and context should be easy and should not create an overly onerous or time consuming burden for engineers. Linking to user stories, requirements, or defects is almost certainly sufficient.

What To Do When Changes are Categorized as Normal Changes

The goal is to ensure that the change can be deployed quickly, even if it is not fully automated. Ensure that any submitted change requests are as complete and accurate as possible, giving the CAB everything they need to properly evaluate the change.

Because the submitted changes will be manually evaluated by people, it is even more important the context of the change is described. The goal is to share the evidence and artifacts that gives confidence that the change will operate in production as designed.

Reduce Resilience on Separation of Duties

For decades, developers have used separation of duty as one of the primary controls to reduce the risk of fraud or mistakes in the software development process. As complexity and deployment frequency increase, performing production deployments successfully increasingly requires everyone in the value stream to quickly see the outcomes of their actions.

Separation of duty often can impede this by slowing down and reducing the feedback engineers receive on their work. Instead, choose controls such as pair programming, continuous inspection of code check-ins, and code review.

Ensure Documentation and Proof For Auditors and Compliance Officers

As technology organizations increasingly adopt DevOps patterns, there is more tension than ever between IT and audit. These new DevOps patterns challenge traditional thinking about auditing, controls, and risk mitigation.

“DevOps is all about bridging the gap between Dev and Ops. In some ways, the challenge of bridging the gap between DevOps and auditors and compliance officers is even larger. For instance, how many auditors can read code and how many developers have read NIST 800-37 or the Gramm-Leach-Bliley Act? That creates a gap of knowledge, and the DevOps community needs to help bridge that gap.”

Bill Shinn, a principal security solutions architect at Amazon Web Services

Instead, teams work with auditors in the control design process. Assign a single control for each sprint to determine what is needed in terms of audit evidence. Send all the data into the telemetry systems so the auditors can get what they need, completely self-serviced.

“In audit fieldwork, the most commonplace methods of gathering evidence are still screenshots and CSV files filled with configuration settings and logs. Our goal is to create alternative methods of presenting the data that clearly show auditors that our controls are operating and effective.”

The DevOps Handbook

Case Study: Relying on Production Telemetry for ATM Systems

Information security, auditors, and regulators often put too much reliance on code reviews to detect fraud. Instead, they should be relying on production monitoring controls in addition to using automated testing, code reviews, and approvals, to effectively mitigate the risks associated with errors and fraud.

“Many years ago, we had a developer who planted a backdoor in the code that we deploy to our ATM cash machines. They were able to put the ATMs into maintenance mode at certain times, allowing them to take cash out of the machines. We were able to detect the fraud very quickly, and it wasn’t through a code review. These types of backdoors are difficult, or even impossible, to detect when the perpetrators have sufficient means, motive, and opportunity.”

“However, we quickly detected the fraud during our regularly operations review meeting when someone noticed that ATMs in a city were being put into maintenance mode at unscheduled times. We found the fraud even before the scheduled cash audit process, when they reconcile the amount of cash in the ATMs with authorized transactions.”

The DevOps Handbook
Series Navigation

Leave a Reply

%d bloggers like this: