Book Club: The Phoenix Project (Chapters 17-20)

This entry is part 5 of 8 in the series Phoenix Project

The following is a chapter summary for “The Phoenix Project” by Gene Kim for an online book club.

The book club is a weekly lunchtime meeting of technology professionals. As a group, the book club selects, reads, and discuss books related to our profession. Participants are uplifted via group discussion of foundational principles & novel innovations. Attendees do not need to read the book to participate.

Chapters 13-16 HERE

Background on the Phoenix Project

“Bill, an IT manager at Parts Unlimited, has been tasked with taking on a project critical to the future of the business, code named Phoenix Project. But the project is massively over budget and behind schedule. The CEO demands Bill must fix the mess in ninety days or else Bill’s entire department will be outsourced.

With the help of a prospective board member and his mysterious philosophy of The Three Ways, Bill starts to see that IT work has more in common with a manufacturing plant work than he ever imagined. With the clock ticking, Bill must organize work flow streamline interdepartmental communications, and effectively serve the other business functions at Parts Unlimited.

In a fast-paced and entertaining style, three luminaries of the DevOps movement deliver a story that anyone who works in IT will recognize. Readers will not only learn how to improve their own IT organizations, they’ll never view IT the same way again.”

The Phoenix Project

Chapter 17

Bill takes his son to see the trains after quitting but is interrupted by multiple calls from Wes & Patty.

The inventory management systems are down. No one can get inventory levels in the plants or warehouses, and they don’t know which raw materials need to be replenished.

“Well, we’ve pretty much screwed the pooch since you’ve left,” Wes says, sounding genuinely abashed, confirming my worst fears. “Steve insisted that we bring in all the engineers, including Brent. He said he wanted a ‘sense of urgency’ and ‘hands on keyboards, not people sitting on the bench.’ Obviously, we didn’t do a good enough job coordinating everyone’s efforts, and…”


Steve Masters attempts to call Bill after calling his wife Paige. Eventually, Bill returns his call and listens to Steve’s apology.

Steve had promised to get “his hands dirty” with IT but hasn’t lived up to the promise. His delegation of IT to Sarah was a total screwup.

“I’m convinced that IT is a competency that we need to develop here. All I’m asking is that you spend ninety days with me and give it a try.”


Steve Masters convinces Bill to rejoin Parts Unlimited.

Chapter 18

Bill attends Steve IT Leadership Off-Site, which is actually located on the Parts Unlimited campus.

Wes, Patty, Chris, Erik, and Steve are all in attendance.

“Erik described the relationship between a CEO and a CIO as a dysfunctional marriage. That both sides feel powerless and held hostage by the other.”


“There are two things I’ve learned in the last month. One is that IT matters. IT is not just a department that I can delegate away. IT is smack in the middle of every major company effort we have and is critical to almost every aspect of daily operations.”


“The second thing I’ve learned is that my actions have made almost all our IT problems worse. I turned down Chris and Bill’s requests for more budget, Bill’s request for more time to do Phoenix right, and micromanaged things when I wasn’t getting the results I wanted.”


Steve apologizes to Bill, taking full responsibility for the failures of Phoenix and the audit.

Steve identifies trust as the primary issue.

“A great team doesn’t mean that they had the smartest people. What made those teams great is that everyone trusted one another. It can be a powerful thing when that magic dynamic exists.”


Five Dysfunctions of a Team: In order to have mutual trust, you need to be vulnerable.

Steve asks each person to share something about themselves.

Steve was the first person in his family to make it to college. He worked in a copper mine to pay for college. He eventually went on to work for a pipe manufacturing plant. Steve joins the ROTC to help pay for school and then the US Army.

Steve is an excellent officer with high ratings but none of his subordinates enjoy working with him. Steve commits to changing his ways.

“Over the next three decades, I became a constant student of building great teams that really trust one another. I did this first as a materials manager, then later as a plant manager, as head of Marketing, and later, as head of Sales Operations. Then twelve years ago, Bob Strauss, our CEO at the time, hired me to become the new COO.”


Steve asks for commitment from everyone to develop IT as a competency by starting to trust one another. Everyone in attendance nods in agreement, except for Bill. . .

Chapter 19

ill nods in agreement as well.

Patty apologizes for reacting so coldly to Bill. She credits Bill for changing the IT Department.

“The goal of this exercise is to get to know one another as people. You’ve learned a bit about me and my vulnerabilities. But that’s not enough. We need to know more about one another. And that creates the basis for trust.”


hris volunteers to start. He was born in Beirut and speakers four languages. He describes the story of his wife’s pregnancy complications and how it taught him to not be selfish.

Wes participates next. He was engaged three times and called off each before getting married. Wes races cars and has struggled with his weight.

Patty started as an art major but ended up switching majors five times in college. She dropped out of college to become a singer-songwriter, touring the country. She decided to work for Parts Unlimited because she couldn’t make a living as an artist.

Bill grew up in a family with an alcoholic father. He ran away from home and got into trouble. After being arrested, he chose to join the Marines.

Bill cries as he describes the lessons learned from the Marines: “What did I learn? That my main goal is to be a great father, not like the shitty father I had. I want to be the man that my sons deserve.”

“Solving any complex business problem requires teamwork, and teamwork requires trust. Lencioni teaches that showing vulnerability helps create a foundation for that.”


Steve identifies missing every commitment and schedule as a primary problem in IT. He surmises that the team is not good at making internal commitments.

Chris counters that his team hit their targets, including on Phoenix. However, Phoenix was a disaster. If success was Chris getting all the Phoenix tasks done, then they met their target. If success was putting Phoenix into production fulfilling business goals, then they failed.

Development does not factor in the work operations needs to complete.

Part of the problem is planning and architecture. Development is also waiting for operations to deploy because there is backlog of work.

“Erik has helped me understand that there are four types of IT Operations work: business projects, IT Operations projects, changes, and unplanned work. But, we’re only talking about the first type of work, and the unplanned work that get’s created when we do it wrong. We’re only talking about half the work we do in IT Operations.”


Bill realizes while discussing the types of work (the audit project specifically) they have forgotten to invite John. Steve takes a 15-minute break to invite John.

The IT staff is unsure how the make commitment decisions for projects, unlike the manufacturing plant. No capacity or demand analysis is done.

IT takes shortcuts, which means fragile applications in production, and firefighting, which leads to technical debt.

Technical debt compounds over time.

“If an organization doesn’t pay down its technical debt, every calorie in the organization can be spent just paying interest, in the form of unplanned work.”


“Unplanned work has another side effect. When you spend all your time firefighting, there’s little time or energy left for planning. When all you do is react, there’s not enough time to do the mental work of figuring out whether you can accept new work. So projects are crammed onto the plate, with fewer cycles available to each one, which means more bad multitasking, more escalations from poor code, which mean more shortcuts.”


Identify where the constraint is and then protect it. Ensure time is never wasted on the constraint.

Bill believes Brent is the constraint for Parts Unlimited.

To fix the problems of IT, Bill proposes to stop doing all other non-Phoenix work to focus on improving their processes for two weeks.

Erik agrees, because the goal should be to increase the throughput of the entire system.

Steve promises to send out an email to the company announcing the work stoppage, to prevent managers from “strong arming” Operations into helping pet projects.

The team will identify the top areas of technical debt, which Development will tackle to decrease the unplanned work being created by problematic applications in production.

Chapter 20

The company has made great progress on Phoenix; more accomplished in 7 days than in the prior month.

The company experiences a Sev-1 incident that took out internal phones and voicemail. The incident was caused by a vendor accidentally making changes to the production phone system. The team will put together a project to monitor critical systems for unauthorized changes.

“How do we currently prioritize our work? When we commit to work on a project, a change, a service request, or anything else, how does anyone decide what to work on at any given time? What happens if there are competing priorities?”


Priorities are typically based on the most senior person making the request or most recent request.

Erik and Bill take another trip to the manufacturing plant.

Understanding the flow of work is the first key to achieving the First Way.

Bill surmises that Brent is a worker supporting way too many work centers, which is why he’s a constraint.

“Every work center is made up of four things: the machine, the man, the method, and the measures. Suppose for the machine, we select the heat treat oven. The men are the two people required to execute the predefined steps, and we obviously will need measures based on the outcomes of executing the steps in the method.”


Bill is standardizing Brent’s work so others can execute it. Documenting the steps helps with consistency and quality.

Bill comes to the conclusion that only those projects that don’t require Brent are safe to begin work on again.

The monitoring project is the most important because it elevates the constraint by removing unnecessary work from his plate by bypassing him.

Total Productive Maintenance

  • Do whatever it takes to assure machine availability by elevating maintenance
  • ‘Improving daily work is even more important than doing daily work.’

“The Third Way is all about ensuring that we’re continually putting tension into the system, so that we’re continually reinforcing habits and improving something. Resilience engineering tells us that we should routinely inject faults into the system, doing them frequently, to make them less painful.”


Improvement Kata: Mike Rother says it almost doesn’t matter what you improve, as long as you’re improving something. Because if you are not improving, entropy guarantees that you are getting worse, which ensures that there is no path to zero errors, zero work-related accidents, and zero loss.

Kata: repetition creates habits, and habits are what enable mastery

Just as important as throttling the release of work is managing the handoffs.

The wait time for a given resource is the percentage that resource is busy, divided by the percentage that resource is idle.

If a resource is fifty percent utilized, the wait time is 50/50, or 1 unit. If the resource is ninety percent utilized, the wait time is 90/10, or nine times longer.

“A critical part of the Second Way is making wait times visible, so you know when your work spends days sitting in someone’s queue—or worse, when work has to go backward, because it doesn’t have all the parts or requires rework.”


The Security Projects from John don’t help scalability, availability, survivability, sustainability, security, supportability, or the defensibility of the organization. At present, they are not a good use of time.

From the Pipeline v10.0

This entry is part 10 of 25 in the series From the Pipeline

The following will be a regular feature where we share articles, podcasts, and webinars of interest from the web. 

Software Testing Podcasts

If you’re interested in learning more about testing and love podcasts, Software Testing Magazine has compiled a list of some popular testing podcasts.

A Primer on Continuous Testing

“Continuous testing shortens feedback loops through automated testing that occurs throughout the development lifecycle—hence “continuous.” Testing and QA become the responsibility of everyone working on the software, not just testers. Let’s look at some proven practices from organizations that have used continuous testing effectively to realize tangible benefits.”

Improve Your Test Automation Learning and Delivery with The Three Stream Method

Jon Ferguson Smart is the author of “BDD in Action”, one of my favorite tech books. He posts often on his blog and provides some solid advice on automation. In this post, he briefly discusses the three method: the first stream is value, the second stream is quality or technical debt, and the third stream is learning. He links to a new ebook, “The Roadmap From Manual to Automated Testing”, which is recommended for anyone learning to adopt automation. He’s an excellent author so please give it a read.

Production Deploy with Every Check-In? You Gotta Go TWO Low!

Paul Grizzaffi is an automation architect for Magenic. In this guest post for Applitools he describes multiple issues that can occur during a deployment to prod by a developer, from visual issues to timing issues. There are two different costs to consider: cost of change and cost of failure. To learn more about both check out his post.

The Technical Debt Trap (VIDEO)

For a change of pace, here is an excellent conference presentation given by the great Doc Norton on Technical Debt. I highly recommend watching this video to understand the origins of technical debt and why so many orgs don’t devote time towards quality as an upfront cost. “Technical Debt has become a catch-all phrase for any code that needs to be re-worked. Much like refactoring has become a catch-all phrase for any activity that involves changing code. These fundamental misunderstandings and comfortable yet mis-applied metaphors have resulted in a plethora of poor decisions. What is technical debt? What is not technical debt? Why should we care? What is the cost of misunderstanding? What do we do about it? Doc discusses the origins of the metaphor, what it means today, and how we properly identify and manage technical debt. In this talk I’ll share how these four principles power world-famous companies and how they can help you work with greater speed, simplicity, safety and success.”

Cukes and Apples: Advanced Cucumber Steps

Welcome Back

In the previous post, we implemented the Page Object pattern to drive a simple Cucumber scenario. The steps used in that scenario are expressive enough, but not very reusable and not well-organized. In this post, we will explore some good practices for writing and using Cucumber steps for mobile test automation.

Get the code from the previous post here:

Arrange, Act, Assert

We recommend organizing general, reusable step definitions with a pattern used in unit testing: Arrange-Act-Assert. The Arrange-Act-Assert pattern divides step definitions into three logical groupings, predictably: arrange, act, and assert.

  • The “arrange” section sets up the preconditions necessary for a test to succeed or fail correctly. This will include things like logging in and navigating to pages.
  • The “act” section describes an action for which the result must be validated, like tapping a button or performing a gesture.
  • The “assert” section finishes a test by validating the result of the action which preceded it, by checking conditions like the visibility and value of page elements.

Organizing our step definitions according to the Arrange-Act-Assert pattern makes it easier for new contributors to learn the most reusable steps in the test suite and reminds us of the purpose of these steps as we use them.

Some steps are not easy to place in the Arrange-Act-Assert pattern – for example, a step which validates the display of a page might be used most often during the arrange and act sections of a test, but still constitutes a very good assertion. If you are not sure where to place a step, consider how the step will be used most often, how it will provide the most value, and what first impression a new collaborator should have.

Custom Steps

If the Arrange-Act-Assert pattern is followed too literally, and other categories of a step are prohibited, the benefits of following the pattern are lost. Collections of arrange, act, and assert steps should include steps that are generalized and reusable to make those steps easy to find.

Steps which are too specific for the general collection can be organized separately. For example, a step which handles user login is only applicable to the application under test, and does not describe a general mobile device interaction. Start small by collecting these steps in one place, like “custom_steps.rb”. As the custom steps collection grows in size and becomes unwieldy, identify related steps and create new step files for them.

Select Any Element

In the previous post, we used a step definition that selected a button on the Welcome page: “the user selects Next” or “the user selects Get started”. We could make that step more valuable by making it reusable. If this step could be written to select any element, then it could be used more scenarios.

First, move the step into a file that aligns with the Arrange-Act-Assert pattern: “step_definitions/action_steps.rb”. This makes sense as an action step – many scenarios are likely to validate the result of tapping an element.

Next, update the step pattern to accept any element name.

We will parse the element name and send it to a page object to invoke the method which will select the named element. The element name that is written in our scenarios – and captured by the step – is likely to be mixed-case, and include spaces, so we need to modify it first. See the string modification below, and the “send” method which accepts it:

That “send” method is an incredibly helpful construct that allows us to invoke a method on a page object without knowing the name of that method until runtime – that is, when we begin executing the test.

Because our scenario was written to select both “Next” and “Get started”, we also need to define a separate button named :get_started

Now, the step above is capable of selecting any element on the Welcome page… but there aren’t many elements on that page. This step would be much more valuable if it could also select any element, on any page.

Any Element, Any Page

The step can be further modified to select an element on any page, but there is a catch. See @current_page below:

The @current_page variable will be familiar for users of the web automation gem page-object, which uses the same variable name for the same purpose. We can call the “send” method on an instance variable named @current_page and assume that preceding steps have set the variable, but we must update other steps.

Update the step “the app is on the Welcome page” to set @current_page.

Now the step “the user selects <element_name>” can be used on any page, assuming the preceding step sets @current_page.

The other step, “the app is on the Welcome page”, would be even better if it could be used to describe any page.

Navigate to Any Page

To set @current_page with an instance of any page class, call Kernel.const_get and pass it the name of a page class. As with the element name above, it is necessary to manipulate the string first. Follow the example below to change page names into class names:

Now the step definition “the app is on the <page name> page” can be used to navigate to any page and validate the visibility of that page using the “on_page?” method, assuming “on_page?” is implemented for the named page.

Further Optimizations

Fans of the page-object gem might be looking for the on_page method. Page-object uses a PageFactory module to manage the @current_page, which includes the on_page method used to create new instances of page classes and set the @current_page variable. Our test suite can do the same if we implement a factory method as page-object does.


In this post, we used the Arrange-Act-Assert pattern to organize our steps by category and updated our step definitions to handle any element, on any page. By following the same principles, and leaning on constructs like “@current_screen.send” and “Kernel.const_get”, we can write step definitions that will describe almost any user interaction in a generalized and reusable way.

Get the code from this post here:

Coming Up Next

Updating this test framework to support cross-platform execution will require access to some new hardware. Another post will explore execution with iOS and Android in the future, but for now this series will be on hold while we publish some other articles. Stay tuned!


Book Club: The Phoenix Project (Chapters 13-16)

This entry is part 4 of 8 in the series Phoenix Project

The following is a chapter summary for “The Phoenix Project” by Gene Kim for an online book club.

The book club is a weekly lunchtime meeting of technology professionals. As a group, the book club selects, reads, and discuss books related to our profession. Participants are uplifted via group discussion of foundational principles & novel innovations. Attendees do not need to read the book to participate.

Chapters 8 – 12 HERE

Background on the Phoenix Project

“Bill, an IT manager at Parts Unlimited, has been tasked with taking on a project critical to the future of the business, code named Phoenix Project. But the project is massively over budget and behind schedule. The CEO demands Bill must fix the mess in ninety days or else Bill’s entire department will be outsourced.

With the help of a prospective board member and his mysterious philosophy of The Three Ways, Bill starts to see that IT work has more in common with a manufacturing plant work than he ever imagined. With the clock ticking, Bill must organize work flow streamline interdepartmental communications, and effectively serve the other business functions at Parts Unlimited.

In a fast-paced and entertaining style, three luminaries of the DevOps movement deliver a story that anyone who works in IT will recognize. Readers will not only learn how to improve their own IT organizations, they’ll never view IT the same way again.”

The Phoenix Project

Chapter 13

The Phoenix crisis is still an issue on Monday, and the problems are front page news on technology sites.

Bill is at a Phoenix status meeting, and Steve says that they are massively screwing their customers and shareholders. He says that Sarah is not off the hook until all of the store managers say that they can transact normally.

Steve also wants to meet with Sarah, Chris, Bill, Kirsten, and Ann once the stores are off life support.

Once Steve leaves and slams the door, Sarah says she wants the usability issues fixed, but Bill and others tell her how impossible that is at the moment.

“We are keeping Phoenix alive by sheer heroics. Wes wasn’t joking when he said that we’re proactively rebooting all the front-end servers every hour. We can’t introduce any more instabilities. I propose code rollouts only twice a day and restricting all code changes to those affecting performance.”


The team produces a plan to tie all code commits must be tied to a defect number or they will be rejected.

Bill visits Ann and her team across the hall. They have tables covered in faxes that represent orders that need deduplicated or reversed.

On the wall Ann’s team shows that 5,000 customers have had duplicate payments or missing orders, and they estimate 25,000 more transactions that still need investigated.

John also stops by to check out the activity. When he looks at an order, he tells Bill that they have a major problem.

John tells Bill that they are storing the CVV2 codes, which is against the law. John wants Bill to destroy all that information, but Bill says that they first need to take care of the transactions.

John remembers that the auditors are actually on site that day. Bill instructs him to not allow the auditors close to Ann’s team and the CVV2 information.

Later, John tells Bill that he may have some extra engineers to spare. Bill is thrilled by this since his team is literally at full capacity and is pulling all nighters.

“I then wonder if the fatigue is getting to me. Something is really screwy in the world when I’m finding reasons to thank Development and Security in the same day.”


Chapter 14

By late Monday, they had finally stabilized the Phoenix situation. The stores had working registers (although the fix is only temporary) and the company is no longer keeping sensitive cardholder data.

The leadership team is waiting outside Steve’s office, and Sarah comes out nearly in tears. Bill and Chris then take their turn to talk to Steve.

Steve says the company has nothing to show for the $20 million they’ve spent on Phoenix. He also says they may of lost loyal customers, and marketing is giving away $100 vouchers.

Bill gets frustrated that Steve didn’t follow his initial advice to delay Phoenix: “No offense, sir, but this is supposed to be news to me? I called you, explaining what would happen, asking you to delay the launch. You not only blew me off, you told me to try to convince Sarah. Where’s your responsibility in all of this? Or have you outsourced all your thinking to her?”

Steve responds by telling Bill that he needs some actual solutions from him. He also says that he needs the business to be able to tell him that it is no longer being held hostage by IT.

Steve goes on to say that the board is considering splitting up the company.

“Second, I’m done playing Russian roulette with IT. Phoenix just shows me that IT is a competency that we may not be able to develop here. Maybe it’s not in our DNA. I’ve given Dick the green light to investigate outsourcing all of IT and asked him to select a vendor in ninety days.”


Bill and Steve are shell shocked and decide to meet for lunch. Bill mentions that Paige tells him that he shouldn’t trust Chris.

Chris says that maybe his group being outsourced wouldn’t be the worst thing in the world and wonders if it might be time for a change. He says he used to love his work but lately it is so hard to keep up with change.

“It’s harder than ever to convince the business to do the right thing. They’re like kids in a candy store. The read in an airline magazine that they can manage their whole supply chain in the cloud for $499 per year, and suddenly that’s the main company initiative. When we tell them it’s not actually that easy, and show them what it takes to do it right, they disappear. Where did they go? They’re talking to their Cousin Vinnie or some outsourcing sales guy who promises they can do it in a tenth of the time and cost.”


Chris says that it is getting harder and harder to hit dates. He was in a meeting where they were planning out work 3 years in the future, but he says that they can’t even effectively plan for one year.

Chris apologizes for his part in the Phoenix fiasco. He says that when he told Sarah a date for when code could be complete, he didn’t know that she would use it as a go live date.

Bill and Chris agree that they are worried that Sarah might try to stick the whole situation on them. They say she is like “Teflon” because nothing sticks to her.

Bill and Chris agree that they will meet once a week for the next few months.

Once back at the office, Bill gets an email from Chris. Chris tells him that they are throwing a celebration party since the Phoenix deployment is “finished” and invites Bill and his team.

Bill forwards the email to Wes and Patty, but Wes says his team still has a lot of work to do.

Chapter 15

The chapter opens on Wednesday with Bill taking Paige out to breakfast. She says she has never seen him this stressed. Bill tells her that he has no idea when life will be normal again.

Paige says she doesn’t know why Bill decided to accept the job. He thinks to himself that he thinks the organization is better off because of his contributions and is happy he’s one of the people that can try to fend off the outsourcing.

Bill starts thinking about how the pay raise will help his family pay down their debt. Paige catches him wandering off in thought and says she wishes they picked someone else for the job.

Bill drops off Paige at home and sees that he has an email that Wes has forwarded him. The email is giving praise to the new change board and how it saved two different groups from making changes to the database and app servers at the same time.

Patti knocks on Bill’s door and tells him that she thinks they have a problem. She asks him to follow her to the Change Coordination Room.

“I groan. Every time Patty’s summons me there, it’s because of some new intractable problem. But problems, like dog poop left in the rain, rarely get better just by ignoring them.”


Bill notices that the change boards look different. He notices there are barely any changes posted that are coming up, and the cards are missing. Patti tells him that there are about 600 cards of changes that need to be rescheduled due to Phoenix.

Bill discovers that the fourth type of work that Erik had mentioned was unplanned work (the other three are: business projects, internal projects, and changes).

“That’s why Erik called it the most destructive type of work. It’s not really work at all, like the others. The others are what you planned on doing, allegedly because you needed to do it.”


“So much of what I’ve been trying to do during my short tenure as VP of IT Operations is to prevent unplanned work from happening: coordinating changes better so they don’t fail, ensuring the orderly handling of incidents and outages to prevent interrupting key resources, doing whatever it takes so that Brent won’t be escalated to. . .”


Bill goes outside and calls Erik. Erik asks Bill how he is doing “after Phoenix crashed and burned so spectacularly”, and asks him if he can tell him the four categories of work now.

“At the plant, I gave you one category, which was business projects, like Phoenix,” I say. “Later, I realized that I didn’t mention internal IT projects. A week after that, I realized that changes are another category of work. But it was only after the Phoenix fiasco that I saw the last one, because of how it prevented all other work from getting completed, and that’s the last category, isn’t it? Firefighting. Unplanned work.”


Erik asks Bill about the change board he’s been working on, and Bill describes it to him.

“You’ve put together tools to help with the visual management of work and pulling work through the system. This is a critical part of the First Way, which is creating fast flow of work through Development and IT Operations. Index cards on a kanban board is one of the best mechanisms to do this, because everyone can see WIP. Now you must continually eradicate your largest sources of unplanned work, per the Second Way.”


Bill explains all the chaotic events that he has been dealing with lately. Eric responds and mentions that Brent is Bill’s constraint, and Bill is surprised.

“Well if we’re going to talk about your next steps, you definitely need to know about constraints because you need to increase flow. Right now, nothing is more important.”


Erik tells Bill that he hopes that he read The Goal by Eli Goldratt.

“Goldratt taught us that in most plants, there are a very small number of resources, whether it’s men, machines, or materials, that dictates the output of the entire system. We call this the constraint—or bottleneck. Either term works. Whatever you call it, until you create a trusted system to manage the flow of work to the constraint, the constraint is constantly wasted, which means that the constraint is likely being drastically underutilized.”


Erik describes the first 3 steps (of 5) in The Goal:

  1. Identify the constraint
  2. Exploit the constraint
  3. Subordinate the constraint

Erik tells Bill his homework is to figure out how to set the tempo of work according to Brent. He also tells Bill he is still missing a piece of the First Way in that he can’t distinguish what is important to the business and what isn’t.

“[Chris] is spending all his cycles on features, instead of stability, security, scalability, manageability, operability, continuity, and all those other beautiful ’itties.. Remember, outcomes are what matter—not the process, not controls, or, for that matter, what work you complete.” – Erik

Chapter 16

Bill is at his desk when Ellen runs in with an email printout from Dick. It says something has gone wrong with the company invoicing systems. It was discovered that no customers were invoiced for 3 days.

Leadership gathers in the NOC room. Bill instructs everyone not to touch anything without approval from him.

The team investigates possible causes for the issue, and Patty’s team found over 20 different potential failures. Eventually they narrow it down to 8. They agree to reconvene at 10 pm.

As Bill is reading a book to his son, he checks the emails on his phone. He’s amazed at the difference in his team’s process: “During the last Sev 1 incident that hit our credit card processing systems, the conference call was full of finger-pointing, denials, and, most importantly, wasted time when our customers couldn’t give us money.Afterward, we did the first of a series of ongoing blameless postmortems to figure out what really happened and come up with ideas on how to prevent it from happening again. Better yet, Patty led a series of mock incident calls with all hands on deck, to rehearse the new procedures.”

At 9:15, Bill receives a call from Steve about the incident. Steve tells Bill that he just talked to Dick, and Dick said that Bill is dragging his feet. Steve is clearly angry.

Bill tries explaining his points again, but Steve cuts him off and asks if he’s in the office.
Steve: “We’ll probably miss almost every target that we’ve promised the board: revenue, cash, receivables—everything. In fact, every measure we’ve promised the board is going the wrong way! This screwup may confirm the board’s suspicion that we’ve completely lost control of managing this company!”

Steve tells Bill that he wants to see a sense of urgency, and that he should be getting people out of bed.

“Steve, if I thought it would help, I’d have everyone pull all-nighters in the data center tonight. For Phoenix, some people didn’t go home for nearly a week. Trust me, I know the house is on fire, but right now, more than anything, we need situational awareness. Before we send the teams crashing through the front door with fire hoses, we have to have someone at least quickly walk the perimeter of the yard — otherwise, we’ll end up burning down the houses next door!”


Steve replies to Bill that Brent disagrees with Bill’s approach. Bill responds that he hopes Brent is at home. He doesn’t want him working until they know exactly what’s wrong.

Steve tells Bill that they’re going to start doing things his way. He screams at Bill to call in Brent along with everyone else.

“You think I’m being overly cautious, and that I’m hesitating to do what needs to be done. But you are wrong. Dead wrong.”


Steve still is not convinced. Bill responds by telling Steve to do the work himself, and to expect Bill’s resignation in the morning.

From the Pipeline v9.0

This entry is part 9 of 25 in the series From the Pipeline

The following will be a regular feature where we share articles, podcasts, and webinars of interest from the web. 

Rethinking Your Measurement and Metrics for Agile and DevOps

In this short piece, Michael Sowers challenges the readers to consider updating their telemetry based on organizational change. In particular, start with the following criteria: (1) providing teams with quick feedback on how the quality of the project, product, and user stories is progressing; (2) understanding how the teams are progressing and what the roadblocks are; (3) knowing how effective and efficient the teams’ processes are; and, (4) understanding resource consumption, both human and computer.

Given-When-Then With Style

Gojko Adzic has partnered with Specflow for a series of articles to help people get the most out of Gherkin with some tips and tricks. Each week he will post a challenge for readers to answer about a particular example of Gherkin. In the first challenge, the reader must try to explain a missing value (A “Given” for a value that’s not supposed to be there).

What’s New in Selenium 4?

Selenium is a set of tools used in support of automation. In this article, Manoj walks us through several of the changes coming to selenium. Relative locators will be a welcome update as well as installing / uninstalling add-ons for Firefox at runtime. The biggest may very well be the ability to use Docker to spin up containers. Anyone interested in checking out the changes can go to for more details.

Is There Such a Thing As Too Much Testing?

Bas Djikstra posts again about the costs and misconceptions around test automation – namely that it’s the end goal and not a means to an end. Investing in automation testing has associated long-term maintenance costs and it shouldn’t be considered the only type of validation performed by a team. Great advice in this piece on scaling automation.

Balance as an Important Part of Website Testing

In this article by Nataliia Syvynska explains two types of balance in web design: symmetrical balance and asymmetrical balance. In symmetrical balance elements are equally disposed on either side of the center (vertically and horizontally). Asymmetrical balance is focus on one particular object with several elements. The article raises an interesting question about validations from a UI/UX perspective of how the user interacts with the system in a “pleasing” fashion.

Book Club: The Phoenix Project (Chapters 8-12)

This entry is part 3 of 8 in the series Phoenix Project

The following is a chapter summary for “The Phoenix Project” by Gene Kim for an online book club.

The book club is a weekly lunchtime meeting of technology professionals. As a group, the book club selects, reads, and discuss books related to our profession. Participants are uplifted via group discussion of foundational principles & novel innovations. Attendees do not need to read the book to participate.

Chapters 4-7 HERE

Background on the Phoenix Project

“Bill, an IT manager at Parts Unlimited, has been tasked with taking on a project critical to the future of the business, code named Phoenix Project. But the project is massively over budget and behind schedule. The CEO demands Bill must fix the mess in ninety days or else Bill’s entire department will be outsourced.

With the help of a prospective board member and his mysterious philosophy of The Three Ways, Bill starts to see that IT work has more in common with a manufacturing plant work than he ever imagined. With the clock ticking, Bill must organize work flow streamline interdepartmental communications, and effectively serve the other business functions at Parts Unlimited.

In a fast-paced and entertaining style, three luminaries of the DevOps movement deliver a story that anyone who works in IT will recognize. Readers will not only learn how to improve their own IT organizations, they’ll never view IT the same way again.”

The Phoenix Project

Chapter 8

Bill spends all weekend working on a PowerPoint slide deck for his meeting with Steve.

When Bill arrives at Steve’s office, he must wait while Sarah & Steve wrap up a call with analysts about the Phoenix project.

Sarah relays that the industry analysts are excited about Phoenix now, too. Bill wonders if they are over promising. By the time Sarah leaves Steve’s office, she has taken up nearly half of the time that Bill has scheduled with Steve.

Bill explains to Steve that IT is stretched dangerously thin. There are too many different projects competing for attention, and that the new audit project will affect the resources that are supposed to be dedicated to Phoenix. He states that he would like to know the relative priority of the audit work compared to the Phoenix work.

“We’ve started to inventory everything we’re being asked to do, regardless of how big or small. Based on the analysis so far, it’s clear to me that the demand for IT work far exceeds our ability to deliver. I’ve asked them to make more visible what the pipeline of work looks like, so we can make more informed decisions about who should be working on what and when.”

Bill Palmer

“What kind of bullshit prioritization question is this? If I went to my board and told them that I need to do either sales or marketing, and asked them which of those I should do, I’d be laughed out of the room. I need to do both, just like you need to do both! Life is tough. Phoenix is the top company priority, but that doesn’t mean you get to hold the SOX-404 audit hostage.”

Steve Masters

Bill tries to reason with Steve, and tells him that Phoenix and compliance share key resources, the infrastructure is too fragile and breaks often, and that some compliance work should be put on hold if Phoenix truly is the top priority.

Steve replies that delaying the audit work is out of the question, and that there is no way they can hire any more people. Any raises to the budget are out of the question, and it seems like Bill’s team is more likely to lose people rather than be able to hire new ones.

“My suggestion to you? Go to your peers and make your case to them. If your case is really valid, they should be willing to transfer some of their budget to you. But let me be clear: Any budget increases are out of the question. If anything, we may have to cut some heads in your area.”

Steve Masters

Bill tosses his presentation he worked on all weekend into the recycling bin as he leaves.

Bill then goes to the continuation of the CAB meeting. He is blown away by how many change cards are in the room, and the room is covered in white boards. He discovers that there have been 437 change requests submitted for the week.

“Let’s go back to our goals: get the left and right hands to know what the other is doing, give us some situational awareness during outages, and give audit some evidence that we’re addressing change control.”

“‘We need to focus on the riskiest changes,’ I continue. ‘The 80/20 rule likely applies here: Twenty percent of the changes pose eighty percent of the risk.'”

Bill Palmer

The team works on splitting up the cards into two groups: a risky group and a routine change group.

The group also decides to share the changes with business, along with data on how risky each change will be.

“We need to create some standard procedures around these changes—like when we’ll want them implemented—and have key resources not only aware of them but also standing by, just in case things go wrong—even the vendors.”


“There’s no reason why all the responsibility should rest on our shoulders. We can send an e-mail out to the business ahead of time and ask when the best implementation time would be. If we can give them data on the outcomes of previous changes, they may even withdraw the change.”


As the meeting concludes, the group feels positive about the change management work that they are doing. On the negative side, the amount of manual work the process is taking is too high, and the group agrees that it will need to be automated sooner or later.

Chapter 9

Bill sits in a high-level budget meeting with leadership (which he calls “the most ruthless budget meeting I’ve ever attended”) when he gets a text that there is a Sev 1 incident where all of the credit card processing systems are down. He is forced to leave the meeting even though he knows that he won’t have a chance to fight for his budget.

When he gets to the call with Patty and Wes, he is informed that the order entry systems are down, and the team is trying to establish what has changed.

Patty asks what the day’s changes were, but the conversation quickly spirals into defensiveness from each manager and finger pointing.

Bill chooses not to intervene in the conversation, and instead opts to simply sit back and observe the chaos.

Suddenly, someone on the phone speaks up and says, “try it now”. Bill tells everyone to hold it and discovers that the voice on the phone is Brent. Shortly after, someone states that the issue has been fixed.

Bill wraps up the call and calls Wes and Patty to meet privately. He tells Patty that she is in charge of presenting a timeline of all changes during incidents. He also says they will do a fire drill every 2 weeks to practice managing incidents.

Bill asks Wes to impress upon Brent that everyone must discuss their fixes during emergencies rather than just implementing them on their own.

Bill says that his guess is that Brent caused the outage on his own and then rushed to undo the change.

“I want you to host practice incident callsand fire drills every two weeks. We need to get everyone used to solving problems in a methodical way and to have the timeline available before we go into that meeting. If we can’t do this during a prearranged drill, how can we expect people to do it during an emergency?”


Moving forward, Bill and Wes spend nearly all their time in the Phoenix war room. The deployment is only three days away, and things are looking worse and worse.

The group has another CAB meeting, where everything has been organized. The group starts to review all high and medium risk changes.

Things are going very well, but Patty shows the group that they have 173 changes going in on Friday alone. The timeline is adjusted, and some members move their changes up in the week.

“‘If I were air traffic control,’ she continues, ‘I’d say that the airspace is dangerously overcrowded. Anyone willing to change their flight plans?'”


Bill begins thinking to himself about what Erik told him. He names three types of work: business projects, IT projects, and changes.

“Sure, each of these changes is much smaller than an entire project, but it’s still work. But what is the relationship between changes and projects? Are they equally important? And can it really be that before today, none of these changes were being tracked somewhere, in some sort of system? For that matter, where did all these changes come from? If changes are a type of work different than projects, does that mean that we’re actually doing more than just the hundred projects? How many of these changes are to support one of the hundred projects? If it’s not supporting one of those, should we really be working on it? If we had exactly the amount of resources to take on all our project work, does this mean we might not have enough cycles to implement all these changes?”


Chapter 10

The chapter starts in the Phoenix war room. William Mason, director of QA, informs the group that they are finding twice as many broken features as are getting fixed.

The group discovers that Brent is a bottleneck for many tasks.

Bill goes to Brent’s desk. When he arrives, Brent is on the phone and Bill observes him for a minute.

“I appreciate how Brent seems to genuinely care that everyone relying on IT systems can get their work done, but I’m dismayed that everyone seems to be using him as their free, personal Geek Squad. At the expense of Phoenix.”


Bill asks Brent how many calls he gets a day, and if he logs them anywhere. Brent says he does not log anything because it takes too long.

Brent says that his previous phone call was with the VP of logistics, and Bill is angry that executives are strong arming Brent into completing tasks.

Bill tells Brent that from now on his only priority is Phoenix. Bill leaves Brent and calls Patty and Wes to a meeting about how to handle escalations.

“‘Processes are supposed to protect people. We need to figure out how to protect Brent,’ I say. I then describe how I already told Brent to send everyone wanting anything to Wes.”


Patty suggests that Brent may be reluctant to give up his knowledge because he may view it as power. Bill responds, “Maybe. Maybe not. I’ll tell you what I do know, though. Every time that we let Brent fix something that none of us can replicate, Brent gets a little smarter, and the entire system gets dumber. We’ve got to put an end to that.”

Bill says the new system will be everyone needs approval before talking to Brent, and everyone must document what they learned.

Bill states that to make sure everyone follows the new processes they will send the engineers to whichever conference they want. They will also give Brent a week off work with no on call responsibilities.

Chapter 11

The chapter opens with Patty calling Bill during his lunch because she wants him to check out something weird on the change calendar.

“I’m starting to think this entire change process is a total waste of time. Organizing all these changes and managing all the stakeholder communication is taking up three people full-time. Based on what I’m seeing now, it may be useless.”


Patty tells Bill that over the last week about 60% of scheduled changes have not actually been implemented.

She says they haven’t been implemented for several reasons: personnel, configuration work that wasn’t completed, and the need for Brent.

“Somehow, just like we’re breaking the habits of people asking Brent to help with break-fix work, we need to do the same with change implementation. We’ve got to get all this knowledge into the hands of people actually doing the work. If they can’t grok it, then maybe we have a skills problem in those teams.”


Bill remembers back to his conversation with Erik about WIP. Erik called WIP the silent killer. Erik had said pointed to an ever growing mountain of work on the plant floor as an indication that floor managers had failed to control their work in process.

Patty states that they will soon pass over 1,000 changes tracked. She wonders why they are doing the tracking work when the changes aren’t ever being implemented.

Bill is starting to believe that Erik was right and there really is a link between plant floor management and IT Operations.

He says that he believes that reversing the process change and allowing change work to go to Brent is the exact wrong thing to do. He also states that this process is worth it because they are now aware of how much scheduled work isn’t getting done, and that they now have “situational awareness”.

Chapter 12

“It’s not a good sign when they’re still attaching parts to the space shuttle at liftoff time.”


The Phoenix project was scheduled to start at 5:30 PM Friday, but it still has not started as of 7:30 due to Chris’s team still making changes. Phoenix was not available in the test environment and was still failing critical tests.

There are multiple issues, including the app only running on one developer’s machine and an unopened network port that is preventing the front end from talking to the back end.

Bill calls Wes, Patty and William into his office to talk. Wes says the team is still missing critical files and they are unable to configure the test environment correctly.

William says that his QA team is unable to keep up with all the code changes being made, and that his bet would be that Phoenix will blow up in production. He wants to stop the release but Chris and Sarah won’t allow it.

William doesn’t think they will have anything up by 8 AM the next day (when the stores open).

Wes tells Bill that they still have not reached the point of no return. That point will be when the team starts converting databases to interact with Phoenix and POS systems.

Bill is going to try and delay the deployment by emailing Steve, Chris and Sarah. He then calls Steve. He explains that he cannot overstate how bad the release has gone so far, and that it is not too late to stop this “train wreck”. He says that failure will jeopardize order data and customer records.

Steve explains that they don’t have a choice but to keep moving ahead. They have already bought ads for that weekend’s newspapers and their partners are ready to go.

Bill asks Steve how bad things have to be to delay the rolling. Steve says that if he can convince Sarah, then he will consider it.

Bill pulls Sarah aside to talk in the hallway. He asks her how it seems things are going from her point of view. She responds, “You know how these things go when we’re trying to be nimble, right? There’s always unforeseen things when it comes to technology. If you want to make omelets, you’ve got to be willing to break some eggs.”

Bill tells Sarah the same things that he told Steve, but she is unconvinced. She says that everyone is ready but Bill, and that they need to keep going. Wes taps Bill on the shoulder and tells him there is a problem.

“Remember when we hit the point of no return around 9 p.m.? I’ve been tracking the progress of the Phoenix database conversion, and it’s thousands of times slower than we thought it would be. It was supposed to complete hours ago, but it’s only ten percent complete. That means all the data won’t be converted until Tuesday. We are totally screwed.”


Wes says that performance is terrible, and even Brent can’t fix the problem. He also says that they cannot use virtualization to fix their server problems because development blamed the performance problems on the virtualization.

“The morning light is starting to stream in from the windows, showing the accumulated mess of coffee cups, papers, and all sorts of other debris. In the corner, a developer is asleep under some chairs.”


Maggie, the Senior Director of Retail Program Management, is kicking off the 7 AM emergency meeting. She says that all the in-store POS systems will be down because of the database issue. The good news is the Phoenix site is up and running.

“We need to get proactive here,” I say to Sarah. “We need to send out a summary to everyone in the stores, as quickly as possible outlining what’s happened and more specific instructions on how to conduct operations without the POS systems.”


At 2pm Saturday, Bill says the bottom is further down than he thought. All transactions are being processed manually. The customers on the website are complaining about how it is slow and unusable.

Bill finally leaves to catch a few hours of sleep while Wes stays behind to look over everything.

Wes calls Bill at 4:30 and says, “Bad news. In short, it’s all over Twitter that the Phoenix website is leaking customer credit card numbers. They’re even posting screenshots. Apparently, when you empty your shopping cart, the session crashes and displays the credit card number of the last successful order.”

Slaying the Hydra: Run-Time State and Splitting Up the Execution

This entry is part 3 of 5 in the series Slaying the Hydra

In this third post of the blog series on parallel test execution, I explain how to execute distributed parallel test automation. The previous entry can be found here.

As discussed previously, The running stage (see below) within the pipeline context is set to execute three builds of the test_runner freestyle job in parallel. Each build is receiving the following parameters:

  • browser – either equal to ‘ie’ or ‘chrome’
  • total_number_of_builds – equal to ‘3’
  • build_number – equal to ‘1’, ‘2’ or ‘3’

Freestyle Job Overview

In the following sections, I explain what freestyle components need utilized when constructing the test_runner job in Jenkins.


As seen from the image above, parameters are being passed from the pipeline job into the freestyle job. We will update the freestyle job to be parameterized. This selection is made when configuring the Jenkins job (see below).

Next the freestyle job is configured with these parameter names:

  • browser –  the value received from the pipeline parameter value.
  • total_number_of_builds –  the value received from the pipeline parameter value.
  • build_number – the value received from the pipeline parameter value.
  • workspace_location – to show a different way of doing things, we can see from the image above that I did not pass a value for workspace location in the pipeline. When I configured the parameter (below), I set a default value in the freestyle job. This default value will be linked to the workspace_location parameter now unless I otherwise specify.

Node Selection

In this section we restrict where this build can execute to only machines associated with the @local tag only. This setting is located in the Manage Jenkins > Manage Nodes section of Jenkins. It provides us the ability to ensure we are not utilizing nodes that are otherwise utilized or not configured to run the cucumber tests in the steps below.

Version Control

In the Source Code Management section, we specify what testing suite to retrieve via version control and utilize for this effort, which will pull the suite down within the workspace. The “clean before checkout” additional behavior (Jenkins functionality) will remove any files in the workspace that are not in the Git repo before pulling the suite down. This allows for a clean slate for every execution.

Splitting Code

class Splitter
  def total_builds

  def build_number

  def main_run
    scenarios = feature_iterator
    splits = job_splitter(scenarios)
    assignment = job_assigner(splits)
    feature_mod_iterator(assignment, 'features', true)

  def feature_mod_iterator(split_assignment, current_location = 'features', assign = true)
    array = []
    split_assignment.each do |value|
      mod_value = value.gsub('@regression', '@split_builds')
      regex = /#{value}$/
      files = return_all_files(current_location, '*', 'feature')
      files.each do |file|
        output =, 'r', &:read)
        modified = output.gsub(regex, mod_value)
        if assign
, 'w+') { |f| f.print(modified) }

  def feature_iterator(current_location = 'features')
    files = return_all_files(current_location, '*', 'feature')
    array = []
    files.each do |file|

  def return_all_gherkin_scenarios(file)
    output =, 'r', &:read)
    output.scan(/(@regression.*\n. (Scenario:|Scenario Outline:)?.*)/).map { |value| value[0] }

  def return_all_files(current_location, filter = '*', file_type = '*')

  def job_splitter(scenarios)
    split = scenarios.length.to_i / total_builds.to_i

    container = []
    total_builds.times { container.push([]) }
    mod_scenarios = scenarios.clone

    total_builds.times do |index|
      container[index].push(mod_scenarios[0..(split - 1)])

      (0..(split - 1)).to_a.length.times do

    mod_scenarios.each_with_index do |value, index|

  def job_assigner(scenarios)
    scenarios[(build_number.to_i - 1)]

one =

At a high level, the code block above is creating an array of arrays that split up the regression tests evenly between the number of executors. The build_number value is utilized to access the corresponding index value of the array. All of the tests in that location are re-tagged from @regression to @split_builds locally on the workspace that houses the Ruby/Cucumber code pulled down from version control.

You would have to change the @regression tag to whatever you are utilizing to tag your tests as regression on your team.

The cool thing is that this will run on each of the three workspaces and re-tag a unique subset of tests. Because the total_builds value is the same for all the jobs kicked off, it will create the same nested array structure on every workspace. The difference between workspaces comes about because of the build_number parameter that chooses which subset of tests to re-tag.

Running the Split Code

We should house the code above within our testing framework in version control.  Within the Build section of Jenkins we then create a windows batch command. Next we set the environment variables that the code utilizes total_builds and build_number as being equal to the parameters set within the freestyle job. We can now run the ruby command passing the path to the .rb file that houses the code within the workspace (in reference to the code above).

Running the Tests

We set up another windows batch command to set environment variables for browser and or_tags, and in this instance, we kick off the tests utilizing a rake task. Cucumber Rake is a useful tool, but we could just as easily run a Cucumber command.

The important thing is that we are passing what will be the tag modified locally on each workspace(split_builds) to run only the tests assigned to that workspace. Additionally, we passed the browser variable set within the pipeline and passed to the freestyle job.

Storing Results

In our last batch command, we are extracting the json test results file and storing it on the workspace_location as a json file named with the build_number value (either 1, 2, or 3). This workspace location is the same as what we utilized in the clearing stage and what will be utilized in the consolidation stage.  

Review and Next Steps

To review, in this post, we figured out how to build the freestyle job that is responsible for splitting, executing, and storing the results of our tests.

In the next post, we discuss how to consolidate the information from the freestyle job builds into a concise cucumber report.

From the Pipeline v8.0

This entry is part 8 of 25 in the series From the Pipeline

The following will be a regular feature where we share articles, podcasts, and webinars of interest from the web. 

From Test Management to Continuous Delivery

Seb Rose and Dana Prey recently hosted a webinar on (now a SmartBear tool) about the evolution of testing to support continuous delivery. “This webinar will define Test Management and Continuous Delivery and go on to explore typical challenges you’ll encounter on your journey towards CD. We’ll describe small steps that you can use to mitigate the risks of changing the way you work, and the value that can be released from the start.”

Information Loss in Software Testing

Matt Heusser describes the level of information loss about a project or product as it moves up through the chain of command, as well as the negative aspects of controlling information about an application for your personal benefit (job security). He provides several alternatives to conveying information such as coverage maps and dashboards to help contain organizational information loss.

Clear, Direct Communication: An Experiment

Kent Beck posts a personal piece about communication with others through his life. The piece is an important introspection who is professionally successful and considered a luminary in our field, yet still struggles with interpersonal connections.

Fighting Against Technical Debt

Cukenfest was held virtually this past week. While the videos are not posted yet, Gaspar Nagy has posted his presentation to slideshare. His talk about technical debt is distilled into three focus areas: Reversibility, Reaction, and Sustainability.

DevOps Journey Playbook

The DevOps Institute have gathered lots of great background information on aspects of DevOps into a single location as a series of playbooks. “Playbooks are a collaborative body of knowledge of research, knowledge and artifacts to help you understand and SKILup your DevOps capabilities. A playbook is populated with twelve research chapter reports plus additional content for ongoing discovery and support during your DevOps journey. We continuously update the playbook with regional and global perspectives for actionable strategies and implementations.”

Book Club: The Phoenix Project (Chapters 4-7)

This entry is part 2 of 8 in the series Phoenix Project

The following is a chapter summary for “The Phoenix Project” by Gene Kim for an online book club.

The book club is a weekly lunchtime meeting of technology professionals. As a group, the book club selects, reads, and discuss books related to our profession. Participants are uplifted via group discussion of foundational principles & novel innovations. Attendees do not need to read the book to participate.

Chapters 1-3 HERE

Background on the Phoenix Project

“Bill, an IT manager at Parts Unlimited, has been tasked with taking on a project critical to the future of the business, code named Phoenix Project. But the project is massively over budget and behind schedule. The CEO demands Bill must fix the mess in ninety days or else Bill’s entire department will be outsourced.

With the help of a prospective board member and his mysterious philosophy of The Three Ways, Bill starts to see that IT work has more in common with a manufacturing plant work than he ever imagined. With the clock ticking, Bill must organize work flow streamline interdepartmental communications, and effectively serve the other business functions at Parts Unlimited.

In a fast-paced and entertaining style, three luminaries of the DevOps movement deliver a story that anyone who works in IT will recognize. Readers will not only learn how to improve their own IT organizations, they’ll never view IT the same way again.”

The Phoenix Project

Chapter 4

Bill is inundated with emails and voicemails just one day on the job. One high priority email comes from Sarah Moulton (SVP of Retail Operations) regarding delays in the Phoenix Project.

Development on the Phoenix Project is behind and they have not considered how to test and deploy the application. This is typical for handoffs between Development and IT Operations at Parts Unlimited.

“The majority of our marketing projects can’t be done without IT. High touch marketing requires high tech. But if there’s so many of us assigned to these Marketing projects, shouldn’t they be coming to us?”

Bill Palmer


  • Kirsten Fingle, Project Management Office. She is organized, levelheaded, and a stickler for accountability.
  • Sarah Moulton, SVP of Retail Operations.
  • Chris Allers, VP of Application Development and acting CIO. Has a reputation as a capable and no-nonsense manager.

The Phoenix Project team has grown by 50 people in the last two years, many through offshore development shops.

Steve Masters attends the Phoenix Project project management meeting. The project has been red for four weeks. Sarah Moulton attacks Bill’s team for the delays.

“See, Bill, in order for us to increase market share, we must ship Phoenix. But for some reason, you and your team keep dragging your feet. Maybe you’re not prioritizing correctly? Or maybe you’re just not used to supporting a project this important?”

Sarah Moulton

Parts Unlimited has spent over $20 million on Phoenix and are two years late.

Chris says Phoenix can be delivered in a few weeks but Wes is not convinced. It would take three weeks just to order the infrastructure necessary and the performance of Phoenix is slow. Additionally, Operations does not have a specification on how the production and test systems will be configured.

“I’ve seen this movie before. The plot is simple: First, you take an urgent date-driven project, where the shipment date cannot be delayed because of external commitments made to Wall Street or customers. Then you add a bunch of developers who use up all the time in the schedule, leaving no time for testing or operations deployment. And because no one is willing to slip the deployment date, everyone after Development has to take outrageous and unacceptable shortcuts to hit the date.”

Bill Palmer

Bill tries to convince Steve to delay the release of Phoenix to no avail. Phoenix impacts thousands of point of sale systems and all of the back-office order entry systems.

After the meeting, Bill and Wes conclude that they’re going to have to get a huge team of their employees together in a room to make the release happen and will also need members of Chris’s team. They also need to free up Brent from fire fighting so that he can help solve problems at the roots.

To make things worse, Bill gets the dreaded blue screen of death on his laptop. His new secretary, Ellen, informs them that a lot of people are experiencing the issue.

Bill attends the CAB (change advisory board) meeting which Patty runs. They are the only two people in attendance. Bill sends out an email to the org stating that all relevant people must attend another mandatory CAB meeting on Friday afternoon.

Bill is given a replacement laptop that is ~10 years old since the help desk team was unable to fix is blue screen of death.

Wes talks to Bill and objects to his mandatory CAB meeting. He says last time the org tried to enforce this it bogged down all his developers in paperwork and they were unable to be productive.

Chapter 5

Bill wakes up the next day to an email from Steve. They need to meet with Nancy Mailer, the Chief Audit Executive. The auditors have uncovered some issues that need to be discussed.

The room is quiet when Bill arrives at the 8 AM meeting. Also in attendance are John, Wes, and Tim, an IT auditor.

The auditing team has found nearly a thousand issues, although only 16 of them are “significant deficiencies”.

Nancy requires a management response letter which includes a remediation plan. Normally the remediation of these issues takes months, but Bill’s team is only given a few weeks before the external auditors arrive.

John tries to grandstand and state his team is on top of things, but that doesn’t seem to be the case. Bill finds out that John’s fix that broke the payroll system may not have even been necessary since it’s out of scope for this audit.

When Bill asks what the most important issue is, he is told: “The first issue is the potential material weakness, which is outlined on page seven. This finding states that an unauthorized or untested change to an application supporting financial reporting could have been put into production. This could potentially result in an undetected material error, due to fraud or otherwise. Management does not have any control that would prevent or detect such a change.”

Nancy Mailer

Bill is also told his team was unable to produce any change meeting minutes, which he already knows but pretends that this is news to him.

After some more discussion and confrontation between Wes and John, Bill agrees to get with his team and come up with a plan, even though everyone is already buried with Phoenix project work.

Wes and Bill stick around after the meeting to talk. Bill is beginning to get the impression that it’s hard to do much of anything without Brent. Wes says they tried to hire some other people at the same level as Brent but they have either left or aren’t as good as Brent.

Bill also discovers that there is no overall backlog of work. They have no visibility into how many business projects and infrastructure projects.

“We also have all the calls going into the service desk, whether it’s requests for something new or asking to fix something. But that list will be incomplete, too, because so many people in the business just go to their favorite IT person. All that work is completely off the books.”


The team (Bill, Patty, and Wes) set out to get a list of organizational commitments from their key resources, with a one-liner on what they’re working on and how long it will take. Bill will take all of Patty and Wes’s data to Steve on Monday to frame an argument for needing more people.

Chapter 6

Bill realizes during a status meeting that the development team is even more behind than he had feared, and almost all testing is being deferred to the next release.

Patty and Wes have put together data for what all their people are working on, and they share it with Bill. They discover that they have a high number of projects compared to the number of people, and their people to projects ratio is going to be about 1:1.

Most of the Operations resources are committed to Phoenix, and the 2nd largest project is Compliance. They also mention the compliance project would take all of their resources almost an entire year.

“Most of our resources are going to Phoenix. And look at the next line: Compliance is the next largest project. And even if we only worked on compliance, it would consume most of our key resources for an entire year! And that includes Brent, by the way.”


The 3rd largest project is incident and break-fix work, which is currently taking about 75% of the staff’s time.

Patty states that the one consistent theme in the interviews was that everyone struggles to get their project work done. When they do have time, the business is constantly making requests.

The numbers show that they will need to hire seven people so that everyone can complete their work.

Later that day, everyone attends a meeting for the Change Advisory Board (CAB).

“We need to tighten up our change controls, and as managers and technical leads, we must figure out how we can create a sustainable process that will prevent friendly-fire incidents and get the auditors off our back, while still being able to get work done. We are not leaving this room until we’ve created a plan to get there. Understood?”


The group starts off by stating that the change management tool is impossible to use. Bill calls a 10-minute break since things are slowly getting away from him. When the meeting reconvenes, Bill states that they must record all the necessary changes that must take place over the next 30 days.

Everyone dives in and starts taking the change management meeting seriously, however the discussions for individual changes go on for a lot longer than anticipated. To keep it simple, they request (1) who is planning the change, (2) the system being changed, and (3) a one-sentence summary.

The team comes up with a definition of change: “a ‘change’ is any activity that is physical, logical, or virtual to applications, databases, operating systems, networks, or hardware that could impact services being delivered.”

Parts Unlimited IT Operations Team

Later, Patty calls Bill and says that they can expect about 400 changes to be submitted that need to happen the next week. Bill tells Patty that all Monday changes can go through without being authorized, but that all changes for later in the week will have to be reviewed.

Chapter 7

Bill gets a call that a potential new board member, Erik Reid, is in town and needs to talk with all the IT executives. Bill decides to meet with Erik even though it’s been a long day.

Bill mistakes Erik for a deliveryman since Erik is wearing wrinkled khakis and an untucked shirt. Erik seems to have trouble remembering names of people he’s met but has assessed the IT situation accurately.

“It looks like you’re in a world of hurt. IT Operations seems to have lodged itself in every major flow of work, including the top company project. It has all the executives hopping mad, and they’re turning the screws on your Development guy to do whatever it takes to get it into production.”

Erik Reid

Erik then takes Bill to one of the company’s manufacturing plants to learn about WIP. WIP is “work in progress”.

“In the 1980s, this plant was the beneficiary of three incredible scientifically-grounded management movements. You’ve probably heard of them: the Theory of Constraints, Lean production or the Toyota Production System, and Total Quality Management. Although each movement started in different places, they all agree on one thing: WIP is the silent killer. Therefore, one of the most critical mechanisms in the management of any plant is job and materials release. Without it, you can’t control WIP.”

Erik Reid

Erik talks to bill about prioritizing work, and why bottlenecks are important to selecting work. Bill says that running IT operations is not like running a factory, but Erik disagrees with him.

The Theory of Constraints:

“Eliyahu M. Goldratt, who created the Theory of Constraints, showed us how any improvements made anywhere besides the bottleneck are an illusion. Astonishing, but true! Any improvement made after the bottleneck is useless, because it will always remain starved, waiting for work from the bottleneck. And any improvements made before the bottleneck merely result in more inventory piling up at the bottleneck.”

“Your job as VP of IT Operations is to ensure the fast, predictable, and uninterrupted flow of planned work that delivers value to the business while minimizing the impact and disruption of unplanned work, so you can provide stable, predictable, and secure IT service.”

Erik Reid

The Three Ways:

“The First Way helps us understand how to create fast flow of work as it moves from Development into IT Operations, because that’s what’s between the business and the customer. The Second Way shows us how to shorten and amplify feedback loops, so we can fix quality at the source and avoid rework. And the Third Way shows us how to create a culture that simultaneously fosters experimentation, learning from failure, and understanding that repetition and practice are the prerequisites to mastery.”

Retrieve Fantasy Football Stats using ESPN’s API: Part 2

Hello again and welcome to part two of our tutorial on how to scrape data from ESPN’s fantasy football API using Ruby. Last time we left off with our basic connection to ESPN, and we had retrieved some solid data. Let’s continue to pull more data and parse it.

First, we have a little bit of cleanup. There are some global variables sitting around that we’d like to get rid of, and we’re also going to be adding static data to reference. So let’s create a data module to house these objects and name it DataSource. We can start by moving our SWID, S2, and league ID (if applicable) variables into this file and assigning them as constants instead of global variables.

Now that we are working with more than one file, we’ll need to pull in these files to our main.rb class. Since we think we will only have one directory, we can make this simple and only add our Root directory to our Load Path. Let’s create a constant in main.rb called ROOT_DIR that will look like this:

ROOT_DIR = File.join(File.dirname(FILE))

Then we can add that to our load path with this statement:


Now we’ll easily be able to pull any files we create in our Root path. Finally we’ll want to require our DataSource module like so:

require ‘data_source’
include DataSource

We could loop through our root directory and require every .rb file, but this might be overkill for now. Now that we have access to our DataSource file, we can remove those ugly global variables and update the references to them in our code.

Now we’re ready to start looping through each week to pull down all the various statistics that we’re looking for. The general flow of our code will be the following:

  1. Make an API call for each week of the season to pull in the data. In this case, we will use 2019.
  2. Loop through each team that played that week.
  3. Loop through each player on that team’s roster and parse out their stats.

Simple enough, right? So, let’s take a look at the data that we pulled down in part 1 to look at what data is relevant to us. For now, we will be concerned with the Teams key in our Hash. The teams key is structured like so:

This may seem a little messy but I’ll point out some relevant data as we walk through this. Most of the actual stats will come from the data in that hash, but we’ll also pull a few pieces from the playerPoolEntry. As mentioned above, our first step will be to loop through each week and make an API call that applies to that week. Let’s make two new variables to specify the weeks we want to look at and the applicable season. For testing purposes, we’ll just look at week 1 for now:

weeks = *(1..1)
season = ‘2019’

If you aren’t familiar with the * syntax, it will simply create an array with the specified range. So in his case it will just create an array of [1], but we can easily expand this later once we’re ready to pull the data for all weeks. We will also want to declare an array called output where we will store all of our data as it is parsed. Now we can set up our loop to iterate through each week:

output = []

weeks.each do |week|
  url = "{season}/segments/0/leagues/1009412?view=mMatchup&view=mMatchupScore&scoringPeriodId=#{week}"
  response = RestClient::Request.execute(
      :url => url,
      :headers => {
          'cookies': {'swid': SWID,
                      'espn_s2': S2}
      :method => :get

  data = JSON.parse(response)

In the above code, we’ll need to redefine the URL for our API call for each week. We can interpolate the season and week variables into the URL string to accomplish this. Then we will perform a GET call and parse out the JSON to turn it into a hash. At this point we should have our data for week 1. This will be followed by our next loop which will parse the players for each team. We will iterate through each object in the teams array from the response body:

data[‘teams’].each do |team|

Now we should be at a point to start pulling out individual pieces of data. The first item we’ll collect is the team ID, or the very first item in the team hash.

This ID will correspond to a team in your league. To find out which team is which, you will have to look at the URL for each team when you are on the ESPN site. To do this you can simply go to the standings page and click through each team.

Here you can see the team ID is set to 2.

This next step is optional depending on if you care about actually having names for each team, but I recommend adding another constant to your DataSource module to map the ID’s for each team:

So if you have added this, we can write the line:

owner = OWNERS[team[‘id’].to_s]

(If you did not add an OWNERS constant then simply write team[‘id’].to_s)

Now we get to add — you guessed it — another nested loop! Is this the best way to write this code? No, it is not. We typically want to minimize our cyclomatic complexity, and the saying goes “flat is better than nested”. So while this isn’t necessarily ideal, we can always get our code to work properly now and then refactor later to extract out some functionality into methods. We can keep a lookout as we go forward to identify places where we can reduce our code complexity and readability when we get around to refactoring. But I digress.

Our next loop will be through each roster entry. The data we will collect for each player is as follows:

  1. firstName
  2. lastName
  3. playerId – a unique ID given to each player
  4. lineUpSlotId – An ID that signifies which position corresponds to the given player
  5. defaultPositionId
  6. actual points scored
  7. points the player was projected to score

Some of this data we can simply take, and some of it we will have to use to parse out more data. Let’s start with the easy ones. The top of our code block will look like this:

team['roster']['entries'].each do |entry|
  fname = entry['playerPoolEntry']['player']['firstName']
  lname = entry['playerPoolEntry']['player']['lastName']
  player_id = entry['playerId']
  slot = entry['lineupSlotId']

This is fairly straightforward as far as data gathering. On the next line we will want to grab the player’s position code. Since this code doesn’t actually tell us anything useful, we’ll have to map out what these codes represent in our DataSource module. The player codes we’ll use are as follows:

'1' => 'QB',
'2' => 'RB',
'3' => 'WR',
'4' => 'TE',
'16' => 'D/ST',
'5' => 'K'

Then we can reference this constant just like we did for our team Owners.

position = POSITION_CODES[entry[‘playerPoolEntry’][‘player’][‘defaultPositionId’].to_s]

We also have to get a little creative with the slot codes that we already grabbed. The slot code doesn’t really tell us much other than if a player is in your starting lineup or on your bench. Luckily this is pretty straightforward. Any number that is less than 9, exactly 16, or exactly 17 represents a starter, and anything else is a bench player. This can be evaluated like so:

starter = (slot < 9 || slot == 17 || slot == 16) ? ‘true’ : ‘false’

Great, now we have a bunch of general info about our given player. Now we want to pull their projected and actual stats, but this requires us to iterate over the stats key from our data. These loops are getting a little out of hand, so let’s stop being lazy and create a new module to help us out. Since we’ll mostly be using this module for parsing player data, let’s call it PlayerHelper (player_helper.rb). We can go ahead and require this at the top of our main.rb file the same way we did with our DataSource. Then we’ll add a method into the PlayerHelper called get_stats.

There are a few entries in the stats array that we are looking at, but we only really care about the entry that corresponds to our given week. We also will need our stats array to parse from. So our method declaration will look like this:

def get_stats(stats_array, week)

Now we will need to use a bit of logic to find the correct entry. First we need to find the entry with the corresponding week in the scoringPeriodId field. Then inside that entry we will need to check the statSourceId. If that ID is a 0, then that is the player’s actual stats. If it is a 1, then that entry represents the player’s projected stats. When we have assigned our actual and projected values, we can return a hash with an actual value and a projected value. So our final method code will look like this:

def get_stats(stats_array, week)
actual = ''
projected = ''
stats_array.each do |stat|
if stat['scoringPeriodId'] == week
if stat['statSourceId'] == 0
actual = stat['appliedTotal']
elsif stat['statSourceId'] == 1
projected = stat['appliedTotal']
{actual: actual, projected: projected}

And the method call from main.rb will look like this:

stats = get_stats(entry[‘playerPoolEntry’][‘player’][‘stats’], week)

That should give us a pretty good list of data to start with. Now let’s think ahead for a minute. Where should we store all of our data when we’re done retrieving it? It would be nice to create our own database, but that’s probably overkill for the moment, not to mention a lot of extra work. We could definitely put it all in a spreadsheet, too, but then we’d have to pull in some extra gems and add more logic. So let’s just stick with a good old CSV for now, which is just comma delimited fields that we can always import into a spreadsheet later. To do this, we can add all of our data so far to one big string:

result = “#{owner},#{week},#{season},#{position},#{fname},#{lname},#{starter},#{stats[:actual]},#{stats[:projected]},#{player_id},”

It’s not the prettiest thing in the world, but it will work for now. Finally, we can add in this result object into our output array that we created earlier:

output << result

If we let our program iterate all the way through for week 1, then we should have output that looks similar to this:

Not bad for a day’s work!
Let’s review what we’ve accomplished up to this point:

  1. We created a new DataSource module that we can move our global variables into and establish constants that help us map our data.
  2. We’ve created logic that will loop through and collect all of our basic player data.
  3. We created another new module PlayerHelper that we can use going forward to extract logic into to keep our main.rb class clean.
  4. We’ve identified a few places where we can go back and refactor to clean up our existing code.

One more takeaway that we have is that we have further seen how the API returns our data in a way that isn’t exactly straightforward. We have to go pretty deep into our data objects to find what we need. This is typical of most web services that return lots of data. This gives us another reminder that we need to keep our code well organized or none of this is going to make much sense to our future selves and will be hard for others to read.

I hope that you’ve found this post helpful and are able to follow along. For part three, we will look at pulling some additional player data and outputting our results into spreadsheets.