The following is a chapter summary for “The Phoenix Project” by Gene Kim for an online book club.
The book club is a weekly lunchtime meeting of technology professionals. As a group, the book club selects, reads, and discuss books related to our profession. Participants are uplifted via group discussion of foundational principles & novel innovations. Attendees do not need to read the book to participate.
Background on the Phoenix Project
“Bill, an IT manager at Parts Unlimited, has been tasked with taking on a project critical to the future of the business, code named Phoenix Project. But the project is massively over budget and behind schedule. The CEO demands Bill must fix the mess in ninety days or else Bill’s entire department will be outsourced.
With the help of a prospective board member and his mysterious philosophy of The Three Ways, Bill starts to see that IT work has more in common with a manufacturing plant work than he ever imagined. With the clock ticking, Bill must organize work flow streamline interdepartmental communications, and effectively serve the other business functions at Parts Unlimited.
In a fast-paced and entertaining style, three luminaries of the DevOps movement deliver a story that anyone who works in IT will recognize. Readers will not only learn how to improve their own IT organizations, they’ll never view IT the same way again.”The Phoenix Project
Bill Palmer is the Director of Midrange Technology Operations for Parts Unlimited, a $4 billion per year manufacturing and retail company.
Parts Unlimited largest retailing competitor offers better customer service and a new feature that allows people to customize their cars with their friends online.
Bill is frustrated because their competition outperforms Parts Unlimited. His group are expected to deliver more with less year after year.
Bill is invited to meet with Steve Masters, the CEO of Parts Unlimited. He is informed that Luke (CIO) and Damon (VP of IT Operations) were let go and Bill is now VP of IT Operations.
“CIO stands for ‘Career Is Over'”Bill Palmer
IT will temporarily report to Steve until a new CIO is hired.
Steve tells Bill the goal of the company is to regain profitability to increase the market share and average order sizes. At present, the competitors for Parts Unlimited are beating them.
Steve believes “Project Phoenix” is essential to company success. The project is years late on delivering. If the company does not turn things around, the shareholders are likely to split up the company, costing the jobs of four thousands employees.
Chris Allers will be interim CIO. Chris is presently the VP of Application Development. Both Chris and Bill will report directly to Steve.
Bill is reluctant to take the position but Steve convinces him.
“What I want is for IT to keep the lights on. It should be like using the toilet. I use the toilet, and hell, I don’t ever worry about it not working. What I don’t want is to have the toilets back up and flood the entire building.”Steve Masters
Bill is informed by Steve that the “payroll run is failing”. This is his first task as failure to make payroll means many factory workers would be affected, potentially getting the company into trouble with the Union.
Bill moves to address the payroll issue by first meeting with Dick Landry, CFO.
“In yesterday’s payroll run, all of the records for the hourly employees went missing. “We’re pretty sure it’s an IT issue. This screwup is preventing us from paying our employees, violating countless state labor laws, and, no doubt, the union is going to scream bloody murder.”Dick Landry
Bill & Dick go to meet the Operations Manager Ann to get more situational awareness about the problem. The general ledger upload for hourly employees didn’t go through and all the hourlies are zero. The salaried employees numbers are ok.
“To get Finance the data they need, we may have to cobble together some custom reports, which means bringing in the application developers or database people. But that’s like throwing gasoline on the fire. Developers are even worse than networking people. Show me a developer who isn’t crashing production systems, and I’ll show you one who can’t fog a mirror.”Bill Palmer
As Bill returns to the IT building, he realizes how run down it is compared to the building that Leadership & Financing work in. Bill heads to the Network Operations Center (NOC) to meet Wes and Patty.
Wes is the Director of Distributed Technology Operations. He is responsible for windows servers, database & networking teams. Wes is loud, outspoken, and shoots from the hip.
Patty is the Director of IT Service Support. She owns all the level 1 and 2 help desk technicians. She also owns the trouble ticketing system, monitoring, and running the change management meetings. Patty is thoughtful, analytical, and a stickler for processes and procedures.
IT was in the middle of a Storage Area Network (SAN) firmware upgrade when the payroll run failed. They tried to back out the changes but ended up bricking it instead.
Chapter 2 is the first introduction of Brent, the engineer in the middle of many important IT projects. By having Brent tackle this Sev 1 issue, he is not working on project Phoenix. The team decides to visit Brent to learn more about the payroll issue.
Bill, Wes, and Patty go to meet Brent about the payroll issue.
“I was helping one of the SAN engineers perform the firmware upgrade after everybody went home. It took way longer than we thought—nothing went according to the tech note. It got pretty hairy, but we finally finished around seven o’clock.”
“We rebooted the SAN, but then all the self-tests started failing. We worked it for about fifteen minutes, trying to figure out what went wrong. That’s when we got the e-mails about the payroll run failing. That’s when I said, ‘Game Over.’”Brent
The team gets an update from Ann. The last pay period was fine but for the new pay period all the data is messed up. The Social Security numbers for the factory hourlies are complete gibberish.
Since only one field is corrupted, the team deduces it’s not a SAN failure. They find out on the conference call for the incident that a developer was also installing a security application the same time the SAN firmware was being upgraded.
The security software change was requested by John Pesche, the Chief Information Security Officer.
“The only thing more dangerous than a developer is a developer conspiring with Security. The two working together gives us means, motive, and opportunity.”Bill Palmer
Information Security at Parts Unlimited often make urgent demands and so the development teams don’t invite them to many meetings. The InfoSec team does not follow the change management process and it always causes problems.
John reveals that Luke and Damon were perhaps fired over a compliance audit finding from security.
InfoSec had an urgent audit issue around storage of PII — personally identifiable information like social security numbers, birthdays, etc.. They found a product that tokenized the information so the SSNs were no longer stored.
“‘Let me see if I’ve got this right…’ I say slowly. ‘You deployed this tokenization application to fix an audit finding, which caused the payroll run failure, which has Dick and Steve climbing the walls?'”Bill Palmer
John made the changes because the next window for the change to be deployed was in four months and auditors would be on-site in one week. John never tested the change because there’s no test environment.
Bill requests a list of all the changes made in the past three days so they can examine the timeline and establish cause & effect. Bill finds out few people use the change management system to make requests.
The Change Advisory Board (CAB) is not well attended. Tams will make changes without approval or notice because of deadline pressures. Bill asks Patty to send out a meeting notice to all the tech leads and announce attendance is mandatory.
After review of the 27 changes in the past three days, only the InfoSec tokenization change and the SAN upgrade could be linked to payroll failure.
The applications were eventually brought online but the company had to submit payroll using the prior pay period. The local newspaper reports on the payroll failure after the Union complains.