Book Club: The Phoenix Project (Chapters 21-25)

This entry is part 6 of 8 in the series Phoenix Project

The following is a chapter summary for “The Phoenix Project” by Gene Kim for an online book club.

The book club is a weekly lunchtime meeting of technology professionals. As a group, the book club selects, reads, and discuss books related to our profession. Participants are uplifted via group discussion of foundational principles & novel innovations. Attendees do not need to read the book to participate.

Chapters 17-20 HERE

Background on the Phoenix Project

“Bill, an IT manager at Parts Unlimited, has been tasked with taking on a project critical to the future of the business, code named Phoenix Project. But the project is massively over budget and behind schedule. The CEO demands Bill must fix the mess in ninety days or else Bill’s entire department will be outsourced.

With the help of a prospective board member and his mysterious philosophy of The Three Ways, Bill starts to see that IT work has more in common with a manufacturing plant work than he ever imagined. With the clock ticking, Bill must organize work flow streamline interdepartmental communications, and effectively serve the other business functions at Parts Unlimited.

In a fast-paced and entertaining style, three luminaries of the DevOps movement deliver a story that anyone who works in IT will recognize. Readers will not only learn how to improve their own IT organizations, they’ll never view IT the same way again.”

The Phoenix Project

Chapter 21

Bill arrives at an audit meeting in building 2. He arrives to a packed conference room with attendees such as Dick, John, Wes, Erik, Ann, Nancy, and the auditors. Bill is surprised at how awful and stressed John looks.

The meeting lasts for 5 hours and ends with everyone surprised with the auditor’s conclusion that the company probably isn’t in trouble.

Bill is surprised to learn after that meeting that Erik and the lead auditor are old friends.

Bill describes the meeting and says that Dick kept marching out business SME’s to demonstrate that they have their own controls outside of IT where fraud would be caught.

John then asks if Bill has a minute to talk. He is still visibly flustered.

“John looks awful. If his shirt were just a little more wrinkled, and maybe had a stain or two in front, he could almost pass as a homeless person.”

Bill

John starts to ramble on to Bill about the systemic IT issues and information security. Erik is still talking to the auditor in the room and guides him out into the hallway upon hearing John starting his rant. John says that no one cares about IT security and the entire dev organization hides their activities from him.

“You all look down on me. You know, I used to manage servers, just like you do. But I found my calling doing information security. I wanted to help catch bad guys. I wanted to help organizations protect themselves from people who were out to get them. It came out of a sense of duty and a desire to make the world a better place.”

John

Erik comes back into the room angrily and grabs a chair.

“You know what your problem is, Jimmy? You are like the political commissar who walks onto the plant floor…sadistically poking your nose in everybody’s business and intimidating them into doing your bidding, just to increase your own puny sense of self-worth. Half the time, you break more than you fix. Worse, you screw up the work schedules of everyone who’s actually doing important work.”

Erik (to John)

Erik continues to go off on John and tells him that he doesn’t have anything else to say to him until John understands what just happened in that room.

“This should be your guiding principle: You win when you protect the organization without putting meaningless work into the IT system. And you win even more when you can take meaningless work out of the IT system.”

Erik

Erik tells John to go to the MRP-8 plant and talk to the safety officer.

Erik leaves John and Bill alone. John says goodbye and pushes his binder off the table. He says he may not be back tomorrow. Bill sees a haiku that John had written:

Here I sit, hands tied

Room angry, I could save them

If only they knew

Chapter 22

The Monday after the audit meeting, John disappeared.

Bill finds Wes and takes him to Patty’s office to talk about the monitoring project. Bill tells them both about how Erik validated that they can release the monitoring project and how important it is so that they can elevate Brent.

Patty seems to entertain the idea that IT is like manufacturing but Wes is still skeptical.

“Let’s use the example of configuring a server. It involves procurement, installing the OS and applications on it according to some specification and then getting it racked and stacked. Then we validate that it’s been built correctly. Each of these steps are typically done by different people. Maybe each step is like a work center, each with its own machines, methods, men, and measures.”

Patty

Patty concludes that she’s not sure what the machine would be in her scenario, and that they are better off trying out the process first. Otherwise, they are just stumbling around in the dark.

Wes is still skeptical, but Bill explains how challenging some of the floor work in the factory was. He says how those workers had to call upon their experience to solve problems and how they earned his respect.

Bill says that they should start the monitoring project as soon as they can.

The next Monday, Bill goes to the change room with Patty. She has put up a new Kanban board. Over the weekend she went to the MRP-8 factory and learned to split work into Ready, Doing, and Done in order to reduce WIP.

Patty plans on putting Kanban boards around key resources to manage their work. She thinks this will help predict lead time and get faster throughput.

“Imagine what this will do to user satisfaction if we could tell them when they make the request how long the queue is, tell them to the day when they’ll get it, and actually hit the date, because we’re not letting our workers multitask or get interrupted!”

Patty

Patty has also implemented an improvement kata and two-week improvement cycles.

Patty wants to implement a Kandan board around Brent in order to further isolate him from crises.

Later, Bill is sitting with Patty and Wes to figure out how to get project work started again. Bill states that they have two queues: business and internal projects.

They decide to only release the five most important business projects. The hard part is prioritizing the 73 internal projects.

Bill remembers what Erik has told him. He says that unless a project increases Brent’s capacity by reducing his workload or allowing someone else to take it over, then it isn’t important.

He asks for 3 lists: one that requires Brent, one that increases Brent’s throughput, and one that is everything else.

“We’re doing what Manufacturing Production Control Departments do. They’re the people that schedule and oversee all of production to ensure they can meet customer demand. When they accept an order, they confirm there’s enough capacity and necessary inputs at each required work center, expediting work when necessary. They work with the sales manager and plant manager to build a production schedule so they can deliver on all their commitments.”

Patty

Two days later, Bill finally gets a new laptop. This is two days earlier than planned.

Chapter 23

The following Tuesday, Bill gets a call from Kirsten. Brent is almost a week late on delivering a Phoenix task and the schedule is in jeopardy again. There are also several other late tasks.

Bill gets to work and joins Patty and Wes in a conference room. Patty explains that the late task is a test environment that was supposed to be delivered to QA. It turns out that this one “task” was more like a small project and involved multiple layers and teams.

Bill goes to the whiteboard and draws a graph:

Bill explains how wait times depend upon resource utilization: “The wait time is the ‘percentage of time busy’ divided by the ‘percentage of time idle.’ In other words, if a resource is fifty percent busy, then it’s fifty percent idle. The wait time is fifty percent divided by fifty percent, so one unit of time. Let’s call it one hour. So, on average, our task would wait in the queue for one hour before it gets worked.”

Bill recalls the Phoenix deployment when Wes was complaining about an excess amount of tickets that would take weeks to resolve. He concludes that the handoff between Dev and IT Ops is very complex.

Patty states that the group shows that everyone needs idle time. Otherwise, WIP gets stuck in the system.

The group then decides that they can create a Kanban lane for each large, recurring “task”.

“You know, deployments are like final assembly in a manufacturing plant. Every flow of work goes through it, and you can’t ship the product without it. Suddenly, I know exactly what the Kanban should look like.”

Patty

They decide that Patty will work with Wes’s team to assemble the 20 most frequently recurring tasks.

Chapter 24

The chapter starts with Bill and his family visiting a pumpkin patch on a Saturday. They spend the day together, and then Bill is watching a movie with his wife on the couch.

He gets a call on his phone from John. He answers the phone after a few ignored calls, and John asks Bill to meet him at a bar. Bill eventually agrees.

When Bill sees John, he looks awful. John is also very drunk.

John says that he has just been at home watching TV. He wants to ask Bill one last question before he leaves.

“Just tell me straight. Is it really true that I haven’t done anything of value for you? In all the three years that we’ve worked together, I’ve never, ever been helpful?”

John

Bill answers, “Look, John. You’re a good guy, and I know your heart is in the right place, but up until you helped hide us from the PCI auditors during the Phoenix meltdown, I would have said no. I know that’s not what you want to hear, but. . . I wanted to make sure that I wasn’t feeding you a line of bullshit.”

John downs a glass of scotch after hearing Bill’s response and asks for another, but Bill tells the waitress not to get it and order a cab.

Bill puts John in the cab and sends him home.

He tries calling John the next day, but John does not answer. There are still rumors circulating at the office regarding what happened to him.

Later Monday night, Bill receives a text from John: Thanks for the lift home the other day. Been thinking. I told Dick that I’ll be joining our 8am mtg tomorrow. Should be interesting.

Bill doesn’t know what meeting John is talking about.

When Bill asks John what meeting he’s talking about, John responds that he’s been arrogant and doesn’t know Dick that well. He says that him and Bill need to change that together. Bill calls John to see what’s going on.

“I kept thinking about our last conversation at the bar. I realized that if I haven’t done anything useful for you, who I should have the most in common with, then it stands to reason that I haven’t been useful to almost everyone else, who I have nothing in common with.”

John

Bill reluctantly agrees to join John in the meeting.

Chapter 25

The next day, Bill heads toward Dick’s office for the meeting with Dick and John. He sees John outside the office, and John has totally cleaned up his appearance from when Bill last saw him. Bill: “With the shaved head, his calm friendly smile and perfect posture, he looks like some sort of enlightened monk.”

Bill is shocked when John asks Dick, “. . . what exactly you do here at Parts Unlimited? What is your exact role?”

Dick plays along and answers the question seriously. He says that when he was hired, he was a traditional CFO, but now he also takes care of planning and operations for Steve. John calls him the de-facto COO, but Dick acknowledges that is now part of his job.

“With a very small smile, he adds, “Want to hear something funny? People say that I’m more approachable than Steve! Steve’s incredibly charismatic, and let’s face it, I’m an asshole. But when people have concerns, they don’t want to have their minds changed. They want someone to listen to them and help make sure Steve gets the message.”

Dick

When asked what a good day for himself looks like, Dick says that it’s when they are beating the competition and writing big commission checks to their salesmen.

“Steve would be excited to announce to Wall Street and the analysts how well the company is performing—all made possible because we had a winning strategy, and also because we had the right plan and the ability to operate and execute.”

Dick

Dick says that they haven’t had a day like that in over four years. He says that a bad day looks like the Phoenix project launch.

John asks Dick what his goals for the year are. Dick gives him a list:

  • Health of Company
  • Revenue
  • Market Share
  • Average Order Size
  • Profitability
  • Return of Assets
  • Health of Finance
  • Order to Cash Cycle
  • Accounts Receivable
  • Accurate Financial Reporting
  • Borrowing Costs

Dick continues to the company goals, which he says are more important than the goals for just his department:

  1. Are we competitive?
  2. Understanding customer needs and wants: Do we know what to build?
  3. Product portfolio: Do we have the right products?
  4. R&D effectiveness: Can we build it effectively
  5. Time to market: Can we ship it soon enough to matter?
  6. Sales pipeline: Can we convert products to interested prospects?
  7. Are we effective?
  8. Customer on-time delivery: Are customers getting what we promised them
  9. Customer retention: Are we gaining or losing customers?
  10. Sales forecast accuracy: Can we factor this into our sales planning process?

Dick says all of those measurements are currently at risk. He says they are $20 million into Phoenix and still are not competitive, and the best favor they can do him is to stay focused and get it working.

After the meeting, Bill says that Dick doesn’t realize how much his measurements depend on IT.

Bill calls Erik to get some advice. He wants to convince Dick that IT is capable of screwing up less often and helping the business win.

“Your mission is twofold: You must find where you’ve under-scoped IT—where certain portions of the processes and technology you manage actively jeopardizes the achievement of business goals—as codified by Dick’s measurements. And secondly, John must find where he’s over-scoped IT, such as all those SOx-404 IT controls that weren’t necessary to detect material errors in the financial statements.”

Erik (to Bill)

He also says that John also needs to learn exactly how business was able to dodge the audit bullet, and to feel free to invite him to Bill’s next meeting with Bill.

Retrieve Fantasy Football Stats using ESPN’s API: Part 3

Welcome to Part 3 of our series on how to scrape fantasy football data from ESPN’s API. Let’s go ahead and recap what we’ve done so far:

  • set up a main class where we are making API calls to ESPN using the RestClient gem
  • built a data_source class to store all of our data and global variables
  • created a player_helper class to extract some logic out of our main class
  • parsed through one week of data for a league and output in CSV format

We’re a good way through retrieving our data and storing it in a format that is ready for output. We still have a little bit more data that we would like to capture, so let’s go ahead and knock that out now.

It’s nice that we have data such as basic player data (name, position, etc.), projected stats and actual stats. It would be great, however, for us to get even more granular statistics for each player and week. In fantasy football, our players gain points by accumulating stats such as gaining yards and scoring touchdowns. Wouldn’t it be nice to break down this data to see where our players’ points are coming from on a weekly or seasonal basis?

Let’s start by seeing where this data is located in our API response. Last time we were already looking inside the response data at the following location:

data[‘teams’][‘roster’][‘entries’][‘playerPoolEntry’][‘player’][‘stats’]

Now we’re going to dig a little deeper into this same data, and append another [‘stats’] to the end of the above location. If we look inside here, we will see a hash with a bunch of seemingly random numbers as keys, and more random numbers as values. These keys actually correspond to certain statistical categories. There is no way to know this without digging through the data with a little trial and error, but luckily I will provide the keys we’re going to use.

So let’s define a new hash inside our data_source file to map out what these keys correspond to.

STAT_KEYS =
    {
        'pass_attempts' => '0',
        'completions' => '1',
        'pass_yards' => '3',
        'pass_tds' => '4',
        'interceptions' => '20',
        'rush_attempts' => '23',
        'rush_yards' => '24',
        'rush_tds' => '25',
        'receptions' => '41',
        'receiving_yards' => '42',
        'receiving_tds' => '43'
    }

For the sake of simplicity, we will ignore defensive stats and kicker stats (who cares about kickers anyway, right?!) Now, if we remember back to last time, we have our get_stats method inside the player_helper class. We have to do a little searching in there based on the statSourceId and the scoringPeriodId. This is the same entry that we’ll want to pull our detailed stats from. So, let’s add in another method in the player_helper class to retrieve these detailed stats.

def get_detailed_stats(player)
  result = {}
  stats = player['stats']
  STAT_KEYS.each_pair do |key, value|
    if stats and stats.has_key?(value)
      result[key] = stats[value]
    else
      result[key] = '0'
    end
  end
  result
end

The purpose here is to take the STAT_KEYS hash, and swap out the numerical values with the actual corresponding player statistic. We could just return the values without the keys to signify what they are, but if we want to do any work with these down the line then they’ll already be nice and organized. The logic here is fairly straightforward.
1. Take the data we already have and look inside the 2nd ‘stats’ key.
2. Take each numerical value from our STAT_KEYS and check to see if that particular key is returned for that player.
3a. If we do have that key, we’ll store that value in a hash with the key from our STAT_KEYS hash.
3b. If we do NOT have that key, we’ll simply plug in a 0. The reason we look for every stat and plug in 0’s is that we want to have the same exact output for each player so that we can plug it neatly into a spreadsheet.

We’ll plug in the call to this method inside our get stats method like so:

def get_stats(stats_array, week)
  actual = ''
  projected = ''
  stats_array.each do |stat|
    if stat['scoringPeriodId'] == week
      if stat['statSourceId'] == 0
        actual = stat['appliedTotal']
        details = get_detailed_stats(stat)
      elsif stat['statSourceId'] == 1
        .....

We don’t need to do this for the projections for now, although we could parse those our more neatly in a similar fashion if we desired. The output for a single player should look something like this:

We can see here that there are a lot of 0’s plugged in, and that’s ok. Most players will only accumulate a few stats that are actually relevant to their position. Now, we will want to return this data to our main class along with the projected and actual values that we are already sending. Let’s send the whole hash for now, and we can include the logic of how to deal with that down the line. The last line of our get_stats method should now look like this:

{actual: actual, projected: projected, details: details}

When we hop back to our main class, our stats variable will now include these broken down player stats. Last time we were storing all of our data in a comma delimited string named results. To append this data to our string, we can simply loop through our stats[:details] hash and append each value to our results string. This line should appear immediately after our result variable is populated.

stats[:details].each_value do |value|
result << value.to_s + ','
end

Now the rest of our logic can stay the same, and we have a nice comma delimited string to output to a file. Now we can finally hop outside of all of our nested loops, and write the code to output the data. Feel free to name your output file whatever you choose. In this case I’ve decided on simply calling it “output” because I’m not very creative.

If you are familiar with file opening modes, we have a few to choose from here. We can use either “w”, “w+”, “a” or “a+”. (If you are not familiar, “w” will write after truncating the destination to size 0 if it already exists, and “a” will append to the end of the file if it already exists. The “+” just changes mode to read-write instead of write only.) Since we don’t really need to read the file and we don’t really want to append anything to prevent a massive CSV file being created over time, we should be fine with choosing “w”. So our output code will read as follows:

File.open(ROOT + '/output.csv', 'w') do |f|
  output.each do |string|
    f << "#{string}\n"
  end
  f.close
end

(In my case this will write a file called “output.csv” to my root directory. You may choose to store your output in a different location.)

And voila! We should now have a nice CSV file with all of our week 1 data. If you open this in Excel, then it will recognize the CSV format for you and move all of your data into the proper columns.

This looks great, but we don’t have any column headers! In order to prevent the need for adding these in every single time, let’s go back and add one last piece of data to our data_source file to easily store our column headers.

OUTPUT_HEADERS =
%w(
Owner
Week
Season
Position
FName
LName
Starter?
Actual
Projected
PlayerID
Pass Attempts
Completions
Pass Yards
Pass TDs
INTs
Rush Attempts
Rush Yards
Rush TDs
Reception
Rec. Yds
Rec. TDs
)

Then when we declare our output variable in our main file we can simply add one extra line:

output = []
output << OUTPUT_HEADERS.join(',')

This about wraps up the Ruby code we need to write for pulling data. The way we have organized and written our code should allow for further expansion if you wanted to add new features on your own. The work now shifts over to the excel side to actually make use of this data and create some pretty graphs and charts. What you want to do with this data is up to you, but here are a few examples of what I have made personally.

Breaking down each individual statistic by owner
Difference between actual and projected points per owner over the course of the season
Season overview of points scored above or below projections

Developing these is a great way to get familiar with some VLOOKUPS and other fun formulas. I think going into too much detail is outside the scope of this post, but most of the data isn’t too hard to put together with a help from google.

I hope you’ve enjoyed this series of posts, learned a little something, and hopefully have been able to use this code for your own purposes. If you have any questions or issues with any of the code mentioned here, feel free to contact me through the Contact Us section of the site.

Book Club: The Phoenix Project (Chapters 17-20)

This entry is part 5 of 8 in the series Phoenix Project

The following is a chapter summary for “The Phoenix Project” by Gene Kim for an online book club.

The book club is a weekly lunchtime meeting of technology professionals. As a group, the book club selects, reads, and discuss books related to our profession. Participants are uplifted via group discussion of foundational principles & novel innovations. Attendees do not need to read the book to participate.

Chapters 13-16 HERE

Background on the Phoenix Project

“Bill, an IT manager at Parts Unlimited, has been tasked with taking on a project critical to the future of the business, code named Phoenix Project. But the project is massively over budget and behind schedule. The CEO demands Bill must fix the mess in ninety days or else Bill’s entire department will be outsourced.

With the help of a prospective board member and his mysterious philosophy of The Three Ways, Bill starts to see that IT work has more in common with a manufacturing plant work than he ever imagined. With the clock ticking, Bill must organize work flow streamline interdepartmental communications, and effectively serve the other business functions at Parts Unlimited.

In a fast-paced and entertaining style, three luminaries of the DevOps movement deliver a story that anyone who works in IT will recognize. Readers will not only learn how to improve their own IT organizations, they’ll never view IT the same way again.”

The Phoenix Project

Chapter 17

Bill takes his son to see the trains after quitting but is interrupted by multiple calls from Wes & Patty.

The inventory management systems are down. No one can get inventory levels in the plants or warehouses, and they don’t know which raw materials need to be replenished.

“Well, we’ve pretty much screwed the pooch since you’ve left,” Wes says, sounding genuinely abashed, confirming my worst fears. “Steve insisted that we bring in all the engineers, including Brent. He said he wanted a ‘sense of urgency’ and ‘hands on keyboards, not people sitting on the bench.’ Obviously, we didn’t do a good enough job coordinating everyone’s efforts, and…”

Wes

Steve Masters attempts to call Bill after calling his wife Paige. Eventually, Bill returns his call and listens to Steve’s apology.

Steve had promised to get “his hands dirty” with IT but hasn’t lived up to the promise. His delegation of IT to Sarah was a total screwup.

“I’m convinced that IT is a competency that we need to develop here. All I’m asking is that you spend ninety days with me and give it a try.”

Steve

Steve Masters convinces Bill to rejoin Parts Unlimited.

Chapter 18

Bill attends Steve’s IT Leadership Off-Site, which is actually located on the Parts Unlimited campus.

Wes, Patty, Chris, Erik, and Steve are all in attendance.

“Erik described the relationship between a CEO and a CIO as a dysfunctional marriage. That both sides feel powerless and held hostage by the other.”

Steve

“There are two things I’ve learned in the last month. One is that IT matters. IT is not just a department that I can delegate away. IT is smack in the middle of every major company effort we have and is critical to almost every aspect of daily operations.”

Steve

“The second thing I’ve learned is that my actions have made almost all our IT problems worse. I turned down Chris and Bill’s requests for more budget, Bill’s request for more time to do Phoenix right, and micromanaged things when I wasn’t getting the results I wanted.”

Steve

Steve apologizes to Bill, taking full responsibility for the failures of Phoenix and the audit.

Steve identifies trust as the primary issue.

“A great team doesn’t mean that they had the smartest people. What made those teams great is that everyone trusted one another. It can be a powerful thing when that magic dynamic exists.”

Steve

Five Dysfunctions of a Team: In order to have mutual trust, you need to be vulnerable.

Steve asks each person to share something about themselves.

Steve was the first person in his family to make it to college. He worked in a copper mine to pay for college. He eventually went on to work for a pipe manufacturing plant. Steve joined the ROTC to help pay for school and then the US Army.

Steve is an excellent officer with high ratings but none of his subordinates enjoy working with him. Steve commits to changing his ways.

“Over the next three decades, I became a constant student of building great teams that really trust one another. I did this first as a materials manager, then later as a plant manager, as head of Marketing, and later, as head of Sales Operations. Then twelve years ago, Bob Strauss, our CEO at the time, hired me to become the new COO.”

Steve

Steve asks for commitment from everyone to develop IT as a competency by starting to trust one another. Everyone in attendance nods in agreement, except for Bill. . .

Chapter 19

Bill eventually nods in agreement as well.

Patty apologizes for reacting so coldly to Bill. She credits Bill for changing the IT Department.

“The goal of this exercise is to get to know one another as people. You’ve learned a bit about me and my vulnerabilities. But that’s not enough. We need to know more about one another. And that creates the basis for trust.”

Steve

Chris volunteers to start. He was born in Beirut and speaks four languages. He describes the story of his wife’s pregnancy complications and how it taught him to not be selfish.

Wes participates next. He was engaged three times and called off each before getting married. Wes races cars and has struggled with his weight.

Patty started as an Art major but ended up switching majors five times in college. She dropped out of college to become a singer-songwriter, touring the country. She decided to work for Parts Unlimited because she couldn’t make a living as an artist.

Bill grew up in a family with an alcoholic father. He ran away from home and got into trouble. After being arrested, he chose to join the Marines.

Bill cries as he describes the lessons learned from the Marines: “What did I learn? That my main goal is to be a great father, not like the shitty father I had. I want to be the man that my sons deserve.”

“Solving any complex business problem requires teamwork, and teamwork requires trust. Lencioni teaches that showing vulnerability helps create a foundation for that.”

Steve

Steve identifies missing every commitment and schedule as a primary problem in IT. He surmises that the team is not good at making internal commitments.

Chris counters that his team hit their targets, including on Phoenix. However, Phoenix was a disaster. If success was Chris getting all the Phoenix tasks done, then they met their target. If success was putting Phoenix into production fulfilling business goals, then they failed.

Development does not factor in the work Operations needs to complete.

Part of the problem is planning and architecture. Development is also waiting for operations to deploy because there is backlog of work.

“Erik has helped me understand that there are four types of IT Operations work: business projects, IT Operations projects, changes, and unplanned work. But, we’re only talking about the first type of work, and the unplanned work that get’s created when we do it wrong. We’re only talking about half the work we do in IT Operations.”

Bill

Bill realizes while discussing the types of work (the audit project specifically) they have forgotten to invite John. Steve takes a 15-minute break to invite John.

The IT staff is unsure how they make commitment decisions for projects, unlike the manufacturing plant. No capacity or demand analysis is done.

IT takes shortcuts, which means fragile applications in production, and firefighting, which leads to technical debt.

Technical debt compounds over time.

“If an organization doesn’t pay down its technical debt, every calorie in the organization can be spent just paying interest, in the form of unplanned work.”

Erik

“Unplanned work has another side effect. When you spend all your time firefighting, there’s little time or energy left for planning. When all you do is react, there’s not enough time to do the mental work of figuring out whether you can accept new work. So projects are crammed onto the plate, with fewer cycles available to each one, which means more bad multitasking, more escalations from poor code, which mean more shortcuts.”

Erik

Identify where the constraint is and then protect it. Ensure time is never wasted on the constraint.

Bill believes Brent is the constraint for Parts Unlimited.

To fix the problems of IT, Bill proposes to stop doing all other non-Phoenix work to focus on improving their processes for two weeks.

Erik agrees, because the goal should be to increase the throughput of the entire system.

Steve promises to send out an email to the company announcing the work stoppage, to prevent managers from “strong arming” Operations into helping pet projects.

The team will identify the top areas of technical debt, which Development will tackle to decrease the unplanned work being created by problematic applications in production.

Chapter 20

The company has made great progress on Phoenix; more accomplished in 7 days than in the prior month.

The company experiences a Sev-1 incident that took out internal phones and voicemail. The incident was caused by a vendor accidentally making changes to the production phone system. The team will put together a project to monitor critical systems for unauthorized changes.

“How do we currently prioritize our work? When we commit to work on a project, a change, a service request, or anything else, how does anyone decide what to work on at any given time? What happens if there are competing priorities?”

Bill

Priorities are typically based on the most senior person making the request or most recent request.

Erik and Bill take another trip to the manufacturing plant.

Understanding the flow of work is the first key to achieving the First Way.

Bill surmises that Brent is a worker supporting way too many work centers, which is why he’s a constraint.

“Every work center is made up of four things: the machine, the man, the method, and the measures. Suppose for the machine, we select the heat treat oven. The men are the two people required to execute the predefined steps, and we obviously will need measures based on the outcomes of executing the steps in the method.”

Erik

Bill is standardizing Brent’s work so others can execute it. Documenting the steps helps with consistency and quality.

Bill comes to the conclusion that only those projects that don’t require Brent are safe to begin work on again.

The monitoring project is the most important because it elevates the constraint by removing unnecessary work from his plate by bypassing him.

Total Productive Maintenance

  • Do whatever it takes to assure machine availability by elevating maintenance
  • ‘Improving daily work is even more important than doing daily work.’

“The Third Way is all about ensuring that we’re continually putting tension into the system, so that we’re continually reinforcing habits and improving something. Resilience engineering tells us that we should routinely inject faults into the system, doing them frequently, to make them less painful.”

Erik

Improvement Kata: Mike Rother says it almost doesn’t matter what you improve, as long as you’re improving something. Because if you are not improving, entropy guarantees that you are getting worse, which ensures that there is no path to zero errors, zero work-related accidents, and zero loss.

Kata: repetition creates habits, and habits are what enable mastery

Just as important as throttling the release of work is managing the handoffs.

The wait time for a given resource is the percentage that resource is busy, divided by the percentage that resource is idle.

If a resource is fifty percent utilized, the wait time is 50/50, or 1 unit. If the resource is ninety percent utilized, the wait time is 90/10, or nine times longer.

“A critical part of the Second Way is making wait times visible, so you know when your work spends days sitting in someone’s queue—or worse, when work has to go backward, because it doesn’t have all the parts or requires rework.”

Erik

The Security Projects from John don’t help scalability, availability, survivability, sustainability, security, supportability, or the defensibility of the organization. At present, they are not a good use of time.

From the Pipeline v10.0

This entry is part 10 of 34 in the series From the Pipeline

The following will be a regular feature where we share articles, podcasts, and webinars of interest from the web. 

Software Testing Podcasts

If you’re interested in learning more about testing and love podcasts, Software Testing Magazine has compiled a list of some popular testing podcasts.

A Primer on Continuous Testing

“Continuous testing shortens feedback loops through automated testing that occurs throughout the development lifecycle—hence “continuous.” Testing and QA become the responsibility of everyone working on the software, not just testers. Let’s look at some proven practices from organizations that have used continuous testing effectively to realize tangible benefits.”

Improve Your Test Automation Learning and Delivery with The Three Stream Method

Jon Ferguson Smart is the author of “BDD in Action”, one of my favorite tech books. He posts often on his blog and provides some solid advice on automation. In this post, he briefly discusses the three method: the first stream is value, the second stream is quality or technical debt, and the third stream is learning. He links to a new ebook, “The Roadmap From Manual to Automated Testing”, which is recommended for anyone learning to adopt automation. He’s an excellent author so please give it a read.

Production Deploy with Every Check-In? You Gotta Go TWO Low!

Paul Grizzaffi is an automation architect for Magenic. In this guest post for Applitools he describes multiple issues that can occur during a deployment to prod by a developer, from visual issues to timing issues. There are two different costs to consider: cost of change and cost of failure. To learn more about both check out his post.

The Technical Debt Trap (VIDEO)

For a change of pace, here is an excellent conference presentation given by the great Doc Norton on Technical Debt. I highly recommend watching this video to understand the origins of technical debt and why so many orgs don’t devote time towards quality as an upfront cost. “Technical Debt has become a catch-all phrase for any code that needs to be re-worked. Much like refactoring has become a catch-all phrase for any activity that involves changing code. These fundamental misunderstandings and comfortable yet mis-applied metaphors have resulted in a plethora of poor decisions. What is technical debt? What is not technical debt? Why should we care? What is the cost of misunderstanding? What do we do about it? Doc discusses the origins of the metaphor, what it means today, and how we properly identify and manage technical debt. In this talk I’ll share how these four principles power world-famous companies and how they can help you work with greater speed, simplicity, safety and success.”

Cukes and Apples: Advanced Cucumber Steps

Welcome Back

In the previous post, we implemented the Page Object pattern to drive a simple Cucumber scenario. The steps used in that scenario are expressive enough, but not very reusable and not well-organized. In this post, we will explore some good practices for writing and using Cucumber steps for mobile test automation.

Get the code from the previous post here: https://github.com/RussellJoshuaA/cukes_apples_2

Arrange, Act, Assert

We recommend organizing general, reusable step definitions with a pattern used in unit testing: Arrange-Act-Assert. The Arrange-Act-Assert pattern divides step definitions into three logical groupings, predictably: arrange, act, and assert.

  • The “arrange” section sets up the preconditions necessary for a test to succeed or fail correctly. This will include things like logging in and navigating to pages.
  • The “act” section describes an action for which the result must be validated, like tapping a button or performing a gesture.
  • The “assert” section finishes a test by validating the result of the action which preceded it, by checking conditions like the visibility and value of page elements.

Organizing our step definitions according to the Arrange-Act-Assert pattern makes it easier for new contributors to learn the most reusable steps in the test suite and reminds us of the purpose of these steps as we use them.

Some steps are not easy to place in the Arrange-Act-Assert pattern – for example, a step which validates the display of a page might be used most often during the arrange and act sections of a test, but still constitutes a very good assertion. If you are not sure where to place a step, consider how the step will be used most often, how it will provide the most value, and what first impression a new collaborator should have.

Custom Steps

If the Arrange-Act-Assert pattern is followed too literally, and other categories of a step are prohibited, the benefits of following the pattern are lost. Collections of arrange, act, and assert steps should include steps that are generalized and reusable to make those steps easy to find.

Steps which are too specific for the general collection can be organized separately. For example, a step which handles user login is only applicable to the application under test, and does not describe a general mobile device interaction. Start small by collecting these steps in one place, like “custom_steps.rb”. As the custom steps collection grows in size and becomes unwieldy, identify related steps and create new step files for them.

Select Any Element

In the previous post, we used a step definition that selected a button on the Welcome page: “the user selects Next” or “the user selects Get started”. We could make that step more valuable by making it reusable. If this step could be written to select any element, then it could be used more scenarios.

First, move the step into a file that aligns with the Arrange-Act-Assert pattern: “step_definitions/action_steps.rb”. This makes sense as an action step – many scenarios are likely to validate the result of tapping an element.

Next, update the step pattern to accept any element name.

We will parse the element name and send it to a page object to invoke the method which will select the named element. The element name that is written in our scenarios – and captured by the step – is likely to be mixed-case, and include spaces, so we need to modify it first. See the string modification below, and the “send” method which accepts it:

That “send” method is an incredibly helpful construct that allows us to invoke a method on a page object without knowing the name of that method until runtime – that is, when we begin executing the test.

Because our scenario was written to select both “Next” and “Get started”, we also need to define a separate button named :get_started

Now, the step above is capable of selecting any element on the Welcome page… but there aren’t many elements on that page. This step would be much more valuable if it could also select any element, on any page.

Any Element, Any Page

The step can be further modified to select an element on any page, but there is a catch. See @current_page below:

The @current_page variable will be familiar for users of the web automation gem page-object, which uses the same variable name for the same purpose. We can call the “send” method on an instance variable named @current_page and assume that preceding steps have set the variable, but we must update other steps.

Update the step “the app is on the Welcome page” to set @current_page.

Now the step “the user selects <element_name>” can be used on any page, assuming the preceding step sets @current_page.

The other step, “the app is on the Welcome page”, would be even better if it could be used to describe any page.

Navigate to Any Page

To set @current_page with an instance of any page class, call Kernel.const_get and pass it the name of a page class. As with the element name above, it is necessary to manipulate the string first. Follow the example below to change page names into class names:

Now the step definition “the app is on the <page name> page” can be used to navigate to any page and validate the visibility of that page using the “on_page?” method, assuming “on_page?” is implemented for the named page.

Further Optimizations

Fans of the page-object gem might be looking for the on_page method. Page-object uses a PageFactory module to manage the @current_page, which includes the on_page method used to create new instances of page classes and set the @current_page variable. Our test suite can do the same if we implement a factory method as page-object does.

Summary

In this post, we used the Arrange-Act-Assert pattern to organize our steps by category and updated our step definitions to handle any element, on any page. By following the same principles, and leaning on constructs like “@current_screen.send” and “Kernel.const_get”, we can write step definitions that will describe almost any user interaction in a generalized and reusable way.

Get the code from this post here: https://github.com/RussellJoshuaA/cukes_apples_3

Coming Up Next

Updating this test framework to support cross-platform execution will require access to some new hardware. Another post will explore execution with iOS and Android in the future, but for now this series will be on hold while we publish some other articles. Stay tuned!

Resources

Book Club: The Phoenix Project (Chapters 13-16)

This entry is part 4 of 8 in the series Phoenix Project

The following is a chapter summary for “The Phoenix Project” by Gene Kim for an online book club.

The book club is a weekly lunchtime meeting of technology professionals. As a group, the book club selects, reads, and discuss books related to our profession. Participants are uplifted via group discussion of foundational principles & novel innovations. Attendees do not need to read the book to participate.

Chapters 8 – 12 HERE

Background on the Phoenix Project

“Bill, an IT manager at Parts Unlimited, has been tasked with taking on a project critical to the future of the business, code named Phoenix Project. But the project is massively over budget and behind schedule. The CEO demands Bill must fix the mess in ninety days or else Bill’s entire department will be outsourced.

With the help of a prospective board member and his mysterious philosophy of The Three Ways, Bill starts to see that IT work has more in common with a manufacturing plant work than he ever imagined. With the clock ticking, Bill must organize work flow streamline interdepartmental communications, and effectively serve the other business functions at Parts Unlimited.

In a fast-paced and entertaining style, three luminaries of the DevOps movement deliver a story that anyone who works in IT will recognize. Readers will not only learn how to improve their own IT organizations, they’ll never view IT the same way again.”

The Phoenix Project

Chapter 13

The Phoenix crisis is still an issue on Monday, and the problems are front page news on technology sites.

Bill is at a Phoenix status meeting, and Steve says that they are massively screwing their customers and shareholders. He says that Sarah is not off the hook until all of the store managers say that they can transact normally.

Steve also wants to meet with Sarah, Chris, Bill, Kirsten, and Ann once the stores are off life support.

Once Steve leaves and slams the door, Sarah says she wants the usability issues fixed, but Bill and others tell her how impossible that is at the moment.

“We are keeping Phoenix alive by sheer heroics. Wes wasn’t joking when he said that we’re proactively rebooting all the front-end servers every hour. We can’t introduce any more instabilities. I propose code rollouts only twice a day and restricting all code changes to those affecting performance.”

Bill

The team produces a plan to tie all code commits to a defect number or they will be rejected.

Bill visits Ann and her team across the hall. They have tables covered in faxes that represent orders that need deduplicated or reversed.

On the wall Ann’s team shows that 5,000 customers have had duplicate payments or missing orders, and they estimate 25,000 more transactions that still need investigated.

John also stops by to check out the activity. When he looks at an order, he tells Bill that they have a major problem.

John tells Bill that they are storing the CVV2 codes, which is against the law. John wants Bill to destroy all that information, but Bill says that they first need to take care of the transactions.

John remembers that the auditors are actually on site that day. Bill instructs him to not allow the auditors close to Ann’s team and the CVV2 information.

Later, John tells Bill that he may have some extra engineers to spare. Bill is thrilled by this since his team is literally at full capacity and is pulling all nighters.

“I then wonder if the fatigue is getting to me. Something is really screwy in the world when I’m finding reasons to thank Development and Security in the same day.”

Bill

Chapter 14

By late Monday, they had finally stabilized the Phoenix situation. The stores had working registers (although the fix is only temporary) and the company is no longer keeping sensitive cardholder data.

The leadership team is waiting outside Steve’s office, and Sarah comes out nearly in tears. Bill and Chris then take their turn to talk to Steve.

Steve says the company has nothing to show for the $20 million they’ve spent on Phoenix. He also says they may have lost loyal customers, and marketing is giving away $100 vouchers.

Bill gets frustrated that Steve didn’t follow his initial advice to delay Phoenix: “No offense, sir, but this is supposed to be news to me? I called you, explaining what would happen, asking you to delay the launch. You not only blew me off, you told me to try to convince Sarah. Where’s your responsibility in all of this? Or have you outsourced all your thinking to her?”

Steve responds by telling Bill that he needs some actual solutions from him. He also says that he needs the business to be able to tell him that it is no longer being held hostage by IT.

Steve goes on to say that the board is considering splitting up the company.

“Second, I’m done playing Russian roulette with IT. Phoenix just shows me that IT is a competency that we may not be able to develop here. Maybe it’s not in our DNA. I’ve given Dick the green light to investigate outsourcing all of IT and asked him to select a vendor in ninety days.”

Steve

Bill and Steve are shell shocked and decide to meet for lunch. Bill mentions that Paige tells him that he shouldn’t trust Chris.

Chris says that maybe his group being outsourced wouldn’t be the worst thing in the world and wonders if it might be time for a change. He says he used to love his work but lately it is so hard to keep up with change.

“It’s harder than ever to convince the business to do the right thing. They’re like kids in a candy store. The read in an airline magazine that they can manage their whole supply chain in the cloud for $499 per year, and suddenly that’s the main company initiative. When we tell them it’s not actually that easy, and show them what it takes to do it right, they disappear. Where did they go? They’re talking to their Cousin Vinnie or some outsourcing sales guy who promises they can do it in a tenth of the time and cost.”

Chris

Chris says that it’s getting harder and harder to hit dates. He was in a meeting where they were planning out work 3 years in the future, but he says that they can’t even effectively plan for one year.

Chris apologizes for his part in the Phoenix fiasco. He says that when he told Sarah a date for when code could be complete, he didn’t know that she would use it as a go live date.

Bill and Chris agree that they are worried that Sarah might try to stick the whole situation on them. They say she is like “Teflon” because nothing sticks to her.

Bill and Chris agree that they will meet once a week for the next few months.

Once back at the office, Bill gets an email from Chris. Chris tells him that they are throwing a celebration party since the Phoenix deployment is “finished” and invites Bill and his team.

Bill forwards the email to Wes and Patty, but Wes says his team still has a lot of work to do.

Chapter 15

The chapter opens on Wednesday with Bill taking Paige out to breakfast. She says she has never seen him this stressed. Bill tells her that he has no idea when life will be normal again.

Paige says she doesn’t know why Bill decided to accept the job. He thinks to himself that the organization is better off because of his contributions and is happy he’s one of the people that can try to fend off the outsourcing.

Bill starts thinking about how the pay raise will help his family pay down their debt. Paige catches him wandering off in thought and says she wishes they picked someone else for the job.

Bill drops off Paige at home and sees that he has an email that Wes has forwarded him. The email is giving praise to the new change board and how it saved two different groups from making changes to the database and app servers at the same time.

Patti knocks on Bill’s door and tells him that she thinks they have a problem. She asks him to follow her to the Change Coordination Room.

“I groan. Every time Patty’s summons me there, it’s because of some new intractable problem. But problems, like dog poop left in the rain, rarely get better just by ignoring them.”

Bill

Bill notices that the change boards look different. He notices there are barely any changes posted that are coming up, and the cards are missing. Patti tells him that there are about 600 cards of changes that need to be rescheduled due to Phoenix.

Bill discovers that the fourth type of work that Erik had mentioned was unplanned work (the other three are: business projects, internal projects, and changes).

“That’s why Erik called it the most destructive type of work. It’s not really work at all, like the others. The others are what you planned on doing, allegedly because you needed to do it.”

Bill

“So much of what I’ve been trying to do during my short tenure as VP of IT Operations is to prevent unplanned work from happening: coordinating changes better so they don’t fail, ensuring the orderly handling of incidents and outages to prevent interrupting key resources, doing whatever it takes so that Brent won’t be escalated to. . .”

Bill

Bill goes outside and calls Erik. Erik asks Bill how he is doing “after Phoenix crashed and burned so spectacularly”, and asks him if he can tell him the four categories of work now.

“At the plant, I gave you one category, which was business projects, like Phoenix,” I say. “Later, I realized that I didn’t mention internal IT projects. A week after that, I realized that changes are another category of work. But it was only after the Phoenix fiasco that I saw the last one, because of how it prevented all other work from getting completed, and that’s the last category, isn’t it? Firefighting. Unplanned work.”

Bill

Erik asks Bill about the change board he’s been working on, and Bill describes it to him.

“You’ve put together tools to help with the visual management of work and pulling work through the system. This is a critical part of the First Way, which is creating fast flow of work through Development and IT Operations. Index cards on a kanban board is one of the best mechanisms to do this, because everyone can see WIP. Now you must continually eradicate your largest sources of unplanned work, per the Second Way.”

Erik

Bill explains all the chaotic events that he has been dealing with lately. Eric responds and mentions that Brent is Bill’s constraint, and Bill is surprised.

“Well if we’re going to talk about your next steps, you definitely need to know about constraints because you need to increase flow. Right now, nothing is more important.”

Erik

Erik tells Bill that he hopes that he read The Goal by Eli Goldratt.

“Goldratt taught us that in most plants, there are a very small number of resources, whether it’s men, machines, or materials, that dictates the output of the entire system. We call this the constraint—or bottleneck. Either term works. Whatever you call it, until you create a trusted system to manage the flow of work to the constraint, the constraint is constantly wasted, which means that the constraint is likely being drastically underutilized.”

Erik

Erik describes the first 3 steps (of 5) in The Goal:

  1. Identify the constraint
  2. Exploit the constraint
  3. Subordinate the constraint

Erik tells Bill his homework is to figure out how to set the tempo of work according to Brent. He also tells Bill he is still missing a piece of the First Way in that he can’t distinguish what is important to the business and what isn’t.

“[Chris] is spending all his cycles on features, instead of stability, security, scalability, manageability, operability, continuity, and all those other beautiful ’itties. Remember, outcomes are what matter—not the process, not controls, or, for that matter, what work you complete.” – Erik

Chapter 16

Bill is at his desk when Ellen runs in with an email printout from Dick. It says something has gone wrong with the company invoicing systems. It was discovered that no customers were invoiced for 3 days.

Leadership gathers in the NOC room. Bill instructs everyone not to touch anything without approval from him.

The team investigates possible causes for the issue, and Patty’s team found over 20 different potential failures. Eventually they narrow it down to 8. They agree to reconvene at 10 pm.

As Bill is reading a book to his son, he checks the emails on his phone. He’s amazed at the difference in his team’s process: “During the last Sev 1 incident that hit our credit card processing systems, the conference call was full of finger-pointing, denials, and, most importantly, wasted time when our customers couldn’t give us money. Afterward, we did the first of a series of ongoing blameless postmortems to figure out what really happened and come up with ideas on how to prevent it from happening again. Better yet, Patty led a series of mock incident calls with all hands on deck, to rehearse the new procedures.”

At 9:15, Bill receives a call from Steve about the incident. Steve tells Bill that he just talked to Dick, and Dick said that Bill is dragging his feet. Steve is clearly angry.

Bill tries explaining his points again, but Steve cuts him off and asks if he’s in the office.
Steve: “We’ll probably miss almost every target that we’ve promised the board: revenue, cash, receivables—everything. In fact, every measure we’ve promised the board is going the wrong way! This screwup may confirm the board’s suspicion that we’ve completely lost control of managing this company!”

Steve tells Bill that he wants to see a sense of urgency, and that he should be getting people out of bed.

“Steve, if I thought it would help, I’d have everyone pull all-nighters in the data center tonight. For Phoenix, some people didn’t go home for nearly a week. Trust me, I know the house is on fire, but right now, more than anything, we need situational awareness. Before we send the teams crashing through the front door with fire hoses, we have to have someone at least quickly walk the perimeter of the yard — otherwise, we’ll end up burning down the houses next door!”

Bill

Steve replies to Bill that Brent disagrees with Bill’s approach. Bill responds that he hopes Brent is at home. He doesn’t want him working until they know exactly what’s wrong.

Steve tells Bill that they’re going to start doing things his way. He screams at Bill to call in Brent along with everyone else.

“You think I’m being overly cautious, and that I’m hesitating to do what needs to be done. But you are wrong. Dead wrong.”

Bill

Steve still is not convinced. Bill responds by telling Steve to do the work himself, and to expect Bill’s resignation in the morning.

From the Pipeline v9.0

This entry is part 9 of 34 in the series From the Pipeline

The following will be a regular feature where we share articles, podcasts, and webinars of interest from the web. 

Rethinking Your Measurement and Metrics for Agile and DevOps

In this short piece, Michael Sowers challenges the readers to consider updating their telemetry based on organizational change. In particular, start with the following criteria: (1) providing teams with quick feedback on how the quality of the project, product, and user stories is progressing; (2) understanding how the teams are progressing and what the roadblocks are; (3) knowing how effective and efficient the teams’ processes are; and, (4) understanding resource consumption, both human and computer.

Given-When-Then With Style

Gojko Adzic has partnered with Specflow for a series of articles to help people get the most out of Gherkin with some tips and tricks. Each week he will post a challenge for readers to answer about a particular example of Gherkin. In the first challenge, the reader must try to explain a missing value (A “Given” for a value that’s not supposed to be there).

What’s New in Selenium 4?

Selenium is a set of tools used in support of automation. In this article, Manoj walks us through several of the changes coming to selenium. Relative locators will be a welcome update as well as installing / uninstalling add-ons for Firefox at runtime. The biggest may very well be the ability to use Docker to spin up containers. Anyone interested in checking out the changes can go to Selenium.dev for more details.

Is There Such a Thing As Too Much Testing?

Bas Djikstra posts again about the costs and misconceptions around test automation – namely that it’s the end goal and not a means to an end. Investing in automation testing has associated long-term maintenance costs and it shouldn’t be considered the only type of validation performed by a team. Great advice in this piece on scaling automation.

Balance as an Important Part of Website Testing

In this article by Nataliia Syvynska explains two types of balance in web design: symmetrical balance and asymmetrical balance. In symmetrical balance elements are equally disposed on either side of the center (vertically and horizontally). Asymmetrical balance is focus on one particular object with several elements. The article raises an interesting question about validations from a UI/UX perspective of how the user interacts with the system in a “pleasing” fashion.

Book Club: The Phoenix Project (Chapters 8-12)

This entry is part 3 of 8 in the series Phoenix Project

The following is a chapter summary for “The Phoenix Project” by Gene Kim for an online book club.

The book club is a weekly lunchtime meeting of technology professionals. As a group, the book club selects, reads, and discuss books related to our profession. Participants are uplifted via group discussion of foundational principles & novel innovations. Attendees do not need to read the book to participate.

Chapters 4-7 HERE

Background on the Phoenix Project

“Bill, an IT manager at Parts Unlimited, has been tasked with taking on a project critical to the future of the business, code named Phoenix Project. But the project is massively over budget and behind schedule. The CEO demands Bill must fix the mess in ninety days or else Bill’s entire department will be outsourced.

With the help of a prospective board member and his mysterious philosophy of The Three Ways, Bill starts to see that IT work has more in common with a manufacturing plant work than he ever imagined. With the clock ticking, Bill must organize work flow streamline interdepartmental communications, and effectively serve the other business functions at Parts Unlimited.

In a fast-paced and entertaining style, three luminaries of the DevOps movement deliver a story that anyone who works in IT will recognize. Readers will not only learn how to improve their own IT organizations, they’ll never view IT the same way again.”

The Phoenix Project

Chapter 8

Bill spends all weekend working on a PowerPoint slide deck for his meeting with Steve.

When Bill arrives at Steve’s office, he must wait while Sarah & Steve wrap up a call with analysts about the Phoenix project.

Sarah relays that the industry analysts are excited about Phoenix now, too. Bill wonders if they are over promising. By the time Sarah leaves Steve’s office, she has taken up nearly half of the time that Bill has scheduled with Steve.

Bill explains to Steve that IT is stretched dangerously thin. There are too many different projects competing for attention, and that the new audit project will affect the resources that are supposed to be dedicated to Phoenix. He states that he would like to know the relative priority of the audit work compared to the Phoenix work.

“We’ve started to inventory everything we’re being asked to do, regardless of how big or small. Based on the analysis so far, it’s clear to me that the demand for IT work far exceeds our ability to deliver. I’ve asked them to make more visible what the pipeline of work looks like, so we can make more informed decisions about who should be working on what and when.”

Bill Palmer

“What kind of bullshit prioritization question is this? If I went to my board and told them that I need to do either sales or marketing, and asked them which of those I should do, I’d be laughed out of the room. I need to do both, just like you need to do both! Life is tough. Phoenix is the top company priority, but that doesn’t mean you get to hold the SOX-404 audit hostage.”

Steve Masters

Bill tries to reason with Steve, and tells him that Phoenix and compliance share key resources, the infrastructure is too fragile and breaks often, and that some compliance work should be put on hold if Phoenix truly is the top priority.

Steve replies that delaying the audit work is out of the question, and that there is no way they can hire any more people. Any raises to the budget are out of the question, and it seems like Bill’s team is more likely to lose people rather than be able to hire new ones.

“My suggestion to you? Go to your peers and make your case to them. If your case is really valid, they should be willing to transfer some of their budget to you. But let me be clear: Any budget increases are out of the question. If anything, we may have to cut some heads in your area.”

Steve Masters

Bill tosses his presentation he worked on all weekend into the recycling bin as he leaves.

Bill then goes to the continuation of the CAB meeting. He is blown away by how many change cards are in the room, and the room is covered in white boards. He discovers that there have been 437 change requests submitted for the week.

“Let’s go back to our goals: get the left and right hands to know what the other is doing, give us some situational awareness during outages, and give audit some evidence that we’re addressing change control.”

“‘We need to focus on the riskiest changes,’ I continue. ‘The 80/20 rule likely applies here: Twenty percent of the changes pose eighty percent of the risk.'”

Bill Palmer

The team works on splitting up the cards into two groups: a risky group and a routine change group.

The group also decides to share the changes with business, along with data on how risky each change will be.

“We need to create some standard procedures around these changes—like when we’ll want them implemented—and have key resources not only aware of them but also standing by, just in case things go wrong—even the vendors.”

Patty

“There’s no reason why all the responsibility should rest on our shoulders. We can send an e-mail out to the business ahead of time and ask when the best implementation time would be. If we can give them data on the outcomes of previous changes, they may even withdraw the change.”

Bill

As the meeting concludes, the group feels positive about the change management work that they are doing. On the negative side, the amount of manual work the process is taking is too high, and the group agrees that it will need to be automated sooner or later.

Chapter 9

Bill sits in a high-level budget meeting with leadership (which he calls “the most ruthless budget meeting I’ve ever attended”) when he gets a text that there is a Sev 1 incident where all of the credit card processing systems are down. He is forced to leave the meeting even though he knows that he won’t have a chance to fight for his budget.

When he gets to the call with Patty and Wes, he is informed that the order entry systems are down, and the team is trying to establish what has changed.

Patty asks what the day’s changes were, but the conversation quickly spirals into defensiveness from each manager and finger pointing.

Bill chooses not to intervene in the conversation, and instead opts to simply sit back and observe the chaos.

Suddenly, someone on the phone speaks up and says, “try it now”. Bill tells everyone to hold it and discovers that the voice on the phone is Brent. Shortly after, someone states that the issue has been fixed.

Bill wraps up the call and calls Wes and Patty to meet privately. He tells Patty that she is in charge of presenting a timeline of all changes during incidents. He also says they will do a fire drill every 2 weeks to practice managing incidents.

Bill asks Wes to impress upon Brent that everyone must discuss their fixes during emergencies rather than just implementing them on their own.

Bill says that his guess is that Brent caused the outage on his own and then rushed to undo the change.

“I want you to host practice incident calls and fire drills every two weeks. We need to get everyone used to solving problems in a methodical way and to have the timeline available before we go into that meeting. If we can’t do this during a prearranged drill, how can we expect people to do it during an emergency?”

Bill

Moving forward, Bill and Wes spend nearly all their time in the Phoenix war room. The deployment is only three days away, and things are looking worse and worse.

The group has another CAB meeting, where everything has been organized. The group starts to review all high and medium risk changes.

Things are going very well, but Patty shows the group that they have 173 changes going in on Friday alone. The timeline is adjusted, and some members move their changes up in the week.

“‘If I were air traffic control,’ she continues, ‘I’d say that the airspace is dangerously overcrowded. Anyone willing to change their flight plans?'”

Patty

Bill begins thinking to himself about what Erik told him. He names three types of work: business projects, IT projects, and changes.

“Sure, each of these changes is much smaller than an entire project, but it’s still work. But what is the relationship between changes and projects? Are they equally important? And can it really be that before today, none of these changes were being tracked somewhere, in some sort of system? For that matter, where did all these changes come from? If changes are a type of work different than projects, does that mean that we’re actually doing more than just the hundred projects? How many of these changes are to support one of the hundred projects? If it’s not supporting one of those, should we really be working on it? If we had exactly the amount of resources to take on all our project work, does this mean we might not have enough cycles to implement all these changes?”

Bill

Chapter 10

The chapter starts in the Phoenix war room. William Mason, director of QA, informs the group that they are finding twice as many broken features as are getting fixed.

The group discovers that Brent is a bottleneck for many tasks.

Bill goes to Brent’s desk. When he arrives, Brent is on the phone and Bill observes him for a minute.

“I appreciate how Brent seems to genuinely care that everyone relying on IT systems can get their work done, but I’m dismayed that everyone seems to be using him as their free, personal Geek Squad. At the expense of Phoenix.”

Bill

Bill asks Brent how many calls he gets a day, and if he logs them anywhere. Brent says he does not log anything because it takes too long.

Brent says that his previous phone call was with the VP of Logistics, and Bill is angry that executives are strong arming Brent into completing tasks.

Bill tells Brent that from now on his only priority is Phoenix. Bill leaves Brent and calls Patty and Wes to a meeting about how to handle escalations.

“‘Processes are supposed to protect people. We need to figure out how to protect Brent,’ I say. I then describe how I already told Brent to send everyone wanting anything to Wes.”

Bill

Patty suggests that Brent may be reluctant to give up his knowledge because he may view it as power. Bill responds, “Maybe. Maybe not. I’ll tell you what I do know, though. Every time that we let Brent fix something that none of us can replicate, Brent gets a little smarter, and the entire system gets dumber. We’ve got to put an end to that.”

Bill says the new system will be everyone needs approval before talking to Brent, and everyone must document what they learned.

Bill states that to make sure everyone follows the new processes they will send the engineers to whichever conference they want. They will also give Brent a week off work with no on call responsibilities.

Chapter 11

The chapter opens with Patty calling Bill during his lunch because she wants him to check out something weird on the change calendar.

“I’m starting to think this entire change process is a total waste of time. Organizing all these changes and managing all the stakeholder communication is taking up three people full-time. Based on what I’m seeing now, it may be useless.”

Patty

Patty tells Bill that over the last week about 60% of scheduled changes have not actually been implemented.

She says they haven’t been implemented for several reasons: personnel, configuration work that wasn’t completed, and the need for Brent.

“Somehow, just like we’re breaking the habits of people asking Brent to help with break-fix work, we need to do the same with change implementation. We’ve got to get all this knowledge into the hands of people actually doing the work. If they can’t grok it, then maybe we have a skills problem in those teams.”

Bill

Bill remembers back to his conversation with Erik about WIP. Erik called WIP the silent killer. Erik had pointed to an ever growing mountain of work on the plant floor as an indication that floor managers had failed to control their work in process.

Patty states that they will soon pass over 1,000 changes tracked. She wonders why they are doing the tracking work when the changes aren’t ever being implemented.

Bill is starting to believe that Erik was right and there really is a link between plant floor management and IT Operations.

He says that he believes that reversing the process change and allowing change work to go to Brent is the exact wrong thing to do. He also states that this process is worth it because they are now aware of how much scheduled work isn’t getting done, and that they now have “situational awareness”.

Chapter 12

“It’s not a good sign when they’re still attaching parts to the space shuttle at liftoff time.”

Bill

The Phoenix project was scheduled to start at 5:30 PM Friday, but it still has not started as of 7:30 due to Chris’s team still making changes. Phoenix was not available in the test environment and was still failing critical tests.

There are multiple issues, including the app only running on one developer’s machine and an unopened network port that is preventing the front end from talking to the back end.

Bill calls Wes, Patty and William into his office to talk. Wes says the team is still missing critical files and they are unable to configure the test environment correctly.

William says that his QA team is unable to keep up with all the code changes being made, and that his bet would be that Phoenix will blow up in production. He wants to stop the release but Chris and Sarah won’t allow it.

William doesn’t think they will have anything up by 8 AM the next day (when the stores open).

Wes tells Bill that they still have not reached the point of no return. That point will be when the team starts converting databases to interact with Phoenix and POS systems.

Bill is going to try and delay the deployment by emailing Steve, Chris and Sarah. He then calls Steve. He explains that he cannot overstate how bad the release has gone so far, and that it is not too late to stop this “train wreck”. He says that failure will jeopardize order data and customer records.

Steve explains that they don’t have a choice but to keep moving ahead. They have already bought ads for that weekend’s newspapers and their partners are ready to go.

Bill asks Steve how bad things have to be to delay the rollout. Steve says that if he can convince Sarah, then he will consider it.

Bill pulls Sarah aside to talk in the hallway. He asks her how it seems things are going from her point of view. She responds, “You know how these things go when we’re trying to be nimble, right? There’s always unforeseen things when it comes to technology. If you want to make omelets, you’ve got to be willing to break some eggs.”

Bill tells Sarah the same things that he told Steve, but she is unconvinced. She says that everyone is ready but Bill, and that they need to keep going. Wes taps Bill on the shoulder and tells him there is a problem.

“Remember when we hit the point of no return around 9 p.m.? I’ve been tracking the progress of the Phoenix database conversion, and it’s thousands of times slower than we thought it would be. It was supposed to complete hours ago, but it’s only ten percent complete. That means all the data won’t be converted until Tuesday. We are totally screwed.”

Wes

Wes says that performance is terrible, and even Brent can’t fix the problem. He also says that they cannot use virtualization to fix their server problems because development blamed the performance problems on the virtualization.

“The morning light is starting to stream in from the windows, showing the accumulated mess of coffee cups, papers, and all sorts of other debris. In the corner, a developer is asleep under some chairs.”

Bill

Maggie, the Senior Director of Retail Program Management, is kicking off the 7 AM emergency meeting. She says that all the in-store POS systems will be down because of the database issue. The good news is the Phoenix site is up and running.

“We need to get proactive here,” I say to Sarah. “We need to send out a summary to everyone in the stores, as quickly as possible outlining what’s happened and more specific instructions on how to conduct operations without the POS systems.”

Bill

At 2pm Saturday, Bill says the bottom is further down than he thought. All transactions are being processed manually. The customers on the website are complaining about how it is slow and unusable.

Bill finally leaves to catch a few hours of sleep while Wes stays behind to look over everything.

Wes calls Bill at 4:30 and says, “Bad news. In short, it’s all over Twitter that the Phoenix website is leaking customer credit card numbers. They’re even posting screenshots. Apparently, when you empty your shopping cart, the session crashes and displays the credit card number of the last successful order.”

Slaying the Hydra: Run-Time State and Splitting Up the Execution

This entry is part 3 of 5 in the series Slaying the Hydra

In this third post of the blog series on parallel test execution, I explain how to execute distributed parallel test automation. The previous entry can be found here.

As discussed previously, The running stage (see below) within the pipeline context is set to execute three builds of the test_runner freestyle job in parallel. Each build is receiving the following parameters:

  • browser – either equal to ‘ie’ or ‘chrome’
  • total_number_of_builds – equal to ‘3’
  • build_number – equal to ‘1’, ‘2’ or ‘3’

Freestyle Job Overview

In the following sections, I explain what freestyle components need utilized when constructing the test_runner job in Jenkins.

Parameters

As seen from the image above, parameters are being passed from the pipeline job into the freestyle job. We will update the freestyle job to be parameterized. This selection is made when configuring the Jenkins job (see below).

Next the freestyle job is configured with these parameter names:

  • browser –  the value received from the pipeline parameter value.
  • total_number_of_builds –  the value received from the pipeline parameter value.
  • build_number – the value received from the pipeline parameter value.
  • workspace_location – to show a different way of doing things, we can see from the image above that I did not pass a value for workspace location in the pipeline. When I configured the parameter (below), I set a default value in the freestyle job. This default value will be linked to the workspace_location parameter now unless I otherwise specify.

Node Selection

In this section we restrict where this build can execute to only machines associated with the @local tag only. This setting is located in the Manage Jenkins > Manage Nodes section of Jenkins. It provides us the ability to ensure we are not utilizing nodes that are otherwise utilized or not configured to run the cucumber tests in the steps below.

Version Control

In the Source Code Management section, we specify what testing suite to retrieve via version control and utilize for this effort, which will pull the suite down within the workspace. The “clean before checkout” additional behavior (Jenkins functionality) will remove any files in the workspace that are not in the Git repo before pulling the suite down. This allows for a clean slate for every execution.

Splitting Code

class Splitter
  def total_builds
    ENV['total_number_of_builds'].to_i
  end

  def build_number
    ENV['build_number'].to_i
  end

  def main_run
    scenarios = feature_iterator
    splits = job_splitter(scenarios)
    assignment = job_assigner(splits)
    feature_mod_iterator(assignment, 'features', true)
  end

  def feature_mod_iterator(split_assignment, current_location = 'features', assign = true)
    array = []
    split_assignment.each do |value|
      mod_value = value.gsub('@regression', '@split_builds')
      regex = /#{value}$/
      files = return_all_files(current_location, '*', 'feature')
      files.each do |file|
        output = File.open(file, 'r', &:read)
        modified = output.gsub(regex, mod_value)
        if assign
          File.open(file, 'w+') { |f| f.print(modified) }
        else
          array.push(modified)
        end
      end
    end
    array
  end

  def feature_iterator(current_location = 'features')
    files = return_all_files(current_location, '*', 'feature')
    array = []
    files.each do |file|
      array.push(return_all_gherkin_scenarios(file))
    end
    array.flatten
  end

  def return_all_gherkin_scenarios(file)
    output = File.open(file, 'r', &:read)
    output.scan(/(@regression.*\n. (Scenario:|Scenario Outline:)?.*)/).map { |value| value[0] }
  end

  def return_all_files(current_location, filter = '*', file_type = '*')
    Dir.glob("#{current_location}/**/#{filter}.#{file_type}")
  end

  def job_splitter(scenarios)
    split = scenarios.length.to_i / total_builds.to_i

    container = []
    total_builds.times { container.push([]) }
    mod_scenarios = scenarios.clone

    total_builds.times do |index|
      container[index].push(mod_scenarios[0..(split - 1)])
      container[index].flatten!

      (0..(split - 1)).to_a.length.times do
        mod_scenarios.delete_at(0)
      end
    end

    mod_scenarios.each_with_index do |value, index|
      container[index].push(value)
    end
    container
  end

  def job_assigner(scenarios)
    scenarios[(build_number.to_i - 1)]
  end
end

one = Splitter.new
one.main_run

At a high level, the code block above is creating an array of arrays that split up the regression tests evenly between the number of executors. The build_number value is utilized to access the corresponding index value of the array. All of the tests in that location are re-tagged from @regression to @split_builds locally on the workspace that houses the Ruby/Cucumber code pulled down from version control.

You would have to change the @regression tag to whatever you are utilizing to tag your tests as regression on your team.

The cool thing is that this will run on each of the three workspaces and re-tag a unique subset of tests. Because the total_builds value is the same for all the jobs kicked off, it will create the same nested array structure on every workspace. The difference between workspaces comes about because of the build_number parameter that chooses which subset of tests to re-tag.

Running the Split Code

We should house the code above within our testing framework in version control.  Within the Build section of Jenkins we then create a windows batch command. Next we set the environment variables that the code utilizes total_builds and build_number as being equal to the parameters set within the freestyle job. We can now run the ruby command passing the path to the .rb file that houses the code within the workspace (in reference to the code above).

Running the Tests

We set up another windows batch command to set environment variables for browser and or_tags, and in this instance, we kick off the tests utilizing a rake task. Cucumber Rake is a useful tool, but we could just as easily run a Cucumber command.

The important thing is that we are passing what will be the tag modified locally on each workspace(split_builds) to run only the tests assigned to that workspace. Additionally, we passed the browser variable set within the pipeline and passed to the freestyle job.

Storing Results

In our last batch command, we are extracting the json test results file and storing it on the workspace_location as a json file named with the build_number value (either 1, 2, or 3). This workspace location is the same as what we utilized in the clearing stage and what will be utilized in the consolidation stage.  

Review and Next Steps

To review, in this post, we figured out how to build the freestyle job that is responsible for splitting, executing, and storing the results of our tests.

In the next post, we discuss how to consolidate the information from the freestyle job builds into a concise cucumber report.

From the Pipeline v8.0

This entry is part 8 of 34 in the series From the Pipeline

The following will be a regular feature where we share articles, podcasts, and webinars of interest from the web. 

From Test Management to Continuous Delivery

Seb Rose and Dana Prey recently hosted a webinar on Cucumber.io (now a SmartBear tool) about the evolution of testing to support continuous delivery. “This webinar will define Test Management and Continuous Delivery and go on to explore typical challenges you’ll encounter on your journey towards CD. We’ll describe small steps that you can use to mitigate the risks of changing the way you work, and the value that can be released from the start.”

Information Loss in Software Testing

Matt Heusser describes the level of information loss about a project or product as it moves up through the chain of command, as well as the negative aspects of controlling information about an application for your personal benefit (job security). He provides several alternatives to conveying information such as coverage maps and dashboards to help contain organizational information loss.

Clear, Direct Communication: An Experiment

Kent Beck posts a personal piece about communication with others through his life. The piece is an important introspection who is professionally successful and considered a luminary in our field, yet still struggles with interpersonal connections.

Fighting Against Technical Debt

Cukenfest was held virtually this past week. While the videos are not posted yet, Gaspar Nagy has posted his presentation to slideshare. His talk about technical debt is distilled into three focus areas: Reversibility, Reaction, and Sustainability.

DevOps Journey Playbook

The DevOps Institute have gathered lots of great background information on aspects of DevOps into a single location as a series of playbooks. “Playbooks are a collaborative body of knowledge of research, knowledge and artifacts to help you understand and SKILup your DevOps capabilities. A playbook is populated with twelve research chapter reports plus additional content for ongoing discovery and support during your DevOps journey. We continuously update the playbook with regional and global perspectives for actionable strategies and implementations.”