This is the final post in a series on branching from Martin Fowler. The series has been an amazing journey to follow. In the wrap-up, Martin reminds us “branching is easy, merging is harder”. He provides us with a summary of his recommended rules to follow with branching and merging.
A great post from Tanner about establishing a shared understanding on the team about estimates by using metaphors to bring everyone on board. They key point made in this article is: “Estimates are about mathematics. Expectations are about human connection. That difference matters.”
A nice introductory article for those in the testing space wanting to learn about equivalence partitioning and boundary value analysis. “Equivalence partitioning and boundary value analysis are two specification-based techniques that are useful in black box testing. This article defines each of these techniques and describes, with examples, how you can use them together to create better test cases. You can save time and reduce the number of test cases required to effectively test inputs, outputs, and values.”
Hypothesis Driven Development is about changing the mindset of software development from a set of fixed features to experimentation. Every project becomes an experiment that tests a hypothesis about the system – meaning we can refute the hypothesis and roll back the changes or update our hypothesis and alter our approach.
Pavan Belagatti gives an excellent rundown of DevOps practices an organization needs to adopt to be successful. Some of those practices are building a strong culture of learning, automation wherever possible, and adopting cloud computing.
The following is a chapter summary for “The Phoenix Project” by Gene Kim for an online book club.
The book club is a weekly lunchtime meeting of technology professionals. As a group, the book club selects, reads, and discuss books related to our profession. Participants are uplifted via group discussion of foundational principles & novel innovations. Attendees do not need to read the book to participate.
Background on the Phoenix Project
“Bill, an IT manager at Parts Unlimited, has been tasked with taking on a project critical to the future of the business, code named Phoenix Project. But the project is massively over budget and behind schedule. The CEO demands Bill must fix the mess in ninety days or else Bill’s entire department will be outsourced.
With the help of a prospective board member and his mysterious philosophy of The Three Ways, Bill starts to see that IT work has more in common with a manufacturing plant work than he ever imagined. With the clock ticking, Bill must organize work flow streamline interdepartmental communications, and effectively serve the other business functions at Parts Unlimited.
In a fast-paced and entertaining style, three luminaries of the DevOps movement deliver a story that anyone who works in IT will recognize. Readers will not only learn how to improve their own IT organizations, they’ll never view IT the same way again.”
Bill Palmer is the Director of Midrange Technology Operations for Parts Unlimited, a $4 billion per year manufacturing and retail company.
Parts Unlimited largest retailing competitor offers better customer service and a new feature that allows people to customize their cars with their friends online.
Bill is frustrated because their competition outperforms Parts Unlimited. His group are expected to deliver more with less year after year.
Bill is invited to meet with Steve Masters, the CEO of Parts Unlimited. He is informed that Luke (CIO) and Damon (VP of IT Operations) were let go and Bill is now VP of IT Operations.
“CIO stands for ‘Career Is Over'”
IT will temporarily report to Steve until a new CIO is hired.
Steve tells Bill the goal of the company is to regain profitability to increase the market share and average order sizes. At present, the competitors for Parts Unlimited are beating them.
Steve believes “Project Phoenix” is essential to company success. The project is years late on delivering. If the company does not turn things around, the shareholders are likely to split up the company, costing the jobs of four thousands employees.
Chris Allers will be interim CIO. Chris is presently the VP of Application Development. Both Chris and Bill will report directly to Steve.
Bill is reluctant to take the position but Steve convinces him.
“What I want is for IT to keep the lights on. It should be like using the toilet. I use the toilet, and hell, I don’t ever worry about it not working. What I don’t want is to have the toilets back up and flood the entire building.”
Bill is informed by Steve that the “payroll run is failing”. This is his first task as failure to make payroll means many factory workers would be affected, potentially getting the company into trouble with the Union.
Bill moves to address the payroll issue by first meeting with Dick Landry, CFO.
“In yesterday’s payroll run, all of the records for the hourly employees went missing. “We’re pretty sure it’s an IT issue. This screwup is preventing us from paying our employees, violating countless state labor laws, and, no doubt, the union is going to scream bloody murder.”
Bill & Dick go to meet the Operations Manager Ann to get more situational awareness about the problem. The general ledger upload for hourly employees didn’t go through and all the hourlies are zero. The salaried employees numbers are ok.
“To get Finance the data they need, we may have to cobble together some custom reports, which means bringing in the application developers or database people. But that’s like throwing gasoline on the fire. Developers are even worse than networking people. Show me a developer who isn’t crashing production systems, and I’ll show you one who can’t fog a mirror.”
As Bill returns to the IT building, he realizes how run down it is compared to the building that Leadership & Financing work in. Bill heads to the Network Operations Center (NOC) to meet Wes and Patty.
Wes is the Director of Distributed Technology Operations. He is responsible for windows servers, database & networking teams. Wes is loud, outspoken, and shoots from the hip.
Patty is the Director of IT Service Support. She owns all the level 1 and 2 help desk technicians. She also owns the trouble ticketing system, monitoring, and running the change management meetings. Patty is thoughtful, analytical, and a stickler for processes and procedures.
IT was in the middle of a Storage Area Network (SAN) firmware upgrade when the payroll run failed. They tried to back out the changes but ended up bricking it instead.
Chapter 2 is the first introduction of Brent, the engineer in the middle of many important IT projects. By having Brent tackle this Sev 1 issue, he is not working on project Phoenix. The team decides to visit Brent to learn more about the payroll issue.
Bill, Wes, and Patty go to meet Brent about the payroll issue.
“I was helping one of the SAN engineers perform the firmware upgrade after everybody went home. It took way longer than we thought—nothing went according to the tech note. It got pretty hairy, but we finally finished around seven o’clock.”
“We rebooted the SAN, but then all the self-tests started failing. We worked it for about fifteen minutes, trying to figure out what went wrong. That’s when we got the e-mails about the payroll run failing. That’s when I said, ‘Game Over.’”
The team gets an update from Ann. The last pay period was fine but for the new pay period all the data is messed up. The Social Security numbers for the factory hourlies are complete gibberish.
Since only one field is corrupted, the team deduces it’s not a SAN failure. They find out on the conference call for the incident that a developer was also installing a security application the same time the SAN firmware was being upgraded.
The security software change was requested by John Pesche, the Chief Information Security Officer.
“The only thing more dangerous than a developer is a developer conspiring with Security. The two working together gives us means, motive, and opportunity.”
Information Security at Parts Unlimited often make urgent demands and so the development teams don’t invite them to many meetings. The InfoSec team does not follow the change management process and it always causes problems.
John reveals that Luke and Damon were perhaps fired over a compliance audit finding from security.
InfoSec had an urgent audit issue around storage of PII — personally identifiable information like social security numbers, birthdays, etc.. They found a product that tokenized the information so the SSNs were no longer stored.
“‘Let me see if I’ve got this right…’ I say slowly. ‘You deployed this tokenization application to fix an audit finding, which caused the payroll run failure, which has Dick and Steve climbing the walls?'”
John made the changes because the next window for the change to be deployed was in four months and auditors would be on-site in one week. John never tested the change because there’s no test environment.
Bill requests a list of all the changes made in the past three days so they can examine the timeline and establish cause & effect. Bill finds out few people use the change management system to make requests.
The Change Advisory Board (CAB) is not well attended. Tams will make changes without approval or notice because of deadline pressures. Bill asks Patty to send out a meeting notice to all the tech leads and announce attendance is mandatory.
After review of the 27 changes in the past three days, only the InfoSec tokenization change and the SAN upgrade could be linked to payroll failure.
The applications were eventually brought online but the company had to submit payroll using the prior pay period. The local newspaper reports on the payroll failure after the Union complains.
Doc Norton addresses the “no deadlines” philosophy in Agile. Deadlines happen often and teams are subject to external deadlines such as regulatory changes. Using repetition he explains three aspects of high-functioning teams.
Barry O’Reilly uses the Chernobyl disaster as a backdrop to discuss how we perform failure analysis. There are four factors to consider for failure causes: (1) KPI’s drive behavior, (2) the flow of information, (3) value guides behavior, and (4) limited resources put pressure on behavior.
In the previous article, Cukes and Apples: App Automation with Ruby and Appium, I demonstrated a functional workspace setup for mobile automation by using Appium to launch the Google Play Store app while running a Ruby/Cucumber test suite. The implementation described in that post is enough to prove that the workspace is capable of mobile automation, but not yet a functional test suite.
This post will cover the implementation of a working Cucumber test suite. This means launching and closing the app in our tests, writing step definitions that can interact with the app, and designing support code to make tests easier to write.
Since the last post, I’ve acquired a new .apk file and updated my capabilities to install it, as shown in the screenshot below. I’m using the Walmart app – it seemed a better test candidate than the Google Play Store.
The “app” capability is used to reinstall the app when the driver is launched. There are fair arguments that execution speed could be improved by leaving the app installed, but my experience has taught me not to deliberately manage app state – it’s easier to start fresh every time.
We need to launch the app at the beginning of every test, and close it at the end. To do this, we can start by moving the sample code from env.rb to a Cucumber hook.
Create a file called hooks.rb in the features/support directory. In that file, create a Before hook and move the sample code into it, as shown in the screenshot below.
Notice how the begin/rescue construct has changed. Cucumber hooks fail quietly, so it is helpful to rescue and print any exceptions that are raised; for that reason, the begin/rescue now encompasses all the driver code.
The remaining code in the env.rb file is short and sweet:
One more hook will ensure the driver is terminated at the end of every test – the After hook. Use a begin/rescue block and call the quit_driver method as shown below.
A Simple Scenario
It won’t be possible to observe the Before and After hooks in action until we have a Cucumber scenario to execute. A very simple scenario will help us to test the driver action in those hooks, practice the language we want to use in our tests, and prove that this application can be automated.
Create a feature file under the features directory. I recommend organizing feature files in a single directory under features, like the “gherkin” directory in the screenshot below.
I created a feature called Welcome to describe the experience of launching the app for the first time.
The step definitions that drive this scenario are very simple – each call upon the driver that was created in the Before hook to find an element on the screen, then two of the steps use that element to test a value or perform an action. These step definitions are intentionally crude, and will be improved.
Execute The Scenario
If you are following along with a similar implementation, it should now be possible to execute the scenario to test your work. Check out the following videos to see the automation in action.
Create Page Objects
One of the problems in our step definitions is a lack of clarity. Direct references to the driver (like uses of @driver in the steps screenshot above) generally hurt readability because the logic of an Appium driver is not like the logic of the application under test. Directly referencing the driver will create step definitions that are too technical to understand at a glance.
The Page Object pattern is a fine solution for improving the clarity of application logic in code, and would make a great improvement to our test suite. Implementing the pattern involves creating constructs in code to represent pages of an application, and then imbuing those constructs with data and logic that describe the pages.
The scenario implemented begins with a step that validates the display of the Welcome page by searching for a title element. The code is very simple – it finds an element, but the logic is obscure. How does the reader know that it validates the display of the page? If an object represented the Welcome page, and this step simply asked that object if the page is visible, then the intent of this step definition would be perfectly clear. Such an object, implemented with a Ruby class, might look like this:
By moving validation of page visibility into a method of a page object, we make the code reusable and get a descriptive method name for method name.
The step definition which calls upon this page object is now much easier to understand:
To make this work, some additional environment setup is necessary. In the env.rb file, require the new page class like in the screenshot below.
For the following steps, we can make similar changes. Take a look at these updated step definitions and consider whether they are now easier to understand:
Advanced Page Objects
Using the Page Object pattern creates a risk of generating lots of boilerplate code – for example, the initialize method from the Welcome page above would be reproduced in every other page object in the suite. A conscientious developer will quickly begin seeking optimizations to his or her Page Object implementation. An alternative is the Screenplay Pattern.
A great example to follow is page-object, the Ruby gem which implements the Page Object pattern for web automation with selenium webdriver via the Watir gem. Cheezy and other developers on the project have created a very nice framework for describing elements in page classes and for managing references to the driver and current page.
The first optimization that we can apply is to reduce duplicate code by establishing a common base class for pages. This immediately allows us to remove the duplicate initialize logic from pages that use this base.
The next optimization is to programmatically create methods for page elements. In the examples above, I created a method in my page class every time I needed to find an element, click on an element, or get the value of an element – this can get quickly get out of control, resulting in page classes that are hundreds of lines long and difficult to read.
An ideal implementation would streamline the process of creating elements. Examine the declaration of element and button in the following example:
That terse expression, which communicates that our page has one plain element and one button, can be accomplished with a little bit of metaprogramming.
Create class methods like element and button in the base page class and use define_method to create new element methods whenever a page class declares an element or button. This implementation is very similar to the page-object gem.
Take a look at the full codebase on GitHub to explore the test suite upgrades implemented in this post:
This is the second in a series of blog posts explaining a way to execute distributed parallel test automation. The first entry can be found here.
In this post I walk you through the process of orchestration and the first orchestrated stage. I will explain the concepts in a way that allows them to be applied to multiple use-cases. Since I am a Rubyist at heart — with a fondness for Cucumber and Jenkins — the examples found here are geared towards them.
Jenkins provides pipelines as a functionality, which serve the purpose of orchestrating multiple jobs into a singular flow. The original intent of a pipeline is for automated continuous delivery of software to a production environment. We utilize the pipeline to orchestrate our parallel testing effort.
The purpose of the pipeline being developed provides feedback to our stakeholders as rapidly as we can, given the resources provided. Additionally, we make the framework dynamic to handle configuration changes quickly and efficiently.
The pipeline implementation in Jenkins requires two parts:
The first is the pipeline code, referred as a Jenkinsfile, which is often stored in the related source code repository. In the example below, the Jenkinsfile is stored in the testing repository.
The second part is the pipeline job within Jenkins, which references the source code that stores our Jenkinsfile. The image below is the configuration of a pipeline job in Jenkins. We provide the URL location, authentication parameters, and name of the Jenkinsfile.
Jenkins jobs allow for parameters configured at a runtime to supply dynamic execution, depending on the selection. The image below is an example where we choose between IE and Chrome as the browser to be utilized for the UI tests.
When running a build of the job we can specify between IE and Chrome. If we kick the job off automatically at a certain time it will default to the first option in the drop-down provided (see below).
After constructing the pipeline job in Jenkins, we can proceed to understand the Jenkinsfile. To complete our objectives, we can breakdown the three sections or stages for building the Jenkinsfile.
The above image is a Jenkinsfile, which is what we store with our source code pulled from a repository and utilized as the script for our pipeline.
*Note: while I am providing an overview of a Jenkins pipeline, I cannot cover all the facets of this expansive tool in one blog post. However, jenkins.io has all the information you could ever want, outside of what I supply here.
From the image above we see the node parameter, which allows us to tell Jenkins where we want the pipeline job itself to run. This does not mean every job within the pipeline will run on machines with this tag associated to them, but we will dive into that in the next installment/blog post.
The browser method returns the result of params.browser which is received from the parameter within the pipeline job in Jenkins. This will either equal ‘ie’ or ‘chrome’.
The total_number_of_builds method returns ‘3’ which will come in handy later in our execution stage.
Setting a Clean Slate
In our ‘clearing’ stage we want to build a job named ‘clear_workspace’ that will go out to all impacted machines and clear a file location to ensure we are guaranteed to start with a clean slate.
Executing Our Tests
In our ‘running’ stage we can run three jobs in parallel to provide a faster feedback loop to our end users. I chose the number of jobs randomly; it could just as easily be 20 or 100 and the pipeline would function correctly.
The image below displays a “catchError wrapper” that prevents a failure code from one of the built jobs stopping the whole pipeline execution.
The parallel keyword allows us to execute the jobs at the same time rather than waiting for them to execute sequentially.
Lastly, within the three jobs we are building the parameter sections have browser and total_number_of_builds returned, which are from the methods created at the top of the pipeline file. Additional we are passing a build_number parameter which is either 1, 2 or 3.
Consolidating our Results
Our ‘consolidation’ stage will allow us to access the machines utilized for testing and pull meaningful artifacts from the job and report the results to our stakeholders.
There are two jobs in consolidating stage: one job is going out and pulling the information from each impacted machine and the other job is to consolidate the information into a concise report.
There are complications to this stage, which will be discussed in the final installment of this blog posts series.
Setting a Clean Slate In-Depth
As previously mentioned, the ‘clear_workspace’ job has the intent to clean up after previous runs of the same job on all utilized workstations.
During execution, the test results are stored in a specific file location on each workstation. We do not want previous results carried into the current execution, so we must go out to each machine being utilized as a node and clear the specified file location.
In Jenkins, we can set a job to iterate a set of workstations via the node parameter plugin. This will execute the job on each node specified sequentially via the default nodes option.
Additionally, we can check the ‘Execute concurrent builds if necessary’ parameter to allow the executions to happen in parallel.
For the actual commands (Windows commands, sorry Mac folks) we need to delete a certain directory and recreate, to ensure it is empty.
In the image above, the file location that we are clearing (first stage) will be the same file location where the results are stored for consolidation (last stage) of our pipeline. Remember, it is important for those locations to be the same.
In the next installment, we discuss executing the tests in parallel and how we ensure tests are distributed within the parallel executions.
This is a great article about the anti-patterns associated with software testers. Great advice for testers on avoiding common problems like only following existing test scripts, a hyper-focus on automation, growing stale in your learning, or focusing too much on shiny new developments without implementing them.
Eran Kinsbruner provides a great overview of non-functional testing including a list of many of the non-functional testing types. Many of the activities he describes can be assisted with a cloud-based testing platform.
Last month I spoke at ComTrade’s “Quest for Quality” webinar series. This blog post summarizes much of the talk on “The Science of Testing” about how software testers can leverage practices of scientists to help improve the rigor of testing.
The QA Lead podcast is a great resource for testers. In the most recent episode Damian Synadinos is the guest of honor. Damian does a great job of getting to the fundamentals of software testing and solving the oft-overlooked human component of software development.
The following will be a regular feature where we share articles, podcasts, and webinars of interest from the web. This week we’ll showcase a articles on Jenkins upgrades, DevOps, Checking versus Testing, the Screenplay pattern, and an external post by yours truly.
Jenkins is moving away from the antiquated “slave” name in favor of “agent” because the former name was considered inappropriate. Jenkins Docker images will also expand availability of windows images, support for additional platforms, and multi-platform Docker images.
Helen Beal provides a high-level summary for troubles surrounding the DevOps evolution that most companies are trying to overcome. It’s not simply renaming the build team to “DevOps” but rather a slow and committed process to change an organization’s culture.
Jason Arbon presents his view (with a bit of humor) regarding the “checking” versus “testing” debate that has occurred online and at conference across the globe. In the article, he differentiates between automated regression testing and generative automated testing. For Jason, regression tests are those often repeated test scripts to validate existing application behavior. Generative automated testing analyzes the software specifications, implementation, and application itself to automatically generate test coverage.
Matt Wynne provides part 4 in his series on the Screenplay pattern for test automation. You can follow the trail back to part 1. The series is a great comparison piece for people who use the PageObject pattern for their application testing.
This is the second in a series of posts about the strategy and tactics of test automation. The first on common challenges can be found HERE. Our team has experience working at multiple large firms with an enterprise-wide scope. Throughout our time working in IT, we have encountered challenges with existing test automation implementations and committed several mistakes on the way. Our hope is to relay some valuable activities to build robustness into an automation suite so you can defeat the automation supervillains.
“I’m not a great programmer; I’m just a good programmer with great habits.”
– Martin Fowler, Refactoring: Improving the Design of Existing Code
The following is an overview of Regression Analysis, Code Reviews, and Refactoring Sessions for test automation. Just like any programmer, automation testers are developing an application; it so happens the application is designed to test other applications. Automation test suites accumulate technical debt like any other code base. Overly complicated scenarios, single use steps, and data management miscues are just a few of the issues facing an automation test suite. The quality standards one would expect from the application being delivered to stakeholders should also be followed for an automation suite to test that application.
Activity One: Regression Analysis
Regression testing has many definitions depending on the source, which can include a set of automated tests executed regularly, 20% of the tests that cover 80% of an application’s functionality, testing after an application undergoes some change, or any test executed in the past. Regression testing can provide value by informing a team whether a change (new release, upgrade, patch, or hot fix) negatively impacts an application. Michael Bolton has previously offered that regression testing also helps us learn about the relationship between parts of the software, to understand better where future changes might have an impact. One of the concerns surrounding regression testing is “what is the appropriate number of tests” or “test coverage” to adequately observe the system. Regression testing is important, but so is performing new tests that extend coverage to features being developed. Plus, time & budget will often play a limiting factor in how much testing can be done before the change is implemented. Therefore, teams must adopt a standard mechanism to select those tests to be included in regression, which is why conducting a Regression Analysis meeting to add, modify, and remove tests from regression is important to supporting those change events.
A Regression Analysis meeting will determine (1) which tests associated with the release should be considered part of core regression and (2) which regression tests should be removed from the current core regression suite. The core regression should be understood by all members of the team and business representatives to represent automated tests executed for any release, patch, or hot fix. The output of a Regression Analysis meeting is a regression suite that reflects the core functionality of the application so for any of those events the team has confidence the application will behave as expected.
Before the Regression Analysis meeting is held, whomever is taking responsibility as quality lead for the application will compile a list of all new release scripts and all existing core regression scripts. That lead will provide both lists to all expected participants of the meeting ahead of time to give everyone an opportunity to review. A representative of the business will provide metrics on application usage broken down by feature, which can include items such as the most used platforms, popular conversion paths, tracked application exit points, active A/B tests, and any other relevant details they believe the development team should know. The application manager or product owner should provide a list of upcoming projects with high-level feature changes to identify features that may be deprecated or modified in the next release. Lastly, a representative of production support (incident and/or service request) will provide metrics on issues for that application’s most recent release and any issues in the months prior to the release. Therefore, the Regression Analysis meeting will include at least the QA lead, business sponsor, application manager / product owner, and production support representative.
The purpose of having these four roles represented in the meeting is to make educated, evidenced-based decisions about testing coverage and effort. Since testing is often limited by both time and budget constraints, all stakeholders of an application should understand the risks of excluding or limiting testing activities for a given event (release, patches, hot fixes). Helping those stakeholders understand the coverage of testing, the time involved, and the division of that work (manual & automated) for a given event aligns expectations with outcomes.
Regression Analysis should be conducted on a regular basis, matching the cadence of the release cycle if applicable. If teams are releasing daily, they should establish a working agreement to adopt those new tests to the core regression by default and have the meeting at predefined intervals to remove any tests determined not necessary by the above stakeholders. The purpose of this practice is to bias towards lower risk by including those tests rather than allow a coverage gap of weeks or months to build up before another review is held. During a Regression Analysis meeting, the participants will review the individual tests from the release to be added to the core regression and determine which tests to remove based on the data points from the four representatives. This decision process can be left open-ended if all participants agree or a checklist can be used to help make the decisions on what to include and exclude. It’s important that the meeting be held live rather than over email because like many team ceremonies, it focuses the attendees on the subject at hand, which is key to establishing a shared understanding.
Outside of updating the core regression suite to reflect the state of the application, the Regression Analysis meeting provides effort estimates to be used in future releases and a list of risk & assumptions the team can use in their working agreements or Test Plans. It’s a powerful event to focus a team on executing valuable tests rather than having a regression suite that becomes overgrown and inaccurate.
Activity Two: Code Reviews
“If you can get today’s work done today, but you do it in such a way that you can’t possibly get tomorrow’s work done tomorrow, then you lose.”
– Martin Fowler, Refactoring: Improving the Design of Existing Code
Code Reviews are a best-practice development activity to ensure mistakes are caught early in the development lifecycle. The activity will help ensure the team has “built the thing right”. Some code review activities include peer reviews by a technical lead, paired programming with another developer, or demonstration to a wider audience. A good practice to follow is leveraging a static code analysis tool (e.g., Cuke Sniffer for Ruby Cucumber) and participate in code reviews. To help ensure a feature has been tested using automation, the team should also conduct an informal walk-through of the feature under development before it’s promoted to higher environments.
Code Reviews conducted by a peer or a larger team should ensure that all requirements for the given feature under development have been met. Additionally, the feature should have all required traceability and follow all accepted team standards of development. These standards can vary significantly team-to-team, so it’s recommended any teams that cross-impact each other establish common standards. Otherwise code and projects that move across multiple teams will only be as strong as the weakest team practice. Most importantly, the automation scripts should actually execute on a regular basis and meet the expectations of pass/fail consistently. At times during the software development lifecycle (SDLC), features provided by the development team aren’t ready for test automation or data is not available. These external factors should be taken into account during a code review, so expectations for pass / fail are met. Any automation script that fails due to outside circumstances is worth noting for review at a later date. Overall, the team should look for the following during a code review:
All possible automation scripts for the feature are indeed scripted
The automation scripts are understandable by the entire team
The automation scripts do not duplicate effort already present
All required environmental, UI-locator, services, and data needs are addressed
The Features and Scenarios best represent the state of the application (living documentation)
All agreed team and enterprise standards & practices are followed (traceability, compatibility, formatting, etc.)
In the above general guidelines, a static code analysis tool was recommended to support team standards programmatically. The advantage of such a tool is execution can occur frequently to assess the current state of the codebase in a consistent manner. For instance, “Cuke Sniffer” is a Ruby Gem used to find “broken windows” in a Ruby project. Executing this static code analysis tool against a Ruby project will provide a list of issues and recommended improvements. Each problem area is assigned a score where the more important the issue results in a higher score. All combined scores for individual areas in a given project is the overall score; the higher the score, the more improvements are needed for the project. The tool also allows each team to update the standard set of rules to address specific needs for an application. Tracking the score over time provides telemetry about one aspect of quality of the test automation as features are added to the application under test. In addition to the above listed guidelines about code reviews, here are some specific “broken windows” to catch:
Tests without names or descriptions
Tests lacking traceability back to the original requirements
Overly long test descriptions
Imperative style Gherkin steps that focus on the UI and not the behavior in declarative style
Empty files with no tests
Features with too many scenarios
Hard-coded data (the data may work now but not in the future)
Tests that use “selfish data” (data that is used once and then is no longer valid)
Tests that use “toxic data” (data that represents a security risk, especially if that data is pulled from production without sanitization)
Tests that never fail (this is an often-overlooked issue. If the application is unavailable and the test still passes, then you don’t have a test)
. . . the list goes on and on.
Many of the above listed issues have been encountered by experienced automation developers. It’s incumbent on those individuals to support newer developers in identifying issues and educating colleagues across their organization on practices that avoid these common mistakes. Code Reviews are an effective early detection mechanism and the collaborative nature of the activity between developers helps build technical ability.
Activity Three: Refactoring Sessions
“Whenever I have to think to understand what the code is doing, I ask myself if I can refactor the code to make that understanding more immediately apparent.”
– Martin Fowler, Refactoring: Improving the Design of Existing Code
Code refactoring is an activity to improve existing code without changing its external behavior. The advantages include improved code readability and reduced complexity, which can improve code maintainability and create more expressive features or improve extensibility.
Refactoring is often motivated by noticing a “code smell”. Once a code smell has been identified, the feature can be addressed by refactoring the code or even transforming it, so the feature behaves the same as before but no longer “smells”. There are two main benefits to refactoring:
Maintainability. Easy to read code is easier to fix and the intent is self-apparent. One example is reducing overly long & complex methods into individually concise, single-purpose methods. Another example is migrating a method to a more appropriate class or by removing poor comments.
Extensibility. It’s easier to extend the automation suite if the appropriate (and agreed upon) design patterns are followed, and it provides flexibility to write more automation scripts without adding support code.
Refactoring should be conducted regularly and with specific goals in mind. Refactoring by making many small changes can result in a larger scale change. A set of guiding principles can help guide a team in refactoring as part of the development process (not as an exception-based activity or occasional activity). Static code analysis tools can be used to supplement the following guiding principles:
Duplication. A violation of the “Don’t Repeat Yourself” (DRY) principle.
Nonorthogonal Design. Code or a design choice that could be made more orthogonal. Orthogonal design examples are scenarios, data management, methods, classes, etc. in an automation suite that are independent of each other.
Outdated Knowledge. Applications can change frequently, and requirements tend to shift during the course of a project. Over the course of time the team’s knowledge of the application improves, which include many of the code smells. The automation suite should represent living documentation, reflecting the current state of the application under test.
Performance. Automation scripts should be executed quickly and often. Added wait times and long setup for scenarios should be minimized to improve performance. Explicit wait times, flaky scenarios, and overly long scenarios hinder the feedback loop for automation results. Poor performance of automation scripts are exposed when the development team uses a CI/CD pipeline to deploy frequently, with the automated testing being the bottleneck to build success.
“I’ve found that refactoring helps me write fast software. It slows the software in the short term while I’m refactoring, but it makes the software easier to tune during optimization. I end up well ahead.”
– Martin Fowler, Refactoring: Improving the Design of Existing Code
Similar to Code Reviews, every team should implement Refactoring Sessions on a recurring basis. In each refactoring session, the team should follow a set of standards enforced by a static code analysis tool and working agreements. These standards are in addition to any existing federated standards for their enterprise. The automation sessions should be led by a member of each team and supported by an automation developer from outside the team for peer review. The reason for outside assistance is to provide a fresh viewpoint on the state of the automation suite. If the code is not self-documenting, that person should be able to raise concerns. Think of the external representative as another form of Code Review in support of quality.
The refactoring sessions should start at approximately one hour per week and be focused on active project work. The reason for this is to establish a baseline expectation for the team AND make the activity “billable” work if time tracking is a concern. To provide guardrails for the team to determine focus for a given sessions, there are a few recommendations: (1) utilize a static code analysis tool to identify problem areas, (2) leverage daily Regression/Release executions from execution reports, (3) select a feature being actively developed, and (4) use telemetry on execution performance (speed and consistency of test execution). The following describe the roles & responsibilities during a refactoring session.
The Team Leader is responsible for scheduling the weekly sessions and ensure attendance by the team for that application under test. The Team Leader can choose to focus on one area or multiple areas, time permitting. The topic responsibility belongs to the Team Leader, but they may choose to rotate selection of the topic to other members of the team to support collective ownership. The Leader can select from multiples topic areas during a session; this is to provide the so-called guardrails, so the team stays within scope and has a fresh topic each session. The topic areas are:
Static Code Analysis Report
Review the rules enforced by the team in the static code analysis tool then execute a fresh report. Use the report to address items in the improvement list (top-down or bottom-up), remove dead steps, improve features & scenarios, refactor step definitions, or refactor hooks. The team can also choose to update any static code analysis at this time regarding enforcement and score. The history of execution should be captured to provide telemetry on the state of the automated suite.
Select a feature from the current or previous cycle then execute the scripts in the appropriate test environment. The team should ensure the feature has the required traceability, proper formatting, and follows all coding standards. Next, the team should ensure all associated data are properly included for successful test execution. The team should confirm functionality is not duplicating existing work. After any updates to the existing test cases, the team will identify technical debt and assign action items for after the session (to add or update any test cases they feel necessary to fulfill the functional and non-functional requirements for the feature). Lastly, the team will re-execute the feature again to confirm expectations of pass / fail.
Daily Release / Regression
The Team Leader will select a feature containing regression scenarios. Execute the scripts in the highest test environment. The team will identify any regression scripts they feel are no longer relevant to core functionality of the application and tag those for Regression Analysis as an action item. The team should ensure the feature has traceability, proper formatting, and follows all coding standards. Any scenarios that have dependency on one another to be successful need to be decoupled. Any functionality in the regression that has been duplicated should be removed. Lastly, the team will re-execute those selected release / regression scripts to confirm expectations of pass / fail.
The Team Leader opens multiple recent CI executions and reviews the results with the team with a focus on performance. The goal a root cause analysis to determine if the scripts suffer because of: (1) application performance, (2) test environment, (3) data issues, (4) automation timing issues such as explicit waits, or (5) change in expected functionality. Flaky tests should be removed from regular execution until the underlying issue(s) are addressed. Explicit wait times should be eliminated to improve execution time; instead, use implicit waits that execute when the application service or UI is available. Additionally, the team should establish failure criteria in the automated tests for response times that exceed a threshold. After addressing the issue(s), the CI job should be executed, and project tracking tool updated if needed.
The Automation Guide is responsible for reporting the meeting outcome to the entire development team and tracking results in an accessible location to the organization at-large. The purpose of tracking this changelog is to demonstrate improvement over time. Information tracked will include the features addressed in the team meeting, the cause for review or refactoring, and the successful outcome. Consistent problem areas can be incorporated into team & personal development goals if the root cause is automation or reporting to the application development team if the root cause is development or requirements.
The Automation Guide also serves as technical oracle for the team during the meeting. When there are questions about implementation or upholding standards for automation, the guide will act as the point of contact for solving those problems during the meeting and will be responsible for follow-up if the issue cannot be addressed in one meeting. The automation guide plays a support role and should allow the team to select the features and problems areas of focus.
“Functional tests are a different animal. They are written to ensure the software as a whole works. They provide quality assurance to the customer and don’t care about programmer productivity. They should be developed by a different team, one who delights in finding bugs.”
– Martin Fowler, Refactoring: Improving the Design of Existing Code
The above overview of Regression Analysis, Code Reviews, and Refactoring Sessions for test automation help build quality in a test automation suite and by extension the application under test. Regression Analysis helps align business partners with their development teams to establish a shared understanding of the application. Code Reviews help ensure the team has “built the thing right” by catching mistakes early in the development process. Refactoring is an activity to improve existing code without changing its external behavior by through increased code readability and reduced complexity. It’s not enough for any team to just say they’ll commit to regression analysis or code reviews or refactoring – building rigor around these activities and making them habitual help bias a team toward long-term success.
In the interest of continuous improvement, developers participating in the above activities will gain new understanding of standards & best practices. However, learning does not stop at meetings end. Many of the guiding principles for the Regression Analysis, Code Reviews, and Refactoring sessions are derived from seminal works in programming. Additional study is required to progress beyond static code analysis tools and team standards. Listed below are some recommended background reading materials on software craftsmanship:
For the past several years I have been passionate about making things easier in the automation world by taking advantage of API’s. I think many people who don’t have experience working with web services can feel intimidated by them, and might be looking for a good excuse to practice with them. A couple years ago I found this post explaining how to connect to ESPN’s “hidden” API using Python. I’m a huge fantasy football nut, and since I work with Ruby so much I decided to build my own project that would connect to ESPN and extract various data for my fantasy football league.
In this post we will be mainly using the Ruby rest-client gem to send GET requests to the API, and then we will be pulling data from the JSON data that we receive back. The main purpose is to show you how to pull ESPN data, but we will be trying to look at this from a learning perspective and highlight practices that we can use when working with any web service. We’ll be building out several classes that interact with different pieces of data and organize our code in a way that makes sense. First, let’s give a little background on fantasy football and why this is some fun data to pull. Even if you don’t care about fantasy football, hopefully this post will still provide some useful information for you to learn from.
For the uninitiated, fantasy football is when a group of degenerates pit their imaginary football teams against each other in a weekly matchup. Everyone gets to draft real players to fill out their rosters, set their lineups, make trades, pick up free agents, and much more. Points are awarded based on stats such as yards gained and touchdowns. Many fantasy football platforms supply you with lots of good data, but we don’t have the raw data to play around with and analyze. We could just use Selenium to scrape data off the site, but websites are subject to change and API’s tend to be much more stable.
Note: From here on, I will assume that you have a valid installation of Ruby and Rubymine. For instructions on this, see Josh’s previous blog post here and stop at “Install Appium”.
So let’s get started building our new project. We’re going to begin with a new Ruby class called main.rb. We’ll also want to create a Gemfile to bring in the necessary libraries. As mentioned before, the only gem we’ll need for now is rest-client. In my environment, I was also receiving an error for the FFI gem, so we’ll specify a version for that as well.
Go ahead and do a bundle install if you don’t already have these gems (Tools >> Bundler >> Install). Then we’ll want to pull in our gems to our main.rb class:
require ‘rest-client’ require ‘json’
If you’re following along for your own fantasy league, you may need to pause here. For those of you with private leagues, you will need to go into your browser to retrieve some cookies. Instructions for this can be found here. Those with public leagues can skip that step. This is a good time to point out that oftentimes the hardest part of accessing an API is authentication. Web services use a wide variety of authentication methods, and it is important to keep in mind that simply getting hooked up might take more time than you may think.
The other piece of data we’ll need to get started is our league ID. This ID can be retrieved if you go to your league page in ESPN and look at the URL:
Let’s go ahead and set our league ID to a global variable at the top of our file since we’ll need to use that variable across multiple files. If you have a private league, let’s assign the S2 value and SWID in the same place. This is lazy and is generally bad practice, but we’ll make sure to come back later and move those variables to somewhere more appropriate. Our class should now look something like this:
Now we can test our API call. As a general rule, we want to wrap our rest-client calls inside a begin/rescue block. This is because if a call fails, it will crash our whole suite. This is usually not a desired behavior, because we will either (a) want the test to try again, or (b) do something useful with the failure message so we can see what the issue is. Our rest-client call is going to need a URL, a method, and some headers if we are accessing a private league.
Since we are just retrieving data, our method will be a GET. To simply try our connection, we can use the following URL:
In this URL, the “seasons/2019/” specifies that we want to look at the 2019 season. Then we specify our league ID, and the “scoringPeriodId=1” query parameter tells the API to pull the data for week 1 of the season. For now, let’s assign value to a variable called “url“.
We will get into the API endpoints as we go forward, but this is the main one that we’ll be working with for now. If you are using a private league, we can assign our headers value to a variable as well. You don’t need to specify headers if you are using a public league. Our headers will look like this:
Here we can point out a few good practices that we’ve implemented for this basic action that will make our lives easier as our scripts get larger. We mentioned that it’s helpful to wrap our requests in begin/rescue blocks. The above code will give us a much cleaner failure than if we let the program output the failure on its own. Also, our call is nice and clean because we have variables defined for the URL and headers.
Go ahead and execute your code. If your console doesn’t show any text, then congratulations! Your call was successful. If our “Request failed” text is displaying, then you may need to go back and verify your league ID or ESPN cookies.
Now let’s explore what we have in our successful response. We have a large JSON block stored inside a RestClient::Response object. Here we can use our JSON library that we required earlier to parse this data into a Hash that we can more easily read.
We can perform this action and assign the hash to a variable with the code:
It looks like we’ve received quite a bit of data back! For simply pulling stats, we aren’t going to need most of this at the moment. We can see that we have pulled some league data, some scheduling data, our league ID and scoring period, and other various data. The key that we’re going to be concerned with for now is Teams. When we start to explore this entry, we’re going to be hit in the face with a pretty deep hash:
Now we’re to the fun part! In the interest of added suspense, we’re going to end this post here before we dive into parsing out and organizing our data for use. If you don’t want to wait on me, the Python blog post mentioned at the top should have enough information for you to continue on your own. Let’s review what we covered so far:
ESPN has a semi-hidden API that we can use to pull data from.
We can easily use our rest-client gem to pull data cleanly.
Global variables for data are usually bad and we still need to address how we are storing our data before it starts to pile up.
We should typically be wrapping our API requests in begin/rescue blocks in order to better handle potential errors.
JSON responses can be easily converted into Hash objects in order to make them more usable.
It may not feel like we’ve accomplished much so far, but we are well on our way to pulling lots of useful data that we can have some fun with. Look for Part Two soon!
The following will be a regular feature where we share articles, podcasts, and webinars of interest from the web. This week we’ll showcase a articles on CI/CD Pipelines, Continuous Testing, the Spotify Model, Unit Tests, and a Webinar series on Automation.
For those of you working in Azure DevOps, Microsoft recently made an update to their Pipelines feature to help support CI/CD. Entire CI/CD workflows can be defined in a YAML file and be versioned with the rest of the code.
Great article by Perfecto that provides a high-level view of crafting an automated test strategy. Nearly every software company is aiming for CI/CD or maximizing the efficiency of their existing CI/CD. The article provides those steps, from value stream mapping the pipeline to building in flexibility to the testing platform. There is plenty of solid references in the article as well for those looking to learn more about automated testing in general and continuous testing in particular.
A wonderful look at the Spotify Model by Jeremiah Lee from his time at the company. The Spotify Model is revealed to be more aspirational than actual, with the company struggling from the management side of growth to team collaboration. As someone who has previously used the Spotify Health Check Model for teams, I’m fascinated by this look into Spotify and feedback from people who actually worked there.
Michael Feathers posts a fascinating article that questions the size of a unit test. He posits a unit test can be a class, a function, or a cluster of either so long as it’s something “small” that is a unit of the application under test. The unit test should align with and enforce modularity and encapsulation. I think his views offer a smart philosophy to approaching code – if you are having difficulty writing tests then that’s a good indication the code could be more module so you can see the distinct pieces.
BrowserStack established a free “Summer of Learning” webinar series for people interested in automated testing of web- and mobile-applications. Recently David Burns joined the BrowserStack team. David is a core contributor to Selenium and was previously responsible for GeckoDriver (Firefox) while working at Mozilla. This webinar series is a great idea to uplift skills while most of us are working from home.
Episode 1 — The Basics: Getting started with Selenium: An introduction to Selenium, how to set up/write your first test scripts, and how to pick the right framework. This is a great introductory session for those looking to learn test automation in 60 minutes.
Episode 2 — Introduction to BrowserStack Automate: In this episode, you’ll learn how to set up and run your first test with Automate, how to test on various real devices and browsers on the BrowserStack Real Device cloud, how to test your local instance on the cloud, and how to collaborate and debug better.
Episode 3 — Continuous testing at scale: You’ll learn how to build an efficient, well-integrated CI pipeline that helps release quality software at speed. You’ll also learn how to use BrowserStack to deploy faster and listen to stories from great companies like The Weather Channel, who release to millions of users every day.
Episode 4 — Selenium + BrowserStack at scale: In Episode 4, David Burns, core contributor to Selenium will explain how to plan parallelization more effectively to achieve faster build times, the best ways to maintain test hygiene while scaling your team or automation suite, and how to monitor test feedback effectively.
Episode 5 — Testing for a mobile-first market: There are 9,000 distinct mobile devices in the market—and you most definitely can’t test on them all. But with this episode, you’ll learn the best strategy to pick the right devices for testing your website or mobile app.