Enterprise Bug Busting From Testing through CI/CD to Deliver Business Results ©2021, Rosalind Radcliffe
All rights reserved. This book or any portion thereof may not be reproduced or used in any manner whatsoever without the express written permission of the publisher except for the use of brief quotations in a book review.
Published by Accelerated Strategies Press acceleratedstrategiespress.com
ISBN: 978-1-09838-149-3 ISBN eBook: 978-1-09838-150-9
“There is no longer any question whether modern DevOps practices can be used to accelerate and improve quality for legacy systems. Rosalind describes DevOps testing concepts in language that makes sense for large enterprise systems and describes large system challenges in a way that makes sense for DevOps testing. If you were looking for a primer of what you need to know and understand before you apply continuous testing in a complex system, you found it. There are two modes for using this book: 1) With a highlighter in hand to mark the ideas you want to apply as you bring enterprise DevOps testing to life; and 2) Hand this book to a peer or leader in your organization and tell them to read it so that you can talk about what to do next. I recommend both modes.”
Mark Moncelle Enterprise DevOps Practitioner
“Enterprise Bug Busting hits home on the all important topic of testing in the enterprise. It defines a roap for implementing a continuous integration practice in your organization, and Rosalind Radcliffe, drawing on a lifetime of DevOps experience, demonstrates how application development leaders can improve software quality while building hybrid cloud applications that are delivered continuously.”
Sherri Hanna Senior Offering Manager, DevOps for IBM Z Hybrid Cloud
“Mainframes are still a critical part of IT infrastructures, and Rosalind Radcliffe is an expert in bringing modern development technologies to trans- form these systems beyond their traditional role as systems of record.”
Charlene O’Hanlon Chief Content Officer, MediaOps
“In my years in the DevOps community, Rosalind Radcliffe has been a shining, true North Star pointing the way for enterprises to adopt the latest technologies, while leveraging their existing investments to do more, better and faster. This book is a continuation of this common sense, best practices approach to success.”
Alan Shimel CEO and Founder, MediaOps
“Software quality is an immutable theme throughout Rosalind’s remarkable career. She artfully imparts her masters-level knowledge. Enterprise software teams will immediately benefit from this book and alter course towards improved software quality in this age of digital transformation.”
Mitch Ashley CEO and Managing Analyst, Accelerated Strategies Group
“This book showcases Rosalind’s unique ability at providing common sense transformational guidance to enterprises with mainframe and complex
environments.”
Jayne Groll CEO, DevOps Institute
To my husband Bob who has always been there to me, and for my children so they can understand more of what I do.
Foreword
Do you work in a large organization that has a variety of applications including mainframe systems running your core business? Do you struggle to deliver improvements with the speed the business requires and the quality the customer expects? When working with the mainframe teams, are you confused as to why everything tends to work differently than the rest of the applications? If so, then this book is for you, and Rosalind is the best guide to help you navigate all of these challenges and understand how to create quality systems for complex environments.
I met Rosalind when she presented at the 2015 DevOps Enterprise Summit. It was in the early days at the conference when most of what was being presented was about improvements on small applications that were not core to the business. Everyone was recommending small-isolated teams working with frontend code. When Rosalind stepped onto the stage to talk about mainframes I thought, “Well, this is going to be interesting.”
She started clearly describing the differences between the stability provided by the redundancy of distributed systems vs. the stability provided by mainframes that the core business of most large enterprises. Rosalind stepped through each and every DevOps practice being discussed at the conference and showed how each and every one of the practices can be implemented on mainframes. Rosalind closed by stating that there isn’t anything fundamentally wrong with mainframes, and in fact they run most of the core businesses in the industry. We just shouldn’t be developing on them the same way we did 30 years ago. It was one of the best presentations I have ever seen, and I decided I had to meet her.
As I have gotten to know her better over the years, I have learned she has the unique ability and skill to speak strategically with executives and technically with senior engineers. She can go into an organization and help the executives understand the changes required. When the senior technical leads that have been doing the same thing the same way for 30 years say it won’t work, she can dive into the technical details to address their concerns and help them understand how it actually can and should work. Rosalind is one of the most impressive people I have met in the industry.
She has a ion for quality and 30 years of experience driving improvements both inside IBM and with its customers. Through this experience she has realized quality is not something that can be delegated to the quality assurance organization. It needs to be everyone’s job. It is not just a function of testing. It involves everything from the idea for a new capability to ensuring it meets the needs of the business and the expectations of the customer. An effective quality system involves all the steps in between.
Rosalind starts with this broader view of quality and shows how to provide an end-to-end system. She demonstrates her knowledge of how to apply these concepts to a broad range of applications. More importantly she shows how to apply these concepts to mainframes, which she is uniquely qualified to do. She helps people understand how mainframes are different from the other applications.
She deep dives on how to create effective quality systems for the mainframe with customer examples and key learnings. Rosalind also covers the latest capabilities of the IBM Z that can these more modern development approaches that are valuable but not broadly used. She shows how and why organizations that haven’t modified their development approaches as much as they should have over the last 30 years should be leveraging these new capabilities.
If you work in a large enterprise with lots of different applications, including the mainframe, this is a must read. It will help you understand how to create a truly effective quality system and explain how and why the mainframe is different. It will also help you understand the more modern approaches to development that your mainframe teams need to start adopting.
Gary Gruver President of Gruver Consulting
Table of Contents
Who Should Read This Book
Introduction
Section 1: Introduction What Software Quality Is Metrics Story Types of Metrics Importance of Testing What Is Testing and Who Does It? What Is Continuous Integration? Key Parts of Continuous Integration Single source code manager (SCM) Fully automated build process Build runs on a build machine Fix broken builds right away
Keep the build fast Make the builds and tests visible Make sure the output is published to easily get it Why Continuous Integration How Does Continuous Integration Building Quality Apps? Key Insight: Continuous Integration Continuous Delivery, Continuous Deployment Key insight: Automated Deployment IBM Z: A High-Level View IBM Z overview: Definitions of key : Traditional z/OS development: Background for Enterprises How Organizational Structure Impacts Software Quality Story Types of Customers Types of Application Architectures Summary
Section 2:Essential Components Pipeline
Key insights: Pipeline Individual Development Environment (IDE) Code section Source Code Manager (SCM) Build Key insights: Static Analysis Artifact Repository Provision and Deploy and Test Story: Deployment errors Key insight: Manual Changes Monitoring Planning Analyze Provisioning vs. Deployment Key insights: Application Infrastructure Environments for Testing Isolated and Shared Environments Examples of shared environments: Story Key Insights: Isolated test environments Provisioning Environments Story
Key insight: Shared test environments Containers Test Data Management Key insights: Test data creation Key insights: Test data Stable Quality Signal Story: Unreliable Tests Key Insight: Stable quality signal Get Clean, Stay Clean Summary
Section 3: Types of Testing Scope of Testing Unit Testing Key Insight: Automated Unit testing Beyond Unit, But Still in the Build Component Testing Applications End-to-End Testing Key Insight: End-to-End Testing Types of Testing
Functional Verification Testing API Testing Key Insight: API Testing Sniff Testing, Smoke Testing Story: The 99 pencil test Integration/System Testing Function testing Integration testing/system testing Testing experience testing Key Insight: Experience Testing Freeform testing Malicious testing acceptance testing Story Regression Testing Performance/Scalability Testing Key Insights: Performance Testing Infrastructure Testing Chaos or Fault Injection Testing Additional mainframe background:
Production Verification Test A/B or Market Testing Story TDD/BDD Testing Frameworks for testing Summary
Section 4: Putting It All Together Why include IBM Z in this change? Story Key Insight: Cultural Change Cultural Change Bringing the Parts Together Story: Bringing it all together Getting Started Story: When it does not work Key Insight: Willingness to change Story Conclusion Story
Acknowledgments
Who Should Read This Book
This book is targeted at large enterprise organizations that have systems including IBM Z (the mainframe as many may consider it), or anyone interested in learning more about software quality including IBM Z. This book is targeted to executives who are in an organization with these large complex systems, as well as engineers working to improve the overall process. This book does not assume any specific background knowledge, and therefore, includes definitions and so those with multiple/various backgrounds can gain an understanding of software quality for the large enterprise.
Introduction
No matter where you look today, you will find technology everywhere. This includes software, which plays a significant part in our lives in some way every day. I’ve witnessed this evolution to software everywhere over the course of my lifetime, going from a day when the only software directly affecting my life was in the telephone system, though not in the telephone attached to the wall. A time when cars were mechanical machines without software guiding, helping and entertaining, and when watches were about fine Swiss movement. Today, we can’t get away from software. Software is in hospitals, cars, it controls the electric grid, it’s used daily in our phones and our homes. Any financial transaction you make involves software as well, even if cash is used, the processing in the cash is all software.
With software everywhere, the quality of that software and ing hardware can affect our lives in many varying ways. If an internet search renders a ‘page not found’ message it’s a minor inconvenience. But when software fails in a car or in an airplane, it’s life threatening. The combination of software and hardware controls so many facets of our lives in ways many people don’t even recognize. The importance of this must be recognized and acknowledged by those of us in the IT industry.
As noted, there are instances where the quality of software can determine life and death. While most of the time it’s not that critical, it is important to understand that
software in any form does impact our lives. This understanding is why it is important to improve the overall quality of the systems being built.
The goal of this book is to provide insights that will help enterprises improve the overall quality of software being created today. This overall quality is achieved throughout the entire development lifecycle based on the inclusion or omission of activities, automation and organizational strategy. I will provide definitions for areas that drive software quality, key insights from years working with various organizations, and stories to explain what others have done. Some stories will be positive and some not so positive. Sharing both the good and the bad is intended to help others learn from and steer clear of the mistakes made by those before them. We all learn through experimentation, I hope to help readers avoid some trouble, and offer insights into new options to improve the overall process. I will share stories from my 30+ years of experience working with various companies, across the globe, large, small and everywhere in-between, as well as my work within IBM.
One important note is that this book is primarily focused on the existing enterprises with existing applications and code bases. Organizations that have been around for a while vs. the general startup organization who’s entire code base may be less than a few years old. Though there are lessons for everyone, the examples provided and discussions will be based on large long-term enterprises. Startup environments will have a different view as they will generally have started more with modern DevOps practices, or cloud native development practices. Common attributes of large enterprises are:
Widely diverse languages, coding styles and application architectures.
Testing and deployment practices are often well established and require a culture change.
Up or downstream systems that “just work” and are seen to be too expensive or risky to change.
Complexity of the system is too large for any one team to understand, develop, or manage. This requires multiple layers, which adds additional complexity when testing the entire system.
The scale, complexity and reliability requirements, drive the requirements that automation has to be designed from the start for production standards.
Return on investment needs to be considered in all aspects of change, change does not happen just because something new comes along, but it has to provide business value. An example would be the focus on automated testing for a part of the solution that does not change often.
Why is it so important to consider the overall software quality of large complex systems? Large complex systems including IBM Z, host more than 70% of the total structured data. IBM Z is used by most of the world’s top banks, insurance companies and retailers, it is heavily used in travel and transportation, not to mention the various government systems. These systems provide the business value to many organizations, yet these systems are seen as legacy, hard to maintain and hard to work with. However, they are at the core of the business process, by bringing the full complex system including the IBM Z into the overall software quality process, using automation for the repeatable tasks, and bringing modern processes to the development and operations environment, not only does the quality improve but the business value can be unlocked for greater use by other parts of the organization as part of the digital transformation.
When I started at IBM, I was just out of college and my first job was working on the then current version of IBM Z, which was the IBM S/370. My original job was to simply learn the language used for the system, and learn how the system
was developed and tested. First assignments were focused on testing, or small coding changes to get used to how the system worked. We did have automation, and I learned early on, to automate tasks that had to be repeated. The parts I worked on were critical to the system, so I had to be very careful with changes to make sure nothing that used to work was broken. This took time and conscious effort. To test the various terminal types, I had to go to a terminal room and test the functions on each of the various terminals available. When we got our first PCs, the team had the ability to use the terminal emulator to test all the various options instead of having to spend time in the terminal room, this helped speed up the process and allow for more automation to be used for the testing.
These early days taught me a lot about the importance of quality. One other early opportunity I had was to work with clients directly through our major Groups. This early directly from the client regarding how they actually used the product, what they liked and what they did not like, helped drive changes for improved experience. I also spent a number of years in IBMs Centered Design organization, the precursor to what we have now, Design Thinking, with a focus on s. Human factors testing allowed us to see how actual s would use the capabilities. With this information we could better design the functions. This focus on what is best for the end- led to the work we did with Common Access (CUA), which helped define the industry standard interface. This work also led to the IEEE standard for interface.
Anyone using a computer today still sees the impact of that work, the menu bar across the top, with File, Edit, View and Help, in standard locations, and the standard cut, copy and paste keyboard shortcuts – ctrl-c, ctrl-v, ctrl-x, ctrl-z, (or cmd on Mac).
Having worked to help drive this standard across all of the major IT vendors at the time was a challenge, but the fact that it lives 30 years later shows the value of this effort to focus on the end . This was obviously a team effort, but I was glad to be part of it so early in my career.
Throughout my career I have worked in various roles, but this beginning drive for quality for our end s has stuck with me. Over the 30 years since, development practices have gone through many changes and evolutions, however, when looking at many IBM Z shops the development and operations practices have not evolved in the same way. Even today when I talk with some Z developers, I see their development practices and tools, look exactly the same as when I started. Those 30 years of evolution that helped drive the change for digital transformation, for software everywhere, seems to have left out this very critical part of the organization. The IBM Z system itself has not stood still, it has continued to evolve as all other systems have. Now is the time for the people working with IBM Z software, to also take advantage of all the new possibilities available.
This book leverages those changes in the system and takes those experiences and brings them together to help organizations around the world learn from the lessons of others.
Development practices have evolved over the years, but fundamentally it’s always the same. Software development requires developers to use their knowledge and skill to create a set of capabilities. These capabilities must be verified to ensure they do what is intended and they must interact with the rest of the system in the way intended. This is a very simple view of the process, but fundamentally, at its core that’s what software development is.
Software development is, at its core, the innovation of individuals creating new functions based on a set of requests. It is not the same as a manufacturing process where you work to remove variability to help ensure every object comes out the same with the same level of quality. Creating a quality system requires a combination of activities, automation and measurements to ensure that creation satisfies the nonfunctional, as well as, functional requirements within the right risk profile. The acceptable level of risk will vary based on the type of software
being developed, the ease of deploying a fix and the impact of a problem.
The reason to create the software is to provide business value. Building software is done by a team of responsible people, and these teams do not work in isolation from each other. These teams include the customers, internal or external, and representatives of the external s. The requirements for the software form a hypothesis for what can provide that value. That then needs to be verified by the , once created, to allow for the adjustment as necessary. The value is useless without quality. Nobody wants a product that does not function to meet their needs.
Since the actual creation of the software cannot be tightly controlled, the activities related to the creation should be as automated as possible to remove the opportunity for error from all the surrounding activities. Simply put, the creative activity of writing the code needs all the flexibility to allow for innovation. The process of building, and deploying the software should be completely automated as there is no need for creativity in this process, but there is a high requirement that it be highly repeatable.
What is built as part of this creative process is based on a set of requirements, based on market, business and input. These requirements will also include a set of nonfunctional requirements, or at least nonfunctional expectations for the function. This includes characteristics such as availability, response time and security, among others. The requirements driving the software creation, the software creation itself, and all the processes related to it, are often referred to as the software development lifecycle (SDLC).
The creative process of deg software requires methods to test that software to ensure it satisfies the requirements as specified. This testing process is what many people include in a quality assurance program, but in reality the software quality is determined by the entire SDLC.
Over the years various transitions have happened with the SDLC, generally to add additional steps in the process, approvals, and additional checks to help improve the perceived quality of the solution. However, as I will describe in the book, some of the changes added, alleviated a problem but brought more significant problems into the process. We, therefore, need to address the entire lifecycle in context of the quality we are working toward.
Another important note in driving software quality is that since it is a creative process, you can’t guarantee there are no problems. The goal should be to deliver the appropriate level of risk for a problem. Determining the level of risk acceptable to each solution, and determining appropriate ways to measure that risk are key factors.
Measurements are an important part of driving software quality, but determining the right metrics to drive the right behaviors is the challenge. Providing the right focus at the right time in the process to drive the highest quality possible with the least effort.
I will detail throughout the remaining sections of the book how each aspect of the software development lifecycle contributes in its own way to the final solution running in production. I will discuss the implications for measurements and how they drive behaviors in ways that may not lead to improved quality.
In many ways this book is written for executives and engineers alike who work in organizations that have IBM Z, but are not familiar with it, this book provides the explanations for why things have come to be done a certain way, as well as suggestions for how to help change those practices. However, for those of you working with IBM Z, you can skip those definitions of you already know, but the suggestions for new ways of working apply to you as well.
I have designed this book so that it can be read front to back, however, if only a particular topic is of interest, each section can stand alone. Each section will contain the definition and foundation information, as well as key insights and stories from various customers environments. In addition, background information is included for those less familiar with particular areas such as the IBM Z environment.
Section 1 provides general background information, definitions for key areas such as software quality, metrics, continuous integration and continuous delivery (CI/CD) and background information. It will include a description of the types of customers that I will use as examples in the book as well as abstract descriptions of those customers. In addition it provides high-level overview of IBM Z and definitions of key related to IBM Z and an introduction to traditional z/OS development processes. The goal of this section is to lay the foundation for an understanding of IBM Z as well as areas related to software quality.
Section 2 delves into essential components related to the process of software quality including the overall pipeline, environments for use during the process, fundamentals related to data for testing and overall high-level practices that should be followed. This section describes how modern development practices can now be used including IBM Z, and describes the importance of each aspect as it relates to the process.
Section 3 provides a view of the various types of testing, the s who perform the tests, and additional considerations for each type of testing. It describes how the IBM Z system can be included and additional considerations for inclusion in each type of testing.
Section 4 offers a summary, options for roaps for transformation and additional stories to bring the entire process together.
Section 1: Introduction
This section will discuss various parts of the software development lifecycle as they relate to quality, including various currently in practice. It will also provide background for the following sections as well as additional and information about IBM Z systems that may be less well understood. If you are not familiar with IBM Z and it’s capabilities I would recommend reading the description of IBM Z and the associated with it. This will help provide a high-level understanding.
What Software Quality Is
Software Quality: “The quality of a system is the degree to which the system satisfies the stated and implied needs of its various stakeholders, and thus provides value. Those stakeholders’ needs (functionality, performance, security, maintainability, etc.) are precisely what is represented in the quality model, which categorizes the product quality into characteristics and sub-characteristics.”
The product quality model defined in ISO/IEC 25010 comprises the eight quality characteristics shown in Figure 1:
Figure 1
Software quality is what we strive for, a level of quality that leads to an acceptable risk profile for use in production. But what is software quality? How do we measure software quality and how do we understand what is the right level of risk when deploying to production? It is important to recognize we can’t and won’t have perfect code, there are always going to be problems or defects, the question becomes, what is the severity of these defects or problems and what is the frequency of the problems occurring? When we’re driving software quality we’re essentially trying to ensure we meet business demands, we’re not spending too much time trying to test in quality. In fact, it’s important to recognize that you cannot test in quality.
Testing doesn’t actually add quality, it gives you and gives you an understanding of what level of quality you have achieved. It is the software development process itself that drives quality overall. And it’s not just the software you’re writing, it’s all the aspects of the system around the software you’re creating, it’s the middleware, the infrastructure and all the different parts that lead to the end quality, the experiences. The other important aspect of understanding the risk is the combination of changes going in at the same time for an overall solution. One change that is low risk is a different consideration than thousands of low risk items going in at the same time.
The topic of software quality was also covered in a paper as part of the DevOps forum. In the paper we discussed outcomes, leading indicators and different aspects that can help drive software quality. One key aspect of the paper is the discussion of leading indicators and misleading indicators. This idea of misleading indicators, or areas that one might use to indicate a measure of software quality that in reality may not provide any indication of software quality, is important to recognize and will be discussed later.
When thinking about software quality, many times the focus is on testing, but looking at the full definition, it’s more than testing, it includes all aspects of
creating the system, as well as the running system itself. Keeping a broad view of software quality helps ensure one does not focus on just one aspect, it allows you to be open to look across the board, to understand how each part can be affected. The purpose of this book is to view the various aspects of the Software Development Lifecycle to address each of the above areas. One additional area to be considered is the happiness of the people involved in the process, this item is not specifically mentioned in the prior definition, but individual happiness drives individual performance, which absolutely affects quality. (Happy teams are more productive and productive teams are more happy. https://phys.org/news/2019-10-happy-workers-productive.html)
So, the question becomes, how do you measure software quality? Well, the problem with measuring anything is that people work to satisfy the measurements, not necessarily the goal of the measurements. And developers are like anyone else, they satisfy the measurements. So, for example, if you are measuring productivity via lines of code, developers will write longer code, not necessarily better or more efficient code, just more lines of code. Significant effort has been put into the space of developer productivity, if you want to look into this in more detail one source of additional information is the paper The SPACE of Developer Productivity (https://queue.acm.org/detail.cfm? id=3454124) The same is true for software quality, it is important to create metrics in which people drive to the goal of the metric, and they can’t just drive to the metric itself. A simple example of this is the requirement for unit testing. If you have a requirement for unit testing, but no other requirements around it, you can easily create a unit test that says assert equals true. You then have a unit test, but it doesn’t actually test anything.
Metrics are a key problem in the software development lifecycle. How do we create metrics that won’t be misused or misread? What metrics provide insight into what the end s will perceive as quality? Driving to the goal of high quality and customer needs must be the focus, and if this requires changing metrics on a regular basis so they cannot be gamed, that’s what needs to be done. In the end, the customer is the final decision-maker as the view of quality that must be considered.
When we think about software quality, we need to think about the process itself, we need to think about the tools that we’re using and we need to think about the organizational structure and how it impacts software quality. It may seem a little strange, but these different dimensions affect the overall quality in different ways, throughout the book we will discuss the different aspects and how they relate to the overall software quality.
Metrics
Metrics are about creating the right measurement system to ensure you get the outcomes desired. Metrics need to be measured throughout the process itself in order to provide value. Trying to do this manually is an inefficient use of time that leads to inaccurate data. Pulling the metrics from the system itself instead results in reliable consistent data across teams.
Metrics are what drive behavior. What you pull from the system to drive the behaviors you want will partially depend on the teams themselves, what they’re building, whether it’s a new development or primarily an update to a large existing system.
If you ask developers for high-code coverage they will create high-code coverage whether or not they are ensuring the code is doing what it should be doing. Therefore, it is important to use the right metrics to drive the right behavior, or changing the metrics with some level of frequency to drive the right behavior.
Common metrics related to software quality include:
Code coverage.
Whether or not there are unit tests.
Function point coverage.
Having a specific type of test.
Number of defects found within a particular type of test.
Number of identified defects that have yet to be addressed in the system.
The severity of the defects found.
The severity of defects and number of defects found after deployment to production.
Defects in particular are a problem when it comes to metrics. Finding more defects while in the unit test phase is good, but to what level should this be tracked? Creating extra work to track the items found instead of just allowing the individuals to work through the development process can be an introduction of waste. You can’t get to actual zero defects in code, zero-found defects possibly, but are all problems worth fixing? Does the issue cause a interruption, does it cause data problems? Understanding the severity of any defect is important to determine if it needs to be addressed right away or the next time the code is opened. It may be important to indicate when finding a defect later in the cycle if it should have been found in unit testing, properly identifying unit defects after code gets into later stage testing is a way to drive behavior for more unit testing by the developer.
It is also important to understand that this short list of metrics only covers a very small part of the overall picture of software quality as defined earlier, we need to look at metrics across the entire scope of software quality.
Story
I early in my development days when Six Sigma was a driving factor. Based on Six Sigma, the goal was to continue to decrease the number of defects within the system. You continuously improved your quality by reducing the expected defects within the system, the way this was measured was based on defects found. However, if you were working on a product that had been in the industry for awhile and was critical to the system in use, very little defects were going to be found. Ultimately, that meant we would have a release in which we reportedly found zero defects. Since we know this is impossible, we had a problem and could not satisfy our business goals. Meanwhile, there were other teams receiving awards for their Six Sigma work because they had so many defects, which made it very easy for them to improve. This is an example rewarding the wrong behavior and penalizing the correct behavior. We were actually improving the product that already had high-quality, but it was seen as bad. The products that were not high quality were seen as better due to the metrics used, larger decrease in defects in production, because they started with such a high number.
This kind of metric negatively affected the morale of the teams that already had high quality. This is an example of how metrics can be used incorrectly and how it is important to understand each product and how its capabilities should be measured. These products were not comparable and different metrics for each should have been used.
In the same environment, if we look at the number of defects found in production, or in this case in the customers environments, we could easily see and measure the fact that the product with very few defects was of a higher quality and was of a quality that generally satisfied customers. The goal should be to maintain that level of quality because it’s already high enough. We all knew there were defects in the system somewhere, no one was going to say we were completely free of
defects, so looking for defects was not a valuable use of anyone’s time. However, as changes were made it was important that we made sure no new defects were introduced in order to maintain the level of quality the customer was currently happy with.
Meanwhile, the team that had significant defects in the customer environment needed to invest a significant amount of time and effort to improve quality. They had to understand what areas needed to be significantly rewritten or addressed in order to decrease the number of defects in the system so that customers were not finding them. It is worth noting that there are always some defects caused by a product being used in a way that it was not designed for, however, defects in performance based on what the product was designed for always need to be addressed.
This story brings up the concept of flow distribution in of how much investment is going into providing work in of value, risk, debt and mitigating defects. Customers would love that all investment went toward the creation of new functions, however, that is not realistic. We need to balance the various areas of investment to ensure security risks before they occur, as well as debt, which is a future investment to value delivery.
Types of Metrics
Metrics can relate to different areas:
Availability metrics.
Business value.
Customer satisfaction / retention.
Development.
Learning.
Happiness.
Operational metrics.
Quality.
Security.
Value delivery speed / offering.
Availability metrics provide the measure of the availability of the service or function, what is the uptime of the capability. Availability might not be measured simply as up or down, it could be up but degraded, or it could be limited function. Capturing and understanding the availability required as well as the achieved availability fall into this category. Service-level objectives and service-level agreements (SLA) also fall into this category.
The required availability is a critical part of this metric. If a service only has to be available at a particular time of day in order to complete a task, such as a commitment between parties requiring a transfer to happen each business workday between 3:30 p.m. and 4:00 p.m., measuring that the function is up 24 hours a day is not critical. However, measuring that the function is active/up for those 30 minutes each workday is critical. This may be an unusual example, but it is important to understand what the true availability requirement for a function is. There are some functions, such as online ordering, where no down time is acceptable. Fully understanding the function helps determine the requirements for quality measurement.
Automated processes for measuring availability should be created to have the clear picture, just because all the parts of the system are up with monitoring tools it does not mean the system is available. Having a measure that tests the availability including its ability to deliver business value and performance of the system should be used to measure the actual availability.
Business value metrics vary based on the function, but are generally a set of key performance indicators defined for the capability. Where possible, these metrics should be automatically measured in areas such as revenue generated, completion of order or completion of a transaction successfully.
Customer satisfaction and retention show end-customer on the service provided. You will gather information such as Net Promoter Score, customer retention, as well as growing the services used by existing customers.
Development includes measures of development activity, such as the size of the pull requests, understanding if small incremental changes are being made. This category should include metrics on how the team performs. These are not focused on calling out individual behavior but rather team level metrics for each group. The type of application the team is working on should also be considered by these metrics as different types of applications will drive different behaviors. These measurements should be pulled from the system into a dashboard without requiring the developers to input additional information. (Many tools are building value-stream management capabilities, which pull these metrics from the underlying pipeline and connected tools.)
Learning is a metric of how much time the team spends on education, experimentation and exploring new alternatives. Making sure individuals are continuously learning as they are working is a metric that helps understand employee engagement. Where possible this should be tracked by systems automatically. If automation is not an option, a simple method of recording time should be created as to not create excessive additional work.
Happiness is a direct measure of employee satisfaction and/or engagement.
Happiness can be measured through surveys. Further input can be gained from employee attrition rate and team surveys. Happiness needs to be measured at the team level, as well as possible rollups to larger groups. Understanding each team’s level of happiness is important, to ensure issues are not being ignored by a global sense of goodness.
Operational metrics are a measure of the system performance. This includes measuring the throughput, response time and system usage characteristics. The time to restore the service for a disaster can also be seen as an operational metric. The operational metrics should be pulled directly from the system. Monitoring products provide this level of information, in addition z/OS has very detailed reporting that can be used to understand the operational characteristics.
Quality includes many metrics generally used to determine the overall software quality such as defect-to-feature, defect-to-developer ratios, defects that reach production, time to resolution, time to restoration, unit test code coverage, percentage of covered lines, test coverage by component, test coverage by business value process and mean-time to fix broken builds. Test coverage by component and business value process are looking not at the lines of code, but the capabilities covered in testing. For example, are all the critical business functions covered by the tests. This data should be captured as part of the definition of the tests. The system should provide these metrics, without the need of additional developer input.
Though there generally is a type of metric called quality, all the various metrics end up being affected by or driving the software quality. Take availability metrics for example, they are an obvious outcome of the software quality process.
Security and risk includes measures such as time to address security issues, vulnerability patching frequency, and time from available patch to
deployment in production. This can also include the output of security scanning tools both static analysis and runtime penetration testing. These measures should be pulled from the system directly, and clearly visible as part of the process. Security and risk defects should not be merged in with other defects as they have a different effect on the end-customer, they need to be clearly identifiable as security or risk related.
Value delivery speed / offering this includes time from when an idea is approved by the business until the time the function is in use by customers and it is delivering the prescribed value. This category also includes measurements such as flow distribution, percentage of time on new functions, defects, technical debt and what the risk associated with the functions are. Also factored in is time from the start of build to deployment into production, time from initial idea to entering production, work items completed per time period, work in progress, work in progress aging, flow efficiency and repository speed (time from merge request to merge). As is clear from all the talk about value stream management, the flow metrics, those that help understand the speed of work through the system, actually are some of the key metrics in helping understand the overall quality. Many tools have been created to help capture the flow metrics.
All of the above metrics come together to help demonstrate the overall software quality. Focusing on only one set of metrics will not provide a full, clear picture. There is no absolute when it comes to software quality, no one metric or function, it is the combination that determines the complete picture.
There are many other possible metrics used that have little if anything to do with driving software quality. Items such as the percentage of the team officially trained on a particular methodology, it’s not the fact that they are trained that is the important measure, it’s their activities that result from the training. This type of metric can be seen as a misleading indicator. The idea of misleading indicators came up when working on a paper as part of the DevOps forum 2020. It is as important to understand which metrics don’t help as well as
understanding which ones do. The paper called Measuring Software Quality can be found at https://itrevolution.com/forum-paper-s/. This collaboration with Cornelia Davis, CTO, Weaveworks, Stephen Magill, CEO, MuseDev, and James Wickett, Head of Research Verica, helped solidify some common misunderstandings with metrics across the industry.
Importance of Testing
Testing is one of the key ways we understand the quality of our software. As I’ve mentioned before you don’t test in quality, but testing is used to help you measure the quality, it is also the way we provide on the system. Testing can mean many different things and in the following sections of the book I will be describing the different types of testing, how they apply and which areas they will drive to help understand what aspects of software quality are important.
One important note, testing refers to not just testing the software that’s been written, but testing the middleware and the environment it’s going to run in to ensure you meet the reliability and scalability standards as well as the security that you need. All aspects are important to provide an overall understanding of the software quality as it will be experienced by the end-.
There is another aspect of testing that is important to understand though, and this aspect of testing is one that confuses many people, testing in production. When we test in production, we’re not testing for software quality, we’re testing for experience or functionality to see if that new function for that new idea is of some value to the end-. Testing in production does not mean that you write code and put it directly into production. The code will have been tested in various ways first, this idea of testing in production has more to do with testing an idea in production to see how s react to it. Another way to describe this difference would be the “technical requirements” testing (which would include business functionality, resiliency, performance, security, etc) and perception testing (UAC, A/B, market testing). “Technical requirements” should never be done in production, but “ perception” testing may make sense in a production environment.
The word testing is used in both cases and it is important to not misconstrue the implications. When people talk about testing in production, generally they’re not talking about software quality or looking for defects, they’re looking for experience problems and whether or not this idea makes any sense. This is where companies use A/B testing or soft rollouts or specific area rollouts to experiment with a particular new idea.
Testing is how we get , it’s how we get an understanding of whether or not what has been written or what is being put together will perform as we anticipated it to and whether or not it will provide the value expected. There are other reasons for testing, such as regulatory and compliance requirements. Many audits require that code has been tested before it rolls into production, the questions, however, are what kind of testing; what amount of testing; and what is required to satisfy those audits? The important thing to note is that, generally, audits focus on the software we are writing and they leave out very important other aspects, such as the actual deployment process and the environment in which it’s running in the way it’s designed to perform. All of these aspects should be part of the testing and verification process.
What Is Testing and Who Does It?
Testing is a key part of the software quality process, but what does the word testing mean? Testing is the process of evaluating a solution, system, or piece of software with the goal of validating it performs as specified in the requirements. Any organization that builds software for its own use or sale, or configures software to its own specifications, needs to validate that it satisfies the requirements that have been specified. Testing is part of the overall software development lifecycle. Testing covers a broad range of actives by a broad range of personas. Testing can be looked at in a number of ways, one such way is how it is performed: Manual testing vs. Automated testing.
Manual testing: Testers manually use the application’s features to validate correct behavior. This may be done by also running some forms of automation, but the validation is done via a person reviewing the output.
Automated testing: A process where all steps including verification are automated, results are a or fail with no manual review or intervention. Automated testing generally produces detailed information about actual processes, and can generate code coverage. It is important to keep the full definition of automated testing in mind. Many times tests are considered automated if they kick off the process for testing through automation, however, unless the results of the test are also automated it actually still requires manual processing, and therefore, does not count as fully automated. One example, I hear very often, the batch run is kicked off as a test. Yes, this full process is automated, but at the end of the batch run a has to validate that the results are as expected. Now if instead, the batch run was done with an automated process at the end to provide the results of the test, then it could be considered an automated test.
Once the how is determined, the next area to look at is type or purpose of the testing. These are:
Unit: Smallest level to test, done during development, by the developer, focused on all use cases, including errors and security problems.
Functional testing: Testing specific business function capabilities by testing the specific capability with as little environment around it as possible.
Integration: Testing some set of components together to ensure they can perform together to accomplish business functions.
System: Testing the entire system as it will run in production for specific business use cases.
Regression: Verification that capabilities for functions or a set of system-level integration that worked prior continues to work.
Performance: Testing how the system performs, how long transactions take to complete and what system resources are required to complete tasks.
Scalability: How system resource utilization changes as the system experiences greater load, understanding how to grow the system to expand the load it can process.
Operational Performance: Ensuring the monitoring systems that will be used in production are also tested and verified early. The monitoring will need to be tuned based on application changes to provide the appropriate early warning system of possible problems. For example monitoring for performance of the system to identify early slowdowns before a failure might occur.
experience: Actual interaction with the system, to determine if it flows as a would expect, if tasks can be performed easily and what assistance is required.
Exploratory: Random interaction with the system to attempt to find possible problems, cause issues.
acceptance: End verification that capabilities requested have been provided.
A/B type testing: Providing two different options for capabilities to subsets of production s to see which provides better interaction for the end s.
Market testing: Similar to A/B testing, providing a specific capability to a subset of production s to determine if the additional capability provides a better experience.
Each will be discussed in detailed sections later.
While all tests are defined or automated in some way they also need to be managed as artifacts within the system. The tests are source code the same way the application is made of source code, and therefore, should be managed in the same way along with the applications.
When discussing software quality and software testing, it is important to note there are a number of roles that contribute to the process:
Developers: Individuals creating capabilities as part of the software system.
Testers - Automation: Individuals creating automated tests and automated test frameworks.
Testers - Manual: Individuals manually testing capabilities, integration, the system.
End s/Business s: Individuals representing the actual end , or the actual end providing on the system.
Specialized testing skills performance/scalability: Individual with specialized skills for particular areas such as performance and scalability testing.
System engineers, Systems programmers: Individuals responsible for
creating the automation for, and/or updating the system for infrastructure related changes, or application infrastructure related changes, this can include DBAs as well. Most individuals working on z/OS from the system side are considered systems programmers or sysprogs.
Automation engineers: Individuals responsible for creating automation for the system such as for the pipeline or the deployment process.
Deployment/Release managers: Individuals responsible for the deployment process, could be they are performing the deployment or creating the automation for the deployment.
When thinking about software quality, the role people typically think about is tester. However, anyone involved in the process is contributing, all the roles involved in the Software Development Life Cycle (SDLC) play a part and should be included when addressing software quality.
When we think about testing, maybe we think about the traditional test organization, the application testing done after the development is completed. Or maybe we think about performance test groups that handle testing large-scale applications. Or maybe we think about the developer doing their initial test of the code that they’re writing. All of these are types of tests.
Historically, there have been different transitions in the test practices. Testing was done by the developers writing the code and there was no other group or team to hand things off to. But over time, as applications grew, and their functions became more critical and regulations arrived, things changed. It became necessary to have someone test your code that wasn’t the person who
wrote it. Large development teams would now write the code and then turn it over to the test team to do the actual testing.
This process evolved over the years into the waterfall development methodology that involved numerous handoffs and stages. A benefit was the development of skilled resources for testing. People who were good at testing for malicious bugs, for unexpected behaviors, or simple errors that might be made were able to help the developer. Automated testing was also sometimes done, which made testing much more efficient and allowed results to be returned faster.
Often these teams evolved into very large groups of individuals to do manual testing. The developers were responsible for writing the code and making sure it ran, but really nothing more than that. In some cases, if a program compiled it was considered working because testing the individual program was too difficult, and it got moved forward. This process was not efficient because the developers didn’t get on the code that they wrote until long after they had forgotten what they had done. Sometimes it could be months before the developer would receive any . Also, by building up large teams of manual testing, the test process itself got longer and longer.
These test teams grew over time and many times split into multiple different types of testing; functional testing, system testing, integration testing, performance testing, acceptance testing etc. Business analysts were used for this testing in many cases and it was, therefore, done in a very manual way. These were not developers by trade, but were from the business side and only knew the expected function from the experience not the development. The use of business analysts it provided a better test, but only provided testing for what should happen and what they expected to happen, not necessarily covering all of the possible error scenarios or problem spots that should have also been tested.
How can we be the most efficient at retrieving that and what roles are
we going to have when it comes to testing. The first place is the developer, the developer really needs to be responsible for writing that first automated test for the unit they just wrote. Having that automated test for the code that was written helps ensure that the code works the way the code is expected to work, including error conditions and failure scenarios. The problem with many of the complex systems is that this automated testing was not created as the systems were developed. Now organizations have large code bases with limited automated unit testing if any. Dealing with this lack of existing testing is covered in Unit testing in Section 3.
The question is, what testing takes place after this developer unit test and who does it? In a waterfall development environment that would be a team responsible for function testing, and probably a separate team for system and performance testing etc.
But, with the transition to product teams, we need to look at what the new normal for testing will be. Who will do the testing and what team are they part of, which is a larger discussion that will be covered later in the book. The types of testing and the function of the testing is covered in detail in section 3. No matter who does the work the types of testing that need to be done are called out and specified.
What Is Continuous Integration?
According to Martin Fowler: “Continuous Integration is a software development practice where of a team integrate their work frequently, usually each person integrates at least daily -- leading to multiple integrations per day. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible. Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly.”
The Continuous Integration cultural transformation grew out of the developer’s need for efficiency. Developers began creating tools to automate the mundane portion of their jobs, such as infrastructure and testing, so that more time and energy could be spent doing the interesting work of developing code. As this movement has matured, it has spread further into the software delivery process, so that there is an organizational drive to remove handoffs that are seen to contribute to waste. Enterprises are asking how we can be the most efficient.
Key Parts of Continuous Integration
Single source code manager (SCM)
It is first key to maintain a single source repository. All source files, configuration files, anything that goes into the making of the application that is not binary output, should be stored in the source repository. The most common error is that not all source files are stored together or even managed. It is important to make sure all the source files are together in a single SCM along with the configuration.
NOTE: It is important to understand this should be a single repository for all files, not just distributed, but across the organization, any source files should be managed in this single repository, this include COBOL, PL/I , Assembler, JCL, REXX along with the Java, c/c++, c#, JavaScript, etc. Today z/OS source code is likely separate in a z/OS specific management capability. Managing source has been a requirement due to regulations for long enough that the source code management solutions for z/OS were developed and matured before other platforms commonly used such systems. However, these systems are highly tied together with the underlying system, including the compile and output deployment process.
Fully automated build process
The next step is a fully automated build process. The build needs to perform all the steps to create the output that will allow the application to run. This does not mean it has to rebuild everything, it should be set up to build the appropriate changes in context of the full application. The build process should also be selftesting, the process should not just create the output, but should provide enough unit testing, code quality scanning, and security code scanning to meet your company’s requirements. In today’s z/OS environment the process generally only includes compiling those programs that have changed, so it’s fast, but it is not self-testing, and does not include code quality scanning.
Build runs on a build machine
One aspect of continuous integration is that the build always runs on a build machine. The build machine should be a defined piece of infrastructure, possibly even a container, but it’s a defined set of infrastructure that ensures it is not changeable by the developer and is controlled so that undesirable code or aspects can’t be added. This also ensures that it is not on any individual developer machine so no individual variances get picked up.
NOTE: This has always been true of z/OS builds, the only place they could be built was on z/OS so it was not happening on individual machines, but this also means, in a standard place not in personal libraries, or using personal JCL. The build is controlled by the system.
Fix broken builds right away
Another key to the development process is to make sure that if anything goes wrong with the build it gets fixed immediately, this is critical for the continuous integration process. The person who breaks the build needs to fix their error as quickly as possible, and/or if there’s a major problem, everyone needs to jump in and fix it to make sure that they can all continue to work. Generally having some timeframe to fix the build before calling on the full team is a helpful addition.
This concept of pulling the cord, as one would do in a manufacturing line, to get the entire team to fix the problem before work piles up is an important aspect to fix builds fast. A broken build means people cannot get their work done and everything slows down. It’s just as important to recognize that the time to fix the build is based on the time before additional work will need to be contributed.
Another important aspect of this category is making sure you have automated testing as part of your process. You need to make sure the tests that are part of the regular build perform each and every time and are stable and clear. This is not about driving a large number of tests as part of the build. The tests are as much a part of the product that is being built as the function itself, you have to manage the tests over time. The goal should not be just more tests, just specific ones that test the function you are adding. You can’t have flaky tests as part of the build or the build errors will begin to be ignored and your builds will always be broken, but you won’t know if it’s really broken or it’s just those flaky tests that are causing a problem.
In traditional z/OS development the concept of a build is a bit different as it is going to compile and link-edit to produce the load module that can then be run. This process does not include automated unit testing, code quality scanning, or security scanning. Moving to an automated process that includes all these steps
is also a change.
Keep the build fast
Another important part of continuous integration is the aspect of keeping the build fast. We want the build to be fast and to make sure the developer gets the quickly. The build, not only needs to be fast, it needs to be seen as a critical aspect of the process such that if it breaks it’s fixed asap and there is no delay or wait. What does fast mean, that varies by what you’re building and it is important to make it reasonable so that as soon as a problem can be found it is detected and returned to the developer. For example, compiling their changes first to make sure they are successful and running the unit tests for the particular code changes, helps ensure the is as quick as possible.
This idea of keeping the builds fast has been a standard part of a traditional z/OS process. Only the files specifically modified are generally recompiled. The exception is if a shared included file, such as a copybook, is changed, then sometimes additional programs that include the copybook are also recompiled.
Make the builds and tests visible
Another key aspect of continuous integration and this cultural transformation to DevOps is making things visible. The builds, tests and results all have to be visible to and viewable by everyone in the organization. The results need to be clearly indicated, i.e. green builds vs. red builds. It’s important that you can get green builds on a regular basis. The thing we’re focused on in the build and the tasks in the build is making sure we have a clear quality signal. From the very first build we want to know that the function can work as it has before. Having automated tasks that reliably provide information on whether or not the code works is more important than having lots of automated tests. We will discuss later how to get this clear quality signal, but for now it’s important to recognize that we need this clear quality signal as part of this build and test process.
Make sure the output is published to easily get it
As the build is created you need to make sure the artifacts created by the build are published into an artifact repository so that they are easily findable and usable as part of the deployment process. The artifact repository is also used for audit and security control. This artifact repository needs to include all artifacts for the entire system. Currently z/OS artifacts generally reside only in z/OS. The transition to moving output to an artifact repository is a capability that has been added to z/OS in recent years.
Why Continuous Integration
Why is continuous integration so important? Organizations have found the quicker you integrate the changes into a common branch the less rework you have to do. One of the biggest problems is the world of merge hell. The phrase merge hell came about due to individuals or teams working independently for too long so that trying to get their code back together becomes a real challenge. And since the function is going to be delivered together it is important to make sure that it works together. The other value is having developers focus on the smallest possible chunk of code that can work, that does not break and that can be part of the system. The function might not be completely done, but it can have feature flags around it so that it doesn’t break other systems while it’s being built.
There can be many uses for feature flags, functions that will be exposed by some marketing campaign on a specific date, or regulations that will take effect on a particular date, or a change in the interface that all teams are not trained on yet. Any of these changes could be controlled by feature flags to make sure the code has been tested and is in production, and it’s not a deployment that turns on the function, but a business rule or flag that activates the function at the right time. By learning to work through continuous integration, smaller functions can be designed and built and delivered more frequently. This changes how developers have had to think about and focus on their assignments.
NOTE: In traditional z/OS development, parallel development was limited by using a single Dev level, so all changes were actually visible to everyone and affected everyone right away. In a sense, it was always doing continuous integration, but then there was no opportunity for isolated work. The goal with today’s continuous integration is to allow for the isolation in early work, but getting teams to integrate small additions frequently. The idea of continuous integration not only allows independent isolated work, it s the early
process to validate that the function can be merged in without causing issues.
How Does Continuous Integration Building Quality Apps?
Through the use of continuous integration, you are quickly bringing together all the changes being made within an application to get fast , this ability to quickly understand if you are building what is required helps remove the longstanding issues of building to what you thought the s wanted. By being able to get early the rework is reduced.
If continuous integration is so helpful, why is it so hard to adopt? Continuous integration is at the core, a way of working, a set of practices for people to follow. Changing the practices of people can be hard, it requires changing their daily work, but not just changing the actions of the daily work, changing the way they think about the work. Moving to small increments can be very hard for people who have been working on long-term schedules, who think of things in month-long projects. The team reaction does vary based on how long they have been working this way, as well as the type of applications they are working on.
Front-end component teams, building apps for the phone or websites, are most likely to quickly make this transition, as in many ways their work has been formed originally around the idea of small incremental changes. They also have the advantage of having actual end- to help drive the capabilities.
Teams building new capabilities are the second most likely to move quickly to this paradigm of continuous integration as they don’t have an existing base to deal with.
Mid-tier and back-end systems are where the biggest challenges exist. Many of these teams have large monolithic applications they are working on, they have been conditioned to work on multiple projects at once, work to get as much into the release as possible as it does not happen frequently. The flow of business value is dependent on all the parts delivering, therefore will move at the speed of the slowest part. The idea of focusing on one small thing, getting it done and moving on, is very different. The monolithic nature of the application is not the issue, it is entirely the focus of the plan, the scope of change and the way the individuals have been working. This is especially true for teams that have been working the same way for decades, with many of the same team and not many new people. The challenge is in helping the teams understand the new ways of working. Continuous integration requires this change be introduced in the smallest possible increments, learning to think this way requires practice, and coaching.
Another important aspect of continuous integration that should not be lost, is that it implies the ability to work in isolation before integrating. The ability to work in isolation is very important to this overall process of creating small functions, testing them first, and integrating quickly. The ability to have the isolation is what provides the flexibility and the space for innovation. This ability to work in isolation is something not generally available to traditional z/OS developers, but there’s no reason it could not be provided.
Key Insight: Continuous Integration
The value of continuous integration lies in the ability to work in total isolation and in bringing functions together as quickly as possible. Both are required.
Continuous Delivery, Continuous Deployment
Continuous delivery is interpreted in many different ways. Based on Martin Fowler’s definition, “continuous delivery is the software development discipline where you build software in such a way that the software can be released to production at any time.” The key factor here is that it is ready to be released. The focus on building in small enough increments to finish capability regularly that can be deployed into production. Focusing on completing changes, getting to what is sometimes referred to as the “done done state.” Not that the code is done, but all the steps required are done.
If continuous delivery is being ready to deploy to production, what is continuous deployment? Continuous deployment is a process in which software functionalities are delivered frequently through automated deployments. Continuous deployment uses automated testing to ensure issues are addressed early in the lifecycle.
The difference between deployment and delivery is very important. One must be able to do continuous deployment into test environments, development environments etc. very quickly and very effectively. The process of delivery to production must be capable of using the same methods and efficiency in deployment, however, there is a fundamental question about how often you really want to deploy to production. This question is important when we consider the types of transactions and applications that exist within different companies. Within any enterprise there are going to be a variety of applications that will require different frequencies of deployment. As noted by Tapabrata “Topo” Pal, a social transaction does not equal a financial transaction. It is important to recognize that there are different types of transactions, and based on those characteristics, the frequency of deployment may vary.
One example I like to use is a calendaring service, which provides functions to help a company organize based on what day of the week it is, whether or not it’s a holiday, and other factors about the day itself. We use the Gregorian calendar, which was originally introduced in 1582. However, in the event that things do change, maybe new holidays need to be added, daylight savings time changes, and technology itself changes, you may want a new API based on the new technology, meaning a new interface would be added. But the service doesn’t really change very often, so measuring frequency of delivery or even deployment, is not a realistic measure of the function itself.
Let’s use another example, a banking application. The back-end system provides the official books of record, it provides the balance function, which is called when updates are made. Does this function need to change multiple times a day, multiple times a week, multiple times a month? Maybe not. The frequency of change for this function is dependent on the requirements for business change, so again this is an example where measuring frequency of deployment to production may not be the best measure.
Finally, a third example, in this case a mobile banking app that is largely responsible for the experience with the financial institution. This capability provides information about the balance yes, but also about all the other capabilities, functions or features the bank wants to offer. This app may need to change very frequently, such as when there’s a new marketing plan or there’s a new way we want to show information or provide additional value to the s. It’s an example of an area where frequency of deployment into production is going to matter.
These are just three simple examples, there are lots more, but the key point here is that different applications, different components and different capabilities have different requirements for deployment into production. Instead of focusing on the production deployment, if we focus on the ability to deploy into an environment, we can help ensure we have that fast ability to deploy capability but don’t necessarily measure something that might be irrelevant for that
particular capability.
All of these pieces are part of a larger system that needs to be managed and controlled together.
Even if the capability does not need to deploy frequently, the ability to deploy quickly is still important. I want to know if I can deploy quickly and effectively into any environment. By improving the process for deployment, you should have the ability to make changes quickly when you need to. For example, let’s say there is a problem that might require an emergency change, which is just a way of introducing a change into the system without going through the normal long process. If my process for deployment is actually fast enough, then I can use the same process and not have to use an expedited, outside the process, method in order to get that change out. I like to say my goal is to make it so that the pipelines are fast enough that no one ever has to by them to put anything into production.
Key insight: Automated Deployment
Having a fully automated process allows for the rapid delivery into production for all changes, so that no change goes into the production environment from outside the process. This removes the need for any manual changes for an application deployment, including the related infrastructure components.
IBM Z: A High-Level View
IBM Z overview:
IBM Z is often referred to as the mainframe, the Z stands for “zero down time.” IBM Z 15 is the latest version of the hardware available at the time of this writing. The machine has been updated to fit in a standard 19-inch rack with the same hot and cool aisle design to easily fit into today’s data centers. The hardware is designed with multiple redundancy in mind. It s OpenShift and runs multiple operating systems including Linux, z/OS, TPF, z/VM. It is designed to provide pervasive encryption, fast I/O rates with built-in compression, optimized for high throughput transaction processing that requires reliability, security and availability. The system is also designed to dynamic upgrades to allow systems to remain running through most changes. The Z 15 can scale up to 40 TB of memory on a single system.
Within the z/OS environment the concept of sysplex, which refers to a tightlycoupled cluster of independent instances of z/OS, is ed. As you might imagine from the term cluster, using a parallel sysplex allows you to route work across the environment allowing for rolling upgrades as you would in other cluster environments. It is also designed to allow rolling upgrades of the full operating system as well as the middleware running within it. In order to manage this system, a coupling facility is provided to the multiple environments. In a sysplex environment a system can be configured for Db2 data sharing, which again allows one environment to come down while the system remains up on the other. The z/OS middleware provides capabilities to the sysplex environment allowing applications a level of flexibility.
z/OS itself s the Unix System Services (USS), which provides a POSIX compliant environment. It is in this USS environment where many of the tools ported from the distributed environment run. The other side of z/OS, is often referred to as MVS, which was its old name. One key difference in z/OS is that the system primarily runs in EBDCIC instead of ASCII. However, z/OS can
ASCII files, so when working in the USS environment, applications can choose to continue to work with ASCII. z/OS s many different languages and run times. Languages such as Assembler, COBOL and PL/I are generally discussed when working with z/OS, but Java, C/C++, node.js, and python are also commonly used on the platform. This is not an exhaustive list of the languages ed, just a set to show that a variety of languages run on the platform, and as new languages come up many of them also come to z/OS based on their popularity.
Definitions of key :
Parallel Sysplex (Sysplex): A clustering technology that allows you to operate multiple copies of z/OS as a single system image. Images can be added or removed to the cluster as needed, while applications continue to run.
Logical Partitions (LPAR): Each LPAR runs its own operating system, this can be any of the ed operating systems for the hardware. Each LPAR is separate however the processors can be shared between them based on site configuration.
z/OS: 64-bit operating system designed for the Z architecture, designed with security and reliability at the core. Designed to many applications running on the same environment at once while providing the right separation and priority to specific workloads.
z/Transaction Processing Facility (z/TPF): An Operating system and a unique database, all designed to work together as one system to provide a high-volume, high-throughput transaction processor.
z/VM hypervisor: A virtualization technology platform for IBM Z and IBM LinuxOne servers. It s Linux, z/OS, z/TPF and z/VSE operating systems.
Partitioned Data Set, Partitioned Data Set Extended (PDS, PDSE): File system type available on z/OS, sometimes referred to as a library, where application load modules are stored in order to be executed within the environment. PDSE was introduced in order to simplify interactions with the PDS. Today most systems should be running primarily with PDSEs, though some remain PDS due to issues with other tools, or other historical considerations, never moving to PDSEs. PDSEs and PDSs can be linked together in a hierarchy as is defined in a search path.
Virtual Storage Access Method (VSAM): File type and access method available on z/OS. Primarily used for application data. You can use VSAM to organize records into four types of data sets: key-sequenced, entrysequenced, linear, or relative record. The primary difference among these types of data sets is the way their records are stored and accessed.
Unix System Services (USS): Provides the Unix environment within z/OS. This allows many applications from other environments to run on the system through recompiling for the platform. USS s ASCII as well as Unicode to simplify the move from other platforms.
z/OS File System (zFS): Provides the posix file system for z/OS in of USS, which s the Unix style of folders and long file names.
Customer Information Control System (CICS): One of the primary transaction processing systems that run on z/OS. CICS provides the ability to run an application online, having multiple s access the same files and programs. CICS manages the sharing of the resources, the integrity of the data, and the prioritization of the requests. CICS applications can be written in COBOL, PL/I, Assembler, C, C++ and Java. Many CICS applications use Db2 for z/OS for the data, though they could also use files such as VSAM, or IMS DB.
Job Control Language (JCL): Is a name for a scripting language used on IBM mainframe operating systems to instruct the system on how to run a batch job or start a subsystem.
Information Management System (IMS): IMS consists of three parts, IMS DB (the database manager), IMS TM (the transaction manager), and a set of services common to IMS DB and TM. Collectively known as IMS DB/DC the three components provide a complete online transaction processing environment.
Green Screen: Term used to describe the native interface to z/OS, z/VM and TPF. The term was started due to the fact the first “screens” available to the system were terminals that only displayed green characters on a black background. Figure 2 represents a typical green screen, showing the editor interface.
Figure 2
Interactive System Productivity Facility/ Program Development Facility (ISPF/PDF): ISPF is the base menuing system provided by z/OS to allow for interaction with the system, many people consider ISPF equivalent to the command-line on other systems. ISPF is designed as a walk-up and use menu system. PDF was built on ISPF to provide the basic development capabilities including the Editors. Figure 3 shows the primary options menu of ISPF/PDF.
Figure 3
Software Configuration and Library Manager (SCLM): Product provided by IBM to manage the software development of z/OS applications written in languages such as COBOL or PL/I. SCLM provided the version control for the source code, the compile and link process (now generally called the build process), and the promotion process to allow for changes to flow between environments using a concatenation hierarchy of PDSs.
Workload manager (WLM): WLM is a part of z/OS that monitors the sysplex environment and determines how much resource should be given to each item of work in the sysplex to meet the goals specified for the system.
Traditionally z/OS applications have been written in COBOL, PL/I or Assembler, or Java. These applications run either in the batch environment or in a runtime, such as CICS or IMS.
Traditional z/OS development:
A little background for those of you not familiar with traditional z/OS development. For those who haven’t worked on z/OS, imagine a green screen with a menuing system and a seemingly archaic and very different environment. For those not familiar with the current z/OS systems, you may be surprised that all of the modern tools you currently use in the distributed world can also be used for z/OS.
Traditional z/OS development is managed through production control tools, such as Endevor Changeman, or SCLM. These production control tools or library managers provided a structure and the proper controls to satisfy separation of duties and to help ensure the flow of changes from the developer into production. This worked through a process of library concatenation, and having changes moved up this defined library concatenation sequence. See the picture below (Figure 4) which represents a simple hierarchy that might exist.
Figure 4
This pictorial representation shows the stages a change would flow through, starting with the development level moving to test, then to the Quality Assurance level then to production. It is possible that there might be more levels, such as a Pre-Production level as well. This represents only the portions of the application running on z/OS.
Some organizations developed a more complex hierarchy to allow for multiple development environments, as shown in Figure 5. However, the flow of code was limited to this hierarchy and these libraries represented not only the level, but the actual environment, in which the code would run. The only exception to this was the production environment, in which most of the time the production load library did not actually represent the running system the loads were usually deployed into.
Figure 5
The process of promoting the code from one level to the next would not only move the code up the hierarchy but also the build out into the next running system.
These library managers provided not only the library control, but also the build process and the deployment process into the different stages. This deployment would include the database bind process as well as possibly new copies into the CICS. It would move the JCL into the appropriate libraries, but it did not include the system level changes, configuration or schedule changes.
These library managers were specifically designed around the PDS moving to PDSEs, but were not focused on the hierarchical file system, which was added after they began. These library managers usually only controlled the actual source code, database definitions, and sometimes included some of the application JCL. In many cases, the system-level JCL was not included. The way the developers would interact with the system was through the ISPF interface (Figure 6), this menuing system has been optimized as a walk-up-and-use system to make it easy for new s.
Figure 6
However, advanced s took advantage of the additional features, such as the ability to create edit macros, to quickly enhance their capabilities. One advantage of using ISPF was that if your system had the appropriate resources it was very fast and it was always available. If you want to think about it in comparison to a distributed system, you could think of ISPF as the commandline to z/OS, it’s the thing you’re going to use when nothing else is available as a developer.
In order to move through the promotion hierarchy there would be a set of rules as to who could move and when could they move. Organizations would come up with complex release schedules based on the calendar to indicate when specific code should be in each level. This calendar will determine when things could move to later stages rather than when the code was actually done. This could lead to a developer writing and doing their testing in lower stages and then waiting months before the system test or another level of test was being done.
Another characteristic of these library managers was that they locked the source code to the first person who checked it out, through this process you could limit parallel development because there was no real merge feature in the system. Organizations came up with many different ways around this limitation by allowing developers to pull a second copy and do their work outside of the system. This led to problems, such as developers having to do manual merges that resulted in a need for complex communication between developers to ensure no code was lost in the process.
One very common example was the need to do a production hotfix while development was going on, in this case sometimes developers would be told to remove their lock on the file so the hotfix could move forward. In other cases,
the hotfix would flow into a higher-level environment and the developer may not notice until an audit report was run. This could happen such that the developer could finish their code, finish their testing and only discover at the end, that there was a major change that they would have to incorporate into their code before it could move forward. This could cause a major rework near the end of the phase.
One other characteristic of mainframe applications that’s important to understand is they are made up of hundreds or thousands of load modules. These load modules can be built and deployed independently. Many organizations take advantage of this incremental deploy nature of the application, they can take advantage of it to the extreme given that different feature enhancements could be decided (accepted or rejected) at late stages in the testing process. This means changes that are related need to be tracked carefully so that if it is decided that one feature will not roll forward, it must be carefully removed from all parts or it will cause late rework in the cycle.
Each of these environments will have its own set of data, and depending on the client, will have varying refresh cycles for that data. This data can be very inconsistent and hard to deal with when testing functions. Due to difficult data, developers often find that the development-level environment is not very useful for testing and they try to move forward to the first level of test as quickly as possible to actually do any real testing. This development-level testing is usually focused on running the entire function in order to test any small code change. Unit testing has been very hard to do as there generally have not been unit test frameworks to assist the developer. The other issue is that load modules within the application are generally highly interconnected, and so, understanding how to run only a portion of the application has been difficult if not impossible.
Background for Enterprises
Quality always needs to be considered when building software, from small oneperson organizations to large enterprises. The size of the organization and the background of the organization, however, does influence the process. In this section we will discuss different types of organizations and how they may have arrived at their current software development lifecycle, which includes how input from customers factors in.
There is a large variability in how customers deal with software quality today, in many cases there is a specific focus on testing. Organizations have evolved quality assurance practices differently, primarily by focusing on different aspects of testing. Understanding how and why organizations are where they are currently is key to being able to continuously improve the processes. I will define a number of different customer types, which are abstract versions of customers I have worked with.
The first customer type is the client that has been in business long before technology was a factor. These organizations were converting manual processes into automated processes as technology became available. They started creating the first back-end processing systems on mainframe computers which were designed to replace the existing workflow processes. The early process had specialized resources managing the system and early software developers building COBOL applications, who were also responsible for testing their own code. Over the years these systems grew and evolved, the teams grew larger, regulations ended up being put in place over separation of duties, and additional boundaries were added. With the advent of the personal computer, additional development teams were included to develop multi-tiered systems, still using those back-end applications that were continuing to evolve.
These companies went through the various evolutions of service-oriented architectures, WebServices and representational state transfer (REST). As the applications continued to evolve, new front-end systems were created with each new industry wave, which usually involved adding teams. These companies viewed themselves as businesses, not as IT organizations, and therefore, IT was generally considered a cost center.
Over time, organizations adopted the waterfall software development approach to have more control over the process. Boundaries were put in place between teams to provide the separation of duties required by regulation. Many times, little effort was placed on automating tasks because each individual team was responsible for only one part. With the recent growing demands for digital transformation and the draw and promise of the cloud, additional development teams were formed to create new digital platforms. More DevOps-like practices were used with the new digital platforms, and due to the cloud, automation was a central focus. The teams would use the latest technology both for tools and development languages. However, these new digital platforms needed to interact with the existing back-end systems and thus created the thinking that was behind two-speed IT, which has proven to be ineffective.
Oftentimes, these customers use what I call, scrum fall, within the back-end systems, for the SDLC, while they are using continuous integration and deployment on the digital platforms.
Definition: Scrum fall -- Adopting some agile practices within the development effort, while still maintaining the existing separation of teams and environments for testing. The movement of code between the different environments is determined by a predefined schedule for when each release will be in which environment. The development teams may follow more of the agile practices, but the code then flows into the remaining waterfall testing cycle.
Scrum fall is a localized optimization of the value stream in that it can allow a single part of the process to flow more effectively, however, the customer and the company do not benefit until the product is delivered. It can be a step on the way, but understanding the full value stream is required to improve the entire process.
Generally, in these organizations the digital platform gets the most unit and automated testing, other parts may have some automated regression, or system test, but the largest portion of testing remains manual, focused on function and integration. Performance and scalability testing, if performed, is done at the last stage just before moving to production.
The second type of customer may have originated in the same way as the prior, but they decided earlier that IT was a critical part of driving the business. Parts might still be seen as cost centers, such as the data center itself, but the applications and the application development was owned by the business. Many times, this caused additional barriers between each of the lines of business when developing systems. Additional process controls were put in place to control the impact of one business system on another. This essentially led to later stage automation and a focus on final integration testing. Often, due to the business drivers, the move to DevOps was started earlier, which meant more localized automation for testing. Newer applications were built with automation in mind from the beginning, including unit testing.
A third customer type is one that originates with IT, they started with computer systems and designed the workflows from the beginning, assuming automation. These companies have many of the same types of challenges as the other two. Many times, these organizations have been faster to adopt new technologies, which means they may have even more variability within the entire IT landscape. More variability in of the types of systems, the languages used, and the architectural patterns used for the applications. This diversity provides additional challenges when trying to tie the systems together.
The way the organization has developed influences the boundaries between the various systems, and the investments that may have gone into the systems over the years. Those organizations that focused on IT as a cost center generally have had less focus on addressing technical debt, and are likely to have more manual processes. One commonality of all of these large organizations though is they have large systems of record running on IBM Z, many of which have not seen the level of process improvement or focus on continuous improvement of the systems of record, more generally the changes have been made to due regulatory, or business requirements.
The other consideration is that many current organizations are the result of multiple acquisitions, each having its own IT history. This typically results in companies having multiple sets of applications doing the same thing. Sometimes one application is replaced by another, but many times due to the differences in the systems, they end up having to work together.
In addition to the way organizations developed, the way software has been developed as projects has also led to inefficiencies. Focusing on a project leads one to focus on the features and functional defects as well as the predefined schedule and budget.
Not only does the enterprise history and project nature come into play when considering software quality, but it’s important to recognize the various personas in those organizations:
A new developer in an organization, usually a recent graduate, but could be anyone new to working in IT, with some recent education related to software development.
A long-term experience developer, with over 20 years of experience working on various platforms and applications.
A long-term z/OS or TPF software developer, with over 20 years of experience working on the platform as well as the applications.
A long-term tester, who generally does manual testing, who may have come from the business originally, but has been in the QA organization for years.
A long-term test automation specialist, who has been working to automate some percentage of the testing for inclusion in the various types of testing, such as regression, performance or scalability testing.
A new systems , or SRE, brought in to the organization, usually a recent graduate, but could be anyone new to working in IT with some recent education related to system istration.
A new systems programmer, recently brought in to work on the infrastructure side of Z hardware, generally knowledgeable about other systems and automation techniques.
A long-term systems , or SRE, who has been working on various systems for over 20 years, sometimes specialized in areas such as network, storage or automation.
A long-term z systems programmer with over 20 years of experience working on the platform, many times specialized in one particular subsystem or operating system, such as CICS, IMS, Db2, z/OS or TPF.
Business analyst responsible for translating the business requirements into capabilities to be built, as well as the first verification of the capability for the end s.
In the above list there is a clear separation of software development, testing, system istration or system programming, along with the split of z/OS and everything else. There is more breakdown in persona types such as front-end developer, or back-end developer, performance tester, as well as additional persona types, such as managers. These are specializations of the personas listed, the hierarchy ing them or variances of them. For our purposes here, the goal in calling out the personas is to recognize the split that happens within organizations, as well as areas that need consideration when deciding how to create product focused teams.
How Organizational Structure Impacts Software Quality
Organizational structure can have an impact on software quality in a number of different ways. The first is ownership, when you feel responsible for a capability, from development through to use by end-customers, your attitude and way of thinking about the function changes. When your responsibility involves only one part, say development, then you don’t have the same ownership. When your part is done, the responsibility gets ed on to the next team, and then on through a number of teams until the last group that will run the software in production takes over.
As organizations evolve they develop structure and hierarchy, this structure can lead to handoffs and silos.
Handoffs
When responsibility for capabilities moves between teams there is a handoff. When a separate group is required to perform specific tasks there are queues or request systems put in place. When a testing organization is totally separate from the development organization it can feel like a throw it over the wall approach. This separation guarantees that the tester is different from the original writer of the code, but it also means that code must be complete before it is turned over to a separate team for testing. All of this separation can cause delays and wait states for the developers as well as for the testers.
Test organizations have over time created additional resources so they can test code frequently as it goes through the development process. However, this ability to ensure code is tested repeatedly, as drops are delivered, before it moves on to production is entirely manual. This focus on manual testing, to validate that the function works for the end-, takes significant time and effort, as a full environment needs to be set up for each test. Typically, these tests focused on the end-to-end process. Using this end-to-end level type of testing, to identify basic code problems, is not efficient and adds to the costs associated with the development process.
Initially, these separate teams were created to keep developers from testing their own code, to ensure the separation of duties, in fact, many of the organizational boundaries that were created over time were to address specific issues. Essentially, as problems were identified processes were put in place to address them, which led to heavier and heavier process overhead, and to more and more separation between teams.
This separation also began to affect technology. In some cases, in order to ensure separation, ticketing systems were set up to provide formal communication channels with tracking. This led to literal walls being put in place to avoid communication outside of the formal channels. At the time, all of these controls were put in place for what seemed like a good reason.
Story
As these processes developed over time, so did regulations around separation of duties. As a result, many organizations set up separate teams and put firewalls and controls in place to limit access, this also limited the flow between environments. For many companies this turned into having a set of rules for transfer of information, specifically locking everyone but production control and operations out of the production environments, and many times, out of the late stage testing environments too. Developers only had access to the development and early testing environments. In order to transfer between the environments processes were put in place to have the later stage pull from the lower stage, never allowing transfers up. Due to the extensive firewall setup, only resources within a particular level could communicate with each other. With the difficulty of moving between environments, separate scripts and automation were generally created at each level in order to perform any task. The environments were separately maintained and managed. Over time you can image this led to very inconsistent environments where new environment problems would be found as changes progressed through the testing environments.
An additional challenge was that all of the IBM Z was typically on the production side of the firewalls, even though it contained many different levels of environments. This was because of the shared nature of the z/OS and the concern of possible impact to production. Some organizations did not separate out the different environments, they all ran within a single logical partition. This made the environment easier to manage because everything was within a single system, and the workload manager could control what work had access to what resources. However, this meant that any changes required, even for development activities, took the same level of approval and control as a production change. Even in organizations where the development and test environments were split out into their own logical partitions, the change process was generally controlled as production.
One of the really good things about IBM Z and z/OS is its workload manager, which included the ability to ensure the workload you want to run is prioritized above other work. This does make it very easy to dedicate what is required to production, and when production spikes, starve the development and test environments. This doesn’t mean they won’t run, just that they will run much slower. In some organizations this meant that, at the end of the month, they would not even be able to to the system to do any work.
NOTE: There have been many options available to developers for years to make use of provisioned test environments spun up into cloud environments to address issues such as the one above, but not all organizations take advantage of such options. For organizations transforming, one key aspect has been that of providing that same dev test capacity available to the distributed teams to the z/OS development teams, as well as the automated acceptance test capabilities.
Silos
To address the issues of silos, separate teams, extra waste and lack of ownership, organizations started using product teams. With the move to product teams, which is a cross-functional team that includes test and development as well as operations and everyone else responsible for delivering business value, the idea of just adding manual testers to test the function does not work, not that it really worked before.
In product teams, tasks need to be performed alongside the development activities, as a result, organizations are recognizing they need development skills in order to build automated tests rather than focus so much on manual testing. Now this does not mean you don’t have test skills. In fact, testers have very specific skills, such as an ability to find problems that are the hardest to discover, as well as being adept in performance, scalability and security testing. Ensuring
that you have the right skills to cover the tests required means that there are some specialized resources being used for those performance or scalability tests.
One topic worth discussing is this concept of full stack developer that has been brought up, you find job postings asking for such a person, but full stack teams are what really need to be created, this was discussed as part of a DevOps Forum Paper – Full Stack Teams, Not Engineers. The team needs to have the capabiliites to deliver business value, but everyone on the team should not be expected to have all skills. Specialists continue to be required, though everyone should understand the process, specific resources may have more skills for frontend, back-end, specialized performance, infrastructure automation, etc.
One important aspect of the product teams is to understand the relationship between the teams when delivering the full system for the end . In large complex systems including IBM Z the number of parts that have to come together to provide the full end-to-end system is going to require numbers of product teams building capabilities that will end up being used together. This relationship will need to be recognized and managed, without removing the autonomy of the individual teams, while recognizing the value they all contribute. The end-to-end test environment may be the first place all these changes come together, but it should be possible for each of these teams to have ways of spinning up the other capabilities, if required, to be able to work independently before the end-to-end environment.
Types of Customers
I have worked with many clients through the years, and those client experiences provide both the positive and the negative, the opportunities for understanding best practices, and the opportunities for learning from failures, or not. In order to discuss these customers scenarios and provide valuable lessons learned, while protecting the client’s privacy, I have come up with a few ‘customer types’ to represent either individual or groups of customers.
The purpose of having individual customers as well as an amalgam of customers is to help show how specific customers have changed over the years, to explain how they’ve evolved and what they’ve learned from earlier experiences. One thing I want to point out, I work worldwide across the globe, the stories are abstracted to make sure that you can’t identify what company they might represent, it could be anywhere in the world, any country, any environment. The goal is not to drill into a specific company’s progress, but to demonstrate how different actions evolve and how organization, in different ways and by the choices that they made, had successful and efficient transformations.
Finance customer: This customer is in the financial services sector. This customer represents a client I’ve worked with over the years, it’s a large financial services company that has many different applications in many different functions with the primary back-end systems, including the primary system of record, running in the z/OS environment. This large financial institution does have development teams spread across different areas and does build a number of different homegrown applications to provide financial services. They provide critical services that are required to be up at all times, with a set of peak hours during each traditional workday. This customer represents the client type that started before IT existed and then worked with IT to automate manual processes.
Insurance customer: This customer is a large insurance company that provides multiple types of insurance. Their back-end systems of record are on z/OS with many different front-end applications and mid-tier applications. This customer more closely resembles the second type of client described above, where IT is considered a critical driver for their business.
Bank customer: This customer is a large bank, this organization again has a large back-end system running on z/OS and is adopting full DevOps processes across the spectrum of the organization. This organization more closely resembles the third customer type above, creating IT systems to provide the value rather than moving from manual to automated, though they did start before IT was generally available. They also represent an organization of mergers.
Retail customer: This company represents a single, large retail organization that has a huge presence of z/OS as well as a significant distributed implementation.
Amalgam customer: This will be an amalgam of companies that will represent a variety of customers from across the financial services sectors.
Types of Application Architectures
The type of application does not in itself specifically have an impact on the overall software quality. A monolithic application and or a set of micro services can provide the desired software quality. However, the type of application will affect the overall process, ways of testing, implications for deployment, as well as the ways of measuring.
When building micro services, each service can be built and tested independently, which is why many people have been building new applications using this architectural pattern. However, in order to deliver the end business function, many different micro services may be required. For system testing you still need to bring the set of micro services together to the end- capabilities.
Some organizations moved to a service-oriented architecture, having a service bus to provide a routing of services. These services were generally larger than what would be considered a micro service. The services could each provide a part of, or a complete, business function. These individual services could be tested independently, but to test the entire solution, you need all the services to be present, including the service bus and ing middleware.
Other organizations have focused on event-driven architectures, where events are published and consumers choose to consume specific events. As with prior architectures, parts can be tested individually, but to get the entire solution, all parts have to come together.
There are many different architectural patterns, more being developed, some coming back with new names, but the one true fact with all of them -- no enterprise will end up with just one. Over time different parts of the system will be designed and built using the latest techniques, while other parts of the system will continue to grow and be developed with other architectures.
Realistically, even if an entire system were written at one time, different architectural principles may be used for different parts due to the requirements for the system, and due to Quality-of-Service requirements. For example, a micro services pattern might be used for a number of aspects of a system, but if a portion has an SLA requiring a subsecond response time, that function, even though it would logically not be a micro service, would likely have to be written as a single service to avoid the network considerations for the multiple services.
Another example might be a loosely coupled system. It would be built to allow for independent testing of the system, and to easily change and update parts, however, for production, the system could be hard-linked together for better performance.
Customer example: One organization I’ve worked with has developed an environment that allows each team to work on their parts individually. The framework has been developed to allow all parts to be built and tested as stand-alone components, with the framework providing the routing when in the early test environments. However, once it’s ready for performance and scalability testing, the functions are compiled together to form more of a monolithic application for run time. After the performance and scalability test it gets deployed to production. This pattern allows the development flexibility and provides the performance needed for production.
The other consideration when it comes to enterprises, is that many have evolved over years, starting with back-end systems, that provide the system of record, then over time, have added more capabilities spread through distributed systems,
and then built new interfaces. This has evolved into three-tier architectures. For example, a front-end component, possibly a mid-tier, and then a back-end system of record. Many different could be used for these different parts, but they all end up needing to work together to deliver the end business value. In any type of application, you want to be able to test the parts independently, but you also need to test them together to ensure you are getting what you expect.
If you think about these independent parts, they are likely to each have their own pipeline, or at least their own instance of a pipeline. They have their own build process with their own tests. In large enterprises it will be important to have pipelines of integrated testing as well as the pipeline to test the individual portions.
Let’s look at an example where the front-end portion, say written in NodeJS, and a mid-tier may be running in a liberty application with a back-end system of record running on z/OS in COBOL. In this example, the team responsible for the NodeJS application could have their own pipeline, their own build process and can work independently through the use of defined APIs. There is a second team responsible for the liberty application, with its own pipeline build process and set of tests, and a third pipeline that s the COBOL application. Hopefully all of these pipelines have some consistent characteristics, such as a common source code manager, a common CI/CD coordinator and a common artifact repository.
The picture below (Figure 7) represents these three simplified parts of the application.
Figure 7
Assuming all of the parts feed a common defined artifact repository, each of these pipelines can instantiate the different parts of the application that they need to test with. For example, the liberty application can spin up the front-end portion based on output in the artifact repository and can also spin up the backend portion of the application for its own integration testing. Giving each pipeline the ability to spin up the appropriate parts to be able to test with, allows the teams to work independently while ensuring that they will appropriately come together. There is, however, likely an integration system or pre-production system or acceptance system, in which they will need to be deployed together to allow for the final verification. This may be its own pipeline bringing together the different parts for a release.
Another important point in these applications is many times there are hundreds of applications that actually make up a system, it’s not just a simple 3-tier architecture, but a highly complex set of relationships between systems. The full impact of the various changes needs to be considered.
These applications may be focused as online systems as many of the above application models define, or they may provide batch applications. Batch applications are also common in systems of record. These may be used for processes such as end of day calculations. This function happens at a particular time and is not caused by an end- interaction. Other types of batch applications include processing of data files sent from one organization or part of a business to another, such as a submission for a set of enrollees in a new healthcare plan.
Later, we’re going to discuss how environments can be created and how shared environments can be used. This process of allowing pipelines to spin up
environments is common in the container world and is somewhat common even in the distributed world, but today it’s rare in z/OS environments. This transition to spinning up z/OS environments is one of the key factors that allows for this move to a digital organization.
Summary
Quality is ensuring the software meets business expectations. This involves both ensuring the software meets all stated functional and non-functional requirements (security, performance, availability) of the new capability without breaking existing capabilities and ensuring it delivers the expected business value. The only real way to ensure it delivers the business value is from the customer, so we want to improve the quality and speed of this . There are major areas to address for ensuring software meets the requirements. First, all repetitive tasks are automated to ensure consistency and to enable small batch sizes. Second, for all creative work of software development for writing code, the only real way to ensure it is meeting all the requirements is to provide . The key is providing the best possible as quickly as possible. The needs to include direct on the code, but also on how the code interacts with the existing systems.
Metrics play a part in driving the behaviors needed to drive a quality software process, however, it is important to use the right combination of metrics to drive the behavior you want. This may require changing metrics over time to drive the continuous improvement process. Testing and the various types of testing play a role in the critical cycle.
This section also provides a summary of key capabilities of IBM Z and z/OS in particular, as well as an introduction to the traditional development practices commonly found for z/OS development to help set the context for the stories provided throughout the book.
Section 2: Essential Components
In this section we’ll discuss the essential components outside of the specific types of testing, the areas often left out when thinking of software quality as related to test only. We’ll start with the value of the pipeline, including the automation within the pipeline. The automation through the pipeline brings one level of stability and consistency to the process. It also removes manual errors that lead to quality issues. Quality is about that fast , and pipeline automation facilitates the ability to get that fast at each stage.
The environments used as part of the development lifecycle are a key contributor to the efforts required, time spent and effectiveness of the testing performed. Covering the different types of environments possible, and the effective use of environments, for development and test.
One key aspect related to environments is the data available for those environments. The subject of test data management could be an entire book on its own, but for this discussion we’ll focus on it as it relates to development and test.
The final parts of the section focus on key aspects of the overall process of testing.
Pipeline
There is no way around the fact the software quality is driven by people and process. The culture of the organization itself defines the focus on quality, and the importance there of. To achieve this, automation via a pipeline, is needed. The pipeline itself has a set of key characteristics that help determine the success of, and the maintainability of, the overall solution.
The pipeline and its associated capability should be considered a key internal product on its own. This is the core foundation for the automation of the processes for all development activities and should be ed, architected, maintained and managed as any other product ing the teams.
The pipeline is a compilation of capabilities put together to provide the foundation for automation for the software lifecycle. The pipeline generally includes:
Source Code Manager (SCM).
Continuous integration coordinator.
Continuous deployment coordinator.
Artifact repository.
Test.
Release Manager.
Build processes.
Security scanning.
Software rules scanning.
Work management.
Monitoring.
Provisioning.
Deployment.
Application understanding/analytics.
Measurements collection/correlation.
The following picture (Figure 8) represents the major portions of the pipeline with their capabilities associated with them.
Figure 8
Key insights: Pipeline
The SCM is critical as the first control and audit point of the process. All source artifacts should be managed in the SCM in an environment neutral way. Source artifacts include not only the source code itself, but all related artifacts such as the “testware” (the artifacts related to testing).
The build process is what converts the source into a runnable form, but still environment neutral, to store into the artifact repository. Some artifacts may not be changed as part of the build process, but should be validated by any rules. The build process should also run any available unit tests for the source code changed or the components built.
The deploy process is what takes the build output and transforms it in any way required to deploy it into the environment itself. The deploy process not only deploys the runnable portion of the application but should also do any required configuration to the runtime the application is being deployed into.
All automation should come from the SCM, that includes build automation and infrastructure as code.
Any artifacts that will possibly be deployed to production should come from the artifact repository. Items that deploy as source should still go through the artifact repository if they are to make it to production, so you have one place to restore or create a new production environment.
Traceability between the SCM and artifact repository must be provided allowing understanding from both directions, from SCM to artifact as well as artifact to SCM source.
Traceability from the artifact repository to the deployed to environments must be provided to allow understanding in both directions, from artifact repository to the environments, to know which artifacts are deployed. Traceability must the ability to define what source code represents the artifact that has been deployed in any environment.
Monitoring must use the same software throughout the process. If a tool is to be used in production, it should be used in earlier stage environments as well, the monitoring thresholds are set and configured through automation. The one exception could be the unit test environment where the middleware may not be running, therefore, it may not be monitorable.
The pipeline automation should be used to integrate the various capabilities together, run the actions in the appropriate order, capture metrics of the process, flow and quality.
The pipeline should have the full automation to deploy all the way through to production, however, there may be steps in the process that require manual complex testing or other manual verification such as acceptance testing. However, when the actual deployment to production takes place, or to a later state of the process, the pipeline should resume it’s automation.
The pipeline is not just for application development software, but the same practices apply to infrastructure related changes.
The pipeline however is not a simple flow from one end to another. The goal of the pipeline is to facilitate fast on code that is being developed. The following diagram provides a better picture of the actual flow within a pipeline.
The following picture (Figure 9) includes the loops that should be provided as part of the pipeline. In order to get the speed and agility as well as improved quality the goal needs to be to improve the speed at which code flows through the circles to provide . Notice in particular the circle from code to build, and then again from provision+deploy to test. Assuming tests , the second loop can continue to move the code through environments to prepare it for release into production.
Figure 9
The diagram shows there are inner circles within the circle of the pipeline, it’s a continuous improvement processes with each step providing opportunities to improve the quality of the deliverable. One important note, in this diagram everything is code, source code we traditionally think of, but also the tests themselves, the automation for the infrastructure, the configuration for the system, and the pipeline automation itself. All are considered code and all flow through the same process with loops.
Individual Development Environment (IDE) Code
The first loop is actually within the code box itself. With the availability of modern Individual Development Environments (IDEs), developers get the quick on their source code, can run static code analysis to understand early security vulnerabilities or early code rule recommendations. Organizational best practices can also be provided within the IDE even before any build or test is performed.
This first step is actually the first cultural implication. When I started development more than 30 years ago the compiler was the first check available for the language. Many developers still don’t recognize the value of having a modern IDE to help with the development aspects. For some, this is because when IDEs started they did not have the capabilities of today, but for others it’s simply because they are so used to editing the way they always have. This is not just a problem in the legacy development space. I have known many people who just love vi as their editor. While, under the right circumstances understanding vi can be invaluable and can be very efficient for quick changes to files, it is not providing the language understanding or many other features available in modern IDEs.
The importance of this first step in code quality scanning, code rules and security scanning cannot be understated. This can be delayed until the build process, but that’s extra time taken before the . Using a modern IDE is the important aspect; while which IDE should be used can, to some extent, be left to the individual. There are new IDEs evolving with time, eclipse has been the dominant IDE when it comes to Java development, but with the advent of VSCode it is growing in popularity among developers as well as system s or system programmers.
The latest trend is providing the IDE in a browser backed by a container in a cloud. The cloud could be internal, private, public, whatever is appropriate, but having the development environment ed this way can bring additional security by not having source code leave the control of the istered environments.
IDEs will continue to evolve over time, providing additional to the developer, as well as assisting in enforcing security or organizational practices. The quicker the identification of a problem, the more effective the development process can be. Providing access for developers to use IDEs with the capabilities required for the language they are developing in, can be the first stage in improving the quality.
Source Code Manager (SCM)
The second area is that of the SCM. The SCM is central for audit and management, selecting the right source code manager is an important step, it should:
Provide versioning of the artifacts stored in it.
Provide ability for multiple versions to be edited at the same time.
Provide for merging of the changes from multiple people.
all platforms required for build.
various file types and code pages.
As of the writing of this book, Git has the largest percentage of the market and satisfies the requirements for an SCM. Git also has the advantage of being very well-known, and because it is so easily consumed, it’s used by students of all levels to manage work in school. Git has been ported to z/OS so that it can be used for all types of source code.
Build
The build process controls the creation of the runnable output as well as providing the first level of scanning for compliance and the first set of tests of the code. The build process needs to be defined such that it is repeatable, and used any time an artifact is built, whether that be the individual developer or part of the pipeline. Many tools have been developed over the years to assist with parts or all of these steps, such as Ant, Maven, Gradle, Make, to name a few. These are defined generally for the application or project such that the individual can do the early build before ever submitting to a process to be built. The goal is to allow the right set of steps at each stage to be as fast as required for that stage. For example, for the developer to build their source, it may not run the static code scans as part of the process as that is done within the IDE and already available to the developer. Yet this same process when run as part of the pipeline will perform the static analysis scans and store the results.
In the earlier section discussing continuous integration, one principle is ‘keep the build fast,’ as developers make changes, they need the ability to build their code often for early testing. This very fast reputable process is critical to their efficiency, when doing development that can be built locally on the machine or the local container (when using a cloud IDE). This is often a feature of the IDE itself, using the definitions provided.
NOTE: for z/OS development, this fast build is also required, but the compile must take place on z/OS itself, in order to facilitate functions such as build which has been created to allow the same quick build process as part of the IDE, using the same build function as that used in the pipeline.
One part of the build, the static analysis scanning for code rules, security rules, local standards, etc. can be a key contributor to the overall software quality.
Scanning early in the process for coding standards, known security issues, possible performance issues, gives you the ability to address the items while they are being coded, which is more effective. Static code scanning can also generate a set of complexity metrics. These complexity metrics, such as Halsted and Maintainability, have defined standards for generation, but the values don’t necessarily indicate code quality itself. Some code is going to be inherently more complex, so these metrics can be used for relative comparisons and could indicate a requirement for more complex code reviews.
Key insights: Static Analysis
Static analysis rules can identify defects before you run a program. Define what rules matter to the organization and don’t generate too many false positives. Having rules that display problems frequently, that essentially don’t matter and are wrong, can cause delays and waste time.
“Once the appropriate set of static analysis is selected, defects found via the scanning need to be addressed right away and not put on the backlog, addressing them as the code is written so that the scanning stays clear.” This addresses the issues as they are written and helps maintain code quality.
For large existing codebases, it is important to understand coding patterns have changed over time. Old code may have many identified errors or warnings, but if the code has been working for years, unless it’s a security related issue, it is not worth the effort to address the issues unless the code is changing for other modifications anyway. The goal should be to add no new problems.
Unit test, also part of the build process, should be run in the pipeline. Unit testing will be discussed in detail in the next section.
The build process should fail with any significant issues, but assuming the code successfully compiles so it can be run, all the static analysis and unit tests should be run, so a full picture of issues is provided at once, rather than one after another.
The build process creates the appropriate runnable units. For many applications this is the entire application, such as the jar, war, or exe. Other times this could be part of the overall application such as a dll or a load module.
NOTE: For traditional z/OS applications in COBOL, PL/I, etc., it will likely be a set of load modules that make up part of the application. It is very rare that a change would cause the entire application to be rebuilt.
Artifact Repository
After successful completion of the build process the artifacts from the build are stored in an artifact repository, from here they can be deployed. The build process may also produce artifacts that will be used in other builds and those artifacts should also be stored in the artifact repository.
The artifact repository needs to be able to store the build output, and make it accessible for later builds and for deployment. This should be the one place all artifacts are pulled from to deploy. The artifact repository should also be able to handle artifacts built that are only used for testing vs. artifacts built that can be deployed into production. One example is a build that includes debug options, or non-optimized code.
Provision, Deploy and Test
The next two boxes in the pipeline, provision and deploy, is where you ensure the application is in a state that whatever the next stage of test is, it can run.The loop from provisioning and deploy through test is to indicate that multiple types of tests should be run after each successive successful run.
After successful testing, and the appropriate approvals, the release process installs the application or application updates into production based on the release rules. This may be at a specified time during the day, maybe on a weekend, or whenever is appropriate for this change to be released. Release processes will generally roll changes in as required by the change. For example, a database schema change that is not a breaking change could be deployed earlier and staged so that it’s ready when the other changes are then deployed. A system configuration, such as new queue, could be predefined. When the code change is made it may be rolled out to a subset of the environments first to ensure no production issues, before continuing. The release process should have a set of automated verifications built-in, which would have been used in all prior deployments as well, to ensure a complete and successful deployment.
One of the largest contributors to quality problems, or long release weekends, is the release process itself. Problems, such as a missing part or a database table not loaded properly with updated reference data, can easily be avoided with the fully automated deployment processes that are used throughout the pipeline.
Story: Deployment errors
One organization had a very high-quality system that had very limited problems following a release weekend and the system was turned over as live. This was achieved with Herculean efforts from highly-experienced resources, subject matter experts (SMEs), within the organization. Each release weekend they would have a long spreadsheet with all the steps required by each team, have a bridge call open and would walk through each step together. As problems occurred, as they always did since each step was performed by individuals using some automation that was being modified for production, the subject matter experts would jump in to identify the issue and get it resolved before the change window closed. One key issue was they had a release twice a month, which meant their SMEs were spending half of their weekends working excessive hours, and these individuals were getting older and closer to retirement.
This process was not sustainable as the key SMEs had limited work-life balance and as they were approaching retirement would not be available forever. The process had to change before the SMEs retired in order to maintain the high level of quality in production that had been achieved.
Another organization I worked with did have a process for using automation between environments, however, in order to get a change made, a ticket would be created for the change in the lower test environments. The team responsible for the lower environments would work together to make the changes, understand what’s required and move through the testing process. However, once the change was ready to move to final testing and production, a separate ticket had to be created to a new team that would be doing the deployment in the next areas. These two separate infrastructure teams shared nothing in common, different processes and automation were created in the late testing and production environments, all the lessons learned in getting the updates made had to be relearned in moving into late stage testing where additional problems were
always discovered moving to production. All this extra effort led to, at least, an additional month in the process of moving changes into production.
This separation of teams was caused due to the separation of duties requirements and the strict controls around late stage changes. In order to get more flexibility, development teams created new environments that would have fewer controls to allow for more changes, but by creating new environments and teams instead of moving to more automation, the process overhead increased and overall delivery time increased.
A financial institution made a simple change in production, a change to the JCL, but with a comment line incorrectly placed. This caused such a major problem the bank could not balance s for a number of days, though they had to stay open. This was a change done by an individual in production, a small lowrisk change, but it was manually done. Processes were changed following this incident to changes, but the one real fix of never making manual changes in production did not happen for years. Issues continued to happen, though none as significant. At the core of the problem here, was making a change and running it for the first time in production.
The above examples could be addressed through the use of automation coded in the SCM and used throughout the process -- standard automation, testing for each deployment, with no manual changes being made.
Key insight: Manual Changes
Do not make changes in production manually, and do not make changes that have not been successfully made in earlier environments. This may seem like an obvious statement, but based on the number of times this happens within organizations, it’s a point worth making multiple times.
Monitoring
Monitoring of the system should include operational metrics, as well as application performance metrics, from all of the aspects of the application. Proper monitoring helps improve software quality by observing actions early, before actual issues occur. Monitoring means not only understanding how things are performing, but actually also monitoring the monitoring system. If an important resource is not providing data, it’s important to understand if the resource itself or the monitoring solution is having the problem, or if in fact it’s the network. Having multiple ways of measuring the system helps identify future problems. The growing focus on Artificial Intelligence for IT operations (AIOps) brings additional insights that can automate the identification and resolution of some issues.
Understanding the normal range, helps identify what is not normal. This idea of monitoring for outside of normal ranges, and/or monitoring for slowly changing flows, helps identify problems before they are actually problems. Examples such as watching the application response time, and the rate at which the application is responding to requests generally follow a standard flow pattern, if while monitoring this standard flow pattern is not observed, a message could be issued to indicate a possible problem to be addressed.
Planning
Planning is a key contributor to the overall software quality. Managing the work in progress, addressing technical debt, the function, as well as the tests, should all be part of the planning process.
The planning process itself is an area that could cover an entire book, and there are many different books published about visibility of work and ensuring the right work and process limits. Books such as Making Work Visible by Dominica DeGrandis among others mentioned throughout this book.
Analyze
In regards to the analyze portion of the pipeline, the analytics should be used throughout the process to help identify the work to be done, dependencies and relationships, and assist in identifying tests to be run, and the order of the tests.
For large complex systems, understanding the application can be a challenge, specifically, for systems that were designed and built years ago. Having a clear understanding of all parts of the system can be difficult. This is where tools are used to provide the picture of the application or set of applications. IDEs themselves generally provide some level of flow within the application, such as program control flow or data flow, but flow across the entire application is also required. Using tools such as Application Discovery and Delivery Intelligence (ADDI) allows individuals to see the relationships, the interdependencies and understand impacts of change.
Provisioning vs. Deployment
In the pipeline picture above, provisioning and deployment are situated next to each other as separate functions. Understanding the difference and the separation helps as individual teams are combined into product teams.
Provisioning refers to the creation of the underlying ing infrastructure, including the operating system and middleware
Deployment refers to the application specific configuration of the middleware as well as deployment of the application artifacts themselves. Deployment can also refer to deploying test artifacts, data, any resource on top of the infrastructure layer, or any configuration to the infrastructure specific to the application.
The following diagram (Figure 10) shows the line between provisioning and deployment:
Figure 10
The picture represents a z/OS environment, but could as easily represent any other environment. In cloud environments the line below the red line is generally what is provided by the cloud provider in Infrastructure-as-a-Service (IaaS), and the top portion represents the customizations required on top of the provided image to run the application. For internal data centers the line generally represents that the artifacts below are controlled by infrastructure, and the items above are controlled by the application teams.
In order to fully automate the process for provisioning and deployment, you must first understand all the parts. Also knowing what teams currently control which parts, helps determine who can provide the configuration information and who will probably provide the appropriate automation.
Key insights: Application Infrastructure
There is a separation of infrastructure, changes that are related to the application changes and infrastructure that is not specifically related to any particular application. These changes should be done in a consistent way, however, the application-related changes must be part of the application pipeline. This is not to say they have to deploy together. Some changes can be pre-staged into an environment, however, application related changes should flow with the application such that they are controlled in a consistent way and can be managed together.
The application-related infrastructure changes should be deployable into provisioned environments or existing environments in a consistent flow.
Environments for Testing
There are different types of environments used for testing. Testing performed before the code is deployed into an environment, and testing once it’s deployed to run. Here we’ll discuss the environments the application will be deployed into, which can be isolated, shared, or subset environments created with virtual services or full environments, and just about any combination in between.
Both isolated and shared environments can represent a full environment or a subset of the environment, where virtual services or stubs are used to allow testing part of an application without other interconnected applications.
Isolated and Shared Environments
Isolated test environments are those that are used for a single , for a specific set of automation, or for a specific type of test, where the entire environment is used only for that purpose. A developer’s laptop is an example of an isolated test environment. Other examples are environments provisioned specifically for a , or for a specific performance test run, with no other testing going on.
The next picture (Figure 11) shows an isolated environment where the developer has access to their own system, no one else will be interacting with the system.
Figure 11
Isolated environments simplify the testing process because no other unrelated testing is going on in the environment. Generally isolated environments are spun up or created when needed, and are populated with the required application components and test data to perform whatever test they are being set up for. Isolated test environments make it easier to run automated testing, as the data can be set up for each test, to allow for the automated verification. But only true isolated environments make automated testing definitively possible. It’s possible that an always running environment could be an isolated test environment, if its sole purpose is only, for say, performance testing. However, many times these environments get borrowed to perform other tests, which gives them the qualities of a shared environment.
Shared test environments are those that are used by multiple people, or multiple sets of automation, testing different areas or functions at the same time or in sequence without resetting the environment. These environments generally are maintained over time, always running, and are deployed into with many different s testing different types of changes all at the same time.
The following picture (Figure 12) shows a simplified version of a shared test environment. It shows multiple people and automation are running against the same application and the same data, and it can have multiple changes from each of the individuals working in the system. The unintended side effects of the multiple changes in the environment make it harder for understanding each individual change.
Figure 12
If we look again at the picture of the complex promotion hierarchy (Figure 13), shown below, we see this development environment would be shared by all developers working on a particular application or in many cases a particular designated release.
Figure 13
Shared environments require scheduling time in the environment, or scheduling when changes can be moved into the environment. This schedule requirement brings waste into the system because even when something is ready it may have to wait. A acceptance test environment is one example of an environment where the shared nature can be a positive. If, when functions are ready they can move into the acceptance area for acceptance testing, they are then running alongside all the other systems that will also be running in the production environment. In the acceptance test it would be expected that different s would be testing different functions as they would in a production environment.
Partially shared environments include environments where the developer or automation is testing in its own component, but that component is connected to other resources that are shared with other testers or automated testing at the same time, such as a back-end system of record, or a shared database.
The other environment that is obviously shared is the pre-production environment, this could be the same as the acceptance test, or another environment, specifically to have a fully running copy of production before rolling into production. The pre-production environment should be similar to the production environment in set-up. However, if it is not used for performance or scalability testing, it does not need to be the full size of the production environment. But it does need to include the same ideas such as clusters, or a sysplex if that is what is in production.
Examples of shared environments:
Let’s say your development environment could be a shared environment for all the developers working on a particular release. The problem comes when you have multiple people working on multiple different releases at the same time. If you’re using shared environments you either have to have multiple development level environments that get shared by the different teams working on the different releases, or you have people competing in the same environment working on different releases. This adds to challenges for the developer, as they won’t have a standard stable environment to work in. If they are working on a function that requires a database change that won’t be done in this next release, but will be done in a future release, then they can’t really make that change without breaking everything else. In addition, if the data format is changing or if new fields are being added or new capabilities are being added, that data change could break the development of other teams. This concept of a shared development environment is very common in the z/OS world, but very unusual in other environments.
Generally, if you’re working on a Java application, you can work on your own laptop to give you isolation so that you can do the initial development without anyone else interfering with you or without you interfering with anyone else. Traditionally, in the z/OS environment we haven’t had this flexibility, we use those shared environments due to the cost and challenge in building up multiple development-level environments. (Though as will be discussed later, it is possible to provide this same level of isolation for the z/OS developer with Z Development and Test Environments (ZD&T)).
Some organizations may have created a number of development environments, but it’s usually not one per developer or even one per few developers, it’s usually related to the release. One of the key problems with shared environments is the data associated with all the environments involved. The shared environment,
ultimately, ends up with a mess of data that make it hard for anyone to do any serious amount of testing because you never know what someone else is going to have done to the data and don’t have a realistic expectation of what is in the environment.
Shared environments are also used for later stage testing, in environments such as performance or scalability where we don’t expect to have separate environments or multiple performance and scalability environments, we need to consider these shared environments and need to use them to perform these specialized tests. In this case, the shared environment introduces that problem of scheduling, any time you have to schedule or wait for something gets delayed.
The other thing to understand about shared environments is they need to be interconnected with the rest of the application environment. The picture below (Figure 14) shows a possible setup, where at each level there are other parts of the application connected. This might be an ideal environment in some ways, in that at each level it is possible to have the full application. However, generally the development environment outside of the z/OS is generally connected to the test level, so there is no way to connect the parts at the development level. It also means that the parts of the application outside of the z/OS environment have to wait until the right changes are moved forward.
Figure 14
Story
One client I worked with had a single development environment for all of the back-end z/OS applications. This environment was large enough to allow all of the developers to do their work simultaneously, however, no one ever knew what the state of the data might be. It also meant that if any database changes needed to be made, they could not be made until it was time for that specific release to be in the development environment, because in order to facilitate the multiple releases at the same time the system would have to be reset based on a schedule, so developers had to work within the defined schedule and if they were working on something for a release that was not currently in the development environment they just had to wait.
This type of highly shared environment delayed all the work, if you weren’t quite ready with your code change you’d have to wait until the next time the environment was set for your release. This caused many people to have to do additional task switching to move between the functions they were building for different releases. The other problem this caused was the staging of source code, if as a developer you were working on code for multiple releases you had to stash it in private libraries to make sure you didn’t lose your changes, but you also had to make sure you were continuously merging in other changes to ensure you didn’t have to merge at the end. This would many times actually cause regressions of code because a developer would miss specific changes going forward. Due to the fact that the developers were working in their own Partition DataSet (PDS) libraries they didn’t have full versions and source code management to help them.
Considering the traditional library managers and the way they mapped to test environments, the promotion hierarchy provided the specification for what test environments would be available and what levels they would have to go through. This generally limited the flexibility in test environments and ensured they were
shared environments, at least for a development team if not for the full application or even for multiple applications within one environment.
Another client I worked with had many different applications in many different teams doing development, but only a single development level environment. This single development level environment was used by all the teams, doing any development effort, for all of the applications. This meant developers spent lots of time trying to coordinate with each other to make sure they could actually get their development done. If they needed to change system type resources, like adding an extra field into the database, their ability to do that development and testing was highly limited.
They then moved on to a single test environment. This test environment was used for the integration test and used by the quality assurance team to do their testing. This testing was mostly manual, though there was some level of automated regression testing. The system test environment would be used by the QA team, for a 3-month test, for any release. Following this test, the code would then move into the pre-production/ acceptance test environment where it would again sit for at least a month to allow acceptance and stability testing.
This stability testing was really just running the batch cycle every day for a month to make sure it did not uncover major problems. The results were not really verified. It was tested by running the batch and making sure it ran, not making sure the data was necessarily correct. This was what some called, the shakedown period.
One of the important factors of these environments was they were integrated environments, for this client this development level consisted of all of the distributed as well as mid-tier and back-end systems. If you were doing front-end development you would be stuck connecting to the development level of the backend systems, which were also changing at the same time. This made it much
harder to do the frequent updates of the front-end system against a changing back-end.
This next client is an example of a highly shared environment. Most customers do have more environments at each level broken down by application or application area, which means the applications are tested together making it hard to do different or unrelated changes, but at least multiple application development changes are not going into the same environment.
The challenge comes when multiple applications are making coordinated changes. These changes won’t come together until later stages in the test environment, and therefore, make it harder for developers to get a complete test. The code moves forward to different environments based on the schedule, not based on when the code is ready. In this case, where there are multiple development environments, these low-level development environments were usually not actually connected to anything else. So, for example, the front-end system development environment had no mid-tier or back-end to connect to, and that back-end system was not connected with the front-end that would normally be driving it at the development level.
Typically, connections weren’t made until at least the test level, and in some cases, connections were not made until at least the third level of testing. This made it much harder to do graded tests and to that the different parts actually could work together. Since moving between environments is determined based on a timeline, not based on when code is ready, the developers could be waiting a while before actually understanding if their code worked between the different systems.
Key Insights: Isolated test environments
Not having the right isolated environments available to development limits the innovation, as well as the velocity, of the development itself. Providing isolated environments for early development activities, as well as for running automated testing, helps increase quality, velocity and improves developer satisfaction.
Providing stubs to allow applications to work independently for early development helps increase flexibility, but it is also important to allow for the spinning up of the other parts of the application for any team to be able to test as early as they need with the remaining parts of the application.
Provisioning Environments
One of the drivers to public or private cloud environments is the ability to quickly spin up capability and, then as easily, tear it down. This capability is very valuable for production workloads that need to quickly add capacity. It is also important to have this ability to spin up and spin down test environments. As adoption of the cloud has grown, more organizations are using the capability to spin up entire sets of environments for testing and then remove them.
With pipelines this process is being included as part of the overall automation. This ability to spin up resources in the cloud is only the first step, the application configuration, and application itself, also needs to be easily spun up and down. With the move to infrastructure as code, the configuration for the application on the cloud resource is possible, along with the application deployment. Many existing applications are still running on virtual machines, or full system images, but some have moved to containers, which makes it even easier to spin up the resource and spin it down, as you pull the entire image from an image repository.
Pets vs. Cattle
This ability to spin up and spin down has been growing, but there are some systems that have been harder to deal with, such as large application server clusters. Think of the spin-up spin-down concepts, as moving from a pet to a herd of cattle, and from working with individual systems (pets) to infrastructure (cattle), as code continues to expand. This is one of the key foundational aspects to moving to repeatable, high-quality deployment automation. One of the reasons for this, is that as one moves to fully automated systems that can easily be recreated, you must move to standards-based systems, instead of fully customized handcrafted environments.
The pets of the server environment were each created, maintained, managed and patched separately, even if automation was used. Each also tended to be unique due to the software they were built to run. This uniqueness is what required individual management.
Moving to an environment where we want to spin up and spin down servers, or systems, or containers, the cattle analogy brings us to the standardization, and instead of patching and managing systems, they are simply replaced.
“In the old way of doing things, we treat our servers like pets, for example Bob the mail server. If Bob goes down, it’s all hands on deck. The CEO can’t get his email and it’s the end of the world. In the new way, servers are numbered, like cattle in a herd. For example, www001 to www100. When one server goes down, it’s taken out back, shot, and replaced on the line.”
From , by Randy Bias
z/OS related note: We have to be careful not to take this analogy too far, however, when a system is running a single application, then the entire system’s lifecycle can be based on that application, and spin-up spin-down, even for production, can work very well. When systems get larger and host multiple interconnected applications, which is the case with z/OS, then no single application in production determines its lifecycle. For production then, the system is maintained and managed, but it does not follow or imply that development and test systems can’t spin up or spin down. In fact, using spin-up spin-down systems for development and early types of testing, will help ensure the automation is standardized and created so that the process can flow all the way to production.
Many of the z/OS systems have been running for years, handcrafted by the system programmers to optimize the system for use for the 4-hour rolling average. Though IBM Z s the ability through z/VM to spin up multiple z/OS environments for testing, or through the use of Z Development and Test Environments (zD&T) to spin up a logical partition (LPAR) on a Linux Intel system, or within an existing z/OS LPAR additional environment, this has not been a normal practice for many shops. As described earlier, the software configuration library management products, with their defined hierarchy, generally limited the environments.
With this ability to spin up capability, partial environments can easily be created for development, test, and could be used for production workload as well. For the environments, data still must be considered. The question becomes, what about the parts of the application that can’t or aren’t being spun up and spun down. The back-end systems, either large legacy applications running on traditional servers or the z/OS applications, are seen as the factor slowing down the development efforts. The need to tie into the back-end systems generally ends up connecting to one of the existing levels, which will have its own testing going on. This inconsistency of what might be in the environment slows down the process for both the back-end systems and the front-end development. Defining virtual services at the boundary between the font-end systems and the back-end systems does provide a level of testing, but at a point, it is critical to bring them together to get the end-to-end test.
Given what has happened with many front-end systems to this spin-up spin-down nature, moving from pets to cattle, we need to extend these concepts into the other legacy systems and into z/OS. But how do you move to cattle, when you have a shared system, and there is not an easy lifecycle to spin the entire thing down for production. The first step in moving toward cattle is to create standardization and automation. In order to create the ability to spin up or spin down, the middleware, the configuration of the middleware and the application considerations for configuration of the operating system need to be understood, as well as what parts make up the application. Pulling out the configuration and storing it in the SCM as with any other system and creating the automation to create an environment will the ability to create the
systems on demand. But that will take time, these z/OS environments have been built up, not just over a few years, but in some cases over the last 30 or 40 years.
Story
This kind of mindset shift of allowing the ability to spin up multiple environments for short periods of time allowed an insurance company to move from a 25-hour test for the back-end to 12-minute individual tests.
In order to get started, you can copy an existing z/OS LPAR, with all its configuration, middleware and all the load modules for all the applications. This copy can then be used with the zD&T to spin up the z/OS LPAR up on Intel hardware. This means the z/OS environment can be spun up alongside the existing other parts of the application for a full test environment. Since this is z/OS running on Intel hardware it does not perform as it does on IBM Z hardware, and it is obviously only for early development and test. But it does allow the ability to create additional isolated environments. Using an environment created this way, only application changes, data and test resources would need to be deployed to the system.
It is important to note ZD&T can also be used to test infrastructure changes, as well as build up automation capabilities for infrastructure as code. Creating the image in the first place from the existing system provides the base framework, then automation can be created to allow for any updates or changes. ZD&T can also be used for testing system level changes, not just application level changes, such as a new version of middleware or compiler.
Moving to infrastructure as code for z/OS as well, will provide additional flexibility, allowing the spin up of environments with only the required resources, however, a copy of the system minus data provides the first step toward independent testing. When moving toward infrastructure as code, consider what is being used for the rest of the environments. For example, Ansible is a common infrastructure automation capability, it can also be used
with z/OS to spin up instances of middleware as required on any z/OS LPAR on IBM Z hardware or hosted on ZD&T.
As automation increases in the pipeline and in the testing process, the ability to have isolated environments becomes more and more critical. Attempting to run automated tests in an environment also being used by individuals for their testing, leads to errors due to data mismatch, this can totally be avoided by having the right systems available.
Key insight: Shared test environments
Using shared environments for automated testing drives instability in the process and makes it more difficult to get and maintain a stable quality signal.
Containers
The creation of containers has brought a dramatic change in testing. By having a container you guarantee that the entire stack you are testing will have no changes in later environments. The full system is included and, therefore, the variability of the operating system, operating system patches, middleware versions etc. is removed. With a container you can be sure the exact same underlying software, not just the application, is the same. As discussed, the containers have also brought additional simplicity to the idea of spin up spin down. The creation of a container to do a set of testing, is a normal process. The other important point is the destruction of a container is normal. Containers once spun up, are not updated, a new container is spun up to replace it with updated capabilities.
This changes the way testing can be performed because you aren’t having to test for the variability of the different environments, you can focus on the function and only the function. The deployment process cannot provide the additional variability as it does when it deploys into existing systems, even if those systems were built up via automation. There is always some chance for variability, but with a container there is no chance once the image is built.
An issue with images is that the code is built along with the middleware and required system components, which means additional care must be taken for the image creation. The various different layers of the image need to be managed appropriately to ensure security vulnerabilities or other packages are not inappropriately brought into the system. Management of the various layers of the images needs to be part of the overall process to help drive software quality.
Ensuring the images are built on a dedicated build system, so there can be no introduction of developer system variability, is critical to ensure stability,
understanding and software quality. The image creation process is now the key point for ensuring all the right parts are included. The images need to be carefully managed, controlled and scanned to ensure no modifications are made outside the process.
Thinking back to the picture of provisioning and deployment, the same line applies in the process of building a container. The base image, including operating system parts and middleware will be provided most likely by the infrastructure team, so it satisfies security and privacy requirements. The product team will provide the layers above, with the configuration for the base and the application artifacts.
ZD&T provides the ability to spin up z/OS LPARs in an intel Linux environment for Development and Test. With ZD&Ts of containers you can have the same flexibility to spin up and spin down a full z/OS LPAR. This provides the same dynamic flexibility as other systems but is based on a full system image not just the specific portions required for the application.
z/OS itself does the concept of Z Container Extensions (ZCX), which are Linux containers running within the z/OS environment, however, though IBM has made a statement of direction that it will native z/OS containers, it does not yet as of the writing of this book. One of the primary goals of containers is to standardize the environment the application is running in, and when you have thousands of images, trying to maintain consistency is very hard. However, z/OS has been designed and developed such that many organizations build a standard system image known as sysres. This sysres is then used for all z/OS images so the base system and software configuration is generally less out of sync than it gets in other environments.
The differences primarily come from configuration differences above the system and in the middleware. Also, with z/OS there are generally many fewer images, in some cases as few as four, in some in the hundred range, but there are many
fewer to manage. Keeping them consistent has been less of a problem. Taking the first step towards infrastructure and configuration, as code can address the differences currently being caused in the z/OS environment. So even without containers you can ensure a consistent environment between development, test and production.
Test Data Management
Test data is any data being used as part of the testing process. What test data is used determines which code paths are executed. Test data is required for all types of testing. Each type of testing has a different goal, and therefore, may need different types of test data. The process of managing all of this test data is referred to as test data management.
There are many types of data including:
Reference data.
Input data -- Not Personal Identifiable Information (PII).
Derived data or calculated data.
Personal Identifiable Information (PII).
Within these types of data are different forms of data:
Production data -- The actual running system with s input.
Obfuscated or masked production data -- This is production data that has been processed to obfuscate or mask the PII information.
Fabricated data -- Either manually or via tooling, creating data not based on real customers, but is made up.
Each type of data has its own controls and security implications. Fabricated data has the fewest controls required, as it does not represent any individual. Data fabrication, however, is not simple, the data has to match the relationships as is defined in the application for the real data.
The following picture (Figure 15) helps to explain the complexity of data within an environment. This is a simplification, but even with the simplification it demonstrates the interrelated nature of the data.
Figure 15
One way that data is fabricated is by simply inputing data into the system as a real customer would. However, you’d be using made up information, creating a name and address and other identifiable information, but not based on a real person, just based on random information. However, when doing this the rules still have to be followed, if the address will be verified via Google Maps it needs to actually exist and be a real address. Even though it’s a real address since it was created via fabrication, it is not PII.
One way to describe data fabrication, as described by a client, think about the fabricated data as fictitious combinations of values – a street address picked randomly from a phone book, plus a first and last name picked at random from the same phone book, would have a very small chance of matching a real person’s name and address. When you add a date of birth picked at random plus random values for a few other “sensitive” fields, there is no real chance of having data that could be matched to a real person. Since PII data is not commonly used for complex processing (except for location as described) a fairly large “population” can be generated then used as a basis for adding more test-specific information. One of the more effective data generation tools uses this technique.
There are tools that can assist in this fabrication, they work by creating rules for data field on what could be the contents, and what is the relationship between multiple tables, so if data has to be related via keys, it is. Rules for fabrication of an number need to be the same rules that are used to generate real numbers, but in this case, not part of the production data. Some fabrication tools can assess the data in the production system to determine ratios of data, for example percentage of male vs. female entries and use those percentages in the fabrication process. Data fabrication is generally slower so it’s typically used for smaller amounts of data.
One additional benefit of fabrication technology is it can generate bad data, and boundary condition data based on rules. This allows for a greater spread of testing scenarios.
Key insights: Test data creation
If using the input method for creating test data:
It can be a bottleneck unless it is automated. Also, if the input has to run through a complex system in order to get to the state that is needed by the test, it can become expensive to use. This is especially true if you have to run the data through multiple process cycles.
Only test data that can be entered into the system can be created. For example, if you have to test a product that is no longer available for sale, but still have s in use, you can’t generate the data unless you keep various old versions around.
Direct fabrication via rules can be very complex, creating the rules without a clear understanding of the data can be a challenge. Having data fabrication capabilities that can look at the existing data can help provide base rules, but additional work will be required. The following diagram (Figure 16) helps demonstrate the flow of interaction that complicates the data fabrication challenge.
Figure 16
Reference data is not PII data nor derived from PII data. Examples of reference data include:
A list of the United States of America.
The list of cities within a state.
The streets within a city.
The available types of s provided within an organization.
The types of insurance policies available.
The list of prescription medications.
This data, when not associated with a particular person, is not PII.
Input data is any data requested from a through any form, such as their name, address, or desired number of years for a mortgage. Some input data can also be PII data.
Derived data is anything that is calculated or created based on the other available data, such as a checking balance. It can also represent items such as a credit card number when it is the organization creating the , however, a credit card number in other circumstances is input data and also always considered PII.
Understanding the data, the types of data, how the data is used, where it is stored and how it is managed, is a critical part of the data management strategy, of which test data must be a part.
Test data management is one of the most critical aspects. Test data management provides the ability to do automated testing. Bad test data, or badly managed test data, is a major problem developers encounter when trying to perform their tests. This is very commonly an issue in shared test environments where the data becomes so unrealistic, out of spec, and so confused and wrong that testing and developing becomes much harder. I have worked with many customers where the development environment is only used to write code, and in order to test anything, the modules have to move forward in order to be an environment that have any reasonable semblance of data
Test data management isn’t just critical, it is also extremely complex. When we think about test data, we need to make sure we are addressing the right data for the right scope for the right phase. One reason test data has become such a focus is the issue around security and PII. With the focus of General Data Protection Regulation (GDPR) in Europe, you can’t have data just used in any environment. We need to make sure the data that is in the test environments has been cleansed, masked or obfuscated. However, when it comes to masking or obfuscating data, that takes time and effort, and can cause problems with the
actual application because the data is no longer correct. If we use the masked obfuscated data for the later stage performance scalability testing, where you’re testing the actual performance and scalability, rather than the actual results this obfuscated masked data works much better. Later state testing of scalability and performance also needs lots of data, which is where production volume is an asset, you can run the full volume or increase the production volume flow for this type of testing.
In lower environments, test data needs to be controlled so that automated tests can be run. Controlled data means well-defined data that is always the same in order to run the tests. This means the same data that runs the test has to be able to do the automated verification of that test. The other thing about lower environments is that you don’t need as much data. When running a unit test or a function test or an actual integration test, the amount of data needed corresponds to the scenarios, to avoid testing the same thing over and over again.
Test data fabrication comes into play in these lower environments. Using fabrication to create the test data here eliminates any worry about PII and you don’t have to worry about the millions of records to run through the program, you’re going to generate a small amount of test data that corresponds to the scenarios that you want to test. This small amount of test data should also include not only happy path testing, but all the error conditions that you want to test.
Making sure your fabricated data gets all of the error scenarios, the bad data, the impossible data, into the system to make sure you can handle those bad requests, as well as those bad actors, that are trying to crash your system. If you use production data that’s been masked or obfuscated it doesn’t have all the error scenarios, because by definition it’s good data. It got into your system, therefore, it processed through the data, it processed through your system, so it has to be good It might not be great, but it is definitely good data that isn’t going to allow you to test all of those error scenarios that you should be testing, it also can’t have incorrect data that can’t get in, that kind of data is what you want to test for
in the earlier stages. Some customers are moving to test-data-as-a-service, this allows their developers to request data for a particular program or scenario that will be generated for them to allow them to create an automated test. The fabricated data is then stored with the test to allow the test to be rerun and verified, and always run with the same data.
The key capability here is that the automation and APIs need to be provided to create, reset and cleanse data. This includes the full process of replacing the data for the test in the environment that they will be testing. Any manual steps in the process can add hours or days to the full value stream.
The following picture (Figure 17) shows where fabricated data vs. data pulled from production environments might be used. Notice the masked indication on the line pulling from production, unless the data is used in an environment as locked down as production the use of production data is not generally appropriate due to security concerns, and can have legal concerns as well with GDPR or other data privacy regulations.
Figure 17
A large insurance company has done a lot with test data management, starting with understanding what kind of data they needed in the test environment, what the requirements for that data were and how close to production did it really need to be. For an insurance company, it’s important to recognize that the data has to be real, for example, if an insurance company sells homeowners insurance the house address has to be a real house address that is findable via Google Maps. The data has to be real enough and representative enough to allow for proper testing.
But how do we deal with error situations or bad data since we know the address has to be valid? A large percentage of their data is used from the standpoint of printed reports or part of reports etc., but isn’t actually used for comparison or for validation or for processing anywhere in the system. So when thinking about test data, it’s important to think about the data and whether or not it’s actually tested against, validated against, and used in particular ways to understand the kind of data that you need.
There are a number of different environments that need to be dealt with first, such as the largest environment closest to the production environment, which is the pre-production environment. That environment is used for final validation for ensuring that everything really fits together before it deploys into production and, in this case, production data is actually the best data to use, however, there are security requirements, so how do you have production data while still providing the correct following of GDPR etc.?
When thinking about this final test environment, this pre-production environment, if we can lock it down the same way we lock down production then maybe it could have real data in it. (This is actually up to the lawyers, but
it’s reasonable that it could be approved if the right access controls are implemented.) But the environment before that can’t have real data. These environments need to have the PII removed from them, obfuscated, masked or cleaned up in some way. Through this process of obfuscation and masking it needs to be impossible to undo, so that if the data is discovered you, or those doing the testing, can’t figure out who the real people are.
Key insights: Test data
Less is more -- Using the smallest set of data that will drive the largest number of scenarios and the largest code coverage increases the speed of the test and the reliability of the results.
Fabricated data for earlier testing -- In earlier types of testing such as unit test and function test, the goal is to test the actual actions. Using fabricated data helps give the right coverage, while also reducing the need for data security as no PII is used.
Production data is not good for early testing -- Production data has only good data in that it has already made it into the system, and it has many entries that will drive the exact same code paths over and over again. This kind of repletion slows down the testing without adding any additional value.
Production data is the right data for performance and scalability testing -- When trying to drive load on the system, using the production data gives the largest base of available good data. Running a day’s actual load at an increased speed s the ability to test at a larger scale to validate for spikes and increased traffic.
The test data management strategy needs to be developed as a partnership between IT, business and legal so that the risks, controls and value to test are all well understood and balanced to meet the business needed while still being fully compliant with regulations. A comprehensive test data program will have to balance risks with compliance.
Stable Quality Signal
In Engineering the Digital Transformation, Gary Gruver defined Stable Quality Signal as a way of measuring the system with a consistent result, specifically ensuring your tests consistently return the same results with the same inputs. I have to thank him, because this concept of having a clear indication of quality with your tests is a key part of ensuring your pipeline can deliver code through to production. There have been a number of discussions about this, but the description in Gary’s book provides clear specifications as to what is required. He discusses some ways to get a stable quality signal, including the idea of running your tests over and over again to ensure they always produce the same results, as well as randomizing the tests to ensure that they aren’t dependent on each other. The idea of reducing your tests to make sure those that you have are a good quality, may seem a little strange, but having tests that do not always return a clear indication of the quality of the code are not useful. Developers spend too much time debugging test problems and they grow to expect it’s a test problem instead of a problem in the actual code, the errors in the build start to become ignored.
The key to getting the stable quality signal is making sure your tests have been developed in such a way that you can run them repeatedly, and in various orders, and have them return the same results. One way to help accomplish this is keeping test data along with tests so you have known inputs that allow you to validate with known outputs. One recommendation I have seen work well is building this stable quality signal to start with on a set of API tests. As I will describe later, API tests are some of the easiest tests to build and maintain because they are based on a defined contract.
In defining a test for part of the stable quality signal, you need to focus on the individual components as well as the integrated system. This stable quality signal should be achieved through the series of regression tests that can be run in
an organized fashion following each build. This regression bucket should be able to run first starting with the area of code that had specifically changed, and then following with the remainder of the tests. This regression bucket should over time include enough tests to give you the confidence that this code could be ready for production. In the beginning, starting with a few tests, provides at least a level of to development. The regression bucket needs to be managed in the same way as the rest of the code maintained, groomed and updated.
The regression bucket should not just become a compilation of all the automated tests that have been created, but should be a defined set of tests that cover key business processes in a way to give you the confidence necessary to deploy. For the regression bucket, it is important to focus on components, applications and then also the final integration type tests. Integration across a set of applications is a key area for the regression bucket to ensure that you were not breaking the interdependencies between the applications.
Story: Unreliable Tests
I can working with a team, who seemed to always have a yellow build status. When I took the time to investigate I discovered they had updated the system to allow for yellow builds, any builds that the tests failed would be marked as such, but could be ignored and the function would continue on to the next environment. No one spent any time looking into these failures if they were not specifically associated with the change they just made. It was just always this way, as there were too many tests and no one could when it was last green. When I then looked into the test failures in the later stage environments it was clear that we were running into problems that should have been caught, and were caught by the earlier test, but since there were so many problems no one noticed. This letting the code through was seen as a positive, but really it was delaying the on issues, and it took longer to figure out what was wrong since it had moved on to integration testing rather than finding the issue in the original testing.
One reason for the failure was the use of unstable UI tests. All of these UI tests were finally removed from the early process, and critical functions were tested via APIs instead. This allowed the team to receive a green build for the first time in a long time, they still had a while to build up enough coverage with the new testing methodologies, but they now had a process that would allow them see if there were problems and not ignore the tests.
Key Insight: Stable quality signal
A stable quality signal is critical to make successful use of the pipeline providing quick on changes. Without a stable quality signal errors in the pipeline will be ignored as it’s not clear if it’s a problem with the code or a problem with the test.
Get Clean, Stay Clean
If every time you do a build there’s a failure in some test somewhere, but nobody knows why, and it’s just always there, people are going to stop paying attention to these failures. They will lose confidence in your testing. This is why it is important that you get your automation and tests clean.
Focusing on ensuring you have a clean pipeline with a clean set of tests is more important than the number of tests or the full code coverage to begin with. Making sure the quality of the test is high enough so that s don’t question the pipeline results matters, because if a problem does surface, then they know it’s a legitimate problem. If you’re starting out with lots of automated tests, you need to re-focus, begin working with subsets, find out which tests are reliable, and provide the right level of coverage for the function.
As described in Gary’s book, making sure you have this clear quality signal is what is meant by ‘get clean and stay clean.’ This process does take effort and must be planned for. Managing the test and the test data is as important as the application itself in order to be able to make updates quickly. In addition, focusing on a small set of quality tests reduces the work of maintaining the tests.
It’s key to include tests as part of all development efforts, this keeps the tests working and providing value. It is not a develop and forget process. One good way to reduce maintenance efforts is to used defined interfaces for testing. Any time a defined interface is updated, the corresponding test should be considered for the new updates, but the existing tests should continue to work, unless it’s a breaking change, which is also good to understand early.
Summary
Software development at the core, no matter what type of system you are building, requires people to use creative thought, this includes writing the code and all associated tests. Outside of these activities the processes are repeatable tasks that should be automated to remove the variability from the system, as well as remove any activities from the software engineer that can be automated to allow them to spend time creating business value.
The pipeline automation is at the core of this automated process. It’s not just automation, but the framework to bring together all the parts of the automation with appropriate quality gates to deliver to the development team as quickly as possible. The pipeline s the ability to go through the various stages as soon as the prior quality gate is satisfied. The pipeline also provides a standardization of the process across all technologies and platforms. The commonality brought to the system by using the same technology for the pipeline where appropriate, breaks down the differences between the teams allowing for greater collaboration and understanding.
In large complex systems including IBM Z it is important to , standardizing on one pipeline technology can help with audit and controls, but you will have multiple instances of pipelines and likely pipelines of pipelines to bring the various parts of the system together.
The pipeline provides the automation, but requires environments to be available to the various levels of testing. In large complex systems including IBM Z one of the biggest inhibitors to the agility required in today’s rapidly changing environment is that of the availability of environments for development and testing. Providing the ability to spin up and down environments, including z/OS,
provides the flexibility to allow for the innovation required and fast to the teams. Along with environments is the ability to have the right test data available at the right level for the type of test being performed. As discussed above, test data fabrication and small sets of data in early environments can help drive appropriate behaviors to build quality from the beginning.
The value of this ability to have isolated development environments cannot be understated. This is a key factor in the ability to deliver at the speed business requires today.
Section 3: Types of Testing
Testing can be broken down into the scope of the testing as well as the type of testing. For any application there will be a need to include various different types of testing, in addition to the other methods described, to drive software quality. This will not be an exhaustive list, but an attempt to describe the most common varieties and types with the multiple names commonly used to describe them. The goal in describing each of these specific areas is to help provide additional insights for deg a quality process. These testing practices are not unique to complex systems, they apply to all types of development. They are discussed here to ensure they’re considered as part of the overall solution.
Scope of Testing
One of the best ways to differentiate between the different types of testing is based on their scope, or the focus of the testing efforts. To begin, we will break down the following scopes:
Unit -- method/program level.
Component level.
Application level.
End-to-end.
According to Martin Fowler, “The test pyramid is a way of thinking about how different kinds of tests should be used to create a balanced portfolio. Its essential point is that you should have many more low-level than high-level running through a GUI.” The following picture (Figure 18) represents the standard test data pyramid.
Figure 18
Another way to look at the pyramid is what can be done before deploying into an environment vs. what has to be done once the function is deployed.
There are many different types of testing as well as scope for the tests. For now, we’ll focus on the scope of tests rather than the types of tests. When we think of the scope of a test we’re thinking about what percentage or what part of the application or system we’re testing. We can start as low as the scope of the unit test, and then move up to function level test, or small integration test, or larger integration test, or system tests, different organizations use different for the different scopes, but in each case there is some level of separation between the scope of testing.
For some organizations, when testing, you’re almost always testing the entire application, there isn’t a way or an environment in which you have the ability to run part of the application without the rest of the application available, this is because many of the applications have ended up growing into monoliths, these monoliths make it harder to test a component or a part of the application that you’re working on. One focus in improving automated testing is to work toward having the ability to test at various levels, from unit to component to application to system, and finally to the full organization test.
Within each one of these scopes there may be different types of testing you also want to perform. For example, you can do performance testing, even at the unit level, to understand what you’ve just written to make sure that it doesn’t slow things down by itself, so we need to think about performance testing of the scope that you’re currently testing. Ensuring that we do all of the testing we can at each scope, helps us get fast to the developer. There’s no reason to wait for a large-scale performance test if you can find out that the database query that you
wrote performs badly from the start, and that’s just an example of something that could cause a problem.
There are many other examples of things that could cause a performance problem, adding in a loop that performs badly and does not jump out as soon as items are found, or performing extra processing after you’ve already resolved an issue. Making sure that performance of the individual or smallest level item is tested along the way helps find these issues as early as possible, when they are much easier and cheaper to fix.
Unit Testing
A unit is the smallest possible piece of code that can be tested. A unit should have few inputs and have limited, or one, output. For procedural languages this is usually the program, for object-oriented languages this is generally the method. Unit testing is a critical part of the quality process improvement. The concept of writing automated unit tests is not normal for many developers of complex systems including IBM Z. Historically there have been few frameworks targeted to languages such as COBOL and PL/I for unit testing, and the concepts evolved after many large systems had been created and teams were working in a waterfall development methodology.
Unit testing is a set of software that is written to test the smallest possible unit of code. The tests are created to drive the various code paths, including error conditions. Problems found with unit testing are fixed as part of the development process and are generally not tracked individually. Unit testing frameworks assist with the testing along with stubs, mocks and drivers. The unit testing frameworks assist in the process of creating the automated test so that the code can be run without other parts of the application or middleware. The unit test is run-able as part of the build process so that each time the method or program is compiled, it is also tested.
Unit tests are created by the developer either before or as they create the code. This is the one case where the test is created by the same person that does the initial implementation.
Frameworks, such as jUnit, xUnit, zUnit, help create the test cases. For example, when building Java applications, jUnit tests should be created to test the methods. Along with the testing framework, the capability to record code
coverage while running the tests helps provide a view into what has been tested. Code coverage can be used to determine if enough scenarios have been covered as part of the unit test.
The unit tests are another piece of code checked into the SCM alongside the actual code and should be part of the code review process. Code coverage can be tracked as a metric for software quality, but as described in the metrics section, care should be taken to drive the right behaviors. One hundred percent code coverage is not necessarily a good goal, but having the right coverage or improving the coverage when testing existing programs, may be the right goal. The following picture (Figure 19) represents the full scope of the unit test.
Figure 19
In building the unit test we need to make sure that we are actually testing the various functions, and not just putting in assert equals true as part of the test. It is possible with jUnit to build a jUnit test that doesn’t actually test the full capabilities of the method, but instead just assert equals true. This isn’t truly unit testing, but rather a way to satisfy the requirement for unit testing without actually doing the work. In order to stop such practices we need to make sure that the unit testing actually goes with code coverage reports, this will ensure the developers are actually testing the code that they have written.
In thinking about unit tests and code coverage we want to focus on making sure that we’re getting the right coverage of the code that were writing. It needs to be enough coverage, or enough information, to indicate that we have tested the capability. We want to make sure we’re not just doing the happy path testing, we do want to be realistic on what is being tested. We also need to think about the fact that some code already exists, and so if what I’m doing is adding new code or changing existing code, my unit test needs to focus on this new and changed code, not on all of the existing codebase.
When we think about z/OS we need to think about unit testing in the same way, we still want to test the new and changed lines of the code that we’re working on without having to worry about all the existing code that’s already in the module. In order to do this, we use tools like zUnit, which allows developers to create a unit test of the particular code they’re changing without having to cover all of the code in the module. We need to provide the right code coverage results to show that those new and changed lines have been tested and include both the happy path and the error paths.
When thinking about z/OS we also have to think about what the smallest
possible unit is. In this environment the smallest possible unit, at this point, is probably the program itself, or the load module. If we look at each individual load module it’s part of the larger application, and usually load modules cannot be tested independently unless they happen to be a batch program. With the zUnit capability, similar to jUnit capability, you have the ability to create a unit test to test individual programs without having to have the middleware around. This means you can do a record and playback method to capture the data necessary to run the unit test in the future. Once you have that unit test built you can test the program entirely independently.
In allowing developers to do automated unit tests as part of their process, they can get that fast on whether or not the code change that they made actually does what they expected it to do. Being able to test quickly without having to deploy into an environment means the developer can get their test done as soon as their code has been written. They don’t have to wait for other people, they don’t have to wait for environment setup and they don’t have to wait for other resources. They can start testing as soon as they finish writing a code or, even as they are writing the code, they can test what they are doing as they are doing it. This may not be true test-driven development, but at least it’s having the test developed at the same time as the code is being written. Adding these automated tests means the code can be verified along with its build process, so each time the program changes you can retest to make sure the old unit tests continue to work.
How does this definition of unit testing compare to what is happening today? In many cases, developers are running their code however they can as part of the application test, to see if the function does what they intend. This process takes time and requires an environment for the developer to run tests in. It is also a manual process, so each time the program or function is changed, the manual process starts again. Moving to true unit testing is changing the way developers work, changing their responsibilities to include the automated unit test, but it also provides additional value to the developers because the existing unit tests can be run to any old functions that should work, still do.
It’s important to recognize that with large complex systems, that have little unit testing done, trying to go back and create unit tests for everything is not realistic. Instead, focusing on the areas where changes are being made and adding unit testing to the changed programs will slowly increase the overall coverage. Adding unit testing to a program that has not changed in 10 years and is not likely to change in the next 10 is not a valuable use of resources.
Key Insight: Automated Unit testing
The use of automated unit testing is critical to get fast to the developer. Providing measurement to show the unit test covered the new and changed functions including error conditions is important. However, this is a significant change, capabilities such as zUnit framework that allow the developer to test without any of the related middleware or other parts of the application, should be used to help demonstrate value to the developers.
Beyond Unit, But Still in the Build
In the z/OS space, after unit testing, there comes the challenge of how to test further. Do you have to build up an entire environment in order to run the test, or is there a simpler way? One specific value about z/OS is that the middleware systems and environments have provided a set of exit points that actually allow you to do more than you might expect. The system also has a highly documented and highly stable control block structure for calling programs.
The next picture (Figure 20) represents a modified version of the standard pyramid where an additional separation is indicated based on pre-deployment and post-deployment testing.
Figure 20
When I started working in IBM learning the z/OS control block structure was a challenge, but very important. Due to this structure, the standard linkage mechanisms and the exits in z/OS, it is actually possible to take a step to literally stub out all middleware without changing the program that has been compiled. With this capability IBM Z Virtual Test Platform (ZVTP) was created. This uses the well-defined structure of z/OS and the middleware to a record and playback of programs that are in a transaction or batch flow. This means you can run your application without having to deploy it, to do the next stage of testing.
This concept of being able to run a z/OS application inside the build process is relatively new as of 2020, but it is a game changer in the way we can do automated testing. With this capability you can now record a transaction or an application to capture all of the calls from the program to the middleware or to other programs. This allows you to build a fully virtualized environment to allow you to run your program or set of programs on a system that doesn’t even have the middleware installed. This greatly simplifies the ability to do automated testing and allows for the testing within the build process to give that really fast to the developer.
This kind of testing changes the game for z/OS allowing for the testing of copybook changes or compiler changes in a way that was never available before. While this capability is specific to z/OS, in of simplicity, development on z/OS now begins to align more closely to that of other systems and middleware, such as Liberty where test environments are quickly deployed to laptops. With ZD&T you have been able to run z/OS on your Intel system, but this is a full z/OS environment that you have to deal with from the standpoint of running the middleware and all of the components. With ZVTP using ZD&T I can now simply use the programs and the data file in a way much more similar to that liberty on my laptop example.
In the z/OS environment we have these shared copybooks similar to other shared include files in other languages, however, these copybooks in z/OS are really the data structures. They are the definition of the sharing of data between the programs and applications, so they are really more like the API definition in other languages. COBOL copybooks or PL/I include files define the structure in order to appropriately access the data. The copybook is used for API definitions with z/OS Connect, or in calling other programs. These copybooks sometimes are included in a large percentage of the programs, as part of a company’s applications they can be shared within an application or across applications. So when a copybook change is made, as long as the change is not a breaking change, it doesn’t require other programs to be recompiled. This means your applications could have many various versions of this copybook included in the program, which can lead to errors, problems and issues. So if compiling everything that includes the copybook increases the risk of a problem, how do you test all of those changes? With ZVTP you can easily run the data file with the programs after they’ve been recompiled to that nothing has actually changed, and that the programs continue to run as they did before.
An important value of testing using the Z Virtual Test Platform is that large portions of the application can be verified based on simple playback (Figure 21), to validate that changes made in one program don’t negatively affect the rest of the flow. In a highly interconnected environment this simplifies the ability to get fast to avoid unintended consequences of a change.
Figure 21
Component Testing
Component testing could equate to a micro service, the smallest segment that performs a unique function. Component testing is designed to that the component performs the required capabilities, sometimes unit testing is also known as component testing, but for the purposes of this discussion component testing is a level above unit testing.
A component is a set of units that come together to provide a specific function. For some applications this could be represented as a transaction. Components are brought together to perform a full service.
Components generally have defined interfaces for communication with other components within the application. This interface could be a REST API, or a MQ Queue, or any number of standards-based or proprietary interfaces.
Component testing is focused on function and interaction, specifically, to make sure the function performs as expected, whereas with unit testing, the focus is on the code performing the steps. Component testing is usually performed through API testing. In many ways a z/OS transaction might relate to a component, or it might be a series of transactions. Using a capability to test the transactions once deployed into the system is the next level of automated testing.
Applications
Application is a term used to represent a set of capabilities that are scoped together to provide value to the organization. Sometimes applications are defined based on organizational boundaries, sometimes they are clearly independent of other capabilities. Applications usually stand on their own, with their own access to data. However, sometimes applications become intertwined.
Application is a term used to mean a variety of different things. Git is considered an application. GitLab, though it has more capabilities than just base Git, as it provides additional server functions, is also considered an application. SAP is many times called an application, though it has many diverse capabilities.
With application testing, it’s important to consider that the definition of “Application” varies, understanding what represents an application within the system to provide a set of boundaries for testing can be a useful exercise. It could be that beyond component testing (Figure 22), the next logical type of test is system or end-to-end testing.
Figure 22
End-to-End Testing
End-to-end testing is defined as testing all of the components of the Application required to complete a transaction, from the starting event until complete. For example, with a financial institution, the has the ability to deposit a check electronically from their phone, and have the correct currency amount deposited in the appropriate . Or a retail example of a selecting to order an item from a web page, adding it to their cart, completing the purchase, paying for the item, having the money requested from the payment system, and shipping the ordered item. These are two examples of end-to-end processes. When doing end-to-end testing, organizations select the highest critical end-to-end functions for their business and those are tested.
This end-to-end testing is where many organizations have focused their main testing efforts. By doing more early testing, when code arrives into the end-toend test environment, the focus can be on the true end-to-end scenarios and integration points.
Key Insight: End-to-End Testing
Other types of testing will not remove the requirement for end-end testing, but the time it takes to do end-end testing and the types of problems found should be significantly reduced.
Types of Testing
Within each scope there can be different types of testing:
Functional verification testing: Verification of specific defined capabilities being added.
API testing: Testing against a defined API, could be any API, many times today APIs are thought of as REST APIs. With the defined APIs tests are easier to create, programmatically and maintain.
Sniff testing, smoke testing: Verification of the environment and the basic functioning of the system with a quick high-level test
Integration/system testing: Verification of the interrelationships of the various applications that make up the system.
testing: Verification of the UI to ensure it performs the functions as expected in a way that is easy to use.
Regression testing: Testing to prior capabilities continue to function as they have before a change, regression testing can be at any level of testing, generally defined at unit, function and system.
Performance/scalability testing: ing the system can perform within specified agreements, measuring the system resource required for the function at particular throughput, and understanding the resource growth requirements as the throughput or load increases.
Infrastructure testing: Verification of the underlying capabilities required to run the application including hardware, network, operating system and middleware. Even when using cloud deployments, ing the capabilities are connected appropriately and running is still important.
Chaos or fault injection testing: Injecting faults or errors randomly to ensure a system continues to perform within tolerance, even with unpredictable failures.
Production verification test: Verification of the solution running in production after a deployment update, to ensure the update was successful and no parts were left out or misconfigured.
A/B or market testing: Providing alternative solutions to the end s to understand which implementation or design is better received.
Test Driven or Behavior Driven (TDD/BDD) testing: Building the test before the implementation of any function to allow for the verification of the function as it is built.
Functional Verification Testing
Functional verification testing provides specific checks for functions being added to a system. These tests are focused on individual functions within a capability to ensure it performs as specified. Functional verification should not only test the successful path, but should also the specified error paths. Functional verification tests are also created to check that the fix, for any defect that is found, resolves the problem. Functional verification tests are generally added to a functional verification regression bucket.
Functional verification tests should be created by a team member that did not create the code for the solution. These tests should be created based on the same requirements for the implementation, at the same time as the implementation, so they can be run whenever the function is ready.
Functional verification testing is likely the first type of testing in the pipeline after the changes are deployed into a running system. ing that the function performs, as specified in the requirements, by someone other than the developer can provide the first check and on the function itself vs. the implementation. This functional verification should be done in an isolated environment, as subbed out as possible, to allow for only testing the function without all the other interconnected parts. Functional verification should be done on the smallest amount of data possible that tests the scenarios, including appropriate errors. This is a great place for fabricated data that can be stored with the tests themselves so they can be rerun easily and reliably.
API Testing
API testing is the process of testing against the specific API that has been defined. When people today refer to API, they often actually mean REST API. Rest API’s are a form of APIs, but there are many other forms of APIs that can also be good targets for testing. API tests have a defined interface and a defined response. This programmatic definition makes it much easier to write automation to call a program with various inputs and then validate the response.
API testing may be one of the easiest to automate as well as the most useful beyond unit testing. Testing via APIs, especially when those APIs are used across components or applications, not only tests the capabilities but also verifies that the callers will receive the correct response.
With API testing it is as important to test the defined requests, as it is to test with errors and malformed requests, in order to simulate possible problems in the environment.
With APIs it is also possible to create virtual services, based on API boundaries, to allow other parts of applications or external applications to code to the virtual service, without needing the real service. One of the many goals of microservices is to provide this way of testing to validate each individual part separately.
API tests can be created by the designers of the API specification, but should not be built by the developer of the API itself, or the creator of the API. These tests are created based on the API specification not the code itself.
API tests are a critical part of the process for large complex systems because these systems generally interact with the APIs. By providing both stubs and API testing, the teams can work independently longer.
Key Insight: API Testing
API testing is one of the most efficient types of tests to create as they are based on a defined contract. Creating both the API test, as well as stub for calling applications, allows for greater independence for development activities while providing improved quality. Since API tests are based on the defined contract for the service, this should be a stable test, and when providing forward and backward compatibility in APIs the existing tests should continue to run.
Sniff Testing, Smoke Testing
Sniff testing, sometimes referred to as smoke testing, is a type of testing that uses a set of quick tests to the application is in a running state and that all key components are functional. The goal of the sniff test is not to all the capabilities or the functions, but to validate that the system fundamentally is working. Sniff tests should be built so they can be run in any environment to validate it is functioning, such as when a new environment is spun up for testing, or even a new production environment, or an update to a production environment. Sniff testing can be used when a disaster recovery site comes online to validate all the parts are running and functioning.
Sniff tests vary in design and complexity. Sometimes they are simply a verification that all required services, components, databases and middleware components are running, and that the network connecting them is flowing traffic. Other times tests, such as the 99 pencil test, have been created to test how the entire process works. Sniff tests need to be fast to identify issues with an environment before any processing or testing is performed.
Story: The 99 pencil test
One client example involves ordering 99 pencils. This organization set up a process to order 99 pencils; however, it would not actually ship or bill the credit card. It would fundamentally test the entire process from the very beginning of requesting the 99 pencils, creating a shopping cart, doing all of the necessary steps, checking inventory, even printing a shipping label; however, in the final shipping stage the order of 99 pencils would not ship nor would the credit card get billed. In this particular case there were simple checks done at the shipping and billing portion to ensure that the 99 pencils order would not be fulfilled at this end stage, but it could go through the entire rest of the process.
By having this setup, they could do integration tests in any environment, as well as run the same test in production. This test could be used for scalability and performance, and also monitor performance in the production environment.
Not all organizations can do something like this, however, figuring out what kind of test can be performed, even in production for verification, is critical. Having a way to do a simple end process check in the production environment helps ensure that all of the necessary integrated pieces are actually available.
Integration/System Testing
Integration testing is a phrase used by many to mean many different things, so let’s start with some of what people consider integration testing. Integration testing could be anything from the smallest integration of two programs to the entire system of an organization, anything larger than the single program could be considered integration testing. The key aspect of integration testing is that you are literally testing the connections between the programs for the impacts one program, or a change to one program, has on the other programs that call it or that it calls. I think it’s very important to recognize that there are various levels of integration testing and that you need to focus on those levels that provide the most value and provide the most .
If integration testing is testing the connections between the programs then what is system testing? System testing is generally used for testing an entire system to provide business value. Sometimes this is actually all of the applications that make up a corporate environment. A system is a set of components necessary to deliver some function.
When we get larger than the unit and get outside the scope of just the developer’s work, we get into the next phases of testing. These phases could be described as function testing, integration testing or system testing, all of the tend to be used interchangeably. However, I want to define these in a way that provides some clarity and some focus to the possible differences.
Function testing
Function testing is the process of testing an individual capability, that is being added or changed, into the business system. Function testing focuses on that particular capability based on the business requirements to ensure that the requirements are satisfied. This is usually larger than a program, but smaller than the full application or system. If we think about it in of a micro service, it might actually be a set of micro services required to provide a particular function.
A function could be something such as, add the ability to provide 10-year mortgages when the system currently provides only 15-year and 30-year mortgages. Or a function could be, add the ability to specify a longer middle name in the system. Both of these could be described as functions being added, but as you can see are at very different scales. This is one of the problems with testing, and the traditional ways of talking about testing, because different people interpret the words in different ways. The problem gets even bigger as we think about integration or system testing because the are generally used interchangeably, though they could mean very different things.
Integration testing/system testing
The key purpose of integration testing is to test how things come together, but at what level does this imply, it could imply two programs calling each other, it could imply a set of micro services, or it could imply how applications talk together. If we think of system testing we more commonly think of the entire environment coming together within an environment that needs to be tested. This term generally refers to the entire environment ensuring that all the different parts of the different applications used can work together to provide the overall Business value.
For the purposes of our discussion, I’m going to focus on integration testing as any testing whose purpose is to test the integration between different parts of an application, and system testing to be the focus of testing across applications.
The question is then what’s the difference between integration testing and API testing. For this it really depends on how you see API testing, as was described previously, many people really see API testing as testing at the REST interface level, not at any API level.
Testing
experience testing
One critical area of testing is UI testing. Developers don’t necessarily make the best experience people, so making sure you have the right experience test for the UI is a critical part of the process. This is one place where manual testing, actually having a use the function, is absolutely the only way to perform this kind of testing. Yes, you can do automated UI testing to validate that the UI continues to perform as it has in the past, but this is not experience testing. Actually, having s experience the capability is critical to ensure that you have the most -friendly interface, as well as the most usable interface for your end-clients.
experience testing can happen at all phases of the development process, and actually does not require the code to be complete, using wireframe drawings to make sure you start in the right place, and doing early playbacks to make sure that the experience is being designed around the actual end-, will give you that as early as possible to allow you to build the most successful interface.
It is important to recognize as design software has gained more capabilities, so drawings can actually resemble the UI. Actual wireframes that do not resemble an actual interface, but provide the basic flow and idea, is a better way to start. s who see wireframes that are more realistic are less likely to focus on the flow and function and more the minute details.
Key Insight: Experience Testing
experience testing is one type of testing that has to be done in a manual way and should be done with individuals as close to the actual end s as possible. This testing can be done starting with wireframes and does not even require a line of code. Getting the right experience improves the perception of quality of the system.
Freeform testing
Freeform testing is the kind of testing that is hardest to scope, hardest to size, but it is a very important focus of your manual testing. In fact, freeform testing is probably the best manual testing that you can have because it’s not done in a standard way, it’s not something that can be automated, and it is likely to find those really weird cases or things you never expected someone to do.
The key part of freeform testing is to have someone unrelated to the area do the testing so that they aren’t biased by the design or pre-work. The goal of freeform testing is to do those things that are not expected, such as attempts to break the system. One area of focus is to try to corrupt the data by inserting unexpected input. This is one way to do additional security testing, by for example, causing loops or hangs to drive resource utilization to interfere with normal operations. This is highly related to the malicious testing described below. Sometimes architects or business-related s make the best freeform testers. There are also some individuals who can be very good freeform and malicious testers, and can be used to validate the application will hold, even with unusual use cases. Freeform testing focuses on the areas that were not defined as automated tests, but instead on the variability that s can cause.
Malicious testing
Another type of testing is malicious testing, where you have s working to break the system, to find holes to break through, and break the product. Malicious testing is another area of testing that ends up being manual. This type of testing should be performed on the system whenever possible. Malicious testing is not usually done when the code is only partially complete as it will be very easy to find the holes, this type of testing needs to be done when a function is believed to be complete so that it has all the right error checking in the system already.
Malicious testing is a critical part of the process many times also associated with security testing, penetration testing, and other types of testing such as chaos testing, to make sure the system can withstand whatever is going to happen to it.
acceptance testing
One of the final phases of testing is generally acceptance testing. In this phase of testing, actual s or representatives will test the system to ensure it satisfies the fundamental requirements and the system is usable. acceptance testing generally happens right before the system goes to production to ensure that in its final form it will satisfy the requirements and perform in a way that is acceptable.
acceptance testing for some systems is the reason why code is not automatically delivered into production, but must first wait in a acceptance space to allow the s time to validate before it moves forward.
It is important to recognize that there are many different types of tests and some of those tests will require final validation before going to production, these final validation tests don’t generally get run early in the process as they are only required for final production deployment. That doesn’t mean that you aren’t checking with the s to make sure the function works as expected early on, but there is a final verification stage that can be required for many types of functions.
Story
One fun example of freeform or malicious testing comes from when I was working in our system test organization. I was responsible for building up customer-like tests for a set of our products. This set of customer-like integration testing provided a great first test of how our products would be installed and work together in a customer environment.
As part of this testing, I would do upgrades, fresh installs, run through many different scenarios, through a series of freeform experiments. When getting ready for a particular release, the organization had completed the development and all of the standard testing, the question was being asked whether or not the product was ready to ship. It was already delayed and there were concerns about not getting the function out, so if tests were successfully completing there was a confidence that the code was really ready. However, my testing had not been so successful, I had run into a number of different problems in the integration and in the upgrade process. The system I was using was one I had built over the years and so it had gone through many different upgrades and in some ways it was somewhat corrupted. However, this system represented what many customers who had been using the products for many years might have at this point in the release cycle.
I should have been able to do a successful upgrade, however, I could not. So in reviewing the ship readiness it became clear that the only thing holding up the release was my upgrade test and since my system was not clean it was determined that it would be an exception and my problems closed, however, the executives responsible for the products understood that this product had been in the field for a long time and that we had many customers that were highly dependent on the monitoring system. So my test became the final test, but had to before the product could be released.
Now I was lucky, I had set up the system a long time ago, but I had used VMware images and snapshots along the way, so I had the ability to restart this test at any time based on whatever release was put out, this allowed the development team to take a copy and be able to completely debug and understand what was really going on and what in the system was causing the fail.
There were two key factors in successfully testing this system, one was that we could easily restore and re-create the problem. We could always go back and could always ensure that we could move forward for each release, the system always ran on the network and therefore had a continuous set of activity and a continuous set of data. This became a perfect representative environment as well as a very good test system. The second key factor in the system was that it had been used in different ways based on how our customers actually set up their systems, and had configurations done to test various customer scenarios and various customer problems in the past. By having a system that really represented the running environment it made it much easier to understand all of the upgrade and configuration scenarios.
Looking at the prior example we need to recognize the importance of the two key factors, one having the ability to re-create a system with all its errors and with all its faults. It is important to be able to test in a realistic environment so that malicious or freeform testing can help you find problems before you’re in a production environment. The second important factor is having data that has gone through various upgrades and various stages that really represents a real running system. Both of these factors need to be considered as we build up later stage testing.
Regression Testing
Regression testing is a type of testing that can apply at any scope. The goal of regression testing is to validate a function that was already working continues to work as it did. The questions with regression testing are how much to test, which tests to run, or in which order to run the test to get the most value, and find errors the fastest? One important aspect of regression testing is that the regression bucket itself must be maintained along with all other code and tests, as the applications change, the expected results may change. With a regression bucket, when something fails, the first question should be, is that a valid failure? For example, did the change that was just made change the results such that the regression test should fail?
Regression buckets are often created by simply adding any new automated test that was created into the bucket to be run. This on the surface might seem like a good idea; however, if tests are continuously added, the regression bucket becomes large and difficult to manage. Building regression buckets based on the scope they apply to can help manage the tests.
The first level of regression testing is the unit level, generally all unit tests are added as a future regression for the program. But when running these tests, only the programs that were changed would have their unit tests run as part of the build, thereby keeping the build times short.
When considering regression buckets at the component and application level, the tests that are quick to run but provide comprehensive coverage should be selected.
When building the end-to-end regression test bucket, the most critical business transactions, and the areas with the most problems, or those that are frequently changed, should be included. Another way to select the tests is to identify tests related to code changes and run those tests first, to get the as quickly as possible.
Performance/Scalability Testing
Performance testing could represent many different things. It could represent the U utilization of an application, it could represent the time it takes to complete a transaction, it could represent the time it takes to complete a transaction when the system is under heavy load. When measuring to see if transactions perform consistently under various loads, it could be to understand how to scale up or scale out a system, depending on the load. We can look at performance from these various different aspects, as we are trying to understand the application.
Running performance tests on individual components within the system can provide a view into what each part will take, running a performance test against the entire system gives a view into the entire transaction time and cost. When we look at performance, we may be looking to optimize the actual time it takes to process or we may be looking to optimize the system utilization while it’s processing.
What is scalability testing and why does it matter? First, are you testing how many s can interact with a system, are you testing how much processing can go through the system, is it a combination of those activities? Scalability of an application varies based on the different parts of the application as well as how those parts have been implemented. When we think of scalability of an application, we need to take into consideration all of the parts of the system, if the back-end system can handle any number of transactions, but the mid-tier has a limit to how many it can process, it doesn’t matter if the back-end system could handle more, it won’t be sent through. Same is true for the front-end to the back-end, if the front-end can’t scale out the number of clients coming in, then it doesn’t matter what the back-end can handle because the data can’t get through.
Looking at the entire system to understand how it scales, how it can scale and how you can improve that scalability when you need to, is important to understand. The purpose of scalability testing is to understand what it takes to allow the system to scale. Does it scale linearly, do I need to add a consistent set of resources or memory for every additional , do I need to add additional capacity when I hit certain boundaries, are there certain limitations that require me to deploy a second system for example, or can my system with additional resources just scale up to handle the situations? And does the environment itself handle this scaling?
With scalability it’s also important to think about availability and reliability in providing a cluster, for example, to allow scaling out we could also be providing additional reliability and availability characteristics by removing single points of failure, but we are at the same time increasing the complexity of the system. Deg the system to be the simplest possible, while still providing the right availability reliability and scalability, should be the goal of the scalability testing.
So, how do you test for scalability? Can you run a set of dummy transactions into the system to simulate an environment, can you scale to the actual production environment, can you cause the thousands of transactions that you might receive? The inability to scale to full production often limits organizations from doing scalability testing because they can’t scale that high. But just because you can’t get to production doesn’t mean you can’t test different loads on the system to understand their effects.
By measuring different loads and understanding their effects, and measuring the system as you increase the loads and measuring how that increase affects the system, you can build models to understand and extrapolate the overall performance. This is done through monitoring the different size loads on the system to understand how the system scales. It is important to understand how the system scales for each part when trying to extrapolate. With z/OS the system can provide detailed information as to the utilization of the system resources related to the application running, as well as performance of the application
itself. This is only one part of the overall complex system, the network connecting to any other part of the system, as well as all other components that make up the system must also be considered.
The other thing that one can do is measure the scalability and performance in production. This does not mean you’re testing in production, this means you are taking the appropriate measurements in the production system to be able to understand, based on the load on production, how the system performs by using these measurements in comparison to the measurements that have been taken with smaller loads. This understanding measurement comparisons helps allow for extrapolation of models if, or when, the system grows even more.
Key Insights: Performance Testing
Doing performance testing as early as possible in the pipeline is important to identify problems with code design. This does not require a full performance test, but instead testing the individual parts to ensure each area does not in itself indicate a problem.
Doing full performance testing of the end-to-end system is also required, with full system and data load to identify problems before going to production. If this can be the full size of the full production environment that is best, but it at least needs to be representative.
The performance testing can help define the monitoring thresholds to be set in production as an early warning system.
Infrastructure Testing
One area that we don’t normally think of for testing is infrastructure. The infrastructure testing process itself involves creating the infrastructure in order to run the application. Having a set of verification tests to ensure connectivity and security are all in place, so it’s not infrastructure problems that are causing application problems.
I can many times working with clients dealing with issues, and where there is a problem in the environment, the first answer is it’s the network. Many times, when there’s a bridge call with a major outage, the first problem is always the network because it’s between everything and if the network isn’t working nothing else is going to work. The other area that is commonly blamed is security and/or certificates, this is because certificate management is critical in today’s systems along with security to ensure our systems are properly controlled and managed.
However, if certificates expire or if there’s a problem with a certificate, it can cause various problems, which may be hard to identify in the middle of an application. By having verification built to allow for a test of the network, appropriate security and the certificates, as well as the availability of any middleware or connected system, it makes it easier to rule out those kinds of problems. As we move to infrastructure as code, this verification should be built into the process along with the process for creation of the infrastructure.
Chaos or Fault Injection Testing
One area that you might not consider traditional testing, is chaos engineering and/or fault injection. Chaos engineering drives an understanding of the reliability of the system overall, which is a form of testing. Chaos engineering or fault injection helps drive an understanding of how the system will perform when significant errors occur, this is another example of non-happy path testing. It is as important to test for errors and for problems as these are the places where we most likely need additional visibility into how the system will perform.
Building software to do the desired thing is much easier than building software to do the desired thing while also considering all of the possible error cases and error situations. But this is absolutely critical to understanding how the system will truly perform. It is important to note that there are places where you may not want to put a tool called Chaos Monkey into a production environment, but you still need to understand and build for appropriate failures.
Long before the idea of Chaos Monkey, many organizations were testing for faults by failing parts of the system to ensure the remaining system stayed functional and the failed parts were restored appropriately, or generated the appropriate alerts and messages to indicate a problem quickly.
Introducing chaos can help you better understand and deal with the areas where you could possibly have a failure. Most people deal with Chaos Monkey when they’re dealing with many different systems services, containers, micro services etc. Having parts of the system totally fail because they weren’t built on reliable hardware, means one has to deal with failures of the hardware as well as failures of the software and of the system. One way to address some of this is to use reliable hardware, such as the IBM Z system, which has a reliability
characteristic of seven 9s. However, even with an IBM Z system you would not want a single point of failure, all systems can have problems, so ensuring you can deal with those errors is critical. No single point of failure is a good idea for a system.
Additional mainframe background:
The IBM Z system has been designed at the core from hardware up to ensure this high level of reliability. Starting with multiple extra systems processors at the chip level to help and ensure that if there are failures, there is something to take over, having the hardware itself be reliable provides the system with one additional level of capability on top of that. The operating system has been designed, if you take advantage of it, to provide additional operating system level transaction reliability.
For example, a sysplex environment provides for high availability in the sense that you can take down parts of the system while other parts are still up, and traffic will be routed appropriately by using a sysplex. The system is handling and ensuring the capacity and capabilities required to complete your application transactions. This along with workload manager helps ensure that the critical work is always performed first, and that each transaction performs with the same speed and same reliability each time. On top of the sysplex there is a database to a data sharing group ensuring that with multiple Z systems, even if you had that hardware failure on a system, which is highly unlikely, you would still have a system that would continue to respond at expected levels. By having this along with transactional integrity to ensure a transaction completes fully, and if no rollback is performed of that transaction, you have a system that has a set of capabilities that the applications can assume instead of having to have the applications built in to do all of this extra work.
By having this reliability built into the operating system and middleware it alleviates the need for every application to implement it on their own, and helps improve the overall software quality because there is an instance provided by the system that has not only been tested, but proven through years of production use within customer environments.
There are also some additional characteristics of the IBM Z hardware that are important to recognize from the reliability and security aspects. With its built-in encryption capability you have the ability to turn on pervasive encryption, this means the applications have less work to do, or no changes to make, in order to have all of the data encrypted instead of having to have each application write the code to handle the encryption. This increases software quality by removing the need to have the additional development code, instead the system provides it.
Production Verification Test
How do we test in production to ensure that the change was successful? What do I mean by testing in production? I don’t mean testing for the first time to see if the code does what is expected. I mean that after the code has gone through all its initial tests you make sure that the deployment that you just completed actually deployed into the system and in the production environment, and provides the same capabilities that were tested in earlier environments. The production verification test is critical to ensure that the changes have been successfully deployed and integrated into the system.
Creating these production verification tests can be rather complex. To ensure you’re not doing a transaction you can’t do, but making sure you can the system. For example, I don’t believe you can have a fake bank to test transfers in and out of, and I don’t know any that would want their bank being used as a testbed to test transfers in and out. It’s critical to understand what kinds of functions can be tested or verified, and what kind of activities can be run in production as a test.
Monitoring the actual performance of transactions through the system to the speed, the individual segment speed, as well as the system utilization for each part of the activity is also necessary. This testing is many times done through application performance monitoring, which monitors the system and the transactions as they go through to help you identify where bottlenecks might be where slowdowns might be and to understand what it takes to have a full transaction process running. This application performance monitoring in production can give you early views to problems as well as providing a set of data to understand the relative comparison of how the transactions perform outside of production to allow for proper performance planning.
A/B or Market Testing
A/B or market testing is a way of using actual end s to determine the best solution, deploying a function to a specific market to test its viability, or deploying multiple different ways of performing an action to different s to see how they react to the function. This is done by deploying the capabilities into production, but only exposing the capabilities to a subset of the s.
Story
For example, in an insurance organization they used this kind of capability to see if changes could increase the quote/bind percentage. The fast allowed for the adjustment of the interface or the back-end product function to improve online sales.
This kind of end- testing is done after all other testing has been completed, and the function has ed whatever quality measures are in place. Often, when people refer to testing in production, this is the type of test they are referring too.
The important aspect of this type of testing is to make sure there are measurements in place, or telemetry from the system being captured, to understand what the actual is. It’s a clear way to get without assuming s are going to complete a survey or create a problem record. While capturing the telemetry, it’s important you are only capturing the generic and the experience, what people are using, what flow they are taking, and not capturing any personal identifiable information. This gets the information without extra burdens for processing the data.
Test Driven Development (TDD) /Behavior Driven Development (BDD) Testing
The focus has mostly been on some of the traditional forms of testing, but there are also test concepts, such as TDD/BDD. These focus on the idea of building the test first or being able to the behavior first, and then building the actual function. The idea of making sure you know what you were going to build and be able to that before you start coding brings a different perspective to the process.
TDD/BDD are found when you’re building something new or relatively new, not generally in existing large-scale systems. However, the concepts of TDD/BDD are important even for what one might call the legacy systems because they drive a different way of thinking and ensure you have a clear understanding of the desired outcome. TDD/BDD are a way of shifting your focus to testing first.
TDD is a software development process in which a test is written first, and only if it fails do you write code or refactor existing code to make it . The focus is on the smallest code change to the test. More tests are added as the failing tests to complete the function. TDD is based on using a unit test framework that allows the creation of a test to fail before building the code.
BDD builds on the concepts of TDD, but extends to focus on the story from the outside in to implement the behaviors most aligned to the business requirements, describing the behaviors in a way that is clearly understood by the domain experts, as well as the developers and testers. BDD builds on the process of TDD to focus on the story behaviors. BDD is usually adopted after a team has adopted TDD practices.
These are ways of approaching the development actives themselves to help limit the code written to what is specifically required, and ensure it’s testable from the very beginning. Many teams find this too large a jump when moving from a more traditional waterfall approach. Requiring automated unit tests for each new addition, and developing the right set of automated tests for the existing function, may be a more realistic way to move toward improved quality. For teams with large existing codebases with limited existing automated tests, this move toward automation will be an easier first step.
Frameworks for testing
Testing frameworks provide an additional level of standardization across the various languages and application types. Frameworks such as provided for unit testing help developers with the particular language they . Such as jUnit for Java or zUnit for COBOL or PL/I. Use of the framework facilitates the ability to test in a consistent way and have a consistent level of information provided. Each will be optimized to the language for which it is designed.
Testing frameworks for stages past unit testing can provide a consistent way of testing across various types of testing and types of applications. Frameworks such as galasa.dev, an open source testing framework, facilitate the ability to do automated testing no matter what the type of application or language, and the large complex systems including IBM Z. Using common frameworks for testing across the integrated application can help by bringing the teams together. Another advantage of frameworks being developed today for testing, they assume everything is source code, so they’re designed around the assumption of “testware” being managed alongside the application.
Summary
There are various types of testing that all come together to provide the full solution. With large complex systems including IBM Z, the ability to do unit testing to ensure each part of the code does as expected, including error conditions, bad data, etc, reduces the time required to do the integration and endto-end tests. Finding any problems early in the cycle reduces cost and helps drive improved quality.
One key area to address is the need for changing the mindset of developers, to help them understand the value of the automated unit tests. Since historically the role of test has been separate, especially with z/OS developers (but really with many developers of large complex systems), helping the developers understand the value and the reason for why they now do these extra steps will be critical to the transformation. Today they have to spend extra time trying to get data correct, or working to make sure others are not changing related items or data. By providing the developers isolated systems, allowing them to work independently, providing the right unit testing frameworks to simplify the work of testing in isolation, and effective ways of generating test data, the developers can work more effectively doing more of what they enjoy. Getting all the groundwork in place to allow this transformation helps demonstrate the commitment to quality and facilitate the ability to actually do the early testing.
As mentioned earlier having a stable quality signal as part of the pipeline is critical, managing the tests and the test data as you manage the rest of the code for the application helps maintain the ability to get the fast on code changes.
End-to-end testing will still be required, but by focusing on identifying problems
earlier in the pipeline, the time required for end-to-end testing should be greatly reduced. Also, by automating the key set of business priority flows, end-to-end testing can be performed effectively as part of the pipeline.
Section 4: Putting It All Together
Why include IBM Z in this change?
I get the question all the time. Why invest in this change? Why spend the energy and money on the large back-end systems that in some cases don’t change often? The developers working on the systems are comfortable with the processes today, and in general they provide the reliability and stability required.
These systems provide the backbone for the processing with key business logic that has been developed over the many years. Many of these systems provide the competitive advantage for the existing companies moving into new opportunities with their digital transformations. The existing developers may be comfortable with the current ways of working, but new developers find the processes and lack of modern tools a challenge.
Most importantly, the value of this new way of working, bringing automation to the full pipeline, removes the manual work currently being done, which allows those people involved to instead focus on business value. It streamlines the processes to reduce deployment errors and the time it takes to do those deployments, reducing or removing the need for the hours of bridge calls with skilled resources working through deployment weekends. Today, even though many of the parts of the system are not changing, large manual tests are performed to ensure the entire system works together. Having an automated set of verification tests reduces this effort and allows deployments more often.
In addition, with a modern pipeline and a base set of automation, it is now easier to make larger changes to those back-end systems, such as refactoring to expose services for easier consumption and remove unused code with confidence. This allows the organization to get the most of the existing systems and easily update those systems for delivering business value.
Story
There are clients who are a long way into this journey, those that recognized including IBM Z systems as part of the value streams would help ensure their ability to deliver at the speed required today. One such client, the banking client, through the use of the new ways of working, has evolved such that they update and build new capabilities as part of the system that they currently are deploying to production z/OS 20 times a day. Basically, as often as they need to deliver business value. This new way of working allows them to focus on building the capabilities on the system that provides the qualities-of-service they need for the specific capabilities they are building.
Another story is from an insurance company, about how they moved to modern pipelines across the entire organization. There were a number of security problems within the overall industry so the company required additional security scanning and security by design changes as part of the process. Since the entire organization had already moved to the pipeline, the additional security checks could be easily added to the pipeline for all systems within weeks. This included the full z/OS development process. This had not been true the last time a new requirement had been added to the development process, before moving to the pipeline, any change to the z/OS process had required months, at a minimum, and closer to a year for all teams.
Key Insight: Cultural Change
It will seem like a very big change for traditional organizations that have been doing work in a similar fashion for a long time, but as the stories show organizations that have invested in the transformation including their IBM Z system are delivering business value quicker and with better quality than they ever thought possible.
Both of these clients had a mix of online and batch workload and did not make application changes in order to move into the modern ways of working. Since moving into the new processes and practices they have been able to more quickly change and adapt to provide new services and capabilities from the existing systems, as well as add new capabilities, into the IBM Z environment to their digital transformations.
Including everyone in the change also brings an additional level of consistency to the organization, those differences that provide no business value are removed. New skills can come to the platform more effectively, and the full value of the existing systems can be realized.
Cultural Change
It’s all about developing a culture of quality. As has been described, there are many different parts that go into the overall software quality, no one part alone will address the problem, but building a culture of quality will help ensure all the various parts do come together. A culture of quality will also empower individuals to want to drive toward new opportunities to improve and deliver on quality capabilities.
We have to people model their behaviors from others they see, if the focus is on meeting a deadline, above all else, quality suffers. It is, therefore, important to make sure quality is recognized, valued and appreciated at all levels of the organization. When people who made improvements in the process are recognized, the result is an increased desire to achieve the highest quality possible.
As previously noted, people respond to metrics. Using the right metrics to help drive cultural change, in addition to the leadership , can help the process. The key is the right metrics at the right time within the organization to drive the practices to improve overall quality. As described, metrics that help understand the flow and allow the teams to remove waste from the system are important.
Changing to a culture of quality will take time and focus. A culture of quality is one of continuous improvement. The idea of continuous improvement applies to the system being built, as well as the system used to create it. The processes and practices of the organization need the flexibility to continuously improve and evolve.
From a culture and practice standpoint, the transition to product teams and the focus on value flow through the system is a key way to the transition to a quality focused organization. There are many great books and papers already written on this subject I would recommend, such as Project to Product by Mik Kersten and Sooner Safer Happier by Jonathan Smart with Zsolt Berend, Myles Ogilvie and Simon Rohrer.
Bringing the Parts Together
Using automation through a pipeline to help bring the parts together, as well as help capture the metrics of the process, starts to bring the foundation for the change. Each part above described aspects of the process to help drive quality, bringing them together through the coordination of the pipeline helps drive the change in ways of working and improving quality. Testing is part of this, as outlined, there are many different types of testing, but let’s be clear about how they relate, how they come together, how they should work together, and when and how each part fits.
The first part is the developer interpreting the requirement to begin to build a set of capabilities, this could start in the form of drawings, wireframes, or design, to get initial from the representative . With the right level of understanding the coding begins, this is where code rules, security rules, scanning and unit testing all fit into the first stage. Unit tests are then stored in the SCM and run any time the code is changed. This includes unit testing and the step just beyond unit testing, anything that can be done within the build process without deployment into an environment. This first build stage should be the first stage of the pipeline, but should also be runnable by the developer easily within their own environment.
Focusing on this very first step is important, as it is likely a very large change for the developers. Understanding that they will “slow down to speed up” is important. They each need the individual time to learn the new ways of working, adjust to building small capability, with automated testing, and integrating with others frequently.
After the build process the output is stored in an artifact repository and can move
onto later types and stages of testing. Each environment that code is deployed into should have a defined sniff test, verification that a portion, or all, of the environment does work, and that deployment was successful. After this sniff test, automated tests of the appropriate type should be run. By using an automated pipeline, this circle of provisioning, deployment and testing should be run, as long as there are not significant failures. Getting as much as quickly as possible to the developer is the goal. The process could end with deploying into an environment where additional manual testing needs to be performed. As the automation is defined the goal should be to reduce the manual testing as much as possible. Unless it is actual testing.
Reminder of the pipeline stages (Figure 23) where release is the actual step of releasing function into the production environment.
Figure 23
When first starting out, the pipeline might only deploy into a development environment where additional testing is performed, as automated tests are created the pipeline can be made to add an additional isolated environment to run those tests, before deploying into the development environment. The additional isolated environment could be a spin up of a ZD&T environment to which the application updates, tests and test data are deployed in order to run the automated tests, or it could be an isolated environment running on real Z hardware.
The key is to allow for the isolated environment, with clean defined data to allow for the running of the automated tests without other s, or interactions, to cause additional unrelated errors. The pipeline needs to maintain that “stable quality signal” as described earlier. Failing tests should be due to changes made, not random failures in test cases.
One key aspect of this early automation effort should be to create regression tests with their defined test data, that covers just enough of the critical flows for that application or component to have a clear indication that the system is generally OK. When it comes to large complex systems this is likely not to be a full regression to start, but the minimum subset to have some level of confidence.
One goal of this pipeline should be to get the function deployed into an environment where the actual end , or end representative, can test the function to validate that it is satisfying the actual business requirements. The faster this happens, the faster you get on the validity of the function.
Automating all steps that can be automated, and with an acceptable level of early testing, there is still the stage of end-to-end testing before release into production. Various components will be able to be independently updated, but with large complex systems, it will be important to ensure there is end-to-end verification testing of the various changes before flowing into production. Even if the parts can work independently, and are loosely coupled together to allow for independent deployment, in the end, it’s a system delivering business value. All the parts have to come together to deliver that value. With the focus on early testing, the need for this final test is not eliminated. This end-to-end test however, will not have to deal with all the small individual component problems it once had to deal with. Instead, it should be able to be a verification of end-toend flows.
Story: Bringing it all together
When I look across those organizations that have taken this approach to transformation, automation and the focus on shifting left on testing, they now not only are delivering business value at the speed business requires, but they also have removed those differences causing silos and lack of understanding between the teams. They have also removed the barriers to bringing on new talent, those coming out of college that would traditionally have had no interest in working on the system, now see it’s not such a problem, and they can work on the critical backbone for the company, not just the font-end. Thinking of each of the customer types, no matter what the history of the company has been, or how they evolved to the state they were in before they started, each has been able to make this leap to seeing the opportunities available through the transformation. By focusing on the similarities, working to remove the differences where there is no value, they could bring this new quality focus to the entire organization.
Looking at one in particular, their executive stated the goal to make the z/OS team work exactly like the cloud development teams. The only exceptions were if it would be of value to the company. They started down this journey after hearing my session at DevOps Enterprise Summit November 2018. The first proof of technology started at the beginning of 2019 and by September of that year they already had 9 application teams with their applications flowing in the new ways of working. From no pipeline, no automated unit testing, to automated unit testing as part of the pipeline, and every stage of the process using the same process steps and same tools as the cloud developers. (The only tooling difference was the build tool, as they needed to be able to build COBOL. The only process difference is that they continue to deploy incrementally to z/OS as to retain the value of being able to deploy the smallest possible change, without deploying the entire application).
The important point was not just that they were now all working the same way,
but that they had already reduced the development process by 30%. As the time has ed they have added additional levels of automated testing, and rolled the new processes and tools out to other teams, now asking to be onboarded to the new ways of working.
Getting Started
Throughout the book starting from the beginning I have described many different customer types, and customers from various industries. All of these were coming from various different histories and evolutions, from various acquisitions or internal growth over the years. One common factor is that they have IBM Z, and the other common factor is they recognized they had an opportunity to improve. This desire to change is the first critical factor in the transformation.
Story: When it does not work
The stories I have shared have been positive ones, the insights have been based on those learnings, as well as learning from some not so positive stories. One not so positive story, I was working with a client looking to modernize their tools, but was not ready to change their process and practices. The plan was put in place for the transition, but as the tools were moving forward to make the change, additional issues continued to be raised about not doing things the old way. This continued for long enough, such that the project to move forward with change was killed.
This team did not have the executive focus to change the processes and practices, nor did it have the of the individuals who needed to change. Looking back those involved recognized it failed due to the unwillingness to change at all. The story does not actually end this way, just a few years later, the organization came back to say, they were ready for change, they now understood the different processes were limiting their ability to grow and handle new workloads and they needed to bring on new talent. This project has kicked off again with the desire to change from all levels of the organization, the full migration will have been completed as this book goes to print.
Key Insight: Willingness to change
There has to be a willingness to change, an acceptance that change can be for good, and a recognition that change will be difficult. The transition needs to be ed at all levels within the organization.
I have discussed many different concepts drawing on the industry learnings from others and bringing in a variety of customer scenarios. But every customer is different, and every environment is different. The best way for any group or organization to get started is through a value stream assessment. Walking through your entire software development lifecycle, from idea to running the system, to knowing where your handoffs are, where your waste is, what your real process looks like, not just the documented one, will help you understand how and where to start. In addition, it will help you understand how the process affects your current software quality and how you need to address it.
When walking through the value stream, the first step is understanding who plays a part, which parts are required to deliver the end value to the end s. This may easily cross various teams and organizational boundaries, but having the full team available helps identify the biggest problems, all the handoffs, and the segmentation currently causing delays that could be reducing the overall quality of the system.
Value stream assessments are useful for multiple reasons, not only do they help identify the waste in the system, the places where handoffs occur, but they also help start the cultural transformation, getting people together to discuss all the work that is done at each step. These assessments help break down the silos by bringing together the people to have them hear from each other on the process they go through, from the very beginning of the idea through development
activities deployment and running the system in production. These assessments are to understand the current state and help start the cultural transformation.
Improving quality of the system is a step-by-step process that goes hand in hand with improving the quality of the software development lifecycle overall. Each is a continuous improvement process, each sprint, iteration, cycle, or whatever methodology the teams adopt, needs to include quality improvement measures. These could be increased coverage through automation, or improved automation itself.
The related area of value stream management is an automated way to identify value work is flowing, and more importantly, where it is not. There are many tools that now provide some level of automated value stream management to help capture this information, and this space will continue to expand, however, these tools make an assumption you already have some infrastructure in place to capture the information from.
Story
I spend much of my time doing simplified value stream assessments, they focus on identifying the waste and handoffs within an organization. During these assessments we bring together everyone from the business side to those involved in running the system. In some cases, this is the first time the development teams and operations teams have actually met. As we walk through the process, each separate team begins to understand all the tasks the other teams are actually having to go through, what the handoffs are causing etc. Within one organization, I literally had to introduce the head of development to the head of operations. I did this before the meeting, so we could be productive during the session. At the end of the day, the client described the experience as “truly cathartic.” Really understanding what other teams were going through gave everyone more empathy for the other teams.
During the value stream we identify where the biggest areas of concern are, by addressing those first more opportunities are gained to allow for more time for more change. Each of the areas where there are handoffs, wait states, transfer of assignment, are areas where we could introduce errors, and reduce quality.
In many cases, automated testing is the greatest opportunity for improvement, starting with unit testing. But other times, automated deployment, or the availability of environments for testing is the greatest challenge. These areas play together, automating testing without the necessary environments to run the test can be less productive. Automating tests without proper test data management can be very difficult, if not impossible, if trying to validate the actual function. Keeping all this in mind as you work through the value stream helps identify where you can start to get the value you need to improve the quality of the systems.
The key here is that not all product teams are at the same point, each team needs to understand their flow and quality metrics to determine the right place to start, or what the right next biggest problem to address is. This continuous improvement process includes continuously improving aspects of building a quality system.
Conclusion
My goal in writing this book is to bring to focus that software quality overall, for enterprise organizations, it’s not just testing, but the entire software development process, and how it includes everyone. With our world becoming so dependent on software, we all need to focus on the quality of that software. None of the systems can be left out of the transformation.
In order to improve quality in large complex systems that include IBM Z, it is important to take the broader view of quality. It will require a focus on automating all repetitive tasks, which is not happening as much as it should be for these systems. These systems provide the backbone for many of today’s industries, as such they provide significant business value. By focusing on the overall software quality of these systems, not only can you improve the quality, but these modern practices will help open opportunities for greater business value.
Enterprise organizations have a large challenge as they have large existing codebases that provide huge business value, but due to the fact they have been developed over the years they have more technical debt built around them. Addressing this first by modernizing the processes practices and culture around the entire process helps bring this existing business value forward for the next thirty years.
Story
This reminds me of another client I work with, the term legacy is used many times for existing applications and systems, but for this organization they decided vintage might be a more appropriate term. Referring to the existing applications running on IBM Z as vintage provided a higher level of respect for the systems. They also focused on ensuring these systems were seen for the value they provided, and exposed as functions that could be easily called by other systems, so they did not slow things down, but facilitated the ability to deliver quickly.
When thinking about software quality it is important to consider all aspects of the process, from capturing ideas, through running the system and automating all the parts that can be automated and making the work visible. Software quality cannot be thought of as the testing process, but the entire lifecycle. The software practices of DevOps, or DevSecOps, provides a good foundation for improving quality. At the core, the organization has to provide the vision that quality is job one for everyone, and that delivering high-quality value to the clients is the goal.
When I think about airplanes flying with only one sensor, cars driving themselves with limited sensors, or software that exposes systems to control by outsiders, I know software quality matters to all our lives. When we consider financial markets, core banking systems, travel and transportation industries, all these areas depend on software, across a wide variety of platforms and operating systems. Improving whatever area or industry we work in, means improving the culture to drive all aspects of quality, only then can we all live better lives.
Acknowledgments
This book is the result of years spent working for IBM along with many of the worlds largest companies.Thanks to a number of managers at IBM, I had so many opportunities to solve difficult problems and work with a variety of people throughout my years. Thanks to Gene Kim and the DevOps forum, reminding me of the need to share knowledge and experiences. Thanks to IBM for once again making it possible for me to write a book and to share my experiences in published form, and to the many individuals who helped along the way.
First, thanks go to Mark Moncelle, who spent the time to review and provide , to help improve the book and help bring clarity to the content.
Thanks to Carmen De Ardo, who has always ed me and who provided in many ways, my start to DevOps before the term even existed. The and encouragement from Carmen pushed me to continue to publish to help others learn.
Thanks to Gary Groover, who I was so lucky to meet at my first DevOps Enterprise Summit. Gary helped remind me that I could publish this book, and he took the time to push me to make it what it is today.
Thanks to my colleges at IBM who helped provide insightful and challenges that added to the content of the book.
Finally, I have to recognize my family. Thanks to my father for his focus on
writing and sharing his knowledge, to my mother who spent years editing content for me. If it were not for all the English classes and the reading I did as a young person I doubt I would have had the drive to write as I do today. Thanks to my husband who always pushes me to be the best I can be. He has been so patient and ive over the years while I have been on the road traveling for work. Thanks to my sons, who provide constant inspiration that continues to drive me to be the best I can be.