Cover Page

SAS® Data Analytic Development

Dimensions of Software Quality

Troy Martin Hughes

 

 

 

Title Page

Wiley & SAS Business Series

The Wiley & SAS Business Series presents books that help senior-level managers with their critical management decisions.

Titles in the Wiley & SAS Business Series include:

  1. Agile by Design: An Implementation Guide to Analytic Lifecycle Management by Rachel Alt-Simmons
  2. Analytics in a Big Data World: The Essential Guide to Data Science and Its Applications by Bart Baesens
  3. Bank Fraud: Using Technology to Combat Losses by Revathi Subramanian
  4. Big Data, Big Innovation: Enabling Competitive Differentiation through Business Analytics by Evan Stubbs
  5. Business Forecasting: Practical Problems and Solutions edited by Michael Gilliland, Len Tashman, and Udo Sglavo
  6. Business Intelligence Applied: Implementing an Effective Information and Communications Technology Infrastructure by Michael Gendron
  7. Business Intelligence and the Cloud: Strategic Implementation Guide by Michael S. Gendron
  8. Business Transformation: A Roadmap for Maximizing Organizational Insights by Aiman Zeid
  9. Data-Driven Healthcare: How Analytics and BI Are Transforming the Industry by Laura Madsen
  10. Delivering Business Analytics: Practical Guidelines for Best Practice by Evan Stubbs
  11. Demand-Driven Forecasting: A Structured Approach to Forecasting, Second Edition by Charles Chase
  12. Demand-Driven Inventory Optimization and Replenishment: Creating a More Efficient Supply Chain by Robert A. Davis
  13. Developing Human Capital: Using Analytics to Plan and Optimize Your Learning and Development Investments by Gene Pease, Barbara Beresford, and Lew Walker
  14. Economic and Business Forecasting: Analyzing and Interpreting Econometric Results by John
  15. Silvia, Azhar Iqbal, Kaylyn Swankoski, Sarah Watt, and Sam Bullard
  16. Financial Institution Advantage and the Optimization of Information Processing by Sean C. Keenan
  17. Financial Risk Management: Applications in Market, Credit, Asset, and Liability Management and Firmwide Risk by Jimmy Skoglund and Wei Chen
  18. Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection by Bart Baesens, Veronique Van Vlasselaer, and Wouter Verbeke
  19. Harness Oil and Gas Big Data with Analytics: Optimize Exploration and Production with Data Driven Models by Keith Holdaway
  20. Health Analytics: Gaining the Insights to Transform Health Care by Jason Burke
  21. Heuristics in Analytics: A Practical Perspective of What Influences Our Analytical World by Carlos Andre, Reis Pinheiro, and Fiona McNeill
  22. Hotel Pricing in a Social World: Driving Value in the Digital Economy by Kelly McGuire
  23. Implement, Improve and Expand Your Statewide Longitudinal Data System: Creating a Culture of Data in Education by Jamie McQuiggan and Armistead Sapp
  24. Killer Analytics: Top 20 Metrics Missing from Your Balance Sheet by Mark Brown
  25. Mobile Learning: A Handbook for Developers, Educators, and Learners by Scott McQuiggan, Lucy Kosturko, Jamie McQuiggan, and Jennifer Sabourin
  26. The Patient Revolution: How Big Data and Analytics Are Transforming the Healthcare Experience by Krisa Tailor
  27. Predictive Analytics for Human Resources by Jac Fitz-enz and John Mattox II
  28. Predictive Business Analytics: Forward-Looking Capabilities to Improve Business Performance by Lawrence Maisel and Gary Cokins
  29. Statistical Thinking: Improving Business Performance, Second Edition by Roger W. Hoerl and Ronald D. Snee
  30. Too Big to Ignore: The Business Case for Big Data by Phil Simon
  31. Trade-Based Money Laundering: The Next Frontier in International Money Laundering Enforcement by John Cassara
  32. The Visual Organization: Data Visualization, Big Data, and the Quest for Better Decisions by Phil Simon
  33. Understanding the Predictive Analytics Lifecycle by Al Cordoba
  34. Unleashing Your Inner Leader: An Executive Coach Tells All by Vickie Bevenour
  35. Using Big Data Analytics: Turning Big Data into Big Money by Jared Dean
  36. Visual Six Sigma, Second Edition by Ian Cox, Marie Gaudard, and Mia Stephens.

For more information on any of the above titles, please visit www.wiley.com.

To Mom,
who dreamed of being a writer and,
through unceasing love, raised one,
and Dad,
who taught me to program
before I could even reach the keys
.

Preface

Because SAS practitioners are software developers, too!

Within the body of SAS literature, an overwhelming focus on data quality eclipses software quality. Whether discussed in books, white papers, technical documentation, or even posted job descriptions, nearly all references to quality in relationship to SAS describe the quality of data or data products.

The focus on data quality and diversion from traditional software development priorities is not without reason. Data analytic development is software development but ultimate business value is delivered not through software products but rather through subsequent, derivative data products. In aligning quality only with data, however, data analytic development environments can place an overwhelming focus on software functional requirements to the detriment or exclusion of software performance requirements. When SAS literature does describe performance best practices, it typically demonstrates only how to make SAS software faster or more efficient while omitting other dimensions of software quality.

However, what about software reliability, scalability, security, maintainability, or modularity—or the host of other software quality characteristics? For all the SAS practitioners of the world—including developers, biostatisticians, econometricians, researchers, students, project managers, market analysts, data scientists, and others—this text demonstrates a model for software quality promulgated by the International Organization for Standardization (ISO) to facilitate the evaluation and pursuit of software quality.

Through hundreds of Base SAS software examples and more than 4,000 lines of code, SAS practitioners will learn how to define, prioritize, implement, and measure 15 dimensions of software quality. Moreover, nontechnical stakeholders, including project managers, functional managers, customers, sponsors, and business analysts, will learn to recognize the value of quality inclusion and the commensurate risk of quality exclusion. With this more comprehensive view of quality, SAS software quality is finally placed on par with SAS data quality.

Why this text and the relentless pursuit of SAS software quality? Because SAS practitioners, regardless of job title, are inherently software developers, too, and should benefit from industry standards and best practices. Software quality can and should be allowed to flourish in any environment.

OBJECTIVES

The primary goal is to describe and demonstrate SAS software development within the framework of the ISO software product quality model. The model defines characteristics of software quality codified within the Systems and software Quality Requirements and Evaluation (SQuaRE) series (ISO/IEC 25000:2014). Through the 15 intertwined dimensions of software quality presented in this text, readers will be equipped to understand, implement, evaluate, and, most importantly, value software quality.

A secondary goal is to demonstrate the role and importance of the software development life cycle (SDLC) in facilitating software quality. Thus, the dimensions of quality are presented as enduring principles that influence software planning, design, development, testing, validation, acceptance, deployment, operation, and maintenance. The SDLC is demonstrated in a requirements-based framework in which ultimate business need spawns technical requirements that drive the inclusion (or exclusion) of quality in software. Requirements initially provide the backbone of software design and ultimately the basis against which the quality of completed software is evaluated.

A tertiary goal is to demonstrate SAS software development within a risk management framework that identifies the threats of poor quality software to business value. Poor data quality is habitually highlighted in SAS literature as a threat to business value, but poor code quality can equally contribute to project failure. This text doesn't suggest that all dimensions of software quality should be incorporated in all software, but rather aims to formalize a structure through which threats and vulnerabilities can be identified and their ultimate risk to software calculated. Thus, performance requirements are most appropriately implemented when the benefits of their inclusion as well as the risks of their exclusion are understood.

AUDIENCE

Savvy SAS practitioners are the intended audience and represent the professionals who utilize the SAS application to write software in the Base SAS language. An advanced knowledge of Base SAS, including the SAS macro language, is recommended but not required.

Other stakeholders who will benefit from this text include project sponsors, customers, managers, Agile facilitators, ScrumMasters, software testers, and anyone with a desire to understand or improve software performance. Nontechnical stakeholders may have limited knowledge of the SAS language, or software development in general, yet nevertheless generate requirements that drive software projects. These stakeholders will benefit through the introduction of quality characteristics that should be used to define software requirements and evaluate software performance.

APPLICATION OF CONTENT

The ISO software product quality model is agnostic to industry, team size, organizational structure (e.g., functional, projectized, matrix), development methodology (e.g., Agile, Scrum, Lean, Extreme Programming, Waterfall), and developer role (e.g., developer, end-user developer). The student researcher working on a SAS client machine will gain as much insight from this text as a team of developers working in a highly structured environment with separate development, test, and production servers.

While the majority of Base SAS code demonstrated is portable between SAS interfaces and environments, some input/output (I/O) and other system functions, options, and parameters are OS- or interface-specific. Code examples in this text have been tested in the SAS Display Manager for Windows, SAS Enterprise Guide for Windows, and the SAS University Edition. Functional differences among these applications are highlighted throughout the text, and discussed in chapter 10, “Portability.”

While this text includes hundreds of examples of SAS code that demonstrate the successful implementation and evaluation of quality characteristics, it differs from other SAS literature in that it doesn't represent a compendia of SAS software best practices, but rather the application of SAS code to support the software product quality model within the SDLC. Therefore, code examples demonstrate software performance rather than functionality.

ORGANIZATION

Most software texts are organized around functionality—either a top-down approach in which a functional objective is stated and various methods to achieve that goal are demonstrated, or a bottom-up approach in which uses and caveats of a specific SAS function, procedure, or statement are explored. Because this text follows the ISO software product quality model and focuses on performance rather than functionality, it eschews the conventional organization of functionality-driven SAS literature. Instead, 15 chapters highlight a dynamic or static performance characteristic—a single dimension of software quality. Code examples often build incrementally throughout each chapter as quality objectives are identified and achieved, and related quality characteristics are highlighted for future reference and reading.

The text is divided into two parts comprising 18 total chapters:

  1. Overview Three chapters introduce the concept of quality, the ISO software product quality model, the SDLC, risk management, Agile and Waterfall development methodologies, exception handling, and other information and terms central to the text. Even to the reader who is anxious to reach the more technically substantive performance chapters, Chapters 1, “Introduction,” and 2, “Quality,” should be skimmed to gleam the context of software quality within data analytic development environments.
  2. Part I. Dynamic Performance These nine chapters introduce dynamic performance requirements—software quality attributes that are demonstrated, measured, and validated through software execution. For example, software efficiency can be demonstrated by running code and measuring run time and system resources such as CPU and memory usage. Chapters include “Reliability,” “Recoverability,” “Robustness,” “Execution Efficiency,” “Efficiency,” “Scalability,” “Portability,” “Security,” and “Automation.”
  3. Part II. Static Performance These six chapters introduce static performance requirements—software quality attributes that are assessed through code inspection rather than execution. For example, the extent to which software is modularized cannot be determined until the code is opened and inspected, either through manual review or automated test software. Chapters include “Maintainability,” “Modularity,” “Readability,” “Testability,” “Stability,” and “Reusability.”

Text formatting constructs are standardized to facilitate SAS code readability. Formatting is not intended to demonstrate best practices but rather standardization. All code samples are presented in lowercase, but the following conventions are used where code is referenced within the text:

Acknowledgments

So many people, through contributions to my life as well as endurance and encouragement throughout this journey, have contributed directly and indirectly and made this project possible.

To the family and friends I ignored for four months while road-tripping through 24 states to write this, thank you for your love, patience, understanding, and couches.

To my teachers who instilled a love of writing, thank you for years of red ink and encouragement: Sister Mary Katherine Gallagher, Estelle McCarthy, Lorinne McKnight, Dolores Cummings, Millie Bizzini, Patty Ely, Jo Berry, Liana Hachiya, Audrey Musson, Dana Trevethan, Cheri Rowton, Annette Simmons, and Dr. Robyn Bell.

To the mentors whose words continue to guide me, thank you for your leadership and friendship: Dr. Cathy Schuman, Dr. Barton Palmer, Dr. Kiko Gladsjo, Dr. Mina Chang, Dean Kauffman, Rich Nagy, Jim Martin, and Jeff Stillman.

To my SAS spirit guides, thank you not only for challenging the limits of the semicolon but also for sharing your successes and failures with the world: Dr. Gerhard Svolba, Art Carpenter, Kirk Paul Lafler, Susan Slaughter, Lora Delwiche, Peter Eberhardt, Ron Cody, Charlie Shipp, and Thomas Billings.

To SAS, thank you for distributing the SAS University Edition and for providing additional software free of charge, without which this project would have been impossible.

Finally, thank you to John Wiley & Sons, Inc. for support and patience throughout this endeavor.

About the Author

Troy Martin Hughes has been a SAS practitioner for more than 15 years, has managed SAS projects in support of federal, state, and local government initiatives, and is a SAS Certified Advanced Programmer, SAS Certified Base Programmer, SAS Certified Clinical Trials Programmer, and SAS Professional V8. He has an MBA in information systems management and additional credentials, including: PMP, PMI-ACP, PMI-PBA, PMI-RMP, CISSP, CSSLP, CSM, CSD, CSPO, CSP, and ITIL v3 Foundation. He has been a frequent presenter and invited speaker at SAS user conferences, including SAS Global Forum, WUSS, MWSUG, SCSUG, SESUG, and PharmaSUG. Troy is a U.S. Navy veteran with two tours of duty in Afghanistan and, in his spare time, a volunteer firefighter and EMT.