Cover Page

title.jpg

To Karen, Katherine, and my parents
W.Q.M.

To Lida, Juan, Catalina, Daniela, and my mother Inés
L.A.E.

Contents

Preface

Acknowledgments

1. Reliability Concepts and Reliability Data

1.1. Introduction

1.2. Examples of Reliability Data

1.3. General Models for Reliability Data

1.4. Repairable Systems and Nonrepairable Units

1.5. Strategy for Data Collection, Modeling, and Analysis

2. Models, Censoring, and Likelihood for Failure-Time Data

2.1. Models for Continuous Failure-Time Processes

2.2. Models for Discrete Data from a Continuous Process

2.3. Censoring

2.4. Likelihood

3. Nonparametric Estimation

3.1. Introduction

3.2. Estimation from Singly Censored Interval Data

3.3. Basic Ideas of Statistical Inference

3.4. Confidence Intervals from Complete or Singly Censored Data

3.5. Estimation from Multiply Censored Data

3.6. Pointwise Confidence Intervals from Multiply Censored Data

3.7. Estimation from Multiply Censored Data with Exact Failures

3.8. Simultaneous Confidence Bands

3.9. Uncertain Censoring Times

3.10. Arbitrary Censoring

4. Location-Scale-Based Parametric Distributions

4.1. Introduction.

4.2. Quantities of Interest in Reliability Applications

4.3. Location-Scale and Log-Location-Scale Distributions

4.4. Exponential Distribution

4.5. Normal Distribution

4.6. Lognormal Distribution

4.7. Smallest Extreme Value Distribution

4.8. Weibull Distribution

4.9. Largest Extreme Value Distribution

4.10. Logistic Distribution

4.11. Loglogistic Distribution

4.12. Parameters and Parameterization

4.13. Generating Pseudorandom Observations from a Specified Distribution

5. Other Parametric Distributions

5.1. Introduction

5.2. Gamma Distribution.

5.3. Generalized Gamma Distribution.

5.4. Extended Generalized Gamma Distribution

5.5. Generalized F Distribution

5.6. Inverse Gaussian Distribution

5.7. Birnbaum–Saunders Distribution

5.8. Gumpertz–Makeham Distribution

5.9. Comparison of Spread and Skewness Parameters

5.10. Distributions with a Threshold Parameter

5.11. Generalized Threshold-Scale Distribution

5.12. Other Methods of Deriving Failure-Time Distributions

6. Probability Plotting

6.1. Introduction

6.2. Linearizing Location-Scale-Based Distributions

6.3. Graphical Goodness of Fit.

6.4. Probability Plotting Positions

6.5. Probability Plots with Specified Shape Parameters.

6.6. Notes on the Application of Probability Plotting

7. Parametric Likelihood Fitting Concepts: Exponential Distribution

7.1. Introduction

7.2. Parametric Likelihood

7.3. Confidence Intervals for θ

7.4. Confidence Intervals for Functions of θ

7.5. Comparison of Confidence Interval Procedures

7.6. Likelihood for Exact Failure Times

7.7. Data Analysis with No Failures

8. Maximum Likelihood for Log-Location-Scale Distributions

8.1. Introduction

8.2. Likelihood

8.3. Likelihood Confidence Regions and Intervals

8.4. Normal-Approximation Confidence Intervals

8.5. Estimation with Given σ

9. Bootstrap Confidence Intervals

9.1. Introduction

9.2. Bootstrap Sampling

9.3. Exponential Distribution Confidence Intervals

9.4. Weibull, Lognormal, and Loglogistic Distribution Confidence Intervals

9.5. Nonparametric Bootstrap Confidence Intervals

9.6. Percentile Bootstrap Method

10. Planning Life Tests

10.1. Introduction

10.2. Approximate Variance of ML Estimators

10.3. Sample Size for Unrestricted Functions

10.4. Sample Size for Positive Functions

10.5. Sample Sizes for Log-Location-Scale Distributions with Censoring

10.6. Test Plans to Demonstrate Conformance with a Reliability Standard

10.7. Some Extensions

11. Parametric Maximum Likelihood: Other Models

11.1. Introduction

11.2. Fitting the Gamma Distribution

11.3. Fitting the Extended Generalized Gamma Distribution

11.4. Fitting the BISA and IGAU Distributions

11.5. Fitting a Limited Failure Population Model

11.6. Truncated Data and Truncated Distributions

11.7. Fitting Distributions that Have a Threshold Parameter

12. Prediction of Future Random Quantities

12.1. Introduction

12.2. Probability Prediction Intervals (θ Given)

12.3. Statistical Prediction Interval (θ Estimated)

12.4. The (Approximate) Pivotal Method for Prediction Intervals

12.5. Prediction in Simple Cases

12.6. Calibrating Naive Statistical Prediction Bounds

12.7. Prediction of Future Failures from a Single Group of Units in the Field

12.8. Prediction of Future Failures from Multiple Groups of Units with Staggered Entry into the Field

13. Degradation Data, Models, and Data Analysis

13.1. Introduction

13.2. Models for Degradation Data

13.3. Estimation of Degradation Model Parameters

13.4. Models Relating Degradation and Failure

13.5. Evaluation of F(t)

13.6. Estimation of F(r)y

13.7. Bootstrap Confidence Intervals

13.8. Comparison with Traditional Failure-Time Analyses

13.9. Approximate Degradation Analysis

14. Introduction to the Use of Bayesian Methods for Reliability Data

14.1. Introduction

14.2. Using Bayes’s Rule to Update Prior Information

14.3. Prior Information and Distributions

14.4. Numerical Methods for Combining Prior Information with a Likelihood

14.5. Using the Posterior Distribution for Estimation

14.6. Bayesian Prediction

14.7. Practical Issues in the Application of Bayesian Methods

15. System Reliability Concepts and Methods

15.1. Introduction

15.2. System Structures and System Failure Probability

15.3. Estimating System Reliability from Component Data

15.4. Estimating Reliability with Two or More Causes of Failure

15.5. Other Topics in System Reliability

16. Analysis of Repairable System and Other Recurrence Data

16.1. Introduction

16.2. Nonparametric Estimation of the MCF

16.3. Nonparametric Comparison of Two Samples of Recurrence Data

16.4. Parametric Models for Recurrence Data

16.5. Tools for Checking Point-Process Assumptions

16.6. Maximum Likelihood Fitting of Poisson Process

16.7. Generating Pseudorandom Realizations from an NHPP Process

16.8. Software Reliability

17. Failure-Time Regression Analysis

17.1. Introduction

17.2. Failure-Time Regression Models

17.3. Simple Linear Regression Models

17.4. Standard Errors and Confidence Intervals for Regression Models

17.5. Regression Model with Quadratic μ and Nonconstant σ

17.6. Checking Model Assumptions

17.7. Models with Two or More Explanatory Variables

17.8. Product Comparison: An Indicator-Variable Regression Model

17.9. The Proportional Hazards Failure-Time Model

17.10. General Time Transformation Functions

18. Accelerated Test Models

18.1. Introduction

18.2. Use-Rate Acceleration

18.3. Temperature Acceleration

18.4. Voltage and Voltage-Stress Acceleration

18.5. Acceleration Models with More than One Accelerating Variable

18.6. Guidelines for the Use of Acceleration Models

19. Accelerated Life Tests

19.1. Introduction

19.2. Analysis of Single-Variable ALT Data

19.3. Further Examples

19.4. Some Practical Suggestions for Drawing Conclusions from ALT Data

19.5. Other Kinds of Accelerated Tests

19.6. Potential Pitfalls of Accelerated Life Testing

20. Planning Accelerated Life Tests

20.1. Introduction

20.2. Evaluation of Test Plans

20.3. Planning Single-Variable ALT Experiments

20.4. Planning Two-Variable ALT Experiments

20.5. Planning ALT Experiments with More than Two Experimental Variables

21. Accelerated Degradation Tests

21.1. Introduction

21.2. Models for Accelerated Degradation Test Data

21.3. Estimating Accelerated Degradation Test Model Parameters

21.4. Estimation of Failure Probabilities, Distribution Quantiles, and Other Functions of Model Parameters

21.5. Confidence Intervals Based on Bootstrap Samples

21.6. Comparison with Traditional Accelerated Life Test Methods

21.7. Approximate Accelerated Degradation Analysis.

22. Case Studies and Further Applications

22.1. Dangers of Censoring in a Mixed Population

22.2. Using Prior Information in Accelerated Testing

22.3. An LFP/Competing Risk Model

22.4. Fatigue-Limit Regression Model

22.5. Planning Accelerated Degradation Tests

Epilogue

Appendix A. Notation and Acronyms

Appendix B. Some Results from Statistical Theory

B.1. cdfs and pdfs of Functions of Random Variables

B.2. Statistical Error Propagation—The Delta Method

B.3. Likelihood and Fisher Information Matrices

B.4. Regularity Conditions

B.5. Convergence in Distribution

B.6. Outline of General ML Theory

Appendix C. Tables

References

Author Index

Subject Index

Preface

Over the past 10 years there has been a heightened interest in improving quality, productivity, and reliability of manufactured products. Global competition and higher customer expectations for safe, reliable products are driving this interest. To meet this need, many companies have trained their design engineers and manufacturing engineers in the appropriate use of designed experiments and statistical process monitoring/control. Now reliability is being viewed as the product feature that has the potential to provide an important competitive edge. A current industry concern is in developing better processes to move rapidly from product conceptualization to a cost-effective highly reliable product. A reputation for unreliability can doom a product, if not the manufacturing company.

Data collection, data analysis, and data interpretation methods are important tools for those who are responsible for product reliability and product design decisions. This book describes and illustrates the use of proven traditional techniques for reliability data analysis and test planning, enhanced and brought up to date with modern computer-based graphical, analytical, and simulation-based methods. The material in this book is based on our interactions with engineers and statisticians in industry as well as on courses in applied reliability data analysis that we have taught to MS-level statistics and engineering students at both Iowa State University and Louisiana State University.

Audience and Assumed Knowledge

We have designed this book to be useful to statisticians and engineers working in industry as well as to students in university engineering and statistics programs. The book will be useful for on-the-job training courses in reliability data analysis. There is challenge in addressing such a wide-ranging audience. Communications among engineers and statisticians, however, is not only necessary but essential in the industrial research and development environment. We hope that this book will aid such communication. To produce a book that will appeal to both engineers and statisticians, we have placed primary focus on applications, data, concepts, methods, and interpretation. We use simple computational examples to illustrate ideas and concepts but, as in practical applications, rely on computers to do most of the computations. We have also included a collection of exercise problems at the end of each chapter. These exercises will give readers a chance to test their knowledge of basic material, to explore conceptual ideas of reliability testing, data analysis, and interpretation, and to see possible extensions of the material in the chapters.

It will be helpful for readers to have had a previous course in intermediate statistical methods covering basic ideas of statistical modeling and inference, graphical methods, estimation, confidence intervals, and regression analysis. Only the simplest concepts of calculus are used in the main body of the text (e.g., probability for a continuous random variable is computed as area under a density curve; a first derivative is a slope or a rate of change; a second derivative is a measure of curvature). Appendix B and some advanced exercises use calculus, linear algebra, basic optimization ideas, and basic statistical theory. Concepts, however, are presented in a relaxed and intuitive manner that we hope will also appeal to interested nonstatisticians. Throughout the book we have attempted to avoid the heavy language of mathematical statistics.

A detailed understanding of underlying statistical theory is not necessary to apply the methods in this book. Such details are, however, often important to understanding how to extend methods to new situations or developing new methods. Appendix B. at the end of the book, outlines the general theory and provides references to more detailed information. Also, many derivations and interesting extensions are covered in advanced guided exercises at the end of each chapter.

Particularly challenging exercises (i.e., exercises requiring knowledge of calculus or statistical theory) are marked with a triangle (images). Exercises requiring computer programming (beyond the use of standard statistical packages) are marked with a diamond (images).

Special Features of the Book

Special features of this book include the following:

1. We emphasize general methods that can be applied to the wide range of problems found in industrial reliability data analysis—specifically, nonparametric estimation of a failure-time distribution function, probability plotting, and maximum likelihood estimation of important reliability characteristics (failure probabilities, distribution quantiles, and hazard functions), and associated statistical intervals. In the basic chapters (3, 6, 7, 8, 17, and 19), we apply these methods to the most frequently encountered models in reliability data analysis. In special chapters (which can be skipped without loss of continuity or understanding), we apply the general methods to important but less frequently occurring situations (e.g., problems involving truncation and prediction).
2. Throughout the book we use computer graphics for displaying data, for displaying the results of analyses, and for explaining technical concepts.
3. We use simulation methods to complement large-sample asymptotic theory (practical sample sizes are often small to moderate in size). We explain and illustrate modem, more accurate (but computationally demanding) methods of inference: likelihood and bootstrap methods for constructing statistical intervals.
4. For both nonparametric and parametric analyses, we illustrate the use of general likelihood-based methods of handling arbitrarily censored data (including left, right, and interval censoring with overlapping intervals) and truncated data that frequently arise in statistical reliability studies.
5. We provide methods for planning reliability studies (length of test, number of specimens, and levels of experimental factors).
6. We cover methods for analyzing degradation data. Such data are becoming increasingly important where there are requirements for extremely high reliability.
7. Almost all of our examples and exercises use real data, including many data sets that have not previously appeared in any book. In order to protect proprietary information, some data have been changed by a scale factor and. in some cases, generic product names have been used (e.g.. Device-A. Component-B. Alloy-A).
8. Numerical examples in this book were done using the S-PLUS system for graphics and data analysis (a product of MathSoft, Inc.. Seattle. WA). A suite of special S-PLUS functions was developed in parallel with this book. Although we have not included explicit information about software use in the chapters, the suite of special S-PLUS functions and a listing of the S-PLUS commands used to do the examples in the book are available from the authors via anonymous ftp at the Wiley ftp site. Instructions about how to access the software are given below.
How to Download the Software Examples

The Wiley public ftp site includes special S-PLUS function examples created for the applications discussed in this book. The files can be accessed through either a standard ftp program or the ftp client of a Web browser using the http protocol. You can access the files from a Web browser through the following address:

http://www.wiley.com/products/subject/mathematics

On the Mathematics and Statistics home page you will see a link to the ftp Software Archive, which includes a link to information about the book and access to the software.

To gain ftp access, type the following at your Web browser’s URL address input box:

ftp://ftp.wiley.com

You can set an ID of anonymous; no password is required.

The files are located in the public/sci_tech_med/reliabilitv directory. Be sure to also download and read the README.TXT file, which includes directions on how to install and use the program.

If you need further information about downloading the files, you can reach Wiley’s tech support line at 212-850-6753.

Other Software to Use with the Book

Today there are many commercial statistical software packages. Unfortunately, only a few of these packages have adequate capabilities for doing reliability data analysis (e.g., the ability to do nonparametric and parametric estimation with censored data). Nelson (1990a, pages 237-240) outlines the capabilities of a number of commercial and noncommercial packages that were available at that time. As software vendors become more aware of their customers’ needs, capabilities in commercial packages are improving. Here we describe briefly the capabilities of a few packages that we and our colleagues have found to be useful.

MINITAB (1997), SAS PROC RELIABILITY (1997), SAS JMP (1995), S-PLUS (1996), and a specialized program called WinSMITH (Abemethy 1996) can do nonparametric and parametric product censored data analysis to estimate a single distribution (Chapters 3, 6, 7, and 8). SAS JMP can also analyze data with more than one failure mode (Chapter 15). MINITAB, SAS PROC RELIABILITY, SAS JMP, and S-PLUS can do parametric regression and accelerated life test analyses (Chapters 17 and 19), as well as semiparametric Cox proportional hazards regression analysis. SAS PROC RELIABILITY can, in addition, do the nonparametric repairable systems analyses (Chapter 16).

Overview and Paths Through the Book

There are many possible paths that readers and instructors might take through this book. Chapters 1–16 cover single distribution models without any explanatory variables. Chapters 17–21 describe failure-time regression models. Chapter 22 presents case studies that illustrate, in the context of real problems, the integration of ideas presented throughout the book. This chapter also usefully illustrates how some of the general methods presented in the earlier chapters can be extended and adapted to deal with new problems.

Chapters 1–3 and 6–8 provide basic material that will be of interest to almost all readers and should be read in sequence. Chapter 4 discusses parametric failure-time models based on location-scale distributions and Chapter 5 covers more advanced distributional models. It is possible to use only a light reading of Chapter 4 and to skip Chapter 5 altogether before proceeding on to the important methods in Chapters 6–8. Chapter 9 explains and illustrates the use of bootstrap (simulation-based) methods for obtaining confidence intervals. Chapter 10 focuses on test planning: evaluating the effects of choosing sample size and length of observation. Chapters 11–16 cover a variety of special more advanced topics for single distribution models. Some of the material in Chapter 5 is prerequisite for the material in Chapter 11, but it is possible simply to work in Chapter 11, referring back to Chapter 5 only as needed. Otherwise, each of Chapters 10 through 14 has only material up to Chapter 8 as prerequisite. Chapter 15 introduces some important system reliability concepts and shows how the material in the first part of the book can be used to make statistical statements about the reliability of a system or a population of systems. Chapter 16 explains and illustrates the fundamental ideas behind analyzing system-repair and other recurrence data (as opposed to data on components and other replaceable units).

There are several groups of chapters on special topics that can be read in sequence.

Appendix A provides a summary and index of notation used in the book. Appendix B outlines the general maximum likelihood and other statistical theory on which the methods in the book are based. Appendix C gives tables for some of the larger data sets used in our examples.

Use as a Textbook

A two-semester course would be required to cover thoroughly all of the material in the book. For a one-semester course, aimed at engineers and/or statisticians, an instructor could cover Chapters 1–4, 6–8, and 17–19, along with selected material from the appendices (according to the background of the students), and a few other chapters according to interests and tastes.

This book could be used as the basis for workshops or short courses aimed at engineers or statisticians working in industry. For an audience with a working knowledge of basic statistical tools, Chapters 1–3, key sections in Chapter 4. and Chapters 6–8 could be covered in one day. If the purpose of the short course is to introduce the basic ideas and illustrate with examples, then some material from Chapters 17–20 could also be covered. For a less experienced audience or for a more relaxed presentation, allowing time for exercises and discussion, two days would be needed to cover this material. Extending the course to three or four days would allow covering selected material in Chapters 9–22.

WILLIAM Q. MEEKER
LUIS A. ESCOBAR

Ames, Iowa
Raton Rouge, Louisiana
April 1998

Acknowledgments

A number of individuals provided helpful comments on all or part of draft versions of this book. In particular, we would like to acknowledge Chuck Annis, Hwei-Chun Chou, William Christensen, Necip Doganaksoy, Tom Dubinin. Michael Eraas. Shuen-Lin Jeng, Gerry Hahn, Joseph Lu, Michael LuValle, Enid Martinets, Silvia Morales. Peter Morse, Dan Nordman, Steve Redman, Ernest Scheuer. Ananda Sen. David Steinberg, Mark Vandeven, Kim Wentzlaff, and a number of anonymous reviewers. Over the years, we have benefited from numerous technical discussions with Vijay Nair. Vijay provided a number of suggestions that substantially improved several parts of this book. We would also like to acknowledge our students (too numerous to mention by name) for their penetrating questions, high level of interest, and useful suggestions for improving our courses.

We would like to make special acknowledgment to Wayne Nelson, who gave us detailed feedback on earlier versions of most of the chapters of this book. Additionally, much of our knowledge in this area has it roots in our interactions with Wayne and Wayne’s outstanding books and other publications in the area of reliability and reliability data analysis.

Parts of this book were written while the first author was visiting the Department of Statistics and Actuarial Science at the University of Waterloo and the Department of Experimental Statistics at Louisiana State University. Support and use of facilities during these visits are gratefully acknowledged.

We greatly benefited from facilities, traveling support, and encouragement from Jeff Hooper and Michèle Boulanger at AT&T Bell Laboratories, Gerald Hahn at General Electric Corporate Research and Development Center, Enrique Villa and Victor Pérez Abreu at CIMAT in Guanajuato, México, Guido E. Del Pino at Universidad Católica in Santiago, Chile, Héctor Allende at Universidad Técnica Federico Santa María in Valparaiso, Chile, Yves L. Grize in Basel, Switzerland, and Stephan Zayac. Ford Motor Company.

Dean L. Isaacson, Head, Department of Statistics, Iowa State University, Lynn R. LaMotte, former Head, and E. Barry Moser, Interim Head, Department of Experimental Statistics, Louisiana State University, provided helpful encouragement and support to both authors. We would also like to thank our secretaries Denise Riker and Elaine Miller for excellent support and assistance while writing this book.

Finally, we would like to thank our wives and children for their love, patience, and understanding during the recent years in which we worked most weekends and many evenings to complete this project.

W. Q. M.
L. A. E.

CHAPTER 1

Reliability Concepts and Reliability Data

Objectives

This chapter explains:

Overview

This chapter introduces some of the basic concepts of product reliability. Section 1.1 explains the relationship between quality and reliability and outlines how statistical studies are used to obtain information that can be used to assess and improve product reliability. Section 1.2 presents examples to illustrate studies that resulted in different kinds of reliability data. These examples are used in data analysis and exercises in subsequent chapters. Section 1.3 explains, in general terms, important qualitative aspects of statistical models that are used to describe populations and processes in reliability applications. Section 1.4 emphasizes the important distinction between studies focusing on data from repairable systems and nonrepairable units. Section 1.5 describes a general strategy for exploring, analyzing, and drawing conclusions from reliability data. This strategy is illustrated in examples throughout the book and in the case studies in Chapter 22.

1.1. INTRODUCTION

1.1.1 Quality and Reliability

Rapid advances in technology, development of highly sophisticated products, intense global competition, and increasing customer expectations have put new pressures on manufacturers to produce high-quality products. Customers expect purchased products to be reliable and safe. Systems, vehicles, machines, devices, and so on should, with high probability, be able to perform their intended function under usual operating conditions, for some specified period of time.

Technically, reliability is often defined as the probability that a system, vehicle, machine, device, and so on will perform its intended function under operating conditions, for a specified period of time. Improving reliability is an important part of (he larger overall picture of improving product quality. There are many definitions of quality, but general agreement that an unreliable product is not a high-quality product. Condra (1993) emphasizes that “reliability is quality over time.”

Modern programs for improving reliability of existing products and for assuring continued high reliability for the next generation of products require quantitative methods for predicting and assessing various aspects of product reliability. In most cases this will involve the collection of reliability data from studies such as laboratory tests (or designed experiments) of materials, devices, and components, tests on early prototype units, careful monitoring of early-production units in the field, analysis of warranty data, and systematic longer-term tracking of products in the field.

1.1.2 Reasons for Collecting Reliability Data

There are many possible reasons for collecting reliability data. Examples include the following:

1.1.3 Distinguishing Features of Reliability Data

Reliability data can have a number of special features requiring the use of special statistical methods. For example:

This book emphasizes the analysis of data from studies conducted to assess or improve product reliability. Data from reliability studies, however, closely resemble data from time-to-event studies in other areas of science and industry including biology, ecology, medicine, economics, and sociology. The methods of analysis in these other areas are the same or similar to those used in reliability data analysis. Some synonyms for reliability data are failure-time data, life data, survival data (used in medicine and biological sciences), and event-time data (used in the social sciences).

1.2. EXAMPLES OF RELIABILITY DATA

This section describes examples and data sets that illustrate the wide range of applications and characteristics of reliability data. These and other examples are used in subsequent chapters to illustrate the application of statistical methods for analyzing and drawing conclusions from such data.

1.2.1 Failure-Time Data with no Explanatory Variables

In many applications reliability data will be collected on a sample of units that are assumed to have come from a particular process or population and to have been tested or operated under nominally identical conditions. More realistically, there are physical differences among units (e.g., strength or hardness) and operating conditions (e.g.. temperature, humidity, or stress) and these contribute to the variability in the data. The assumption used in drawing inferences from such single distribution data is that these differences accurately reflect the variability in life caused by the actual differences in the population or process of interest.

Example 1.1 Ball Bearing Fatigue Data. Lieblein and Zelen (1956) describe and give data from fatigue endurance tests for deep-groove ball bearings. The ball bearings came from four different major bearing companies. There was disagreement in the industry on the appropriate parameter values to use to describe the relationship between fatigue life and stress loading. The main objective of the study was to estimate values of the parameters in the equation relating bearing life to load.

The data shown in Table 1.1 are a subset of n = 23 bearing failure times for units tested at one level of stress, reported and analyzed by Lawless (1982). Figure 1.1 shows that the data are skewed to the right. Because of the lower bound on cycles (or time) to failure at zero, this distribution shape is typical of reliability data. Figure 1.2 illustrates the failure pattern over time.

images

Modern electronic systems may contain anywhere from hundreds to hundreds of thousands of integrated circuits (ICs). In order for such a system to have high reliability, it is necessary for the individual ICs and other components to have extremely high reliability, as in the following example.

Example 1.2 Integrated Circuit Life Test Data. Meeker (1987) reports the results of a life test of n = 4156 integrated circuits tested for 1370 hours at accelerated conditions of 80°C and 80% relative humidity. The accelerated conditions were used to shorten the test by causing defective units to fail more rapidly. The primary purpose of the experiment was to estimate the proportion of defective units being manufactured in the current production process and to estimate the amount of “burn-in” time that would be required to remove most of the defective units from the product population. The reliability engineers were also interested in whether it might be possible to get the needed information about the state of the production process, in the future, using much shorter tests (say, 200 or 300 hours). The data are reproduced in Table 1.2. There were 25 failures in the first 100 hours, three more between 100 and 600 hours, and no more failures out to 1370 hours, when the test was terminated. Ties in the data indicate that failures were detected at inspection times. A subset of the data is depicted in Figure 1.3.

images

Table 1.1. Ball Bearing Failure Times in Millions of Revolutions

images

Figure 1.1. Histogram of the ball bearing failure data.

images

Figure 1.2. Display of the ball hearing failure data.

images

Table 1.2. Integrated Circuit Failure Times in Hours

images

Figure 1.3. General failure pattern of the integrated circuit life test, showing a subset of the data where 28 out of 4156 units failed in the 1370-hour test

images

Table 1.3. Failure Data from a Circuit Pack Field Tracking Study

images

Example 1.3 Circuit Pack Reliability Field Trial. Table 1.3 gives information on the number of failures observed during periodic inspections in a field trial of early-production circuit packs employing new technology devices. The circuit packs were manufactured under the same design, but by two different vendors. The trial ran for 10,000 hours. The 4993 circuit packs from Vendor I came straight from production. The 4993 circuit packs from Vendor 2 had already seen 1000 hours of burn-in testing at the manufacturing plant under operating conditions similar to those in the field trial. Such circuit packs were sold at a higher price because field reliability was supposed to have been improved by the burn-in screening of circuit packs containing defective components. Failures during the first 1000 hours of burn-in were not recorded. This is the reason for the unknown entries in the table and for having information out to 11,000 hours for Vendor 2. The data in Table 1.3 is for the first failure in a position; information on circuit packs replaced after initial failure in a position was not part of the study.

Inspections were costly and were spaced more closely at the beginning of the study because more failures were expected there. The early “infant mortality” failures were caused by component defects in a small proportion of the circuit packs. Such failures are typical for an immature product. For such products, burn-in of circuit packs can be used to weed out most of the packs with weak components. Such burn-in, however, is expensive, and one of the manufacturer’s goals was to develop robust design and manufacturing processes that would eliminate or reduce, as quickly as possible, the occurrence of such defects in future generations of similar products.

There were several goals for this study:

images

Example 1.4 Diesel Generator Fan Failure Data. Nelson (1982, page 133) gives data on diesel generator fan failures. Failures in 12 of 70 generator fans were reported at times ranging between 450 hours and 8750 hours. Of the 58 units that did not fail, the reported running times (i.e., censoring times) ranged between 460 and 11,500 hours. Different fans had different running times because units were introduced into service at different times and because their use-rates differed. The data are reproduced in Appendix Table C.1. Figure 1.4 provides an initial graphical representation of the data. Figure 1.5 shows the censoring data. The data were collected to answer questions like:

images

Example 1.5 Heat Exchanger Tube Crack Data. Nuclear power plants use heat exchangers to transfer energy from the reactor to steam turbines. A typical heat exchanger contains thousands of tubes through which steam flows continuously when the heat exchanger is in service. With age, heat exchanger tubes develop cracks, usually due to some combination of stress-corrosion and fatigue. A heat exchanger can continue to operate safely when the cracks are small. If cracks get large enough, however, leaks can develop, and these could lead to serious safety problems and expensive, unplanned plant shut-down time. To protect against having leaks, heat exchangers are taken out of service periodically so that its tubes (and other components) can be inspected with nondestructive evaluation techniques. At the end of each inspection period, tubes with detected cracks are plugged so that water will no longer pass through them. This reduces plant efficiency but extends the life of the expensive heat exchangers. With this in mind, heat exchangers are built with extra capacity and can remain in operation up until the point where a certain percentage (e.g., 5%) of the tubes have been plugged.

Figure 1.4. Histogram showing failure times (light shade) and running times (dark shade) for the diesel generator fan data

images

Figure 1.5. Failure pattern in a subset of the diesel generator fan data. There were 12 fan failures and 58 right-censored observations

images

Figure 1.6 illustrates the inspection data, available at the end of 1983, from three different power plants. At this point in time, Plant 1 had been in operation for 3 years, Plant 2 for 2 years, and Plant 3 for only 1 year. Because all of the heat exchangers were manufactured according to the same design and specifications and because the heat exchangers were operated in generating plants run under similar tightly controlled conditions, it seemed that it should be reasonable to combine the data from the different plants for the sake of making inferences and predictions about the time-to-crack distribution of the heat exchanger tubes. Figure 1.7 illustrates the same data displayed in terms of amount of operating time instead of calendar time.

The engineers were interested in predicting tube life of a larger population of tubes in similar heat exchangers in other plants, for purposes of proper accounting and depreciation and so that the company could develop efficient inspection and replacement strategies. They also wanted to know if the tube failure rate was constant over time or if suspected wearout mechanisms (corrosion and fatigue) would, as suspected, begin to cause failures to occur with higher frequency as the heat exchanger ages.

images

Figure 1.6. Heat exchanger tube crack inspection data in calendar time.

images

Figure 1.7. Heat exchanger tube crack inspection data in operating time

images

Example 1.6 Transmitter Vacuum Tube Data. Table 1.4 gives life data for a certain kind of transmitter vacuum tube (designated as “V7” within a particular transmitter design). Although solid-state electronics has made vacuum tubes obsolete for most applications, such tubes are still widely used in the output stage of high-power transmitters. These data were originally analyzed in Davis (1952). As seen in many practical situations, the exact failure times were not reported. Instead, we have only the number of failures in each interval or bin. Such data are known as grouped data, interval data, binned data, or read-out data.

images

Table 1.4. Failure Times for the V7 Transmitter Tlibe

images

Example 1.7 Turbine Wheel Crack Initiation Data. Nelson (1982) describes a study to estimate the distribution of time to crack initiation for turbine wheels. Each of 432 wheels was inspected once to determine if it had started to crack or not. At the time of the inspections, the wheels had different amounts of service time (age). A unit found to be cracked at its inspection was left-censored at its age (because the crack had initiated at some unknown point before its inspection age). A unit found to be uncracked at its inspection was right-censored at its age (because a crack would be initiated at some unknown point after that age). The data in Table 1.5, taken from Nelson (1982), show the number of cracked and uncracked wheels in different age categories, showing the midpoint of the time intervals given by Nelson. The data were put into intervals to facilitate simpler analyses.

In some applications components with an initiated crack could continue in service for rather long periods of time with the expectation that in-service inspections, scheduled frequently enough, could detect cracks before they grow to a size that could cause a safety hazard.

The important objectives of the study were to obtain information that could be used to:

Table 1.5. turbine Wheel Inspection Data Summary at Time of Study

images

Figure 1.8. Turbine wheel inspection data summary at time of study

images

The failure/censoring pattern of these data is quite different from the previous examples and is illustrated in Figure 1.8. The analysts did not know the initiation time for any of the wheels. Instead, all they knew about each wheel was its age and whether a crack had initiated or not.

images

1.2.2 Failure-Time Data with Explanatory Variables

Example 1.8 Printed Circuit Board Accelerated Life Test Data. Meeker and LuValle (1995) give data from an accelerated life test on failure of printed circuit boards. The purpose of the experiment was to study the effect of the stresses on the failure-time distribution and to predict reliability under normal operating conditions. More specifically, the experiment was designed to study a particular failure mode—the formation and growth of conductive anodic filaments between copper-plated through-holes in the printed circuit boards. Actual growth of the filaments could not be monitored. Only failure time (defined as a short circuit) could be observed directly. Special test boards were constructed for the experiment. The data described here are part of the results of a much larger experiment aimed at determining the effects of temperature, relative humidity, and electric field on the reliability of printed circuit boards.

Figure 1.9. Scalier plot of printed circuit board accelerated life test data

images

Spacing between the holes in the test boards was chosen to simulate the spacing in actual printed circuit boards. Each test vehicle contained three identical 8 × 18 matrices of holes with alternate columns charged positively and negatively. These matrices, or “boards,” were the observational units in the experiment. Data analysis indicated that any clustering effect of boards within test boards was small enough to ignore in the study.

Meeker and LuValle (1995) give the number of failures that was observed in each of a series of 4-hour and 12-hour long intervals over the life test period. This experiment resulted in interval-censored data because only the interval in which each failure occurred was known. In this example all test units had the same inspection times. A graph of the data in Figure 1.9 plots the midpoint of the intervals containing failures versus relative humidity. The graph shows that failures occur earlier at higher levels of humidity.

images

Example 1.9 Accelerated Test of Spacecraft Nickel-Cadmium Battery Cells. Brown and Mains (1979) present the results of an extensive experiment to evaluate the long-term performance of rechargable nickel-cadmium battery cells that were to be used in spacecraft. The study used eight experimental factors. The first five factors shown in the table were environmental or accelerating factors (set to higher than usual levels to obtain failure information more quickly). The other three factors were product-design factors that could be adjusted in the product design to optimize performance and reliability of the batteries to be manufactured. The experiment ran 82 batteries, each containing 5 individual cells. Each battery was tested at a combination of factor levels determined according to a central composite experimental plan (see page 487 of Box and Draper, 1987, for information on central composite experimental designs).

images

Figure 1.10. Alloy-A fatigue crack size as a function of number of cycles

images

1.2.3 Degradation Data with no Explanatory Variables

Example 1.10 Fatigue Crack-Size Data. Figure 1.10 and Appendix Table C.14 give the size of fatigue cracks as a function of number of cycles of applied stress for 21 test specimens. This is an example of degradation data. The data were reported originally in Hudak, Saxena, Bucci, and Malcolm (1978). The data were collected to obtain information on crack growth rates for the alloy. The data in Appendix Table C.14 were obtained visually from Figure 4.5.2 of Bogdanoff and Kozin (1985, page 242). For our analysis in the examples in Chapter 13, we will refer to these data as Alloy-A and assume that a crack of size 1.6 inches is considered to be a failure.

images