Table of Contents

Title Page

Dedication

Preface

Introducing Excel
So How Did We Get to Here?
Intended Level of the Textbook
Textbook Organization
Leading by Example(s)

Acknowledgments

The Authors

Part 1

Chapter 1: Statistics and Excel
1. 1.1 How This Book Differs from Other Statistics Texts
2. 1.2 Statistical Applications in Health Policy and Health Administration
3. 1.3 What Is the “Big Picture”?
4. 1.4 Some Initial Definitions
5. 1.5 Five Statistical Tests
6. Key Terms
Chapter 2: Excel as a Statistical Tool
1. 2.1 The Basics
2. 2.2 Working and Moving Around in a Spreadsheet
3. 2.3 Excel Functions
4. 2.4 The =IF() Function
5. 2.5 Excel Graphs
6. 2.6 Sorting a String of Data
7. 2.7 The Data Analysis Pack
8. 2.8 Functions That Give Results in More than One Cell
9. 2.9 The Dollar Sign ($ ) Convention for Cell References
10. Key Terms
Chapter 3: Data Acquisition: Sampling and Data Preparation
1. 3.1 The Nature of Data
2. 3.2 Sampling
3. 3.3 Data Access and Preparation
4. 3.4 Missing Data
5. Key Terms
Chapter 4: Data Display: Descriptive Presentation, Excel Graphing Capability
1. 4.1 Creating, Displaying, and Understanding Frequency Distributions
2. 4.2 Using the Pivot Table to Generate Frequencies of Categorical Variables
3. 4.3 A Logical Extension of the Pivot Table: Two Variables
4. Key Terms
Chapter 5: Basic Concepts of Probability
1. 5.1 Some Initial Concepts and Definitions
2. 5.2 Marginal Probabilities, Joint Probabilities, and Conditional Probabilities
3. 5.3 Binomial Probability
4. 5.4 The Poisson Distribution
5. 5.5 The Normal Distribution
6. Key Terms
Chapter 6: Measures of Central Tendency and Dispersion: Data Distributions
1. 6.1 Measures of Central Tendency and Dispersion
2. 6.2 The Distribution of Frequencies
3. 6.3 The Sampling Distribution of the Mean
4. 6.4 Mean and Standard Deviation of a Discrete Numerical Variable
5. 6.5 The Distribution of a Proportion
6. 6.6 The t Distribution
7. Key Terms

Part 2

Chapter 7: Confidence Limits and Hypothesis Testing
1. 7.1 What Is a Confidence Interval?
2. 7.2 Calculating Confidence Limits for Multiple Samples
3. 7.3 What Is Hypothesis Testing?
4. 7.4 Type I and Type II Errors
5. 7.5 Selecting Sample Sizes
6. Key Terms
Chapter 8: Statistical Tests for Categorical Data
1. 8.1 Independence of Two Variables
2. 8.2 Examples of Chi-Square Analyses
3. 8.3 Small Expected Values in Cells
4. Key Terms
Chapter 9: t Tests for Related and Unrelated Data
1. 9.1 What Is a t Test?
2. 9.2 A t Test for Comparing Two Groups
3. 9.3 A t Test for Related Data
4. Key Terms
Chapter 10: Analysis of Variance
1. 10.1 One-Way Analysis of Variance
2. 10.2 ANOVA for Repeated Measures
3. 10.3 Factorial Analysis of Variance
4. Key Terms
Chapter 11: Simple Linear Regression
1. 11.1 Meaning and Calculation of Linear Regression
2. 11.2 Testing the Hypothesis of Independence
3. 11.3 The Excel Regression Add-In
4. 11.4 The Importance of Examining the Scatterplot
5. 11.5 The Relationship between Regression and the t Test
6. Key Terms
Chapter 12: Multiple Regression: Concepts and Calculation
1. 12.1 Introduction
2. Key Terms
Chapter 13: Extensions of Multiple Regression
1. 13.1 Dummy Variables in Multiple Regression
2. 13.2 The Best Regression Model
3. 13.3 Correlation and Multicolinearity
4. 13.4 Nonlinear Relationships
5. Key Terms
Chapter 14: Analysis with a Dichotomous Categorical Dependent Variable
1. 14.1 Introduction to the Dichotomous Dependent Variable
2. 14.2 An Example with a Dichotomous Dependent Variable: Traditional Treatments
3. 14.3 Logit for Estimating Dichotomous Dependent Variables
4. 14.4 A Comparison of Ordinary Least Squares, Weighted Least Squares, and Logit
5. Key Terms

Appendix A: Multiple Regression and Matrices

An Introduction to Matrix Math
Addition and Subtraction of Matrices
Multiplication of Matrices
Matrix Multiplication and Scalars
Finding the Determinant of a Matrix
Matrix Capabilities of Excel
Explanation of Excel Output Displayed with Scientific Notation
Using the b Coefficients to Generate Regression Results
Calculation of All Multiple Regression Results

References

Glossary

Index

End User License Agreement

List of Illustrations

Chapter 2: Excel as a Statistical Tool

Figure 2.1 Initial view of an Excel spreadsheet
Figure 2.2 Excel arithmetical conventions
Figure 2.3 Moving around a data set
Figure 2.4 Result of Ctrl+Shift+Right arrow
Figure 2.5 Highlighting an entire column
Figure 2.6 Copying a formula to several cells
Figure 2.7 Moving a data range with drag and drop
Figure 2.8 The Undo button
Figure 2.9 Insert Function dialog box
Figure 2.10 Function Arguments dialog box
Figure 2.11 Calculation of average
Figure 2.12 Summing two noncontiguous areas
Figure 2.13 Use of the =IF() function
Figure 2.14 Nested =IF() functions
Figure 2.15 Insert Chart dialog box
Figure 2.16 A basic bar graph
Figure 2.17 Excel's chart pop-up menu
Figure 2.18 Select Data Source dialog box
Figure 2.19 A different view of the same data
Figure 2.20 Chart formatting pop-up menu
Figure 2.21 Sort dialog box
Figure 2.22 Sort dialog box and data to be sorted
Figure 2.23 Sort dialog box with two-column sort options
Figure 2.24 Result of data sort on two variables
Figure 2.25 Data Analysis option
Figure 2.26 Excel Options button
Figure 2.27 Excel Options dialog box with add-ins screen displayed
Figure 2.28 Add-Ins dialog box with Analysis ToolPak selected
Figure 2.29 Data Analysis dialog box
Figure 2.30 Frequency calculation
Figure 2.31 Matrix math example
Figure 2.32 Calculations of percentages

Chapter 3: Data Acquisition: Sampling and Data Preparation

Figure 3.1 A small data set
Figure 3.2 Constructed data for an imaginary clinic
Figure 3.3 Beginning of random number generation
Figure 3.4 Paste Special dialog box
Figure 3.5 Sort dialog box
Figure 3.6 Result of sort operation
Figure 3.7 Partial list of women who will receive each intervention
Figure 3.8 Random Number Generation dialog box
Figure 3.9 Five sets of 10 random numbers
Figure 3.10 Value and probability input range (example)
Figure 3.11 Text Import Wizard, Step 1
Figure 3.12 Text Import Wizard, Step 2
Figure 3.13 Text Import Wizard, Step 3
Figure 3.14 Data as initially imported from a text file
Figure 3.15 Making imported dates century-correct
Figure 3.16 Text file imported to Excel with ID and Variable labels
Figure 3.17 Calculation of length of stay
Figure 3.18 Calculation of age
Figure 3.19 Imported file ready for analysis

Chapter 4: Data Display: Descriptive Presentation, Excel Graphing Capability

Figure 4.1 =MIN() and =MAX() functions
Figure 4.2 Frequency distribution of age
Figure 4.3 Formulas for frequency distribution of age
Figure 4.4 Data Analysis dialog box
Figure 4.5 Histogram dialog box
Figure 4.6 Final output for histogram example
Figure 4.7 Chart of the age frequency distribution
Figure 4.8 Chart of age showing BIN ranges
Figure 4.9 Line chart depiction of age frequencies
Figure 4.10 Bar chart depiction of age frequencies
Figure 4.11 Pie chart depiction of age frequencies
Figure 4.12 XY(Scatter) chart of age and LOS
Figure 4.13 Cumulative frequency and percentage distributions
Figure 4.14 Formula view of Figure 4.13
Figure 4.15 Graph of age showing actual and cumulative values
Figure 4.16 Four distribution types
Figure 4.17 Graph of Medicare payments
Figure 4.18 Infant mortality for 149 countries of the world
Figure 4.19 Infant mortality for states of the United States
Figure 4.20 Simulation of the roll of a fair die 100 times
Figure 4.21 Create PivotTable dialog box
Figure 4.22 Pivot table layout screen
Figure 4.23 Finished pivot table for Sex category
Figure 4.24 Value Field Settings dialog box
Figure 4.25 Pivot table layout screen
Figure 4.26 Two-variable pivot table for DRG category and Sex
Figure 4.27 A Pareto chart for DRG categories

Chapter 5: Basic Concepts of Probability

Figure 5.1 Possible combinations of five coin flips
Figure 5.2 Venn diagram for two mutually exclusive events
Figure 5.3 Children ever born to 2,556 women
Figure 5.4 Sequential events
Figure 5.5 First 20 observations in an emergency room visit file
Figure 5.6 Contingency table of shift and emergency status
Figure 5.7 Joint probabilities for shift and emergency status
Figure 5.8 Venn diagram of two events that are not mutually exclusive
Figure 5.9 Joint probability “or” for shift and emergency status
Figure 5.10 Conditional probabilities for arrival during any shift
Figure 5.11 Conditional probabilities for high- and low-income women and number of children
Figure 5.12 All possible outcomes of the flip of a coin five times
Figure 5.13 All possible outcomes of five emergency room visits
Figure 5.14 Probabilities of number of visits that are actual emergencies
Figure 5.15 Probabilities of number of visits using formulas
Figure 5.16 Formulas used for calculations of probabilities
Figure 5.17 The =BINOMDIST() function
Figure 5.18 Binomial distribution for emergencies in an eight-hour shift
Figure 5.19 Binomial distributions for 0.75 and 0.85 correct
Figure 5.20 Poisson distribution of emergency room arrivals in 15-minute intervals
Figure 5.21 Calculated Poisson distribution of emergency room arrivals in 15-minute intervals
Figure 5.22 Calculated Poisson distribution of emergency room arrivals: Excel formulas
Figure 5.23 Poisson distribution for gloves that are not usable in a box of 100
Figure 5.24 A normal distribution

Chapter 6: Measures of Central Tendency and Dispersion: Data Distributions

Figure 6.1 Time spent by physician with patients
Figure 6.2 Ordered time spent by physician with patients
Figure 6.3 Calculation of variance
Figure 6.4 All samples of two from a population of four
Figure 6.5 Sum of absolute differences
Figure 6.6 Sum of absolute differences, second example
Figure 6.7 Sum of squared differences
Figure 6.8 Frequency distribution with mean and standard deviation, HDI data
Figure 6.9 Histogram (graph) of HDI values
Figure 6.10 Normal distribution
Figure 6.11 Calculations for normal distribution
Figure 6.12 Cumulative normal distribution
Figure 6.13 Cumulative normal probabilities for the weight of newborns
Figure 6.14 Length of stay for one year of discharges from a 200-bed hospital
Figure 6.15 Distribution of 250 sample means
Figure 6.16 Comparison of the population variance divided by two and the variance of the mean of all samples of size two
Figure 6.17 Probabilities of prenatal visits
Figure 6.18 Data Analysis dialog box
Figure 6.19 Random Number Generation dialog box
Figure 6.20 Random samples of visits
Figure 6.21 Example of calculations of means, standard deviations, and standard errors for 100 samples
Figure 6.22 Distribution of means from 250 samples of size 100
Figure 6.23 Calculation of the mean and standard deviation of a discrete numerical variable
Figure 6.24 Portion of correctly and incorrectly completed forms
Figure 6.25 Calculation of probability of 85 percent correct
Figure 6.26 Probability of 70 percent or less
Figure 6.27 Use of the =NORMDIST() function
Figure 6.28 Degrees of freedom
Figure 6.29 Two t distributions
Figure 6.30 Exact probabilities of t distributions

Chapter 7: Confidence Limits and Hypothesis Testing

Figure 7.1 Distribution of costs for 12,000 discharges
Figure 7.2 Confidence limits from 10 samples
Figure 7.3 Calculation of means and limits
Figure 7.4 Distribution of sample means around 48 inches
Figure 7.5 Distribution of sample means for 48 inches and 49 inches
Figure 7.6 Distribution around true means of 45 and 48 inches
Figure 7.7 Sixty-eight percent confidence limits for a distribution around 48 Inches
Figure 7.8 Distributions around three true means
Figure 7.9 Distributions around 48 and 44.6 inches
Figure 7.10 Two distributions for a sample of 290
Figure 7.11 Upper limit for $c07-math-0104$
Figure 7.12 Positioning $c07-math-0106$
Figure 7.13 Low beta value
Figure 7.14 Effect of sample size on standard error and measurement error

Chapter 8: Statistical Tests for Categorical Data

Figure 8.1 Marginal frequencies
Figure 8.2 Marginal frequencies with most probable internal frequencies
Figure 8.3 Marginal and conditional probabilities
Figure 8.4 Different conditional probabilities
Figure 8.5 Template for the chi-square
Figure 8.6 The =CHITEST function
Figure 8.7 The chi-square distribution for one degree of freedom
Figure 8.8 Example from the Halpern et al. ([2001]) article
Figure 8.9 An n-by-two chi-square
Figure 8.10 Number of children and desire for more
Figure 8.11 Adequacy of treatment of three conditions
Figure 8.12 Yates's correction
Figure 8.13 Small expected values in df > 1

Chapter 9: t TESTS FOR RELATED AND UNRELATED DATA

Figure 9.1 Distribution of 250 t tests when H0 = $5,905
Figure 9.2 Distribution of 250 t tests when H0 = $4,500
Figure 9.3 Region of rejection for two-tail test
Figure 9.4 Region of rejection for one-tail test
Figure 9.5 Type II error for true means of $5,000 and $7,000
Figure 9.6 Type II error for true means of $5,000 and $7,000 and sample size 150 for each group
Figure 9.7 Results of a breast cancer experiment
Figure 9.8 F distribution
Figure 9.9 Dialog box for t test for equal variance
Figure 9.12 Calculation of before and after t test
Figure 9.10 Results of Excel t test for equal variance
Figure 9.11 Results of Excel t test for unequal variance
Figure 9.13 Excel add-in for before and after t test

Chapter 10: Analysis of Variance

Figure 10.1 Average cost distribution for two hospitals
Figure 10.2 Average cost distribution for four hospitals
Figure 10.3 Analysis of variance for four hospitals
Figure 10.4 Data arrangement for the Excel Single-Factor ANOVA add-in
Figure 10.5 ANOVA: Single-Factor dialog box
Figure 10.6 Output of the ANOVA: Single-Factor data analysis add-in
Figure 10.7 Test of differences between two means in ANOVA
Figure 10.8 Bartlett test for homogeneity of variance: Interpreting the Bartlett test
Figure 10.9 Data for ANOVA for repeated measures
Figure 10.10 Results of ANOVA repeated measures
Figure 10.11 Calculation of SS_R
Figure 10.12 Degrees of freedom in repeated measures
Figure 10.14 ANOVA factorial analysis
Figure 10.15 Degrees of freedom in two-factor factorial ANOVA
Figure 10.16 Data arrangement for ANOVA: Two-Factor with Replication
Figure 10.17 Results of ANOVA: Two-Factor with Replication
Figure 10.18 Simple data for repeated measures in a factorial design
Figure 10.19 ANOVA results for repeated measures in a factorial design
Figure 10.20 Sources of variation and degrees of freedom in factorial designs
Figure 10.21 Appropriate analysis for repeated measures, two-factor design

Chapter 11: Simple Linear Regression

Figure 11.1 Examples of relationships
Figure 11.2 Positive relationship with the best-fitting straight line
Figure 11.3 Twenty hospital stays
Figure 11.4 Length of stay and charges
Figure 11.5 Calculation of coefficients
Figure 11.6 Calculation of R² and F
Figure 11.7 Total variance, regression variance, and error variance
Figure 11.8 Excel Data Analysis add-in dialog box
Figure 11.9 Regression dialog box
Figure 11.10 Results of using the regression add-in
Figure 11.11 Four data sets
Figure 11.12 Scatterplots of four data sets where y = .5x + 1.55
Figure 11.13 Regression as a t test

Chapter 12: Multiple Regression: Concepts and Calculation

Figure 12.1 Cost data for 10 hospitals
Figure 12.2 Initial Regression dialog box
Figure 12.3 Multiple regression output
Figure 12.4 Calculation of sums
Figure 12.5 Successive elimination for b_j

Chapter 13: Extensions of Multiple Regression

Figure 13.1 Data for 20 hospital days with sex as a dummy variable
Figure 13.2 Results of regression for 20 hospitals
Figure 13.3 Graph of cost data by length of stay
Figure 13.4 Hospital charges with dummy and interaction
Figure 13.5 Regression coefficients with dummy and interactions
Figure 13.6 Hospital charges with dummy and interaction: Modified example
Figure 13.7 Regression coefficients with dummy only: Modified example
Figure 13.8 Regression coefficients with dummy and interaction: Modified example
Figure 13.9 Hospital charges with dummy and iteration graphed
Figure 13.10 Charge data showing an interaction effect
Figure 13.11 Coefficients for charge data showing an interaction effect
Figure 13.12 Coefficients for charge data showing only the interaction effect
Figure 13.13 Graph of charge data with predicted lines
Figure 13.14 Data from The State of the World's Children [1996]
Figure 13.15 Coefficients for each of the predictor variables for U5Mort independently
Figure 13.16 Multiple regression coefficients of predictors for U5Mort
Figure 13.17 Multiple regression coefficients of best predictors for U5Mort: Backward elimination
Figure 13.18 Multiple regression coefficients of best predictors for U5Mort: Forward inclusion
Figure 13.19 Multiple regression coefficients of best predictors for U5Mort: Health predictors
Figure 13.20 Multiple regression coefficients of best predictors for U5Mort: Cultural predictors
Figure 13.21 Correlation among the predictors for U5Mort
Figure 13.22 CPR and TFR together as predictors for U5Mort
Figure 13.23 CPR and TFR together as predictors for U5Mort: Reduced sample
Figure 13.24 Relationship between GNP and U5Mort
Figure 13.25 Sample of data for second-degree curve analysis of U5Mort
Figure 13.26 Relationship between GNP and U5Mort with U5Mort predicted: Second-degree curve
Figure 13.27 Relationship between LogGNP and U5Mort
Figure 13.28 Relationship between LogGNP and LogU5Mort
Figure 13.29 Relationship between GNP and U5Mort with HiGNP dummy
Figure 13.30 LogGNP versus Log U5Mort with Format Data Series menu displayed
Figure 13.31 Format Trendline dialog box
Figure 13.32 Selections in the Format Trendline dialog box
Figure 13.33 Best-fitting line for LogGNP and LogU5Mort: Linear model
Figure 13.34 Best-fitting line for GNP and U5Mort: Logarithmic model
Figure 13.35 Best-fitting line for GNP and U5Mort: Power model
Figure 13.36 Best-fitting line for GNP and U5Mort: Exponential model

Chapter 14: Analysis with a Dichotomous Categorical Dependent Variable

Figure 14.1 Data for 32 mothers
Figure 14.2 One-zero conversion of data for 32 mothers
Figure 14.3 Chi-square analysis of immunization by age of mother
Figure 14.4 Results of OLS for immunization data
Figure 14.5 Calculation of weights for WLS
Figure 14.6 Weighted least squares results
Figure 14.7 Regression setup for weighted least squares analysis
Figure 14.8 Calculation of pseudo R square for WLS
Figure 14.9 Graph of two relationships between independent and dependent variables
Figure 14.10 First step in finding logL
Figure 14.11 Complete layout for maximizing logL
Figure 14.12 Solver Parameters dialog box for maximizing logL
Figure 14.13 Solver solution for Logit model
Figure 14.14 Calculation of chi-square for Logit
Figure 14.15 First step in the calculation of the information matrix
Figure 14.16 Calculation of q_i × x_ij
Figure 14.17 Formation of the transpose matrix
Figure 14.18 Information matrix and t tests
Figure 14.19 Comparison of OLS-, WLS-, and Logit-predicted values

Multiple Regression and Matrices

Figure A.1 Hospital data
Figure A.2 Arrays y , X , and X′
Figure A.3 Arrays X′X and X′y
Figure A.4 Inverse of X′X
Figure A.5 The b coefficients
Figure A.6 Calculation of standard errors of b
Figure A.7 Calculation of all results from Figure 12.3

List of Tables

Chapter 2: Excel as a Statistical Tool

Table 2.1 Glossary of Key Excel Terms Used throughout Text

Chapter 3: Data Acquisition: Sampling and Data Preparation

Table 3.1 Formula for Figure 3.17
Table 3.2 Formula for Figure 3.18
Table 3.3 Formula for Figure 3.19

Chapter 6: Measures of Central Tendency and Dispersion: Data Distributions

Table 6.1 Formulas for Figure 6.3
Table 6.2 Formulas for Figure 6.8

Chapter 7: Confidence Limits and Hypothesis Testing

Table 7.1 Formulas for Figure 7.3

Chapter 9: t TESTS FOR RELATED AND UNRELATED DATA

Table 9.1 Formulas for Figure 9.7
Table 9.2 Formulas for Figure 9.12

Chapter 10: Analysis of Variance

Table 10.1 Formulas for Figure 10.3
Table 10.2 Formulas for Figure 10.7
Table 10.3 Formulas for Figure 10.8
Table 10.4 Formulas for Figure 10.9
Table 10.5 Formulas for Figure 10.10
Table 10.7 Formulas for Figure 10.14
Table 10.8 Formulas for Figure 10.21

Chapter 11: Simple Linear Regression

Table 11.1 Formulas for Figure 11.5
Table 11.2 Formulas for Figure 11.6

Chapter 12: Multiple Regression: Concepts and Calculation

Table 12.1 Formulas for Figure 12.4

Chapter 14: Analysis with a Dichotomous Categorical Dependent Variable

Table 14.1 Formulas for Figure 14.11
Table 14.2 Comparing linear regression, logic regression, and survival analysis

Multiple Regression and Matrices

Table A.1 Formulas for Figure A.7

Published by Jossey-Bass

A Wiley Brand

One Montgomery Street, Suite 1000, San Francisco, CA 94104-4594— www.josseybass.com

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400, fax 978-646-8600, or on the Web at www.copyright.com. Requests to the publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, 201-748-6011, fax 201-748-6008, or online at www.wiley.com/go/permissions.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. Readers should be aware that Internet Web sites offered as citations and/or sources for further information may have changed or disappeared between the time this was written and when it is read.

Jossey-Bass books and products are available through most bookstores. To contact Jossey-Bass directly call our Customer Care Department within the U.S. at 800-956-7739, outside the U.S. at 317-572-3986, or fax 317-572-4002.

Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com.

Library of Congress Cataloging-in-Publication Data

Names: Kros, John F. | Rosenthal, David A. | Veney, James E. Statistics for health care professionals working with Excel.

Title: Statistics for health care management and administration : working with Excel / John F. Kros, David A. Rosenthal.

Other titles: Statistics for health care professionals working with Excel.

Description: Third edition. | Hoboken : Wiley, 2016. | Series: Public health/epidemiology and biostatistics | Revised edition of Statistics for health care professionals working with Excel, 2009. | Includes index.

Identifiers: LCCN 2015019360 (print) | LCCN 2015039488 (ebook) | ISBN 9781118712658 (paperback) | ISBN 9781118712641 (pdf) | ISBN 9781118712764 (epub)

Classification: LCC RA409.5 .K76 2016 (print) | LCC RA409.5 (ebook) | DDC 610.2/1–dc23

LC record available at http://lccn.loc.gov/2015019360

%ADD E-ISBN NUMBERS FOR CIP AT BOTTOM OF BLOCK OF CIP DATA

Cover design: Wiley

The second edition was published under the title Statistics for Health Care Professionals: Working With Excel.

THIRD EDITION

Preface

The study and use of statistics have come a long way since the advent of computers. Particularly, computers have reduced both the effort and the time involved in the statistical analysis of data. But this ease of use has been accompanied by some difficulties. As computers became more and more proficient at carrying out statistical operations of increasing complexity, the actual operations—and what they actually meant and did—became more and more distant from the user. It became possible to do a wide variety of statistical operations with a few lines or words of commands to the computer. But the average student, even the average serious user of statistics, found the increasingly complex operations increasingly difficult to access and understand.

Introducing Excel

Sometime in the late 1980s, Microsoft Excel became available, and with it came the ability to carry out a wide range of statistical operations—and to understand the operations that were being carried out—in a spreadsheet format. John's first introduction to Excel was a revelation. It came during his MBA studies and continued through his doctoral studies and even in his first industry job. In fact, John quickly became somewhat indispensable in that first industry job for the plain fact that he was the most proficient of his peers at Excel. Through the years he found himself using Excel to complete all kinds of tasks (since he was too stubborn to learn to program properly). He discovered that Excel was not only a powerful statistical tool but also, more important, a powerful learning tool. When he began to teach the introductory course in business decision modeling to MBA students, Excel seemed to him to be the obvious medium for the course.

So How Did We Get to Here?

At the time John started using Excel in his teaching, there were a few textbooks devoted to statistics using Excel. However, none fit his needs very well, so he wrote Spreadsheet Modeling for Business Decision Modeling. That was about the time John met David. David had earned his doctorate in technology management and had worked in the health care industry for more than 10 years (which ensures that the health care–specific examples and scenarios used in this book are appropriate). He discovered the power of Excel's statistical analysis functionality by using it to calculate the multiple regression and correlation analysis required for his doctoral dissertation.

Through his friend, Scott Bankard, John learned that the author of a successful text in the use of Excel to solve statistical problems in health care administration was looking for someone to revise that text. In turn, John and David became the coauthors of the revised text.

Intended Level of the Textbook

The original text was designed as an introductory statistics text for students at the advanced undergraduate level or for a first course in statistics at the master's degree level. It was intended to stand alone as the book for the only course a student might have in statistics. The same is true for the revised text, which includes some enhancements and updates that provide a good foundation for more advanced courses as well. Furthermore, since the book relies on Excel for all the calculations of the statistical applications, it was also designed to provide a statistical reference for people working in the health field who may have access to Excel but not to other dedicated statistical software. This is valuable in that a copy of Excel resides on the PC of almost every health care professional. Further, no additional appropriations would have to be made for proprietary software and there would be no wait for the “stat folks.”

Textbook Organization

The revised edition of the text has been updated for use with Microsoft Office Excel 2013. It provides succinct instruction in the most commonly used techniques and shows how these tools can be implemented using the most current version of Excel for Windows. The revised text also focuses on developing both algebraic and spreadsheet modeling skills. Algebraic formulation and spreadsheets are juxtaposed to help develop conceptual thinking skills. Step-by-step instructions in Excel 2013 and numerous annotated screenshots make examples easy to follow and understand. Emphasis is placed on the model formulation and interpretation rather than on computer code or algorithms.

The book is organized into two major parts: Part 1, Chapters 1 through 6, presents Excel as a statistical tool and discusses hypothesis testing. Part 1 introduces the use of statistics in health policy and health administration–related fields, Excel as a statistical tool, data preparation and the data display capabilities of Excel, and probability, the foundation of statistical analysis. For students and other users of the book truly familiar with Excel, much of the material in Chapter 2, Chapter 3, and Chapter 4, particularly, could be covered very quickly.

Part 2, which includes Chapters 7 through 14, is devoted to the subject of hypothesis testing, the basic function of statistical analysis. Chapter 7 provides a general introduction to the concept of hypothesis testing. Each subsequent chapter provides a description of the major hypothesis testing tool for a specific type of data. Chapter 8 discusses the use of the chi-square statistic for assessing data for which both the independent and dependent variables are categorical. Chapter 9, on t tests, discusses the use of the t test for assessing data in which the independent variable is a two-level categorical variable and the dependent variable is a numerical variable. Chapter 10 is devoted to analysis of variance, which provides an analytical tool for a multilevel categorical independent variable and a numerical dependent variable. Chapters 11 through 13 are devoted to several aspects of regression analysis, which deals with numerical variables both as independent and dependent variables. Finally, Chapter 14 deals with numerical independent variables and dependent variables that are categorical and take on only two levels and introduces the use of Logit.

Leading by Example(s)

Each chapter of the book is structured around examples demonstrated extensively with the use of Excel displays. The chapters are divided into sections, most of which include step-by-step discussions of how statistical problems are solved using Excel, including the Excel formulae. Each section in a chapter is followed by exercises that address the material covered in that section. Most of these exercises include the replication of examples from that section. The purpose is to provide students an immediate reference with which to compare their work and determine whether they are able to correctly carry out the procedure involved. Additional exercises are provided on the same subjects for further practice and to reinforce the learning gained from the section. Data for all the exercises are included on the web at www.wiley.com/go/kros3e , and may be accessed by file references given in the examples themselves. Additional materials, such as videos, podcasts, and readings, can be found at www.josseybasspublichealth.com.

A supplemental package available to instructors includes all answers to the section exercises. In addition, the supplemental package will contain exam questions with answers and selected Excel spreadsheets that can be used for class presentations, along with suggestions for presenting these materials in a classroom. However, the book can be effectively used for teaching without the additional supplemental material.

Users who would like to provide feedback, suggestions, corrections, examples of applications, or whatever else can e-mail me at krosj@ecu.edu.

Please feel free to contact me and provide any comments you feel are appropriate.

Acknowledgments

As always this newly revised version of the text would not have been possible without the support and guidance of numerous colleagues, friends, family, and all those poor souls who had to listen to us bounce ideas off of them, or for that matter anyone that just had to listen to us!

Many people contributed to this book as it now appears—in particular, several faculty at the University of North Carolina who contributed to the original edition and to this revision in various ways.

We would like to thank proposal reviewers Nan Liu, Xinliang Liu, Lawrence A. West, Jr., James Porto, and Graciela E. Silva, who provided valuable feedback on the original book proposal. Echu Liu, John P. Gaze, and James Porto provided thoughtful and constructive comments on the complete draft manuscript.

John thanks his wife, Novine, and his daughters, Samantha and Sabrina, for always being by his side and encouraging him in the special way they do when the light at the end of the tunnel starts to dim. At present Samantha is 13 years old and always reminds her dad that she loves him, and maybe someday he will be cool again (no date has been set yet). Sabrina is 11 years old and tells her dad that she expects great things from the East Carolina University Pirate football squad, the Texas Longhorn and Nebraska Cornhusker football teams, and the Virginia Cavaliers. She also invites anyone interested to eat brunch on any given Sunday at her favorite establishment, the West End Dining Hall on East Carolina University's campus. John would like to say thank you to Novine, Samantha, and Sabrina and that he loves them very much. John also must thank his parents, Bernie and Kaye, who have always supported him, even when they didn't exactly know what he was writing about. Finally, John has to thank Scott Bankard of Vidant Health for setting things in motion way back in 2007 and suggesting the project.

David thanks his wife Allyson for her unending and unwavering support, and his son, daughter, and son-in-law for being so easily impressed by his modest professional achievements.

—John F. Kros, PhD, and David A. Rosenthal, PhD

The Authors

John F. Kros is the Vincent K. McMahon Distinguished Professor of Business in the Marketing and Supply Chain Management Department in the College of Business at East Carolina University, in Greenville, North Carolina. He teaches business decision modeling, statistics, operations and supply chain management, and logistics and materials management courses. Kros was honored as the College of Business's Scholar/Teacher for 2004–2005, again in 2009–2010, and was awarded the College of Business Commerce Club's highest honor, the Teaching Excellence Award, for 2006 and again in 2011. Kros earned his PhD in systems engineering from the University of Virginia, his MBA from Santa Clara University, and his BBA from the University of Texas at Austin. His research interests include health care operations, applied statistics, design of experiments, multi-objective decision making, Taguchi methods, and applied decision analysis. In 2014, the fourth edition of his textbook titled Spreadsheet Modeling for Business Decisions was printed. He is also coauthor of Health Care Operations and Supply Chain Management, published in 2013. He enjoys spending his free time with his beautiful red-headed wife, Novine, and their two beautiful daughters, Samantha and Sabrina, traveling, snow skiing, vegetable gardening, spending time with his family and old fraternity brothers, watching college football, and attempting to locate establishments that provide quality food and liquid refreshment.

David A. Rosenthal is a professor and chair of Health Care Management at Baptist College of Health Sciences in Memphis, Tennessee. Rosenthal earned a master of public administration degree from Valdosta State University in 1996, and a PhD in technology management from Indiana State University in 2002. He has over 20 years of health care experience in both academic and practitioner settings, having served in roles specific to health care information technology leadership, multispecialty practice management, and ambulatory services project management. Rosenthal served for two years as director of the state of Tennessee's Health Information Exchange (HIE) Evaluation Project while a faculty member in the division of Health Systems Management and Policy in the School of Public Health at the University of Memphis from 2009 to 2013. He also served as director of Statewide eHealth Initiatives while a faculty member in the Department of Health Informatics and Information Management at the University of Tennessee Health Science Center in Memphis from 2007 to 2009. Rosenthal currently resides in Memphis, Tennessee, with his wife, Allyson, and their extended family.

STATISTICS FOR HEALTH CARE MANAGEMENT AND ADMINISTRATION

WORKING WITH EXCEL

Third Edition

Preface

Introducing Excel

So How Did We Get to Here?

Intended Level of the Textbook

Textbook Organization

Leading by Example(s)

Acknowledgments

The Authors

Part 1