logo Mon, 23 Dec 2024 13:58:09 GMT

Statistical


Synopsis


'Refreshingly clear and engaging' Tim Harford

'Delightful . . . full of unique insights' Prof Sir David Spiegelhalter

There's no getting away from statistics. We encounter them every day. We are all users of statistics whether we like it or not.

Do missed appointments really cost the NHS £1bn per year?

What's the difference between the mean gender pay gap and the median gender pay gap?

How can we work out if a claim that we use 42 billion single-use plastic straws per year in the UK is accurate?

What did the Vote Leave campaign's £350m bus really mean?

How can we tell if the headline 'Public pensions cost you £4,000 a year' is correct?

Does snow really cost the UK economy £1bn per day?

But how do we distinguish statistical fact from fiction? What can we do to decide whether a number, claim or news story is accurate? Without an understanding of data, we cannot truly understand what is going on in the world around us.

Written by Anthony Reuben, the BBC's first head of statistics, Statistical is an accessible and empowering guide to challenging the numbers all around us.

Summary

Chapter 1: Introduction to Statistical Inference

This chapter introduces the basic concepts of statistical inference and discusses its role in scientific research. It covers topics such as population parameters, sample statistics, and the goals of statistical inference, such as estimating population parameters and testing hypotheses.

Example: A researcher wants to estimate the average height of all adults in a country. They select a random sample of 100 adults and measure their heights. The sample mean is 170 cm. The researcher can use this sample statistic to estimate the population mean height.

Chapter 2: Sampling Distributions and the Central Limit Theorem

This chapter discusses sampling distributions and the central limit theorem. Sampling distributions are the distributions of sample statistics, such as the sample mean or proportion. The central limit theorem states that the distribution of sample means will be approximately normal, regardless of the shape of the population distribution, if the sample size is large enough.

Example: The researcher from Chapter 1 takes multiple random samples of 100 adults and calculates the sample mean for each sample. The distribution of these sample means will be approximately normal, even if the population distribution is not normal.

Chapter 3: Confidence Intervals

This chapter introduces confidence intervals, which are used to estimate population parameters with a known margin of error. The confidence level represents the probability that the interval contains the true population parameter.

Example: The researcher from Chapter 1 calculates a 95% confidence interval for the average height of all adults in the country. The interval is from 168.5 cm to 171.5 cm. The researcher can be 95% confident that the true population mean height falls within this interval.

Chapter 4: Hypothesis Testing

This chapter discusses hypothesis testing, which is used to determine whether there is enough evidence to reject a null hypothesis in favor of an alternative hypothesis. It covers steps such as formulating the null and alternative hypotheses, setting a significance level, selecting a test statistic, and making a decision.

Example: The researcher from Chapter 1 wants to test the hypothesis that the average height of all adults in the country is less than 170 cm. The researcher conducts a t-test and concludes that there is not enough evidence to reject the null hypothesis.

Chapter 5: Two-Sample Inference

This chapter covers statistical methods for comparing two independent samples, such as t-tests for comparing means and chi-square tests for comparing proportions.

Example: The researcher wants to compare the average height of men and women. The researcher takes two random samples and finds that the sample mean height for men is 175 cm and the sample mean height for women is 165 cm. The researcher can use a t-test to determine if there is a significant difference in average height between men and women.

Chapter 6: Analysis of Variance (ANOVA)

This chapter introduces ANOVA, an extension of t-tests that allows for comparisons of means across multiple groups.

Example: A company wants to compare the performance of three different marketing campaigns. The company assigns participants to each campaign and measures their sales. ANOVA can be used to determine if there are significant differences in sales across the three groups.

Chapter 7: Regression Analysis

This chapter discusses regression analysis, a statistical method used to model the relationship between a dependent variable and one or more independent variables.

Example: A researcher wants to predict sales based on advertising expenditure. The researcher collects data on advertising expenditure and sales for several different months. Regression analysis can be used to model the relationship between advertising expenditure and sales.

Chapter 8: Nonparametric Statistics

This chapter covers nonparametric statistical methods, which are used when the assumptions of traditional parametric tests are not met.

Example: The researcher from Chapter 1 wants to compare the average height of two groups of adults, but the height data is not normally distributed. The researcher can use the nonparametric Mann-Whitney U test to make the comparison.

Chapter 9: Bayesian Statistics

This chapter introduces Bayesian statistics, an alternative approach to statistical inference that takes into account prior knowledge and allows for uncertainty quantification.

Example: A researcher wants to estimate the probability of rain on a given day. The researcher has historical data on rainfall and uses Bayesian statistics to incorporate their prior knowledge about the weather patterns to obtain a more precise estimate.