data:image/s3,"s3://crabby-images/f06a5/f06a55e9ab99a836afd02e313041457454a4d5d2" alt="Training Systems Using Python Statistical Modeling"
One-way analysis of variance (ANOVA)
One-way ANOVA tests whether all groups share a common mean with their own sample. The null hypothesis assumes that all populations share the same mean, while the alternative hypothesis simply states that the null hypothesis is false. One-way ANOVA assumes that data was drawn from normal distributions with a common standard deviation. While normality can be relaxed for larger sample sizes, the assumption of common standard deviation is, in practice, more critical.
Before performing this test, let's consider doing a visual check to see whether the data has a common spread. For example, you could create side-by-side box and whisker plots. If the data does not appear to have a common standard deviation, you should not perform this test.
The f_oneway() function from SciPy can perform this test; so, let's start performing one-way ANOVA.
Your company now has multiple processes. Therefore, before you were able to return your report for the other two, you were given data for processes C, D, and E. Your company wants to test whether all of these processes have the same mean level of resistance or whether this is not true—in other words, whether one of these processes has a different mean level of resistance. So, let's get into it:
- We will first define the data for these other processes, as follows:
data:image/s3,"s3://crabby-images/3dcda/3dcda017b6fecef1ef8c7106dc4afd8791525e36" alt=""
- We're going to use the f_oneway() function from SciPy to perform this test, and we can simply pass the data from each of these samples to this function, as follows:
data:image/s3,"s3://crabby-images/975f4/975f4dc771341870a6ffc1349f54c27d8d440e78" alt=""
- This will give us the p value, which, in this case, is 0.03:
data:image/s3,"s3://crabby-images/37cc9/37cc9a6744e207f1d6398f1321f67e98894f52d1" alt=""
This appears to be small, so we're going to reject the null hypothesis that all processes yield resistors with the same level of resistance. It appears at least one of them has a different mean level of resistance.
This concludes our discussion of classical statistical methods for now. We will now move on to discussing Bayesian statistics.