WALTER PECK: Hi, again. So if you're looking at this video, you're probably convinced that you have numerical data to analyze. Numerical data is something that you can measure, like weight or time. So you end up having a mean and a standard deviation. Some numbers are higher, some numbers are lower than the mean. And you want to figure out, well, does this data indicate I should accept one hypothesis or the other hypothesis?
Let's just review a couple things about that kind of data. So let's imagine we have a population. Oh, I'm going to make up an example here. I'm going to imagine that a paleoanthropologist, a person who studies ancient evolution of human beings, has defined a new species of Homo, Homo ibericus, from the Iberian Peninsula.
And so you could imagine that if you take all the members of this species who ever existed, if you could do that, and measure every last one of those individual's weight, you got a distribution. You got a population mean mu. You got a population standard deviation sigma. And you now get a normal curve, a bell curve, where between plus and minus 1 standard deviation of the mean 67% of the individuals are located.
An interesting question we see in paleoanthropology is, what was the social structure like for those early humans? One thing that really tells us a lot about that possible social structure is something called sexual dimorphism. Sexual-- male and female, di-- 2, morph-- form. Two forms between the males and the females. In other words, how do the males and the females differ?
One really basic thing we notice in primates, in the anatomy of the various species of primates, is that if you have a species which has monogamous pair bonding, the males and females tend to be about the same size. Whereas if you have a species where the males have harems, the males tend to be much bigger than the females. An example of that would be gorillas. The example of the one where the male and female are very similar would be gibbons.
So when we look at ancient humans, sometimes we want to say, gee, I wonder how they lived. So what anthropologists have actually done with real species, not artificial ones like this, is gone out and taken a sample of all the bones of, say, Homo erectus, or Homo habilis, or whatever, and looked at the males and the females and seen whether or not they were the same size or whether they were different and then make some inferences about how they behaved.
Well, I made up a species called Homo ibericus. And we're going to just take an imaginary sample and see whether they're sexually dimorphic or not. Now, when we test a hypothesis, we usually have two hypotheses-- the null hypothesis H0 and the alternative hypothesis Ha. More often than not, the null hypothesis means that there's no difference between two populations, or that some treatment, a medication, let's say, has no effect upon disease rates, mortality, and morbidities.
Well, in this case, I'm going to make my null hypothesis that the population mean for all the males is equal to the population mean for all the females when it comes to weight. In other words, they're not sexually dimorphic.
I'm going to make my alternative hypothesis the idea that the males and the females are not equal in weight. One sex or the other is larger. In primates, that would usually mean that the males are bigger, but that's not necessarily true. In fact, in many animal species, it's the opposite, but not in primates.
So let's see what it would look like if H0 were true in terms of the distribution of weights for the males and the females. We're going to start with the distribution of weights for the females. There is the average mass, average weight of the females. Some are heavier, some are lighter. You get a nice bell curve. This would be counting every female that ever existed in the species. Hard to do? Nah. Impossible to do.
So on top of this, if H0 is true, I'm going to draw the distribution of weights for the males of the species. I've got a red pen for that, and I'm going to throw it right on top of it. So the mean for the males would be the same as the mean for the females. I happened to also draw them so that they had the same standard deviation, but that's not necessary. That would be H0 being true.
If, on the other hand, the males and the females are not the same size, Ha is true. H0 would be rejected in this case. We don't get one distribution on top of the other. If here's the females, the males might be shifted to a higher average weight like that.
The thing is, we can't take a census of all of these individuals. First of all, they're not alive anymore. And most of them, their bones are gone. We have to make inferences based upon the samples we can take. In other words, what we're going to do is we're going to take a sample, and we're going to get not mu, which is the population mean, and sigma, which is the population standard deviation, we're going to get x-bar and s. x-bar is the sample mean. s is the sample standard deviation.
So here's the distribution for females. In a sample, in other words, we say we've got 20 sets of bones, 10 skeletons. I don't know. It doesn't matter, some smaller number than the entire census. And we're going to figure out how much these individuals purportedly weigh. Well, there's the average for the females. There's the standard deviation up and down.
Well, let's imagine for a second that Ha is true. In other words, if we do a sampling of the males, hopefully the x-bar for the males is over here with a standard deviation issue as well, and they're going to be separated between the x-bar for the males and the x-bar for the females. That's the kind of results we would hope to get.
What statistical testing does, in particular this statistical test on quantitative numerical data called a t-test, is it tells us whether the difference in the x-bars is sufficiently big given the standard deviations and the sample sizes for us to conclude that Ha is true or for us to conclude that H0 is true. That's the whole point of statistical analysis.
In scientific studies, there are two kinds of errors that you really can't avoid-- type I errors and type II errors. You can't eliminate them completely. A type I error is when you reject a true null hypothesis. In other words, there is no difference between the two means, but you conclude, on the basis of your data, that there is a difference.
The type II error is when you reject a true Ha. In other words, on the basis of your data, you decide that there is not a difference, but there is a difference. You can't eliminate error completely, and unfortunately that's true with any scientific study. And what you have to do is you have to be smart. You have to look at your study and say, gosh, what kind of error am I willing to put up with?
Let's say, for example, you're testing a new medication, and you don't want to make errors that might hurt somebody. So you have a real sensitivity to error of a certain kind. But if you're just doing prospective-- oh, I don't know, creative work in paleoanthropology, nobody's going to get hurt. The bones are already dead. So you're willing to put up with a lot more possible error in that case, and you have to deal with your statistics appropriately.
Let's go back to the drawings of the population means and get an idea of what that means. So imagine we have the population. We have every last male of the species and every last female of the species, and you did a census. And you discover, lo and behold, that the means are the same. You want to accept H0. In other words, the two means are the same.
But when you do a sample, you don't get to touch-- you don't get to measure every individual in that population. You have to make inferences there. Imagine for a second that all the females in your sample are over here. They just happen to be, because you had such a small sample, all be on the light side. And let's imagine that you had all the males over here. They just happened, by dumb luck, to be on the heavy side.
Well, you'd end up with an x-bar for the females over here, an x-bar for the males over here. And lo and behold, the difference might be big enough for you, from your statistical analysis, to conclude that the difference is real and therefore reject your H0. That would be a type I error. That would the number, in a few minutes, that we're going to be calling alpha. You'll see that in a few minutes. We're willing to put up with a little bit of type I error in most cases.
The type II error would happen if Ha were true, and we ended up rejecting Ha. Again, imagine we have two populations of males and females here, or subpopulations of males and females. And there is a real difference in their average weights. Let's say, just by dumb luck, that the females that you found in your sample tend to be on the heavy side, and the males tend to be on the light side.
Well, you might end up with the x-bar for the males and the females right there at the very same spot. And therefore, you would conclude, from your statistical analysis, that H0 is true, that there's no difference, and reject Ha. That would be called a type II error. Unfortunately, when you increase the chance of a type I error, you decrease the chance of a type II error. They work opposite each other. You can't eliminate them both completely.
We've received your request
You will be notified by email when the transcript and captions are available. The process may take up to 5 business days. Please contact firstname.lastname@example.org if you have any questions about this request.
As part of the NIH-funded ASSET Program, students and teachers in middle and high school science classes are encouraged to participate in student-designed independent research projects. Veteran high school teacher Walter Peck, whose students regularly engage in independent research projects, presents this series of five videos to help teachers and students develop a better understanding of basic statistical procedures they may want to use when analyzing their data.