Jump to content

Recommended Posts

Posted
That's not necessary.

 

The first mistake HA made was calling the behavior seen in his scenario "regression towards the mean." That phrase has a very specific meaning to statisticians, and the behavior exhibited by his scenario does not fit that meaning. You'll notice, that is why I always refer to it as "the specific scenario" and not "regression towards the mean."

 

However, in HA's defense, he has never claimed to be a trained or professional statistician, so it is perfectly reasonable to assume he would not know that the phrase "regression towards the mean" has such a specific meaning/application in traditional statistics. Furthermore, because his scenario demonstrates how the presence of normally distributed measurement error contributes to the movement of sample means towards population means after retest, I don't think it is much of a crime for an amateur statistician with a basic stats education to call it "regression towards the mean."

 

The other mistakes HA has made involve being stubborn and arrogant, but that is definitely not in short supply amongst his detractors, either. Hell, I am as stubborn and arrogant as they come.

859871[/snapback]

Let me ask you this question: do you feel this article has mislabeled regression toward the mean? I'm not trying to argue with you, but it really seems to me the person who put that website together knows his stuff.

  • Replies 398
  • Created
  • Last Reply

Top Posters In This Topic

Posted
But is it a bulletin board crime, if people with more than a basic stats education tell him that it's not "regression to the mean," yet he continues to insist that it is?

859878[/snapback]

 

Well, I would definitely urge him to call the behavior in his scenario something besides regression towards the mean, definitely. But is it "a bulletin board crime" to continue to CALL it "regression towards the mean" (when in some literal sense, it is regression towards a mean...)? No, it's just being stubborn.

 

I would caution HA if he is trying to say his scenario is simulating the traditional definition of "regression towards the mean." But I don't think he is saying that, nor do I think he believes that now (nor am I sure he ever believed that).

Posted
My apologies.  It was a dig at him and his treatment of you as some sort of magical statistical talisman.  Not you.

859881[/snapback]

 

Understood. I am just worried about my credibility with people just jumping into this argument for the first time. Not that I have much credibility.

Posted
Well, I would definitely urge him to call the behavior in his scenario something besides regression towards the mean, definitely. But is it "a bulletin board crime" to continue to CALL it "regression towards the mean" (when in some literal sense, it is regression towards a mean...)? No, it's just being stubborn.

 

I would caution HA if he is trying to say his scenario is simulating the traditional definition of "regression towards the mean." But I don't think he is saying that, nor do I think he believes that now (nor am I sure he ever believed that).

859887[/snapback]

People in general aren't good at seeing their own flaws, so I don't know the extent to which I'm guilty of being unreasonably stubborn. That said, this is the first time anyone's said to me that the phenomenon I've been describing exists, but that I've been mislabeling it. Mostly the message I've been hearing from Bungee Jumper and Ramius is that the phenomenon I've been describing doesn't exist, that I'm an idiot for saying it does exist, and that I'm an idiot in general. If I've thrown insults at those two, it's because they've first insulted me. In this case, their insults were directed at me for saying things which are objectively correct and not seriously disputed.

Posted
Let me ask you this question: do you feel this article has mislabeled regression toward the mean? I'm not trying to argue with you, but it really seems to me the person who put that website together knows his stuff.

859885[/snapback]

 

I would absolutely say he has mislabeled regression toward the mean, or at least over simplified.

 

Without knowing a person's true intelligence and the error distribution of the test, it is impossible to say what a subsequent test score of A SINGLE, SPECIFIC PERSON will be. If the example person's true IQ is 790 and he scores a 750, the probability that a subsequent score is going to be even lower than 750 is incredibly small (assuming the IQ test has only a reasonable amount of error) and a likely score of 725 is a ludicrously low suggestion. In other words, not very likely.

 

The author should be referring to a sample of people who scored 750 (as your scenario does). This example is confounding two behaviors that Bungee Jumper has brought up: the probability distribution of the underlying population, and regression of the error towards the mean ERROR (typically zero) with subsequent retests.

Posted
People in general aren't good at seeing their own flaws, so I don't know the extent to which I'm guilty of being unreasonably stubborn. That said, this is the first time anyone's said to me that the phenomenon I've been describing exists, but that I've been mislabeling it. Mostly the message I've been hearing from Bungee Jumper and Ramius is that the phenomenon I've been describing doesn't exist, that I'm an idiot for saying it does exist, and that I'm an idiot in general. If I've thrown insults at those two, it's because they've first insulted me. In this case, their insults were directed at me for saying things which are objectively correct and not seriously disputed.

859895[/snapback]

 

You sh--head. I've told you REPEATEDLY you're mislabelling it, that you're mistaking regression of THE ERROR toward the mean ERROR of zero for regression of THE POPULATION toward the mean of the population! You're incorrectly extrapolating a poorly chosen subset of data to a global effect over a whole population. If you understood your own example and could do the math, you could see it yourself. :)

Posted
The author should be referring to a sample of people who scored 750 (as your scenario does). This example is confounding two behaviors that Bungee Jumper has brought up: the probability distribution of the underlying population, and regression of the error towards the mean with subsequent retests.

859899[/snapback]

 

I just want to stress for HA: the error toward the mean ERROR OF AN INDIVIDUAL TEST.

 

If you integrate the error over all space (sum over the number of tests), the net error is zero (approaches zero, for a sum). It has to be; otherwise, the normal probability distribution of the population integrated over all space wouldn't be equal to one...

 

...which gets back to another question I asked that you never answered, HA: What is the integral of a gaussian over all space, and why is it important to the topic at hand?

Posted
That said, this is the first time anyone's said to me that the phenomenon I've been describing exists, but that I've been mislabeling it.

859895[/snapback]

 

I am an engineer first and foremost, so I like real world examples. So I am going to give you an extremely simple example of what statisticians traditionally refer to as "regression towards the mean" that I've worked with often:

 

Weight of a plastic part is a very good metric for a whole bunch of process parameters in injection molding. If the weights of a series of plastic parts are consistent and stable over time, you know that the injection unit of an injection molding machine is also behaving in a controlled and stable manner. You also know that the part dimensions are likely to be behaving in a controlled, stable manner.

 

When dealing with very small parts that require very small and precise shot sizes from a molding machine, part weight may have to be measured with a large degree of precision. When you start measuring part weight to tenths of thousands of grams, outside factors such as ambient temperature and humidity of the room air have huge effects, even in a controlled environment. These uncontrollable factors result in measurement error. It is typically normally distributed and centered at zero.

 

So parts are weighed repeatedly over time . If only measured once, you have no idea if the weight you've gotten is the true weight, near the true value, or an extreme value that had a .1 % chance of happening. Repeated measurements dampen out these effects. This is because measurement error is normally distributed and will regress towards the mean (of the ERROR, not the population, this says nothing about the weights of any other part).

 

Repeated measurements remove the effects of measurement error and cause regression towards the "true" mean. There is a large body of knowledge regarding how to determine how big a sample size (in this case, repeated measurements) is necessary to get an accurate reflection of the true mean (and the answer is always "how accurate do you need to be?").

 

EDIT: Note, repeated measurements in this example are not done because you think parts are changing over time. Just want to make that clear.

Posted
Repeated measurements remove the effects of measurement error and cause regression towards the "true" mean. There is a large body of knowledge regarding how to determine how big a sample size (in this case, repeated measurements) is necessary to get an accurate reflection of the true mean (and the answer is always "how accurate do you need to be?").

A good example, and I agree--repeated measurement causes regression toward the true mean.

Posted
A good example, and I agree--repeated measurement causes regression toward the true mean.

859925[/snapback]

 

"True mean" OF THE ERROR, you nitwit. OF THE ERROR.

 

Wraith capitalized it himself, I'm guessing because he knew you'd misunderstand. You measure the same item multiple times (or multiple identical items once each), to eliminate the error...which you can do because the error's normally distributed, and the variance in the weights allows you to calculate the weight where the MEAN ERROR is zero.

Posted
"True mean" OF THE ERROR, you nitwit.  OF THE ERROR.

 

Wraith capitalized it himself, I'm guessing because he knew you'd misunderstand.  You measure the same item multiple times (or multiple identical items once each), to eliminate the error...which you can do because the error's normally distributed, and the variance in the weights allows you to calculate the weight where the MEAN ERROR is zero.

859928[/snapback]

I understood the example perfectly. I think Wraith was trying to provide a better understanding of the technical definition of the term "regression toward the mean." However, I've said all along that those who obtain extreme scores on a partially luck-based test tend to score somewhat closer to the population mean upon being retested. Wraith was not disputing this. Are you?

Posted
I understood the example perfectly. I think Wraith was trying to provide a better understanding of the technical definition of the term "regression toward the mean." However, I've said all along that those who obtain extreme scores on a partially luck-based test tend to score somewhat closer to the population mean upon being retested. Wraith was not disputing this. Are you?

859931[/snapback]

 

I am absolutely disputing it, and I will continue to do so until you prove "luck" is a mathematically valid concept.

Posted
I am absolutely disputing it, and I will continue to do so until you prove "luck" is a mathematically valid concept.

859932[/snapback]

Instead of calling it luck, you're welcome to call it measurement error if that makes you happy. Assuming people's results on a given test are due at least in part to measurement error, those who obtain extreme scores on the test will tend to score closer to the population mean upon being retested. Do you dispute that?

Posted
Call it measurement error if that makes you happy. The underlying phenomenon is still there. Do you dispute that?

859933[/snapback]

 

Oh, I definitely dispute that it's measurement error.

 

And those little details are important. Not the least of which because it shows you don't understand the first thing about it. The underlying phenomenon is there not because of luck, or measurement error, or statistics gnomes. The effect - specifically, regression toward the mean - exists because the probability of small variances from the mean is greater than the probability of large variances from the mean for independent measurements within a normal distribution.

Posted
Oh, I definitely dispute that it's measurement error. 

 

And those little details are important.  Not the least of which because it shows you don't understand the first thing about it.  The underlying phenomenon is there not because of luck, or measurement error, or statistics gnomes.  The effect - specifically, regression toward the mean - exists because the probability of small variances from the mean is greater than the probability of large variances from the mean for independent measurements within a normal distribution.

859945[/snapback]

Measurement error is a necessary component in the phenomenon I've been describing. If you had an error-free I.Q. test, and 100 people who scored a 140 the first time they took the test, that group's expected average score on the retake would be 140. But if the test involves measurement error, someone who scored a 140 on the test could be a lucky 130 or an unlucky 150. A group of 100 people who scored a 140 on this error-possible test will, upon retaking it, obtain a lower average score the second time around. (Some people will obtain higher scores the second time around, but there will be far more people who obtain lower scores.)

Posted
Measurement error is a necessary component in the phenomenon I've been describing. If you had an error-free I.Q. test, and 100 people who scored a 140 the first time they took the test, that group's expected average score on the retake would be 140. But if the test involves measurement error, someone who scored a 140 on the test could be a lucky 130 or an unlucky 150. A group of 100 people who scored a 140 on this error-possible test will, upon retaking it, obtain a lower average score the second time around. (Some people will obtain higher scores the second time around, but there will be far more people who obtain lower scores.)

859958[/snapback]

 

That's because you're describing regression of the ERROR toward's the mean of the ERROR.

 

You're just too much of an idiot to know that.

Posted
That's because you're describing regression of the ERROR toward's the mean of the ERROR. 

859961[/snapback]

 

Yup, that's pretty much it.

 

Holcomb's Arm was mislead a bit by that article. The article describes a scenario where exceptional scores move towards the popluation mean when retested, and calls it regression towards the mean. HA simulates that scenario, see's that the scenario only works because of measurement error, and concludes that measurement error causes regression towards the population mean. I can see how it could happen.

Posted
You're just too much of an idiot to know that.

859961[/snapback]

Yup, that's pretty much it, too.

 

Why, for the love of god, is this discussion STILL going on? I haven't read anything this unbelievable since Ed announced he was engaged to a woman.

Posted
Yup, that's pretty much it.

 

Holcomb's Arm was mislead a bit by that article. The article describes a scenario where exceptional scores move towards the popluation mean when retested, and calls it regression towards the mean. HA simulates that scenario, see's that the scenario only works because of measurement error, and concludes that measurement error causes regression towards the population mean. I can see how it could happen.

859980[/snapback]

 

I think this should fairly well wrap it up.

Posted
Yup, that's pretty much it, too.

 

Why, for the love of god, is this discussion STILL going on? I haven't read anything this unbelievable since Ed announced he was engaged to a woman.

859989[/snapback]

 

Because it's worth it for gems like "measurement causes a rubber band to stretch". :)

×
×
  • Create New...