There is a "paradox" discovered by a guy name Lord, concerning the proper way to analyze pre/post test score data. Generally a comparison of difference scores (i.e. post - pre ) finds no statistically signficant difference between groups, but the same data analyzed using regression shows a statistically significant difference. Lord demonstrated this paradox using weights of college freshman recorded at the beginning and end of the school year. An analysis of the difference in weights for each student reveals that the females and males do not gain weight on average: the school "diet" does not act differently on each sex. On the other hand, a multiple regression analysis of the weight change using starting weight and Sex reveals a statistically significant difference between the sexes. Two reasonable methods, two reasonable but opposite conclusions. On the one hand, the difference scores imply that the "freshman diet" acts no differently for men than women. On the other hand, the regression implies that the women tend to gain less weight than a man at the same starting weight.
I have created a simple dataset that shows

Lord's Paradox in all its glory.The average value of GAIN is zero for both groups. On the other hand, a multiple regression to predict GAIN using SEX and SEPT weight fits a line for both men and women with a common slope but with a statistically significant difference in intercept. That is, at a given september weight the expected GAIN is less for women than for men.
What's the truth? In this example, we randomly generated a TRUE weight for each student with men having a higher mean than women and a higher SD as well. The september weight for both men and women is the true weight plus a random normal (SD around 5 lbs). The June weight has the same distribution. Thus, for each student the pair of weights is a bivariate normal with correlation around .9. So under this model, there is NO DIFFERENCE between men and women with respect to the average weight change. So how do we explain the multiple regression?
In the simulation it is clear that the multiple regression uses the wrong variable. A man and woman who weigh the same are not the same- the man is more likely to be under his true weight and a woman is more likely to be above her true weight. Both regress to the mean- but the man regresses upwards and the woman downwards. If we repeat the regression this time using SCALED starting weights- that is we subtract the group mean from every individual- the sex effect disappears! That is, there is no difference in expected weight gain for a man and woman who are at the same weight in september RELATIVE to their group mean (180 for men, 130 for women).
Is there a conclusion here? yes- multiple regression can easily lead to terrible error. Is Lord's Paradox not really a paradox at all? Perhaps it's just the Regression Fallacy.