Matt Yglesias sees the recent Washington Post article describing the fall in SAT scores after the most recent test changes and asks whatâ€™s the big deal. On the face of it, Matt makes some good points. Scores are only down 7 points (5 in reading and 2 in math) and for any given student, that really isnâ€™t a big deal. The Washington Post article, while suggesting that there may be some underlying causes (e.g. the test changes or students not learning enough) also notes that it could be a one year blip.

I did a quick look at some of the data for the SAT. It turns out that in 2005, the mean reading score was 508 (N=1,475,623 SD=113). In 2006, it was 503 (N=1,465,744 SD=113). For math in 2005, we have a mean of 520 (N=1475623 SD=115) and in 2006, mean of 518 (SD = 115 N=1465744).

Armed with this information, and the reasonable assumption that the data are normal, we can compute the likelihood that the difference between these two years is due to random sampling of students. Letâ€™s do a studentâ€™s t-test. For reading, we get a t-value of 37.94; for math, 14.91. The probability that the reading scores could happen by chance is 0, as in less than the epsilon of my stats package (R). For math, itâ€™s on the order of 10^-50.

Given that, we know that there is a difference here. Itâ€™s not a statistical blip. If we look further at the data, we see that SATs are remarkably consistent as long as the test isnâ€™t changed. Essentially, we are dealing with such a huge number of students (1.5 million per year) that blips donâ€™t show up. So that leaves us with two possibilities:

- graduating seniors have gotten dumber over the past year; or
- the change in the test is significant

I canâ€™t definitively disprove hypothesis 1, but given the large number of students and the general SAT consistency, I think that it is not likely to be the issue. That leaves us with hypothesis 2 â€“ the change in the test is likely to be significant. I donâ€™t see a way to avoid saying this is the case.

Of course, on a per student basis, the change isnâ€™t that large, but it is real and not a statistical artifact; and it means that you canâ€™t compare SATs for students from 2005 to students from 2006.