A Proxy on All Your Measures
When you can’t measure something directly, you use a proxy measure. For example, climate scientists can’t directly measure the weather in 1492, but they can make some inferences about average temperature and rainfall from the variation in widths and certain other properties in tree rings. Tree rings can serve as proxy measures for climate.
So do we ever use proxy measures in education? A better question is: Do we ever not? Almost every metric commonly used in education is either obviously a proxy, or else it is a direct measurement of something we should not really care about, pressed into service as a proxy for something we do care about.
Here are some examples:
|What we care about||Proxy measure|
|Extent to which education is challenged by the poverty of children in a school or district||Percentage of students on free/reduced lunch|
|Students’ developing knowledge and abilities||Students’ performance on tests|
|Students’ and families’ commitment to learning and/or engagingness of the learning environment||Attendance and participation rates|
|Teacher quality||Teacher performance ratings, graduate credits, and (if we dare) their students’ academic outcomes|
|Students’ preparation for life after school||College matriculation rates, employment rates|
|Students’ ethnic/cultural background||Self-designated race or ethnicity chosen from a short list of options|
|Students’ aptitude and intelligence||Students’ performance on aptitude and/or intelligence assessments|
|Suitability of school’s learning climate||Incident rates, suspension rates, teacher turnover rates|
Two Things You Should Never Do with Proxy Measures:
1) Forget they are proxy measures
2) Exploit them to blur the truth
When we think of individual examples, it is easy to see how proxies sometimes fail. Averaged over a few hundred students, a college matriculation rate is a pretty good proxy for how well a school prepares students for college.
Don’t Forget that Proxies are Proxies, Not the Real Thing
But what about that straight-A senior who took 9 AP courses, got a 35 on the ACT, graduated and moved Guatemala to start a surfing school? Did her high school fail to prepare her for college?
What about a child whose parents are a thoracic surgeon and an army colonel, is he at ‘high-risk’ of dropout because he is Latino and male?
What about the new principal who arrived in a chaotic high-school plagued by gang violence, instituted strict new reporting policies, and saw a 400% increase in ‘serious incident’ reports along with an increase in attendance, improved student report card grades, and higher parent satisfaction rates. Was that school four times more dangerous than it was under the previous administration?
Well, of course not. And this is not to say that proxies aren’t valuable and useful; they are useful most of the time. We just have to remember that we are not measuring exactly what we care about.
Don’t Let Exploiting Proxies Detract from Your Real Mission
Once, when I was working in the accountability office at a school district, an assistant principal called me and asked me about a certain withdrawal code in our student information system. In our district, if a high school student stops coming to school, with no information from the child or the parent about why or to where, we counted that student against our dropout rate. Our new superintendent had launched an initiative to reduce the dropout rate. So the assistant principal asked me, “If I code a student as ‘Withdrawn—location unknown’, does that count as a dropout?” I told her that it would. “But I was told that if I didn’t want a student to count as a dropout, I could use that code,” she said. “What code should I use to make sure my kids don’t count as dropouts?”
At first I was a little confused—she should, I told her, use whatever code was closest to the actual information she had about the student. If the school had no information about the student’s whereabouts, she was using the correct code. But she persisted: “If I use ‘transferred to another school district’, will that count as a dropout?” I told her that it would not. “OK,” she told me, “I will tell my principal we’re safe if we use transferred to another school district.” I explained that our policy was to only use that withdrawal code if we knew—either from a records request, or a parental notification—that the child had enrolled elsewhere. “But you don’t really track in the system whether we got that request, right?” She asked. “So there’s no way you could really tell.” And there was not. And she was not even embarrassed to discuss her plan with me. Her charge was to reduce the dropout rate. She was reducing the dropout rate.
In the early days of high-stakes state testing, Walt Haney noticed that some Texas high schools figured out a nifty way to raise 10th grade proficiency rates: keep all the low performing students in 9th grade until they dropped out. The 10th grade proficiency rate is almost a triple proxy: a student’s test score is a proxy for her academic ability, the percent of students proficient is an oversimplified summary statistic for a group, and the 10th grade was serving as a proxy for the whole school. It should hardly be a surprise that it was abused.
In those cases, there were clear ethical lines being crossed. But proxies can be manipulated in ways that are less obviously wrong. The most common one I see is focusing resources not on students who have the most need, or who are most likely to benefit from a particular program, but instead on students who are closest to the ‘proficiency’ tipping point.
It is never this easy to predict outcomes, but imagine you are a superintendent choosing between two intervention programs. Both cost $500 per child per year. On average, students in program A improve by 10 points on the state standardized test. On average, students in program B improve by 5 points. Which program should I choose?
The choice is easy, right? Only let’s give it a twist. Program A, you find out, is most effective in serving children who are, on average, 20 points below the proficiency threshold. Program B, on the other hand, is most effective for students who are within 5 points of the proficiency threshold. If you invest in Program A for your ‘below basic’ students, about 25% of the students served are likely to reach proficiency. If you invest in Program B for your ‘on the bubble’ children, about 50% of them are likely to reach proficiency. In other words, Program A gives you demonstrably more bang for the buck in helping your neediest scholars catch up. But Program B gets you double the gains in percent of students proficient. If you still choose Program A, I salute you! But your boss will not be impressed.
Any metrics can be perversely incentivized, but proxy measures are particularly prone to misuse, and professionals in a system will generally make the choice that most directly affects the measures on which they are formally evaluated or publicly judged.
A final note: If you are a superintendent, or a researcher, or a data analyst designing evaluation and accountability systems, you would do well to combine the two “Don’ts”: Don’t forget that it’s easy and tempting to exploit proxies.