While in class, as Don was challenging us to remember that a test is not "valid", but rather, the interpretation of the test can be valid or not, I almost felt a lightbulb go off in my head. (I probably stopped paying attention at this point, but we'll just skip over that fact for now.)
This, exactly, I realized, is how I was taught to practice medicine. Over the years, as I failed (miserably) to convey this idea to others, I finally arrived at the hemoglobin metaphor to explain what I mean.
What does hemoglobin have to do with assessment results?
Here is the scenario. If I approach a physician and said to them: "Hey, Dr. X, I have a patient whose hemoglobin is 90. Should I give her a blood transfusion?", no competent physician would give me a yes or no answer without knowing a lot more about the patient. Or, to put it more simply, as a recent victim of this discussion said: "What's the story?" Physicians would want to know who the patient was, where the patient was, what their symptoms are, why the test was done, what medical conditions does the patient have, and a myriad of other questions. Most would engage me in a long discussion about the patient, discussing the pros and cons of transfusions, and would explore other options. Many would question if I was sure the hemoglobin was "valid" - i.e. did that measurement represent what was going on with the patient's blood at the time the test was done, or could there have been a measurement error? They likely would not want to make a reasonably high-stakes decision based only on one measurement.
If you are reading this and you do not deal with hemoglobin values regularly, you might have already gleaned that this is not a straightforward question with this particular lab value. Specifically, a hemoglobin of 90 is perfectly in the gray zone. If someone had much higher value (say, 140), then no one would ever consider transfusing (OK, rarely ever), and if it was much lower (like 40), most people would advise strongly considering transfusing the patient, barring extenuating circumstances. However, it's not clear when the answer would change from "it depends" to yes or no. Likely, if you asked 20 clinicians, you would get 10-20 different answers.
The other issue that is important in this scenario is that giving a patient a blood transfusion is a reasonably high stakes intervention. If the intervention was much lower stakes, with little potential for harm, the answers would be quite different. Many would say: "Well, there would be little harm (note I didn't say 'no harm') in doing X or Y, so you might as well try it", but no one would say that (I hope) with respect to a transfusion.
OK, so can we get to test scores already?
Yes we can. What hit me is that the exact same clinicians who would provide a very thoughtful analysis of the scenario above were often taught to blindly accept test scores as 100% "valid". Student 'a' gets 60.1% on a test and thus, they pass. Student 'b' gets 59.9% on a test, and thus, they fail. Yes, most of the time in our institution anyway, the test items are analyzed to ensure that we delete problem questions, but again, there is a certain leap of faith that is involved when going from item analysis score being acceptable to "it's a good question so we'll keep it in".
In undergraduate medical education, where I work, these pass/fail decisions are high stakes for our students, as a failed exam at best means they have to write a supplemental exam, or at worst, they fail a course and may have to repeat a portion of the curriculum - a not insignificant set back by anyone's standards.
These decisions are also, of course, high stakes for everyone else, as no one (faculty, administration, or the general public) wants medical students to move on if they are not ready to do so. (The discussion about decisions about whether students are going to be able to 'succeed' in medicine is not part of the hemoglobin metaphor, thankfully, though it is an important discussion beyond the scope of this post.)
The other parallel is around the "what is the magic number?" question. At what point is a score too low? We are currently grappling with the 60% issue, which is what currently exists in the student assessment policy at our school. If a test is well-constructed, and, if the teaching/learning opportunities were aligned with the test, then many people would argue that "only" knowing 60% of, say, anatomy, might not be sufficient to move on. If we are moving towards criterion-referenced testing (which we're trying to do), where we are focusing on "need to know" vs. "nice to know" then the 60% pass mark may well be too low.
There are a lot of "ifs" in the above paragraph. To my mind, this simply highlights the need to ask the exact same question: "What is the story?" What are the possible outcomes of the decision, and what factors led to the test score?
So, now what?
Here is where I admit that the Hemoglobin metaphor doesn't actually answer the question about how to interpret a test score. It simply highlights the some of the thinking processes that are required in order to interpret a score. Oversimplified, this might boil down to "treat the patient, not the number".
I hope, simply, that this metaphor might help us to have the hard discussions - is 60% enough? If it's not, what is? How are we going to determine that? Are our tests constructed to answer these high stakes questions?
I'll feel I've succeeded if I've left you with more questions than you had before you read this blog.
No comments:
Post a Comment