The Evaluator in the Room
The Invisible Work Behind Visible Outcomes
The most important work in evaluation never shows up in the dashboard.
I brought that line into We All Count’s Talking Data Equity series when I presented this framework. What was striking was not that the idea felt new, but that it was immediately recognizable across contexts. That kind of recognition points to something structural: the work that makes evidence usable is widely experienced, but rarely named or accounted for within evaluation itself.
Many people knew the work and the spaces between what gets counted and what has to be understood. They knew the moment when a dashboard satisfied the reporting requirement, but left the real question untouched. They knew the quiet labor that happens before the report arrives, before the findings deck is shared, before the clean chart appears in front of a board, a funder, or a leadership team. That is the work I wanted to name and share at this talk: Not evaluation as a method alone, but evaluation as a decision infrastructure.
Evidence does not become decision-ready because it has been counted. It becomes decision-ready because someone has done the invisible labor of making the evidence honest, usable, and aligned with the decision it is meant to inform.
Where the work actually happens
The following examples demonstrate the invisible work evaluation does to examine evidence and hold the questions no one wants to ask.
Local Arts Nonprofit
An organization working with middle school students had a funder focused on learning outcomes, program staff who understood the work as identity formation and belonging, and community members whose relationship to the program was cultural in ways neither framework fully captured. Each perspective was legitimate, but they were not measuring the same thing.
The visible outcome was a report. It included the number of students served, performances attended, and the hours of instruction. Those numbers were real, but they could not carry the full meaning of the work on their own.
The invisible labor was translation.
The work was not to force everyone into one flat story, but to build a framework in which different stakeholders could recognize their own understanding within the same body of evidence without pretending those understandings were identical.
That meant staying inside the tension long enough to ask the question the dashboard could not answer: what is the closest honest proxy for what this work produces, and what do we need to say clearly about what that proxy cannot capture? The report did not show that labor, but everything built from it depended on it.
The question the room has not asked
International Health Organization
The organization had screened tens of thousands of people for a disease. The number was large, the coverage documented, and the funder report showed scale. Everyone in the relationship had reason to be satisfied.
The question the evaluator held back from asking was this: of the positive screens, how many received treatment? Of those who received treatment, how many completed it?
The data were fragmented across clinic records, community health worker logs, and follow-up documentation, but they had never been pulled together. Nobody had asked for it in that form, but the outcome question was sitting there not outside the evaluation but inside of it.
The moment of judgment was not only methodological. It was professional, relational, and structural. Is it the evaluator’s job to ask a question no one in the relationship has required?
That is where invisible labor stops sounding soft.
A number that satisfied everyone in the room except the question is not a finding. It is a gap wearing the shape of a finding.
The evaluator’s job is not to embarrass the room or turn every engagement into a confrontation, nor to punish organizations for measuring what they were asked to measure. But evaluation cannot serve decision-making if it only protects the questions the room already knows how to hold. That is the distinction that matters.
The invisible labor of holding the questions no one wants to ask is the work of pushing against that current. Without it, evidence can look complete while remaining decision-poor. A dashboard can be accurate and still not be useful. A finding can satisfy a relationship and still fail the question the investment was always asking.
That is why the invisible labor matters. Not because evaluators need to be praised for what no one sees, though many have carried this work without naming it, but because the labor is part of the evaluation’s actual value. Serious evaluation has to name the labor, not just the methodology. It has to measure toward intent, not toward ease, and hold the question long enough for evidence to become more than proof that activity occurred.
Evidence should help capital move toward what it intends to produce. That requires more than data collection. It requires an evaluator in the room willing to do the invisible work behind visible outcomes.
Rhonda Williams, Ph.D. is an evaluator and philanthropic practitioner examining how capital deploys toward or away from the outcomes it intends to produce.
This essay is part of Evidence 2 Decision, where I trace the distance between capital deployed and change produced. Explore the work at evidence2decision.com.




