The Situation
A strong instinct, and a gap between that instinct and the full picture.
Design and standards teams at a global real estate organization were spending months doing benchmarking manually — gathering industry data, building comparisons, and trying to draw conclusions from a process that was slow, inconsistent, and deeply dependent on individual effort. The team's instinct was right: an internal tool to automate that process and surface industry comparisons faster would be genuinely valuable.
The initial product direction reflected that instinct. Build for Design and Standards teams. Give them a faster way to compare their facility specifications against industry. Enable better, more consistent decisions with better, more accessible data.
What research did wasn't redirect that vision — it deepened it. As teams described how they actually used benchmarking data day-to-day, four things emerged that the original scope hadn't accounted for, and that would have significantly limited the tool's usefulness if they hadn't been surfaced before design began.
Research Approach
Four functions. Two continents. One question guiding every conversation.
Discovery interviews ran across Design, Standards, Risk, and Operations Engineering — in North America and Europe — before any product decisions were locked. The deliberate breadth across functions and regions was necessary: benchmarking data serves different purposes for different roles, and what makes it useful for a design engineer making a specification call is different from what makes it useful for a risk manager or a standards lead defending a company-wide decision.
"What decisions are you actually trying to make — and what would data need to look like for you to feel confident enough to act on it?"
That framing kept conversations away from feature wishlists and toward something more useful: the gap between what teams needed to make a decision and what they currently had. Decision context mapping made those gaps concrete — specific enough to translate directly into product requirements rather than general themes.
Key Findings
Four things a well-directed team hadn't fully scoped — because they hadn't asked yet.
There were two benchmarking questions, not one
Industry benchmarking — how do we compare to the market? — was exactly what the team had planned for. But research uncovered an equally important question teams were constantly trying to answer: how are we solving this problem across our own regions and business units, and what can we learn from ourselves? When a region had already solved a flooding risk, or found a lower-cost material that held up, or reduced clear height without impacting operations — that knowledge wasn't reaching other teams. Internal cross-regional comparison wasn't a secondary feature. It was the other half of how decisions actually got made.
Data only drives decisions when cost is attached
Teams weren't thinking in specifications — they were thinking in trade-offs. The real question was never "is 70ft better than 65ft?" It was "is that 5ft difference worth $2M per facility?" Research surfaced a model that reframed how the tool should present data: start with a baseline facility at industry-standard cost, then show each company-specific choice as a line item — what it costs and what it buys. That framing gave leadership something to actually decide on, rather than just review. Without cost context, the tool would have been a reference database. With it, it could support real-time decisions in leadership conversations.
Not every industry comparison is a fair one
Some specifications look above-market until you understand the operational context behind them. A building shaped differently than industry norms, or a yard wider than standard, isn't over-engineering — it's built for a fundamentally different throughput model. Comparing those specs to industry averages without accounting for that context produces conclusions that are technically accurate and practically misleading. Research produced a simple framework to distinguish where industry comparison is meaningful, where it's directional, and where internal comparison is the more honest benchmark — so the tool could guide users rather than just surface data.
Decisions outlast the people who made them
Teams kept encountering standards with no documented rationale — specifications that had persisted long after the original reason was gone, or been challenged without any record of why they existed. Capturing the "why" behind a decision, not just the number, became a product requirement. Without it, the tool would surface data without the context needed to challenge or defend it — leaving teams in the same position they were already in, just faster.
The Comparability Framework
A tool that guides the right comparison — not just any comparison.
One of the most concrete outputs of the research was a framework for distinguishing when different types of comparison are appropriate. Built into the tool's logic, it prevents the most common benchmarking failure: reaching a confident conclusion from a comparison that was never valid in the first place.
Industry comparison is the right benchmark
When the operational context is comparable — similar building type, throughput model, and use case — industry data provides a valid external reference point for evaluating whether a specification is above, at, or below market.
Industry comparison is informative, not conclusive
When context differs but isn't entirely dissimilar, industry data can signal direction and prompt questions — but shouldn't be treated as a definitive verdict. Useful for identifying whether further investigation is warranted.
Internal comparison is the more honest benchmark
When the operational model is sufficiently distinct from industry norms — different throughput, footprint, or building configuration — internal cross-regional comparison is more valid than external. The organization's own data becomes the reference.
What Changed
Four findings. Four expansions to the product vision — before a single design decision was made.
Each insight translated directly into a specific product implication. Research didn't redirect the project — it ensured the right version of the project got built.
Internal cross-regional benchmarking became a core module — Not a future phase, not a nice-to-have addition. Research made clear this was half of how decisions actually got made, and it was built into the product vision from the start — grounded in how teams investigate and decide, not assumed from the outset.
Cost context became central to the data experience — The baseline-plus-line-item model shifted the tool from a reference database to a decision-support surface. Specifications could now be evaluated as trade-offs, not just comparisons — giving leadership the framing needed to make actual choices, not just review data.
The comparability framework was built into the tool's logic — Rather than surfacing any data for any comparison, the tool guides users toward the right type of benchmark for their context — preventing the most common failure mode of confident conclusions drawn from invalid comparisons.
Two adoption risks identified and addressed before launch — Research surfaced organizational gaps that would have undermined adoption if discovered post-launch. Early identification gave the team time to address them — the difference between a product that launches into a problem and one that's ready for the environment it's entering.
The Outcome
A more complete product — built from what teams actually needed, not what was assumed at the start.
The original direction was right. Research made it more complete. The benchmarking tool that emerged from discovery was meaningfully different from the one that would have been built without it — not because the instinct was wrong, but because the instinct alone didn't capture the full picture of how benchmarking decisions actually worked.
The most important product decisions are the ones made before design begins. What the tool does, what it doesn't do, what type of comparison it encourages, how it presents data relative to cost, and what organizational context it needs to be useful — those questions have answers that can only come from talking to the people who will actually use it, before assumptions become architecture.
Research impact
2
benchmarking modes built in — industry comparison and internal cross-regional, both grounded in research
4
product expansions from discovery — scope, cost model, comparability framework, decision rationale capture
2
adoption risks identified before launch — giving the team time to close organizational gaps before go-live