Does AI mean we don’t need structured procurement data anymore?
A question keeps coming up in our conversations with donors, policymakers, and civil society partners: if AI can now read and make sense of messy, unstructured documents, why should we keep investing in structured data and data standards for procurement?
It’s a fair question. The recent wave of AI capabilities is genuinely impressive. Large language models can extract information from scanned contracts, parse inconsistent spreadsheets, and summarize thousands of pages of tender documents. If you’ve used these tools yourself (for example, to draft an email, build a simple app, or analyze a report), you might wonder whether the hard work of structuring procurement data is still necessary.
We think it is. But the answer deserves more than a reflexive defence of existing approaches. So let’s walk through what AI actually changes, and what it doesn’t.
AI is a digitization assistant, not a replacement for structured data
AI is good at helping to create structured data from unstructured sources. A government that has years of procurement records locked in PDF contracts and scanned documents can now use AI to extract key fields, such as contract values, supplier names, or even timelines, far more quickly than manual data entry would allow. That’s a real and valuable capability.
But notice what’s happening in that scenario: the goal is still to produce structured data. AI is a means of getting there, not an argument against getting there. The destination hasn’t changed; the journey has gotten easier.
The risk we see is a kind of magic-wand thinking. We assume that because AI can process messy documents, we no longer need to worry about the messiness. That confuses a coping mechanism with a solution. Yes, AI can work with unstructured data. But “can work with” is not the same as “works best with” or “produces reliable results from.”
Two different starting points
It helps to distinguish between two situations, because the role of AI looks quite different in each.
- Where a digital procurement system already exists, structured data is already being produced as a natural byproduct of the process. Suppliers submit bids through forms with defined fields. Contracting authorities record decisions in databases. The data comes out structured because the system was designed that way.
In this case, AI can augment what’s already there: for example, by helping incorporate additional information from unstructured sources like contract amendments or performance reports that sit outside the core system. But it would make no sense to dismantle the structured system and rely on AI to reconstruct the data from documents instead. That would be like sending stock market trading floors back to shouting and hand signals, then using AI to transcribe the chaos into structured records. The structured system is already doing the job, and doing it more reliably.
- Where no digital system exists yet, the question becomes: what should we invest in building? Here, AI can genuinely help as a bridge to more permanent structured data collection. It can help digitize historical records, and it can even help design and build the systems that will produce structured data going forward.
But it matters what kind of system you end up with. A system that deterministically produces structured data, where a form field captures a contract value and stores it correctly, gives you a reliable, verifiable answer. A system that relies on AI to interpret documents and extract that value introduces a layer of uncertainty. The AI might get it right most of the time. But “most of the time” is a problem when you’re trying to track public spending or hold officials accountable. As a colleague put it: do you want the AI to interpret what you meant, or do you want to express yourself clearly from the beginning?
What are you trying to do with the data?
There’s another distinction worth making: between matching and measuring.
AI is remarkably good at fuzzy matching and pattern recognition. It can look at a tender notice and say, “this looks like you’re buying fire engines. Here are three similar tenders from other jurisdictions.” That kind of capability doesn’t depend on perfectly structured data. It works with messy text, varied formats, and inconsistent terminology. This is genuinely useful for discovery, benchmarking, and connecting buyers with relevant market information.
But measuring is different. Are prices competitive? Are contracts being delivered on time? Are small businesses winning a fair share? If you want to understand how a market is performing, you need specific, consistently collected data points. You need intentionality about what you’re collecting and why. No amount of AI processing can conjure a data point that was never captured, or reliably standardize information that was recorded in fundamentally different ways across hundreds of contracting authorities.
Matching is about finding patterns in what exists. Measuring is about designing systems to capture what matters. AI can help a lot with the first. It doesn’t eliminate the need for the second.
Standardization is infrastructure
Finally, even in a world where AI is doing some of the heavy lifting in extracting and processing data, you still need a common standard to transform that data into. AI doesn’t remove the need for a shared schema. It actually makes it more important.
Think of it this way: standards are infrastructure, and AI is a powerful vehicle. You don’t stop building roads because you’ve invented better cars. The cars work better when the roads exist. A procurement data standard like the Open Contracting Data Standard used by more than 50 public buyers worldwide, gives AI something to target. It gives it a defined structure to populate, a common language to map diverse sources onto. Without that, you just have a very powerful tool producing outputs that can’t easily be compared, combined, or verified.
So what’s the answer?
The answer isn’t “ignore AI.” AI is already making it faster and cheaper to digitize procurement records, extract information from documents, and bridge gaps in data coverage.
But the answer also isn’t “stop investing in structured data because AI will sort it out.” That trades a reliable foundation for a probabilistic one, and it underestimates how much the value of procurement data depends on consistency, comparability, and trust.
The smart investment is in both: structured systems that produce high-quality data by design, and AI tools that help fill gaps, augment coverage, and make the data more useful. The two are complementary, not competing. And the standard, the shared structure that makes data comparable and accountable, remains the foundation that makes everything else work.