Contracting has many stakeholders. They many have many different information and data needs in different settings and at different times. Use cases provide a way to identify particular users, and to describe their particular information and data needs.
Over the past few weeks we held two community web meetings on use cases pertaining to open contracting data. These demand-side meetings are meant to engage users to share their thoughts on the importance of open contracting process and more specifically open contracting data.
1st Community Web Meeting: Media use cases for Open Contracting Data
In the first community web meeting (May 16), we discussed media use cases for open contracting data. Public contracts can be found in the media in many ways: from tender notices in newspapers, to investigative reporting that digs into cases of corruption, overspend or poor-management of public finances. Journalists may turn to contracting information to inform existing stories, or might start looking in contract data to find the stories hidden within it.
With more and more data becoming available, the need for media may appear to be growing – and yet the resources available for investigative reporting are on the decline. In this Web Meeting we will explore how the media are using data on contracts, and how they could use data in future – identifying key use cases for an open contracting data standard. We are interested in how the media make use of public contracting information right now. How does this vary across the world? What would happen if the media had better access to public contracting information? And what media user stories, and related requirements, are priorities for an open contracting data standard?
- Roberto Rocha (@robroc): Interactives and data editor, The Montreal Gazette, Canada
- Friedrich Lindenberg (@pudo): Coder, developer. Digging Deeper with Data Journalism, Germany
Our first presenter, Roberto Rocha from The Montreal Gazette, an online database of municipal contracts awarded by the city of Montreal for the public to track government spending, focused on investigative journalism.
Our second presenter, Friedrich Lindenberg from Open Tenders Electronic Daily: OpenTED, a portal designed to create a joined European market for government services and works, focused on digging deeper with data journalism.
Roberto discussed the challenges facing investigative journalists in Montreal relating to contract data. First, finding good, reliable data is difficult. Journalists sometimes produce their own data either by scraping online databases or by using API’s as in Montreal Gazette. At times, only a skilled programmer to get data from databases that are hard to access. The journalist also needs to understand if the data was released for political motivations, who put the data together, and what might be missing. Second, contract and tendering systems are outdated (particularly in Montreal). It is simple to bribe bureaucrats and engineers, which explains the opaqueness of these systems in terms of disclosing contract information. Given this, journalists need to spend weeks at city hall looking through documents, making connections, and finding anomalies. Third, cleaner, more standardized data is needed. Often 80% of the time is spent cleaning up ‘dirty’ data and getting it into a format for which it can be analyzed instead of spending that time telling the story. This is especially important given the shrinking North American media budgets. It is often still the norm to have documents shared in PDF rather than as machine-readable data. Reporters at the Montreal Gazette often still use old methods to find contracting-related stories since its the only way to investigate public contracts in Montreal.
Friedrich discussed the need to dig deeper with data journalism. First, what can be done to allow journalists to look at the data themselves? In Friedrich’s experience, one cannot expect journalists to have much more technical knowledge than using Excel.Therefore It is key to have data available in forms journalists can use it and are familiar with like Excel. In addition, create simple tools for journalists to use. Second, what are some journalistic contracting questions relating to OpenTED?
- What are the biggest suppliers in my city, state, country?
- What are the biggest authorities in terms of contracting?
- Which industries are taking up a lot of money?
- Is there a lot of defence spending or a lot of infrastructure spending?
In terms of asking these questions the most interesting bit is not the tender announcement (which EU TED was made for), but the actual contracting awards. Third, what is needed to improve Open TED and its use of this contracting data? It is important to provide training for those might need it. Data quality improvements are necessary given the existing challenges: need to digitize paper documents as 5% of procurement documents are still received as paper. Companies names entered as a string. Authority identities are needed, similar to the work of OpenCorporates for corporate identifiers. A possible solution is Public Bodies. For ⅓ of all tenders, the contract award is not published. And the list of needs for improvement goes on!
Additionally, a ‘Blaming and Shaming’ site may be useful to show what authorities have missing fields, not published awards, not published certain fields etc. In many cases laws exist already on what data needs to be provided publicly. The European procurement office cannot monitor this and will not do this, therefore an external office/body is needed. It is also pertinent to bridge journalists, government, technologists. The Data Harvest conferencebrings together journalists from all EU states. Friedrich has been working to bridge different groups such as the European Publications Office, companies that can analyze data, and journalists.
Following the presentations, there was a fruitful participation on improving the publication and use of open contracting data. Lisette Garcia (FOIA Resource Centre) noted the importance of understanding and promoting the legal statutes that already exist. Friedrich (OpenTED) noted journalists want more data regarding people who have put in offers but have not won – not aware of anyone publishing this type of data; and to answer certain questions – what services were to be provided under what timelines? – you need the actual contract documents. Jose Alonso (Wed Foundation) concurs with Lisette and comments pertaining to the legal system. Jose had a related chat with a person on transparency and environmental resource issues in Europe, where there was similar discussion on the open data community lacked legal expertise (e.g., laws, regulations). These instruments, if combined properly, could help the community to push the government to release the things that community really wanted. Juan (Programmer Paraguay) is working with contracting data (Json-LD) within the government in Paraguay and noted that is hard to find a model they could use with the phases on the contracting process. He stressed the need for a project such as the Open Contracting Data Standard and how the same model could be then used for other systems from around the world. He is keen to use this standard. Tim (Web Foundation) noted that many contracts are often only published over a certain threshold making it hard to aggregate if lots of the spending is under the threshold.