Written By Duncan Dewhurst and Tim Davies, 30 Jun 2016

As we are learning from the feedback we receive on the implementation of the Open Contracting Data Standard, there are some topics that continue to emerge. We’d like to pull them together and share them in a series that looks at the technical implementation of the data standard. This first one looks at different opportunities for your API structure. If you are working on the OCDS, you might also want to look at our process for updating it.

The Open Contracting Data Standard (OCDS) provides a common schema that can be used to publish data on all stages of the contracting process in a range of different data file formats (CSV, Excel, JSON).

In the ideal scenario:

  • Individual releases and records for each contracting process should be available at unique persistent URLs;
  • Bulk downloads in CSV (and, if appropriate, Excel) format should be available covering set periods of contracting;
  • Users should be able to easily locate the collections of releases and records they want.

Although this can be achieved by writing individual files to a web-accessible file-system, in most cases we are seeing that publishers are choosing to opt for a database and API approach.

We are currently consulting on a simple API specification for OCDS, but in this blog post I want to reflect on the different approaches that can be taken in the architecture of an API.

Inside-out: getting data to the web

Publishing OCDS data usually involves a conversion step, where data is mapped from the internal data model used by the publishers, to the common OCDS schema. This can happen at a number of different points. Below we outline three approaches we’ve observed.

(1) Direct publication from live systems

In this scenario each originating system publishes data directly in OCDS format via API. Data is not stored according to OCDS but is converted via the API when an API call is received.

directPublication

key

Pros Cons
Fewer systems to maintain Where data originates in multiple systems, multiple APIs must be maintained and 3rd party systems may have to make calls to multiple APIs
Complex or high volume API calls may place additional load on live system

(2) Separate OCDS datastore, pull and convert

In this scenario a middleware system sits between live systems and the internet facing API. An automated process pulls data from live systems to the middleware system which performs the conversion to OCDS and maintains a datastore in OCDS format.

pullAndConvert

key

Pros Cons
Modular approach Additional system to maintain
Single API to maintain and for 3rd party systems to call
Complex or high volume API calls do not place additional load on live system
Possible to share and re-use open source code for providing the datastore and API

A similar approach has been adopted by European Dynamics to support OCDS output from a new e-procurement system for the Zambian Public Procurement Agency, the key difference being that data is pushed rather than pulled from the live e-procurement systems whilst conversion takes place at the middleware layer.

(2.5) Separate OCDS datastore, convert and push

This scenario can be viewed as a combination of the two previous scenarios. Live systems perform the conversion of data to OCDS format and push this to a middleware system which maintains an OCDS format datastore and an internet facing API.

convertAndPush

key

Pros Cons
Modular approach Additional system to maintain
Single API to maintain and for 3rd party systems to call Where data originates in multiple systems multiple OCDS conversions must be maintained
Complex or high volume API calls do not place additional load on live system
Possible to share and re-use open source code for providing the datastore and API
Middleware complexity reduced over scenario (2)

A similar approach has been adopted by the OpenProcurement system, developed in Ukraine and used as the basis for the Prozorro platform, which uses OCDS building blocks as the foundation for live systems data models, easing the conversion process. It should be noted that OCDS is not a framework for building an e-procurement system, however mapping against OCDS can help ensure e-procurement systems are capturing relevant data for disclosure.

(3) Separate OCDS datastore, manual import

In this scenario a middleware system sits between live systems and the internet facing API.

Data is manually exported from live systems for upload to the middleware system which performs conversion to OCDS and maintains a datastore in OCDS format.

manualImport

key3

There’s a good documented example of this approach from the work Development Gateway have been carrying out in Vietnam.

Pros Cons
Modular approach Additional system to maintain
Single API to maintain and for 3rd party systems to call Manual export/import process introduces potential for failure
Complex or high volume API calls do not place additional load on live system

 Things to think about

  • Search endpoints. Your API may do more than just provide individual release and records. You may want to provide endpoints which can be used to fetch all the contracting processes involving a particular product type, a particular supplier, or a particular procuring agency.Consider whether these endpoints will provide only JSON, or whether they can also provide custom exports of CSV and Excel data for users who are more familiar with spreadsheets.
  • Documents. OCDS is not just about meta-data and data on contracting processes – it is also about disclosure of documents. In many cases we’ve found where systems link out to documents on external platforms, link-rot can quickly set-in.The best systems will ensure that documents are archived, and kept available permanently.
  • Generating records. OCDS has the idea of releases (snapshot information about a contracting process), and records (summary of the current state of the process, and links to what has gone before).Every time there is a new release in your system, you will need to update the corresponding record. This could be done at the time data is converted, or could be a separate process, triggered on each update.
  • Generating bulk exports. Periodic exports of your data are very useful to researchers, analysts and other users. Think about how data will be segmented across bulk files, particularly if you are bulk exporting records.

 

 

  • Mireille Raad

    Would you say that one of the benefit of “(2.5) Separate OCDS datastore, convert and push” is that it enables more real time OCDS data publication? (or at least makes it easier?)

    My logic is that it would be “more real time” to push data to the OCDS data store as soon as updates happen… in the pull scenario, the pull has to be scheduled or there has to be a subscription system or some sort of web hooks (in which case it is a push-pull system)

    Let me know your thoughts for any realtime considerations.

    • Tim Davies

      Agreed.

      The reason to list both is that push systems require adaptations to live deployments, whereas pull systems can be developed with an independent tool that just needs to access the live database: hence sometimes they are practically easier to implement.

      However, I agree a push system has a number of benefits over a pull system in terms of making sure data is kept updated as things change (and avoiding, for example, only getting one of N updates to a contracting process, if those changes took place between times data was ‘pulled’).

  • Mireille Raad

    I would also add “Aggregation” to ” Things to think about”

    Some of the use case of OCDS is doing value for money for govt…. to satisfy this use case there is a lot of need for analytics… so one of the things to think about is how to position the aggregation framework in the diagrams.

    • Tim Davies

      This is a really good point. We’ve not dug deeply into aggregation API design yet – although we had some work on it in December at the OGP Summit with Yohanna, who has recently joined the LATAM helpdesk hub, and hopefully will be helping us move this forward.