Ensuring Data Integrity.

November 29, 2022
Data

Not all data is the same. It might have come from the same source, but how it gets treated is vital. If a data company doesn’t have good data hygiene practices things can get messy very quickly, making it hard to understand the data or undermining your valuable analysis.

  1. Always know where the data came from → we always record the precise source of every record, so that our users can always go back to the original source and validate each one of our records.
  2. Always know when the data was collected → we don’t just record where we got the data from, we also know when we collected the data, so if the source changes, we can change.
  3. Never overwrite source data → we know that some of our data needs to be improved, if a date is incomplete or a better category can be added, in doing this we always add to the data, rather than overwrite the underlying data.
  4. Generate metadata → Our clients want to be able to filter data on attributes that we create, the language of a record for instance. We augment our data with useful metadata every time we gather a record.
  5. Handle duplicates with sensitivity → We see a lot of duplicates and some records that look like duplicates but aren’t. So we don’t provide a binary ‘on’ or ‘off’ analysis of duplicates, we look at eight key attributes and then score these to provide a good understanding of whether something is a duplicate.
  6. Matching needs manual checks → Entity matching is incredibly hard to get right, algorithms can help, but in the end, every match that isn’t an exact match needs to be checked, to make sure that a match is correct. We do this because the details matter, and if we get a contract award wrong, then it can impact investment decisions.
  7. Be ready to highlight anomalies → We wished that some of the records we gathered were better formed, had better information, or just had the data that they were supposed to have. We have to accept that this isn’t always the case. So where things aren’t right, we don’t shy away, we don’t pretend that everything is rosy, we tell our users where the problems are, and let them budget what’s best.

We think that how we approach quality matters. We don’t tell you things you want to hear just to get a sale, we tell you what we know. We want to build partnerships, not future problems. If you’d like to know more about our data or our research services, get in touch.

 

January 25, 2023

Mixing Politics & Procurement.

The need for administrations to act at pace is often at odds with the processes and procedures needed for good procurement. If...
January 24, 2023

Improve Your Exporting With Government Data

Government procurement data can be a valuable resource for export agencies looking to expand their business and identify new opportunities in foreign...
January 18, 2023

Waiting For A Tsunami Of Text

In a world where text can be generated by an algorithm for negligible costs, we have to rethink how we value the...
January 17, 2023

Cleaning Up On Cleaning Contracts

Canberra, in the Australian Capital Territory, is home to the nations houses of parliament and a large swathe of public sector. Transparency...
January 11, 2023

New Procurement Bill Progresses Through Parliament

The proposed new Procurement Bill passed its second reading without division in the House of Commons yesterday, with the Government proposing the...
January 10, 2023

How Government Procurement Data Can Help Export Agencies

Government procurement data, the information on what goods and services government entities purchase in different markets, is an extremely valuable, and often...
January 5, 2023

Government On Covid 19 Contracts

The government has published a document, providing further information about the procurement of critical testing equipment and services during the early months...
January 3, 2023

Global Analysis: Less Is Less

Analysing 7.5 million tender documents published by governments around the world has revealed a worrying trend of publishers providing less data for...
December 1, 2022

Using Data To Build A Supply Chain

All data has a provenance. It comes from somewhere. Someone created it. Maybe it came from a sensor, or it was created...
December 1, 2022

Get Your Data Sorted.

We collect tens of thousands of documents every day. We visit more than 700 sources to collect this data. That's what our...
December 1, 2022

The Importance of Government Procurement Data for Export Agencies

There are a number of ways that government procurement data can help export agencies, including: Identifying New Markets - By understanding what...
December 1, 2022

Lord’s Have Their Say On Procurement Bill

The proposed changes to the Procurement Bill are on their third reading in the House of Lords. Last week the House of...
December 14, 2022

Buyers who stole Christmas (again)

It’s time to shed a festive tear and feel some sympathy for all the bidding teams who’ve got bids to respond to...
December 1, 2022

How Government Procurement Data Can Help Export Agencies.

If you’re in the business of exporting, you know that the global market can be a tough one to crack. In order...
December 1, 2022

EU’s Ruling On Beneficial Ownership

Last week, the Court of Justice of the European Union (CJEU) ruled that public access to registries showing personal details about companies'...

Newsletter

Compelling research, insights and data directly into your inbox.

Recent media stories

Search