Does Your Data Have Integrity?

Does Your Data Have Integrity?

October 20, 2022
Data

Not all data is the same. It might have come from the same source, but how it gets treated is vital. If a data company doesn’t have good data hygiene practices things can get messy very quickly, making it hard to understand the data or undermining your valuable analysis.

These are our rules for ensuring data integrity:

  1. Always know where the data came from → We always record the precise source of every record, so that our users can always go back to the original source and validate each one of our records.
  2. Always know when the data was collected → We don’t just record where we got the data from, we also know when we collected the data, so if the source changes, we can change.
  3. Never overwrite source data → We know that some of our data needs to be improved, if a date is incomplete or a better category can be added, in doing this we always add to the data, rather than overwrite the underlying data.
  4. Generate metadata → Our clients want to be able to filter data on attributes that we create, the language of a record for instance. We augment our data with useful metadata every time we gather a record.
  5. Handle duplicates with sensitivity → We see a lot of duplicates and some records that look like duplicates but aren’t. So we don’t provide a binary ‘on’ or ‘off’ analysis of duplicates, we look at eight key attributes and then score these to provide a good understanding of whether something is a duplicate.
  6. Matching needs manual checks → Entity matching is incredibly hard to get right, algorithms can help, but in the end, every match that isn’t an exact match needs to be checked, to make sure that a match is correct. That’s what we do, because the details matter and if we get a contract award wrong, then it can impact investment decisions.
  7. Be ready to highlight anomalies → We wished that some of the records we gathered were better formed, had better information, or just had the data that they were supposed to have. We have to accept that this isn’t always the case. So where things aren’t right, we don’t shy away, we don’t pretend that everything is rosy, we tell our users where the problems are, and let them budget what’s best.

We know that data quality matters. We won’t tell you things you want to hear just to get a sale, we’ll tell you what we know. We want to build partnerships, not future problems.

If you’d like to know more about our procurement data, our API or our research services, get in touch.

March 21, 2023

AUKUS Defence Procurement Deal

Australian Prime Minister Anthony Albanese joined the US President Joe Biden, and the UK Prime Minister Rishi Sunak, in San Diego this...
March 14, 2023

Australian Government Responds to COVID-19 Procurement Review

Many governments around the world have faced pressure regarding their procurement of COVID-19 treatments and vaccines, so it is encouraging to see...
March 8, 2023

EU Joint Ammunition Procurement

EU ministers will meet today in Stockholm to decide whether to  jointly procure  ammunition to aid Ukraine and replenish domestic stockpiles. EU...
March 7, 2023

Denial Of Service By Wall Of Text

Clarkesworld Magazine is a publisher of science fiction short stories. They accept and pay for submissions from anyone. As of last week,...
March 1, 2023

Canadian Government Announces Rail Procurement Process

This week, the Canadian Government announced a major step in the procurement process for their High Frequency Rail project: the launch of...
February 28, 2023

Deduplication Routines

Deduplication is a deceptively complex problem to handle at scale. There’s a simple, ugly and brutal way to do it, where you...
February 23, 2023

Policy Proposal: Analysts In Place

Data is critical to the functioning of government. Policies and services live or die on the quality of data. Successes and failures...
February 23, 2023

Hallucination As A Service

I asked Chat-GPT to find me some academic papers that highlight the usefulness of combining two different machine learning techniques. I hoped...
February 20, 2023

Future Trends in Government Procurement

Over the years, government procurement has undergone significant changes, with new trends emerging that are reshaping the way governments buy goods and...
February 15, 2023

Making Your Own Export Platform?

If you are building your own export platform, sourcing, organising and analysing your data has its challenges. Here are some of the...
February 14, 2023

Tackling Modern Slavery in Supply Chains – Guidance

On 26 March 2020, the UK became the first country to publish a Government Modern Slavery Statement, setting out the steps government...
February 8, 2023

Securing The Competitive Edge.

Are you an exporter looking for leads for your exporters? Do you want to help your in-country teams build more partnerships within...
February 7, 2023

The Benefits of Data Tech 101

Government procurement, the process by which government entities purchase goods, services and works from private businesses, is being revolutionised by the integration...
February 1, 2023

EU Awards New Gas Aggregation Contract

In December 2022, EU energy ministers agreed to create a platform for the collective demand aggregation and joint procurement of gas, with...
January 25, 2023

Does Procurement Need Blockchain?

Blockchain is fundamentally a database, but rather than a database where one item is allowed to replace another, each change to the...

Newsletter

Compelling research, insights and data directly into your inbox.

Recent media stories

Search