Drawing of the number seven in three dimensions

7 Rules for Data Integrity

December 17, 2020
Data

Not all data is the same. It might have come from the same source, but how it gets treated is vital. If a data company doesn’t have good data hygiene practices things can get messy very quickly, making it hard to understand the data or undermining your valuable analysis.

  1. Always know where the data came from → we always record the precise source of every record, so that our users can always go back to the original source and validate each one of our records.
  2. Always know when the data was collected → we don’t just record where we got the data from, we also know when we collected the data, so if the source changes, we can change.
  3. Never overwrite source data → we know that some of our data needs to be improved, if a date is incomplete or a better category can be added, in doing this we always add to the data, rather than overwrite the underlying data.
  4. Generate metadata → Our clients want to be able to filter data on attributes that we create, the language of a record for instance. We augment our data with useful metadata every time we gather a record.
  5. Handle duplicates with sensitivity → We see a lot of duplicates and some records that look like duplicates but aren’t. So we don’t provide a binary ‘on’ or ‘off’ analysis of duplicates, we look at eight key attributes and then score these to provide a good understanding of whether something is a duplicate.
  6. Matching needs manual checks → Entity matching is incredibly hard to get right, algorithms can help, but in the end, every match that isn’t an exact match needs to be checked, to make sure that a match is correct. We do this because the details matter, and if we get a contract award wrong, then it can impact investment decisions.
  7. Be ready to highlight anomalies → We wished that some of the records we gathered were better formed, had better information, or just had the data that they were supposed to have. We have to accept that this isn’t always the case. So where things aren’t right, we don’t shy away, we don’t pretend that everything is rosy, we tell our users where the problems are, and let them budget what’s best.

We think that how we approach quality matters. We don’t tell you things you want to hear just to get a sale, we tell you what we know. We want to build partnerships, not future problems. If you’d like to know more about our data or our research services, get in touch.

contact@spendnetwork.com

January 13, 2021

How Long Does It Take A Department To Publish A Contract Notice?

Public sector organisations have, according to guidance, thirty days to publish details of their contracts online. So, thirty days after a contract...
January 12, 2021

Controversial Food Box Contractors Face Scrutiny

There is increased scrutiny for the suppliers of food boxes provided to the chronically ill and those asked to shelter through the...
January 7, 2021

£550 Million Missile Contract Signed.

Yesterday defence Defence Minister Jeremy Quin announced a £550 million contract was awarded for new surge-attack missile The contract award promises 'hundreds...
January 7, 2021

Time To Build More Open Products For Government

-Ian Makgill, Founder Spend Network Just before Christmas, DXC (formerly Hewlett Packard) was awarded a contract for £430,000 by the Business Services...
December 23, 2020

New Year, New Tool. Introducing Our New Classifier.

Our New Solution To Classification In 2020, we developed an advanced classifier. This tool adds multiple labels to procurement notices based on...
December 18, 2020

The buyers that spoiled Christmas 2020

Welcome to our annual run down of the buyers that are most likely to spread misery for suppliers at Christmas. Here are...
December 17, 2020

Why Blacklisting Is Harder Than You Think.

Sadly, we don't have to look far to find examples of suppliers being accused of illegality. The Grenfell enquiry heard evidence that...
December 17, 2020

Build Back Younger?

Joe Biden's exhortation to 'build back better', which has also been used by Boris Johnson, is broadly equivalent to the more arch...
December 17, 2020

UK Government Launches Plans To Transform Procurement.

The UK government yesterday launched a green paper, a series of proposed changes to procurement rules, purporting to put transparency and increased...
December 23, 2020

More Governments Improve Transparency.

It is always encouraging to see government procurement transparency improving around the world. Brazil and Cote d'Ivoire have both recently applied to...
December 14, 2020

Life in The Fast Lane

The NAO has just published a report criticising the Government for using a 'fast-lane', where suppliers that were known to MPs were...
December 14, 2020

Adding Value

We're analysts. We work with data, every day. We know what works and what doesn't work. We know about values that can't...
December 14, 2020

Visualise your data

Harness the power of procurement data to make informed decisions. Using our advanced analytics, we can create custom visualisations and dashboards for...
December 14, 2020

Clean and enrich your data

Cleanse all of the supplier records within your organisation, creating a single, consolidated record with rich information to help you make better...
December 17, 2020

Classify your data

We have built a state-of-the-art algorithm just to categorise procurement data. Work with us to categorise millions of records with stunning accuracy...

Newsletter

Compelling research, insights and data directly into your inbox.

Recent media stories

Welp Magazine
December 23, 2020
FT PPE Story
The Financial Times
December 9, 2020

Search