7 Rules for Data Integrity
Not all data is the same. It might have come from the same source, but how it gets treated is vital. If a data company doesn’t have good data hygiene practices things can get messy very quickly, making it hard to understand the data or undermining your valuable analysis.
- Always know where the data came from → we always record the precise source of every record, so that our users can always go back to the original source and validate each one of our records.
- Always know when the data was collected → we don’t just record where we got the data from, we also know when we collected the data, so if the source changes, we can change.
- Never overwrite source data → we know that some of our data needs to be improved, if a date is incomplete or a better category can be added, in doing this we always add to the data, rather than overwrite the underlying data.
- Generate metadata → Our clients want to be able to filter data on attributes that we create, the language of a record for instance. We augment our data with useful metadata every time we gather a record.
- Handle duplicates with sensitivity → We see a lot of duplicates and some records that look like duplicates but aren’t. So we don’t provide a binary ‘on’ or ‘off’ analysis of duplicates, we look at eight key attributes and then score these to provide a good understanding of whether something is a duplicate.
- Matching needs manual checks → Entity matching is incredibly hard to get right, algorithms can help, but in the end, every match that isn’t an exact match needs to be checked, to make sure that a match is correct. We do this because the details matter, and if we get a contract award wrong, then it can impact investment decisions.
- Be ready to highlight anomalies → We wished that some of the records we gathered were better formed, had better information, or just had the data that they were supposed to have. We have to accept that this isn’t always the case. So where things aren’t right, we don’t shy away, we don’t pretend that everything is rosy, we tell our users where the problems are, and let them budget what’s best.
We think that how we approach quality matters. We don’t tell you things you want to hear just to get a sale, we tell you what we know. We want to build partnerships, not future problems. If you’d like to know more about our data or our research services, get in touch.
Selling to procurement: No One Cares About Your Product
8 Reasons Why Procurement Doesn’t Need Blockchain.
Procurement Transparency Suffers Under Covid-19
South Africa, Kenya lead the way on African transparency.
Missing Data Is A Known Unknown
NZ Government Pharmaceutical Procurement Review
The Problem With Frameworks
Do Framework Agreements Have Value?
NSW Aims To Reserve Procurement Budgets For SMEs
Creating Synergy Between Politics & Procurement.
Canada Launches Green Procurement
Not All Data Is The Same: Rules For Data Integrity.
Post Brexit Procurement – What Will Change?
Where Next For Data Led Procurement in Europe? A Discussion.
Spending $400bn – A Demanding Task For Biden.
Compelling research, insights and data directly into your inbox.