We’re working hard on a new set of machine learning models for categorisation. It has proven to be really challenging.
Developing an accurate categorisation for all our data is our ultimate goal. At the moment, some records have classifications, some do not. Some of the records with classifications aren’t correct.
We want our users to be able to choose a category and know that they’ll find all the records in that category, regardless of the source, regardless of a publisher’s mistakes. It would make their lives so much easier.
The advances in machine learning have created algorithms that are incredibly powerful. The promise is much greater accuracy, the cost is added complexity.
We’ve been working on our models for months, changing one thing at a time, recording the results, finding what improves the results, finding what doesn’t.
We’ve learned that there isn’t a silver bullet. However, with multiple models, each one optimised to solve a specific problem, we’re confident we can achieve the end goal of giving our users exactly what they need.
To find out more about our work to create the best data in the world, get in touch. firstname.lastname@example.org
Image by Philipp Marquetand