Doing science in the 21st century is a complicated business: more data, from different sources, each with their own unique characteristics, that take time to acquire and thought to combine. Very often, researchers spend a lot of time collating disparate pieces of information to begin to understand the data, even if they have been documented.
Providing, high quality metadata to enable researchers to navigate and understand the relationships between the different items in datasets, is in increasing demand as complexity meets scale, but there are few scalable mechanisms or incentives to make that happen.
But it does not mean you have to completely start from scratch. Utilising existing resources, especially in areas such as social science which already has rich metadata resources, we can use recent developments in machine learning to uplift this and build sustainable rich high quality information