New Open Data Citation Corpus Released

The lack of a centralized resource for citations to datasets has hindered the evaluation of how open data is used. The Data Citation Corpus addresses this challenge by providing a comprehensive, centralized resource that compiles data citations from a variety of sources and makes them accessible. A major milestone in the Make Data Count initiative, this first release makes eight million data citations openly available and usable for the first time via an interactive dashboard and a public data file.

The corpus dashboard allows users to visualize the current content of the corpus or narrow the results according to specific filters, such as the affiliation associated with the dataset or the repository where the dataset is hosted.

Make Data Count Logo

The next stages of the data citation corpus will involve ingestion of data citation metadata from additional sources, enhancements to the dashboard and corpus visualizations, and enrichment of the existing gaps in subject information for data citations.

More information about the corpus is available via a webinar on Feb 22, 2024 11:00 AM ET and here:

Data Citation Corpus         The Launch of the First Release