This page contains details of and links to all the data dumps of the OpenCitations Meta and OpenCitations Index. They are made available online by means of the support of Figshare and of the Internet Archive.
The OpenCitations Meta database stores and delivers bibliographic metadata for all publications involved in the OpenCitations Index.
Dump created on 2023-10-24. Compared to the previous dump, this one adds the metadata contained in OpenAIRE and in the Crossref dump dated September 2023. This dump includes information on:
105,953,699 bibliographic entities
338,173,282 authors and 2,523,200 editors (counted by their roles, without disambiguating individuals)
691,262 publication venues
36,679 publishers
Type and format | Archive | Size |
---|---|---|
Metadata (CSV) | ZIP | 43 GB (10 GB zipped) on NTFS |
Metadata and provenance (RDF) | ZIP | 38.1 GB zipped on NTFS. It does not vary once extracted because it contains zipped JSON files |
In addition, a CSV dump containing all the bibliographic resources identified by their own PIDs (e.g., DOI, PMID, ISSN) and their corresponding OMID (e.g., br/12345):
BR OMID map (CSV) | ZIP | 4.6 GB (0.8 GB zipped) |
Dump created on 2023-06-28. Compared to the previous dump, this one adds the metadata contained in the dump of NIH Open Citation Collection dated November 2022. Dump available in CSV (metadata) and JSON-LD (metadata and provenance) formats.
Dump created on 2023-02-24. Compared to the previous dump, this one adds the metadata contained in the last dump of DataCite dated 22 October 2021. Dump available in CSV (metadata) and JSON-LD (metadata and provenance) formats.
Dump created on 2022-12-20, based on open references to works with DOIs within the Crossref dump dated December 2022. Dump available in CSV (metadata) and JSON-LD (metadata and provenance) formats.
The OpenCitations Index stores OMID-to-OMID references representing all the references gathered from several sources.
Dump created on 2023-10-25. This dump includes information on:
84,687,532 bibliographic resources;
1,836,126,096 citations;
Type and format | Archive | Size |
---|---|---|
Citation data (CSV) | ZIP | 157.3 GB (24.7 GB zipped) |
Citation data (N-Triple) | ZIP | 1.3 TB (57.6 GB zipped) |
Citation data (Scholix) | ZIP | 1.6 TB (34.1 GB zipped) |
Provenance data (CSV) | ZIP | 286 GB (12.8 GB zipped) |
Provenance data (N-Triple) | ZIP | 2.3 TB (73 GB zipped) |
In addition, a N-Triple dump containing information regarding the data source collection (e.g., COCI, DOCI, POCI, etc) of all the citation data:
Citation data sources' info (N-Triple) | ZIP | 287 GB (16 GB zipped) |