This page contains details of and links to all the data dumps of the OpenCitations Meta and OpenCitations Index. They are made available online by means of the support of Figshare and of the Internet Archive.
The OpenCitations Meta database stores and delivers bibliographic metadata for all publications involved in the OpenCitations Index.
Dump created on 2023-11-29. Compared to the previous dump, this one adds the metadata contained in the Japan Link Center (JaLC). This dump includes information on:
105,953,699 bibliographic entities
338,173,282 authors and 2,523,200 editors (counted by their roles, without disambiguating individuals)
691,262 publication venues
36,679 publishers
Type and format | Archive | Size |
---|---|---|
Metadata (CSV) | ZIP | 10 GB (43 GB zipped) on NTFS |
[TBA] Metadata and provenance (RDF) | ZIP | [TBA] |
In addition, a CSV dump containing a mapping between all the bibliographic resources identified by an OMID (e.g., br/12345) and their corresponding PID(s) (e.g., DOI, PMID)
BR OMID map (CSV) | ZIP | 4.4 GB (1.7 GB zipped) |
Dump created on 2023-10-24. Compared to the previous dump, this one adds the metadata contained in OpenAIRE and in the Crossref dump dated September 2023.Dump available in CSV (metadata) and JSON-LD (metadata and provenance) formats.
Dump created on 2023-06-28. Compared to the previous dump, this one adds the metadata contained in the dump of NIH Open Citation Collection dated November 2022. Dump available in CSV (metadata) and JSON-LD (metadata and provenance) formats.
Dump created on 2023-02-24. Compared to the previous dump, this one adds the metadata contained in the last dump of DataCite dated 22 October 2021. Dump available in CSV (metadata) and JSON-LD (metadata and provenance) formats.
Dump created on 2022-12-20, based on open references to works with DOIs within the Crossref dump dated December 2022. Dump available in CSV (metadata) and JSON-LD (metadata and provenance) formats.
The OpenCitations Index stores OMID-to-OMID references representing all the references gathered from several sources.
Dump created on 2023-11-29. Compared to the previous dump, this one adds the citation data contained in the Japan Link Center (JaLC) and in the Crossref dump dated September 2023. This dump includes information on:
89,920,081 bibliographic resources;
1,975,552,846 citations;
Type and format | Archive | Size |
---|---|---|
Citation data (CSV) | ZIP | 171 GB (26.8 GB zipped) |
Citation data (N-Triple) | ZIP | 1.4 TB (62.3 GB zipped) |
Citation data (Scholix) | ZIP | 1.7 TB (37 GB zipped) |
Provenance data (CSV) | ZIP | 14 GB (312 GB zipped) |
Provenance data (N-Triple) | ZIP | 2.5 TB (79 GB zipped) |
In addition:
Type and format | Archive | Size |
---|---|---|
Citation data sources' info (N-Triple): information regarding the data source collection (e.g., COCI, DOCI, POCI, etc) of all the citation data | ZIP | 351 GB (19 GB zipped) |
Citation count data (CSV): the number of incoming citations to each bibliographic entity (identified by an OMID) in OpenCitations Index | ZIP | 1.7 GB (0.4 GB zipped) |
Reference count data (CSV): the number of references of each bibliographic entity (identified by an OMID) in OpenCitations Index | ZIP | 1.7 GB (0.35 GB zipped) |
Dump created on 2023-10-25. Dump available in CSV (citation data), N-Triple (citation data), SCHOLIX (citation data), CSV (provenance data), and N-Triple (provenance data). In addition, a N-Triple dump containing information regarding the data source collection.