This page contains details of and links to all the data dumps of the OpenCitations Meta and OpenCitations Index. They are made available online by means of the support of Figshare and of the Internet Archive.
The OpenCitations Meta database stores and delivers bibliographic metadata for all publications involved in the OpenCitations Index.
This dataset's dump, released on 2024-06-20, enhances its previous version by incorporating new data from the Crossref dump available at Crossref March 2024 Dump. This dump includes information on:
116,705,111 bibliographic entities
348,815,548 authors and 2,561,336 editors (counted by their roles, without disambiguating individuals)
715,711 publication venues
245,783 publishers
Type and format | Archive | Size |
---|---|---|
Metadata (CSV) | tar.gz | 11G (47G zipped) on ext4 |
Metadata and provenance (RDF) | tar.gz | 44G (145G compressed) on ext4 |
In addition, a CSV dump containing a mapping between all the bibliographic resources identified by an OMID (e.g., br/12345) and their corresponding PID(s) (e.g., DOI, PMID)
BR OMID map (CSV) | ZIP | 4.4 GB (1.7 GB zipped) |
Dump created on 2024-04-06. Compared to the previous dump, this one incorporates OpenAlex IDs, leveraging data from the OpenAlex dump. Dump available in CSV (metadata) and RDF (metadata and provenance) formats.
Dump created on 2023-11-29. Compared to the previous dump, this one adds the metadata contained in the Japan Link Center (JaLC). Dump available in CSV (metadata) format.
Dump created on 2023-10-24. Compared to the previous dump, this one adds the metadata contained in OpenAIRE and in the Crossref dump dated September 2023.Dump available in CSV (metadata) and JSON-LD (metadata and provenance) formats.
Dump created on 2023-06-28. Compared to the previous dump, this one adds the metadata contained in the dump of NIH Open Citation Collection dated November 2022. Dump available in CSV (metadata) and JSON-LD (metadata and provenance) formats.
Dump created on 2023-02-24. Compared to the previous dump, this one adds the metadata contained in the last dump of DataCite dated 22 October 2021. Dump available in CSV (metadata) and JSON-LD (metadata and provenance) formats.
Dump created on 2022-12-20, based on open references to works with DOIs within the Crossref dump dated December 2022. Dump available in CSV (metadata) and JSON-LD (metadata and provenance) formats.
The OpenCitations Index stores OMID-to-OMID references representing all the references gathered from several sources.
Dump created on 2024-07-01. Compared to the previous dump, this one adds the citation data contained in the Crossref dump dated March 2024. This dump includes information on:
2,012,939,079 citations
Type and format | Archive | Size |
---|---|---|
Citation data (CSV) | ZIP | 179 GB (28.1 GB zipped) |
Citation data (N-Triple) | ZIP | 1.5 TB (65.6 GB zipped) |
Citation data (Scholix) | ZIP | 1.8 TB (39 GB zipped) |
Provenance data (CSV) | ZIP | 329 GB (15 GB zipped) |
Provenance data (N-Triple) | ZIP | 2.6 TB (83 GB zipped) |
In addition:
Type and format | Archive | Size |
---|---|---|
Citation data sources' info (N-Triple): information regarding the data source collection (e.g., COCI, DOCI, POCI, etc) of all the citation data | ZIP | 364 GB (20.4 GB zipped) |
Citation count data (CSV): the number of incoming citations to each bibliographic entity (identified by an OMID) in OpenCitations Index | ZIP | 1.1 GB (0.4 GB zipped) |
Dump created on 2023-11-29. Dump available in CSV (citation data), N-Triple (citation data), SCHOLIX (citation data), CSV (provenance data), and N-Triple (provenance data). In addition, a N-Triple dump containing information regarding the data source collection, and a citation count dump with the number of incoming citations to each bibliographic entity (identified by an OMID)
Dump created on 2023-10-25. Dump available in CSV (citation data), N-Triple (citation data), SCHOLIX (citation data), CSV (provenance data), and N-Triple (provenance data). In addition, a N-Triple dump containing information regarding the data source collection.