This page is a legacy page (not linked anymore from the official website) that describes COCI. Since October 2023, all the citation data collected previously in different OpenCitations Indexes have been moved (and deduplicated) in the new citation collection, i.e. the OpenCitations Index.
COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations, is an RDF dataset containing details of all the citations that are specified by the open references to DOI-identified works present in Crossref, as of the latest COCI update. COCI does not index Crossref references that are not open, nor Crossref open references to entities that lack DOIs. The citations available in COCI are treated as first-class data entities, with accompanying properties including the citations timespan, modelled according to the OpenCitations Data Model. For an in-depth description of COCI, see:
Ivan Heibi, Silvio Peroni, David Shotton (2019). Software review: COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations. Scientometrics, 121 (2): 1213-1228. https://doi.org/10.1007/s11192-019-03217-6
Currently, COCI contains:
1,463,920,523 citations;
77,045,952 bibliographic resources.
COCI was first created and released on 4 June 2018.
Most recent update of COCI: 23 January 2023, based on open references to works with DOIs within the Crossref dump dated December 2022.
Each citation (i.e. an individual of the class cito:Citation
) is identified by an URL structured as follows: https://w3id.org/oc/index/coci/ci/[[OCI]]
.
Each Open Citation Identifier [[OCI]]
has a simple structure: the lower-case letters "oci" followed by a colon, followed by two numbers separated by a dash (e.g. https://w3id.org/oc/index/coci/ci/02001010806360107050663080702026306630509-02001010806360107050663080702026305630301), in which the first number identifies the citing work and the second number identifies the cited work.
For citations in which the citing and cited works are identified by DOIs, which includes all the COCI citations, the OCI is created in the following manner, as explained more fully here. Each case-insensitive DOI is first normalized to lower case letters. Then, after omitting the initial doi:10.
prefix, the alphanumeric string of the DOI is converted reversibly to a pure numerical string using the simple two-numeral lookup table for numerals, lower case letters and other characters presented at https://github.com/opencitations/oci/blob/master/lookup.csv. Finally, each converted numeral is prefixes by a 020
, which indicates that Crossref is the supplier of the original metadata of the citation (as indicated at http://opencitations.net/oci).
OCIs can be resolved using the OpenCitations OCI Resolution Service.
All the data in COCI:
can be queried by means of the OpenCitations COCI SPARQL endpoint;
can be retrieved by using the COCI REST API;
can be searched by using the OpenCitations Indexes Search Interface;
are available as dumps on Figshare in CSV, N-Triples and Scholix.