OpenCitations Indexes

A citation index is a bibliographic index recording citations between publications, allowing the user to establish which later documents cite earlier documents. Several citation indexes are already available, some of which are freely accessible but not downloadable (e.g. Google Scholar), while others can be accessed only by paying significant access fees (e.g. Web of Science and Scopus).

OpenCitations, as a scholarly infrastructure organization dedicated to providing free, accessible and downloadable bibliographic metadata and citation links, is building several OpenCitations Indexes using the data available in particular bibliographic databases.

The first of these is COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations, a new RDF dataset of around 450 million citations which is available at

Other similar indexes will be published in the coming months.

These OpenCitations Indexes have the following characteristics in common:

  1. Citations are treated as first-class data entities, with accompanying properties – for a full explanation, see our introductory blog post and following posts;

  2. Each citation is identified by an Open Citation Identifier (OCI), which has a simple structure: the lower-case letters "oci" followed by a colon, followed by two numbers separated by a dash (e.g. oci:1-18). OCIs can be resolved using the OpenCitations OCI Resolution Service.

  3. The citation metadata within each OpenCitations Index are recorded in RDF.

  4. The RDF statements in each of the indexes are organising according to the data model shown in Figure 1.

    The Graffoo diagram of the data model adopted by all the OpenCitations Indexes.

    Figure 1. The Graffoo diagram of the data model adopted by all the OpenCitations Indexes.

    This model is based on the Citation Typing Ontology (CiTO) for describing the data, and the Provenance Ontology (PROV-O) for the provenance information.

  5. All the data in each OpenCitations Index are available for download in the following ways:

    • by querying a SPARQL endpoint;

    • by using a REST API, implemented by means of RAMOSE (the Restful API Manager Over SPARQL Endpoints), where they can be downloaded in JSON and CSV formats;

    • as dumps of the full index on Figshare in CSV and N-Triples formats;

    • using the HTTP URI of the individual citations, where they can be downloaded in different formats (HTML, RDF/XML, Turtle, and JSON-LD) via content negotiation.

  6. Additionally, the content of each OpenCitations Index can be searched and browsed using OSCAR and LUCINDA, the search and browse interfaces developed by OpenCitations.