OpenCitations provides the following datasets:
OpenCitations Indexes. The Indexes contain information about the citations themselves, in which the citations, instead of being considered as simple links, are treated as first-class data entities in their own right. This permits each Index to endow each citation with descriptive properties, such as the date on which the citation was created, its timespan (i.e.the interval between the publication date of the cited entity and the publication date of the citing entity), and its type (e.g. whether or not it is a self-citation). These Indexes do not store metadata about the citing and cited bibliographic entities internally. Rather, these entities are identified in the Indexes by their unique identifiers (e.g. DOIs), enabling bibliographic information to be retrieved on-the-fly upon request by means of the external APIs (see the operation “metadata” at https://w3id.org/oc/index/api/v1 for additional information). The Indexes currently available are:
OpenCitations Corpus (OCC). The OCC is a database of open downloadable bibliographic and citation data recorded in RDF and released under a Creative Commons CC0 public domain waiver. The current content of the OCC has been mainly derived from biomedical articles within the Open Access Subset of PubMed Central, harvested using the Europe PubMed Central REST API.
OpenCitations in Context Corpus (CCC). The CCC is a RDF dataset including bibliographic and citation data mined from the full-text of articles, such as in-text references (e.g. (Daquino et al. 2020)), structural elements (sentences, paragraphs, footnotes, captions, tables, and sections), rhetorical elements (e.g. section "Methods"), and the sequential number of the structural elements including in-text reference pointers (e.g. sentence 5, paragraph 2, section 1 "Introduction"). Like OCC, CCC has been mainly derived from biomedical articles within the Open Access Subset of PubMed Central, harvested using the Europe PubMed Central REST API. CCC is released under a Creative Commons CC0 public domain waiver.