The metadata model used for the data stored in the OCC is briefly summarised in Figure 1 and described in:
Silvio Peroni, David Shotton (2016). Metadata for the OpenCitations Corpus. figshare. https://dx.doi.org/10.6084/m9.figshare.3443876
The model is explicitly aligned with the SPAR Ontologies and other well-known vocabularies. In particular:
the FRBR-aligned Bibliographic Ontology (FaBiO) is used for providing a description of all the metadata of citing/cited resources (conference papers, book chapters, journal articles, etc.), their related container resources (academic proceedings, books, journals, etc.), as well as metadata about the particular formats in which they have been embodied (digital vs. print, first and ending pages, etc.);
the Publishing Roles Ontology (PRO) is used for describing the roles of agents (author, editor, publisher, etc.) related to bibliographic resource – while the order among roles, e.g. the list of authors of a paper, is handled by extending PRO with an additional property, i.e.
the Bibliographic Reference Ontology (BiRO) and the Citation Counting and Context Characterization Ontology (C4O) are used for describing the textual content of each reference in the reference list of a citing bibliographic resource;
finally, the DataCite Ontology is used for defining all the identifiers (e.g. DOI, PubMed ID, PubMed Central ID, ORCID, ISSN, etc.) for bibliographic resources and all the agents involved – the Friend Of A Friend (FOAF) ontology is used for defining additional data (e.g. names) about agents.
All the terms from the aforementioned ontologies are collected within a new ontology called OpenCitations Ontology (OCO). This is not yet another bibliographic ontology, rather just a place where existing complementary ontological entities from several other ontologies are grouped together for the purpose of providing descriptive metadata for the OCC.