REST API for the OpenCitations Corpus
Version: 1.0.0
API URL: https://w3id.org/oc/api/v1
Contact: contact@opencitations.net
License: This document is licensed with a Creative Commons Attribution 4.0 International License, while the REST API itself has been created using RAMOSE, the Restful API Manager Over SPARQL Endpoints created by Silvio Peroni, which is licensed with an ISC license.
This document describe the REST API for accessing the data stored in the OpenCitations Corpus hosted by OpenCitations. All the operations described in this document return either a JSON document (default) or a CSV document according to the mimetype specified in the Accept
header of the request.
If you would like to suggest an additional operation to be included in this API, please use the issue tracker of the OpenCitations Corpus API available on GitHub.
Parameters can be used to filter and control the results returned by the API. They are passed as normal HTTP parameters in the URL of the call. They are:
-
require=<field_name>
: all the rows that have an empty value in the <field_name>
specified are removed from the result set - e.g. require=given_name
removes all the rows that do not have any string specified in the given_name
field.
-
filter=<field_name>:<operator><value>
: only the rows compliant with <value>
are kept in the result set. The parameter <operation>
is not mandatory. If <operation>
is not specified, <value>
is interpreted as a regular expression, otherwise it is compared by means of the specified operation. Possible operators are "=", "<", and ">". For instance, filter=title:semantics?
returns all the rows that contain the string "semantic" or "semantics" in the field title
, while filter=date:>2016-05
returns all the rows that have a date
greater than May 2016.
-
sort=<order>(<field_name>)
: sort in ascending (<order>
set to "asc") or descending (<order>
set to "desc") order the rows in the result set according to the values in <field_name>
. For instance, sort=desc(date)
sorts all the rows according to the value specified in the field date
in descending order.
-
format=<format_type>
: the final table is returned in the format specified in <format_type>
that can be either "csv" or "json" - e.g. format=csv
returns the final table in CSV format. This parameter has higher priority of the type specified through the "Accept" header of the request. Thus, if the header of a request to the API specifies Accept: text/csv
and the URL of such request includes format=json
, the final table is returned in JSON.
-
json=<operation_type>("<separator>",<field>,<new_field_1>,<new_field_2>,...)
: in case a JSON format is requested in return, tranform each row of the final JSON table according to the rule specified. If <operation_type>
is set to "array", the string value associated to the field name <field>
is converted into an array by splitting the various textual parts by means of <separator>
. For instance, considering the JSON table [ { "names": "Doe, John; Doe, Jane" }, ... ]
, the execution of array("; ",names)
returns [ { "names": [ "Doe, John", "Doe, Jane" ], ... ]
. Instead, if <operation_type>
is set to "dict", the string value associated to the field name <field>
is converted into a dictionary by splitting the various textual parts by means of <separator>
and by associating the new fields <new_field_1>
, <new_field_2>
, etc., to these new parts. For instance, considering the JSON table [ { "name": "Doe, John" }, ... ]
, the execution of dict(", ",name,fname,gname)
returns [ { "name": { "fname": "Doe", "gname": "John" }, ... ]
.
It is possible to specify one or more filtering operation of the same kind (e.g. require=given_name&require=family_name
). In addition, these filtering operations are applied in the order presented above - first all the require
operation, then all the filter
operations followed by all the sort
operation, and finally the format
and the json
operation (if applicable). It is worth mentioning that each of the aforementioned rules is applied in order, and it works on the structure returned after the execution of the previous rule.
Example: <api_operation_url>?require=doi&filter=date:>2015&sort=desc(date)
.
The operations that this API implements are:
- /metadata/{dois}: This operation allows one to get the metadata of all the articles specified in input by means of their DOIs.
- /coauthorship/{dois}: This operation allows one to get co-authorship matrix of all the articles specified in input by means of their DOIs.
This operation allows one to get the metadata of all the articles specified in input by means of their DOIs.
It is possible to specify one or more DOIs as input of this operation. In this case, the DOI should be separated with a double underscore ("__") – e.g. "10.1108/jd-12-2013-0166__10.1016/j.websem.2012.08.001__...". The fields returned by this operation are:
- occ_id: the OpenCitations Corpus local identifier of the citing bibliographic resource (e.g. "br/2384552");
- author: the semicolon-separated list of authors of the citing bibliographic resource;
- year: the year of publication of the citing bibliographic resource;
- title: the title of the citing bibliographic resource;
- source_title: the title of the venue where the citing bibliographic resource has been published;
- volume: the number of the volume in which the citing bibliographic resource has been published;
- issue: the number of the issue in which the citing bibliographic resource has been published;
- page: the starting and ending pages of the citing bibliographic resource in the context of the venue where it has been published;
- doi: the DOI of the citing bibliographic resource;
- occ_reference: the semicolon-separated OpenCitations Corpus local identifiers of all the bibliograhic resources cited by the citing bibliographic resource in consideration;
- doi_reference: the semicolon-separated DOIs of all the cited bibliograhic resources that have such identifier associated;
- citation_count: the number of citations received by the citing bibliographic resource.
Accepted HTTP method(s) get
Parameter(s) dois: type str, regular expression shape \"?10\..+[^_\"]((__|\" \")10\..+[^_])*\"?
Result fields typeocc_id (str), author (str), year (datetime), title (str), source_title (str), volume (str), issue (str), page (str), doi (str), occ_reference (str), doi_reference (str), citation_count (int)
Example/metadata/10.1108/jd-12-2013-0166__10.1016/j.websem.2012.08.001
Exemplar output (in JSON)
[
{
"doi_reference": "",
"year": "2012",
"citation_count": "1",
"page": "33-43",
"occ_id": "br/2384552",
"title": "FaBiO and CiTO: Ontologies for describing bibliographic resources and citations",
"source_title": "Web Semantics: Science, Services and Agents on the World Wide Web",
"issue": "",
"author": "Peroni, Silvio; Shotton, David",
"occ_reference": "",
"volume": "17",
"doi": "10.1016/j.websem.2012.08.001"
},
{
"doi_reference": "",
"year": "2015",
"citation_count": "1",
"page": "253-277",
"occ_id": "br/7295288",
"title": "Setting our bibliographic references free: towards open citation data",
"source_title": "Journal of Documentation",
"issue": "2",
"author": "Peroni, Silvio; Dutton, Alexander; Gray, Tanya; Shotton, David",
"occ_reference": "",
"volume": "71",
"doi": "10.1108/jd-12-2013-0166"
}
]
This operation allows one to get co-authorship matrix of all the articles specified in input by means of their DOIs.
It is possible to specify one or more DOIs as input of this operation. In this case, the DOI should be separated with a double underscore ("__") – e.g. "10.1108/jd-12-2013-0166__10.1016/j.websem.2012.08.001__...". The fields returned by this operation are:
- author1: an author of the articles specified as input by means of their DOIs;
- author2: another author of the articles specified as input by means of their DOIs;
- coauthorship_count: the number of articles (among the specified ones) that the aforementioned authors have been written together.
Accepted HTTP method(s) get
Parameter(s) dois: type str, regular expression shape \"?10\..+[^_\"]((__|\" \")10\..+[^_])*\"?
Result fields typeauthor1 (str), author2 (str), coauthorship_count (int)
Example/coauthorship/10.1108/jd-12-2013-0166__10.1016/j.websem.2012.08.001
Exemplar output (in JSON)
[
{
"author2": "Gray, Tanya",
"author1": "Dutton, Alexander",
"coauthorship_count": "1"
},
{
"author2": "Peroni, Silvio",
"author1": "Dutton, Alexander",
"coauthorship_count": "1"
},
{
"author2": "Shotton, David",
"author1": "Dutton, Alexander",
"coauthorship_count": "1"
},
{
"author2": "Peroni, Silvio",
"author1": "Gray, Tanya",
"coauthorship_count": "1"
},
{
"author2": "Shotton, David",
"author1": "Gray, Tanya",
"coauthorship_count": "1"
},
{
"author2": "Shotton, David",
"author1": "Peroni, Silvio",
"coauthorship_count": "2"
}
]