OpenCitations

Download

This page contains details of and links to all the data dumps of the OpenCitations Corpus (OCC), which are created regularly every month, and are made available online by means of the support of Figshare.

Each dump is composed by several zip archives, each containing either data or provenance information relating to a particular sub-dataset within the OCC.

After unzipping an archive, one needs to use Disk ARchive (DAR) - a multi-platform archive tool for managing huge amount of data - to recreate the whole OCC structure.

Most recent OCC data dump - July 2017 OCC Dump

25 July 2017 Dump

Dump created on 2017-07-25. This dump includes information on:

TypeArchive
agent roles (ar)data, provenance
bibliographic entries (be)data, provenance
bibliographic resources (br)data, provenance
identifiers (id)data, provenance
responsible agents (ra)data, provenance
resource embodiment (re)data, provenance
corpustriplestore, provenance

25 June 2017 Dump

Dump created on 2017-06-25. This dump includes information on:

TypeArchive
agent roles (ar)(data not available for technical reasons), provenance
bibliographic entries (be)data, provenance
bibliographic resources (br)data, provenance
identifiers (id)data, provenance
responsible agents (ra)data, provenance
resource embodiment (re)data, provenance
corpustriplestore, provenance

May 2017 OCC Dump

Dump created on 2017-05-25. This dump includes information on:

TypeArchive
agent roles (ar)data, provenance
bibliographic entries (be)data, provenance
bibliographic resources (br)data, provenance
identifiers (id)data, provenance
responsible agents (ra)data, provenance
resource embodiment (re)data, provenance
corpustriplestore, provenance

April 2017 OCC Dump

Dump created on 2017-04-26. This dump includes information on:

TypeArchive
agent roles (ar)data, provenance
bibliographic entries (be)data, provenance
bibliographic resources (br)data, provenance
identifiers (id)data, provenance
responsible agents (ra)data, provenance
resource embodiment (re)data, provenance
corpustriplestore, provenance, data+provenance (single n-quads file)

March 2017 OCC Dump

Dump not submitted for technical reasons.

February 2017 OCC Dump

Dump not submitted for technical reasons.

January 2017 OCC Dump

Dump not submitted for technical reasons.

December 2016 OCC Dump

Dump created on 2016-12-24. This dump includes information on:

TypeArchive
agent roles (ar)data, provenance
bibliographic entries (be)data, provenance
bibliographic resources (br)data, provenance
identifiers (id)data, provenance
responsible agents (ra)data, provenance
resource embodiment (re)data, provenance
corpustriplestore, provenance

November 2016 OCC Dump

Dump not submitted for technical reasons.

October 2016 OCC Dump

Dump created on 2016-10-24. This dump includes information on:

TypeArchive
agent roles (ar)data, provenance
bibliographic entries (be)data, provenance
bibliographic resources (br)data, provenance
identifiers (id)data, provenance
responsible agents (ra)data, provenance
resource embodiment (re)data, provenance
corpustriplestore, provenance

September 2016 OCC Dump

Dump created on 2016-09-24. This dump includes information on:

TypeArchive
agent roles (ar)data, provenance
bibliographic entries (be)data, provenance
bibliographic resources (br)data, provenance
identifiers (id)data, provenance
responsible agents (ra)data, provenance
resource embodiment (re)data, provenance
corpustriplestore, provenance