EOL Content Partners: Contribute Using Archives

Introduction

EOL has started to harvest content prepared according to the GBIF Darwin Core Archive (DwC-A) format.


Create Your Archive

Archives are created with a core file (in our case representing Darwin Core Taxa) and various extensions which add information about the core in a one-to-many relationship. This means any one Taxon could have various Vernacular Names if you choose to use the Vernacular Name extension. EOL is currently recognizing archives which have Taxon as their core. We are also recognizing the Vernacular Names, Media, References and Agents extensions. All attributes of Media, References and Agents as specified in their extension definitions will be indexed by EOL, as well as a subset of the fields from Taxon and Vernacular Name that will be described below.

Archives can be prepared as described in GBIF's Darwin Core Text Guide. But EOL has some notable requirements above what is listed in that guide. The guide declares that "The extension itself does not have to have a unique ID", but EOL does require that, when an identifier field is present in the extension, an identifier must be included. This will help EOL keep track of revisions to text and media provided, and help us preserve any annotations about these content added through the EOL site.

EOL is accepting text descriptions of species and captions of multimedia which may contain tab or newline characters. These characters could potentially break a traditional tab-separated file, such as some Darwin Core Archives. So please remember to escape newline (\n becomes \\n), tab (\t becomes \\t) and carriage return (\r becomes \\r) characters (through carriage returns are not recommended to be used anyway).

Extension Requirements

Taxon

The following are the Taxon fields that EOL will recognize:

Vernacular Name

The following are the Vernacular Name fields EOL will recognize:

Media

Text descriptions are required to contain a subject which should as accurately as possible describe the main focus of the description. We recommend the subject be a URI of a subject from a controlled vocabulary such as the TDWG Species Profile Model, Plinian Core or one of the EOL defined subjects.


Example

The exact structure of an archive will vary depending on the extensions used and fields populated. For reference we prepared a sample archive using all the extensions EOL recognizes, but remember your archive may look different. A compressed version of the sample archive contains the following files:


Validating

You can test your archive for structural or data problems by checking it with our Archive and Spreadsheet Validator. This tool will perform several tests:

  • ensure the archive is valid by decompressing it and confirming the presence of a meta.xml descriptor file
  • validate the meta.xml file according to the DwC-A meta file XML Schema
  • verify that required fields are included
  • validate certain field values such as data types and licenses
  • perform some data type-specific validation such as verify text object contain descriptions and images contain accessURIs

It will then return with a summary of the validation status. Any errors or warnings will be displayed, as well as a brief summary of the data included in the archive. Errors should be addressed before the archive is registered with EOL. Resources failing validation will not be harvested.


See Also