EOL Content Partners: Contribute Using EOL XML Transfer Schema

As of summer 2012, extended Darwin Core Archives (DwC-A) is the preferred method for sharing data with EOL. However, many content partners continue to export their data using the old EOL XML Transfer Schema.  We will continue to support this schema, so if you have data available in this format, there is no need to switch to DwC-A.

The basic structure of the EOL XML Transfer Schema consists of a series of taxon elements containing attributes of the taxon as well as one or more dataObject elements providing information about your text descriptions, media files, references, etc. For each taxon, there should be only one taxon element, i.e., if you have multiple data objects for a particular species, they should all be contained within the same taxon element.  You need to give us the scientific name for each taxon (dwc:ScientificName), and you can also include higher categories for each taxon as well as synonyms and vernacular names.  When you provide a hierarchy along with your taxa, we will display this hierarchy in the Names tab of relevant EOL taxon pages. Hierarchical information is also useful for homomym resolution.

Each data object (text descriptions, media files) must be associated with a particular taxon.  You (or your programmer) will create this association by including a given dataObject element into the taxon element of a particular taxon. If a given data object applies to more than one taxon, you can include the same dataObject element into the taxon elements of several taxa.

Note that all data objects need to be in a single data file. Certain elements & attributes can be repeated, but elements need to be exactly in the sequence indicated in the schema. If you put an element in the wrong place, your file will not validate when you upload it to the EOL content partner registry.

Example of a well-formed, complete image object in EOL export format

<dataObject>
<dc:identifier>yourproject:yourid:28910</dc:identifier>
<datatype>http://purl.org/dc/dcmitype/StillImage</datatype>
<mimetype>image/jpeg</mimetype>
<agent role="photographer">Jim Beam</agent>
<dcterms:created>2010-05-7 15:12:23</dcterms:created>
<dcterms:modified>2010-07-1 14:14:53</dcterms:modified>
<license>http://creativecommons.org/licenses/by-nc/3.0/</license>
<dcterms:rightsholder>Jim Beam</dcterms:rightsholder>
<dc:source>http://yourproject/imagegallery/28910</dc:source>
<dc:description xml:lang="en"><em>Critterus longicornis</em> male feeding on a rotting mushroom.</dc:description xml:lang="en">
<mediaurl>http://yourproject/images/28910.jpg</mediaurl>
<location xml:lang="en">On the beach, Anse Source d' Argent, La Digue, Seychelles.</location>
</dataObject>

Example of a well-formed, complete text object in EOL export format

<dataObject>
<dc:identifier>yourproject:yourid:123456</dc:identifier>
<dataType>http://purl.org/dc/dcmitype/Text</dataType>
<agent role="author">Jane Doe</agent>
<agent role="author">Billy Green</agent>
<dcterms:created>2010-07-1 14:14:53</dcterms:created>
<dcterms:modified>2010-07-1 14:14:53</dcterms:modified>
<dc:title xml:lang="en">Feeding Ecology</dc:title>
<license>http://creativecommons.org/licenses/by-nc/3.0/</license>
<dcterms:rightsHolder>Jane Doe and Billy Green</dcterms:rightsHolder>
<dcterms:bibliographicCitation>Doe, Jane and Billy Green. 2010. Critterus longicornis. Feeding Ecology. Version 15 May 2010. http://yourproject/pages/123456 in Your Project, http://yourproject</dcterms:bibliographicCitation>
<audience>General public</audience>
<audience>Expert users</audience>
<dc:source>http://yourproject/pages/123456</dc:source>
<subject>http://rs.tdwg.org/ontology/voc/SPMInfoItems#TrophicStrategy</subject>
<dc:description xml:lang="en">&lt;em&gt;Critterus longicornis&lt;/em&gt; feeds on resin from a variety of trees.</dc:description>
<reference doi="10.2238/453449c">Humpty, G. and Dumpty F. 1983. Observations of Critterus longicornis in the Seychelles. Proceedings of the National Academy of Purzeldux 234(789):12-24.</reference>
<reference doi="10.3333/j.1077-3123.2001.00213.x">Yumyum, X. 2001. Species visiting tree sap flows in a coastal Mauritius forest. Mauritius Natural History 23(4):46-48.</reference>
</dataObject>

Subject types

Note that there are some EOL subject types that were not initially supported by the EOL XML Transfer Schema.  If you want to provide text objects for any of these subjects, you need to use a workaround in order to get your XML file to validate.

The following subjects require the workaround:

If you want to include any of these subjects in your XML file, you need to put a dummy value in the regular subject element and then introduce the new subject in an extra <additionalInformation> element. Here is an example:

<dataObject> 
<dc:identifier>eolspecies:nid:1:000001</dc:identifier>
<dataType>http://purl.org/dc/dcmitype/Text</dataType> 
<agent role="author">Jane Doe</agent>
<agent role="author">Billy Green</agent>
<dcterms:created>2010-07-1 14:14:53</dcterms:created>
<dcterms:modified>2010-07-1 14:14:53</dcterms:modified>
<license>http://creativecommons.org/licenses/by-nc/3.0/</license>
<dcterms:rightsHolder>Jane Doe and Billy Green</dcterms:rightsHolder>
<dcterms:bibliographicCitation>Doe, Jane and Billy Green. 2010. Critterus longicornis. Fossil History. Version 15 May 2010. http://yourproject/pages/123456 in Your Project, http://yourproject</dcterms:bibliographicCitation>
<audience>General public</audience>
<audience>Expert users</audience>
<dc:source>http://yourproject/pages/123456</dc:source>
<subject>http://rs.tdwg.org/ontology/voc/SPMInfoItems#Description</subject>
<dc:description xml:lang="en">Three specimens of &lt;em&gt;Critterus longicornis&lt;/em&gt; are known from Dominican amber.</dc:description>
<reference doi="10.2558/45889c">Flux, S.L. 1993. Critteridae from Dominican Amber. Paleontologia Critteridologica 34(7):88-101.</reference>
<additionalInformation> <subject>http://www.eol.org/voc/table_of_contents#FossilHistory</subject> </additionalInformation>
</dataObject>

For more information about the EOL XML Transfer Schema see:

Creating your XML file

There are various ways in which you can produce an XML file in the EOL Transfer Schema:

  • The Creating Content Connectors for EOL page describes common techniques in PHP and Ruby on Rails to parse your data into logical components, map each component into an element of the EOL schema, and generate an XML document conforming to the EOL Transfer Schema. For PHP developers we provide the eol_php_code code base which helps ensure that the XML conforms to the expected schema.
  • For other development languages you may be able to borrow from the above examples to create a connector to pull information directly from a database or from a collection of files and format it into an EOL Transfer Schema file. If you create such a system and are willing to share it, please let us know so we can add it to our examples.
  • If you have a datadump in XML format (many database management systems now provide this option), you can write an XML stlyesheet that transforms this XML document into compliance with the EOL Transfer Schema. A basic tutorial in using XSLT (Extensible Stylesheet Language Transformations) is available here. If you create such a transformation and are willing to share it, please let us know so we can add it to our examples.

Validating your XML

Be sure to run your file through the XML File Validator before submitting it to EOL. This validates XML documents according to their defined XML schema and performs data integrity checks for conditional conditions which cannot be defined in XSD. For example it will look for data objects of type image or video which lack the mediaURL element, or text objects with no dc:description, or elements that are in the wrong sequence. It will print out a list of integrity failures or recommended changes such as adding dc:identifiers.

For really big XML files you can use other free software. For example, the command line functions of AltovaXML will allow you to validate your EOL XML. You can download AltovaXML here. The command line argument is: altovaxml.exe /v C:\xml_path\really_big_xml.xml or altovaxml.exe /v http://my_domain/xml_path/really_big_xml.xml