EOL export files can be created by any moderately skilled programmer. The required export files must follow particular data standards and they are usually made available to EOL on the content partner's server, where EOL will periodically harvest updates.
All your data objects (text descriptions, images, videos, etc.) need to be defined in an XML document conforming to the EOL Transfer Schema (see EOL Schema Documentation and Data Element Glossary). You will use this schema to create a resource file that describes all the attributes of taxa for which you have content as well as the attributes of your data objects for these taxa.
The basic structure of the schema consists of a series of taxon elements (letting EOL know for which taxa you have information) each of which contain one or more dataObject elements providing information about your text descriptions, media files, references, etc. In the taxon element, you need to give us the scientific name for each taxon (dwc:ScientificName). You can also include the different Linnean categories for each taxon as well as synonyms and vernacular names. For detailed instructions regarding the content of the dataObject element, see Exporting Taxon Descriptions and Exporting Images, Videos, Sounds, Maps.
Note that all data objects need to be in a single data file. Certain elements & attributes can be repeated, but elements need to be exactly in the sequence indicated in the schema. If you put an element in the wrong place, your file will not validate when you upload it to the EOL content partner registry.
There are various ways in which you can produce your data file:
- The Creating Content Connectors for EOL page describes common techniques in PHP and Ruby on Rails to parse your data into logical components, map each component into an element of the EOL schema, and generate an XML document conforming to the EOL Transfer Schema. For PHP developers we provide the eol_php_code code base which helps ensure that the XML conforms to the expected schema.
- For other development languages you may be able to borrow from the above examples to create a connector to pull information directly from a database or from a collection of files and format it into an EOL Transfer Schema file. If you create such a system and are willing to share it, please let us know so we can add it to our examples.
- If you have a datadump in XML format (many database management systems now provide this option), you can write an XML stlyesheet that transforms this XML document into compliance with the EOL Transfer Schema. A basic tutorial in using XSLT (Extensible Stylesheet Language Transformations) is available here. If you create such a transformation and are willing to share it, please let us know so we can add it to our examples.
- We are working with GBIF to develop an EOL extension for the GBIF Integrated Publishing Toolkit (IPT). The IPT will connect to a local database or data files and will help map them into a standard format. The EOL extension allows you to map your content to the EOL Content Schema. Instructions are provided here. Let us know if you are interested in helping us develop this approach further.
The EOL schema is versioned and we are ensuring each version is backwards compatible with previous versions. This allows us to expand the schema by adding new attributes without affecting existing content partners or schema creation code.
Validating your XML
Be sure to run your file through the XML File Validator before submitting it to EOL. This validates XML documents according to their defined XML schema. If the file is in the EOL Tranfer Schema format, it also performs data integrity checks for conditional conditions which cannot be defined in XSD. For example it will look for data objects of type image or video which lack the mediaURL element, or text objects with no dc:description, or elements that are in the wrong sequence. It will print out a list of integrity failures or recommended changes such as adding dc:identifiers.
For really big XML files you can use other free software. For example, the command line functions of AltovaXML will allow you to validate your EOL XML. You can download AltovaXML here. The command line argument is: altovaxml.exe /v C:\xml_path\really_big_xml.xml or altovaxml.exe /v http://my_domain/xml_path/really_big_xml.xml
Providing Information in a Spreadsheet
See our Spreadsheet Partners page for discussion of providing data through a spreadsheet.
Creating a DarwinCore Export file for Taxonomic Information
The EOL supports importing more complete taxonomic classification data than is currently supported in the EOL XML Schema using the latest version of the DarwinCore. We are working to develop documentation of this process. In the meantime please contact us to get assistance if you have more detailed classification data.