Herbarium
Information Standards and Protocols for Interchange of Data
Version 3
Editor
Barry J. Conn
Internet URL
http://www.rbgsyd.gov.au/HISCOM
© Council of Heads of Australian Herbaria
Previous Versions of HISPID
Version 1:
Croft, J.R. (ed.) (1989). HISPID - Herbarium Information Standards and
Protocols for Interchange of Data (Australian National Botanic Gardens: Canberra).
Version 2:
Whalen, A. (ed.) (1993). HISPID - Herbarium Information Standards and
Protocols for Interchange of Data (National Herbarium of New South Wales: Sydney).
As with all previous versions of HISPID, the preparation of the HISPID3 interchange standard is being coordinated by a committee of representatives from all major Australian herbaria. Since 1995, the development of this standard has been coordinated by the 'Herbarium Information Systems Committee' (HISCOM) (refer Internet URL http://www.rbgsyd.gov.au/HISCOM).
INTRODUCTION
The 'Herbarium Information Standards and Protocols for Interchange of
Data' (HISPID) is a standard format for the interchange of electronic herbarium specimen
information. HISPID has been developed by a committee of representatives from all major
Australian herbaria. This interchange standard was first published in 1989, with a revised
version published in 1993.
HISPID3 is an accession-based interchange standard. Although many fields
refer to attributes of the taxon they should be construed as applying to the specimen
represented by the record, not to the taxon per se. The interchange of taxonomic,
nomenclatural, bibliographic, typification, rare and endangered plant conservation, and
other related information is not dealt with in this standard, unless it specifically
refers to a particular accession (record).
This data dictionary is concerned primarily with data interchange
standards but has considerable relevance to database structure since the task of preparing
interchange files is simplified if the data fields of the despatching and receiving
databases match, as far as possible, the interchange standard. If differences do exist
then, generally, it is easier to combine data fields than it is to dissect them in a
reliable manner. Fields that are concatenated are frequently heterogeneous in their nature
and many preclude the possibility of rearranging the data contained within such fields.
The fields discussed in this data dictionary cover most of the herbarium
and botanic gardens sphere of activity and have been arranged in groups of similar types
of information. In many cases these groups may coincide with separate welldefined tables
(or databases) of structurally similar records.
The challenge for herbarium data managers is to decide whether the data
are to be efficiently exchanged as discrete but related tables (databases) or as a larger
single flat file that may have to be appropriately dismembered by the receiving
institution. Some database packages are able to stack multiple values in a single field.
This useful data structure complicates the interchange format and will not be used at this
stage.
The 'Herbarium Information Systems Committee' (HISCOM) considered
several format options for HISPID3. It was agreed that the interchange format of HISPID3
would be a flatfile. This flatfile format was chosen because it was relatively simple
and required minimal computer programming to enable the importing and exporting of data.
Furthermore, this format was in agreement with that chosen for the 'International Transfer
Format for Botanic Garden Plant Records (Version 2.00)(ITF2). Although, it was recognised
that it was difficult to transfer relational (hierarchical) data in flatfile formats, it
was decided to proceed with the publication of this version of HISPID so that electronic
data interchange could be actively encouraged. It is hoped that future versions of HISPID
will include the capability of transferring data such that the relational structure is
maintained.
There have been several major changes incorporated into this version of
the HISPID transfer format, namely:
(1) HISPID3 allows for the interchange of variable length fields. It is no longer restricted to a fixed length format.
(2) HISPID3 allows missing data to be omitted from the transfer file
(3) HISPID3 provides a protocol for interchanging (non-standard) data that are either not defined within this document or are in a form different to that define here.
(4) Apart from a few exceptions, HISPID3 does not evaluate the relevance of interchanging any of the specific fields described in this document
(5) The references to how data are stored in the major Australian herbarium databases has been deleted from this document
(6) HISPID3 has been developed in conjunction with ITF2 (International
Transfer Format for Botanic Gardens Plant Records version 2.00) so that the two
interchange standards are as compatible as possible.
The transfer format of HISPID3 is based on 'Information technology -
Open Systems Interconnection - Specification of Abstract Syntax Notation One (ASN.1) '.
International Standard ISO/IEC 8824, 2nd ed. (1990)(ISO/IEC: Genève).
As far as practicable, raw data should be used. Interpretations or
corrections in free text fields should be enclosed in square brackets: '[' and ']'.
Omitted data should be represented by the ellipsis: '...'.
Since the printable ASCII (EBDIC or UNICODE) character set does not
include italicised characters, these are not included in the interchange file.
If information is not known for a field, then the field need not be
included in the interchange file or else the field identifier may be interchanged
unfilled. However, if the value of the Collector's Identifier field is unknown,
then the default value should be 's.n.'.
In general, single character (flag) fields have not been included in
this standard because of the difficulty of detecting data entry errors.
As for the 'single character' fields (above), codes are mostly not
included in this standard because of the difficulty of detecting data entry errors.
The fields included in this interchange standard are a compilation from
the following sources:
ABIS Australian Biotaxonomic/Biogeographic Information System
(Australian Biological Resources Study ABRS)
ICBN International Code of Botanical Nomenclature
(International Association of Plant Taxonomists IAPT)
ITF International Transfer Format for Botanic Gardens Plant Records
ITFBGPR (Botanic Gardens Conservation International BGCI)
ITRF International Earth Rotation Service Terrestrial Reference Frame
MFN Minimal Functional Nomenclator, also known as:
DSTI Database Standards for Taxonomic Information
(Taxonomic Database Working Group TDWG)
PECS Plant Existence and Categorisation Scheme, also known as:
POSS Plant Occurrence and Status Scheme
(World Conservation Monitoring Centre WCMC Threatened Plants Unit - TPU)
SDTS Spatial Data Transfer Standard
TDWG Taxonomic Database Working Group
TLR Type and Lectotypification Registers
(Taxonomic Database Working Group TDWG)
WGSUB World Geographical System for Use in Botany
(Taxonomic Database Working Group TDWG)
XDF Language for the Definition and Exchange of Biological Data Sets
(Taxonomic Database Working Group TDWG)
a) Each field is prefaced by an unique identifier this refers to the fields which describe the contents of the file, as well as to those which describe the information contained in each record);
b) Each unique identifier must begin with a lowercase letter (a-z) and cannot contain any spaces;
c) A transfer file begins with the file identifier 'startfile';
f) Variable length fields are allowed;
g) Fields can be omitted from the transfer file if there is no information available for that field;
h) Alphanumeric data are enclosed by double quotation marks (");
i) Numeric data are not enclosed by double quotation marks;
j) Each field and each file information is one line long and is terminated by a comma (,);
k) Each transfer file ends with the file identifier 'endfile'.
| startfile | |
| version | HISPID version |
| numrecords | number of records in this file |
| datefile | date to which the file refers |
| institute | full name of institution supplying information |
| contact | contact name |
| address | postal address |
| phone | telephone number |
| fax | fax number |
| email address | |
| nonstandard | optional field to describe any non-standard fields added to the HISPID3 transfer file |
| fileaction | descriptor flag indicating how records of file should be processed |
| filedescriptor | descriptor flag indicating the nature of the records included in file |
| content | contents of the file and other comments |
| { | start of a record |
| insid | the standard 'Index Herbariorum' code for the herbarium to which the plant record refers |
| accid | accession number |
| ... | |
| ... | |
| | | |
| | | |
| ... | |
| } | end of record |
| { | start of next record |
| insid | |
| accid | |
| ... | |
| | | |
| | | |
| ... | |
| } | end of next record |
| endfile | end of file |
| startfile | |
| version | "HISPID3", |
| numrec | 2, |
| datefile | 19951202, |
| institute | "National Herbarium of New South Wales (NSW)", |
| contact | "Gary Chapple", |
| address | "Royal Botanic Gardens, Mrs Macquaries Road, Sydney NSW 2000, Australia", |
| phone | 612 92318164, |
| fax | 612 92517231, |
| "gary@rbgsyd.gov.au", | |
| fileaction | "insert", |
| filedescriptor | "exchange", |
| content | "Herbarium exchange data of various species from NSW to CANB", |
| { | |
| insid | "NSW", |
| accid | "390839", |
| fam | "Loranthaceae", |
| gen | "Amyema", |
| sp | "pendulum", |
| isprk | "subsp.", |
| isp | "longifolium", |
| vnam | "Wiecek, B.M.", |
| vdat | 1995, |
| prot | "Wild", |
| cou | "AUSTRALIA", |
| pru | "NSW", |
| sru | "Central W. Slopes", |
| loc | "Mount Bolton, Moura", |
| latdeg | 33, |
| latmin | 15, |
| latdir | "S", |
| londeg | 148, |
| lonmin | 24, |
| londir | "E", |
| cnam | "Baeuerlen, W.", |
| cdat | 190103, |
| hab | "On Eucalyptus macrorrhyncha.", |
| misc | "Donated by Museum of Applied Arts & Sciences, 1979.", |
| } | |
| { | |
| insid | "NSW", |
| accid | "248836", |
| fam | "Asclepiadaceae", |
| gen | "Cynanchum", |
| sp | "pedunculatum", |
| vnam | "Hill, K.D.", |
| vdat | 1992, |
| prot | "Wild", |
| cou | "AUSTRALIA", |
| pru | "WA", |
| sru | "Fortescue", |
| loc | "Mount Lois.", |
| alt | 800, |
| latdeg | 22, |
| latmin | 06, |
| latdir | "S", |
| londeg | 117, |
| lonmin | 44, |
| londir | "E", |
| geoacy | 0.05, |
| hab | "Summit of mountain. Red loam derived from iron-rich shale.", |
| cnam | "Wilson, Peter G.", |
| cid | "1031", |
| cnam2 | "Rowe, R.", |
| cdat | 19910911, |
| cnot | "Rare. Scrambler. Flowers white; fruit green.", |
| } | |
| endfile |
The herbarium data fields for information interchange are listed below
in the following format:
The name of the discrete piece of information within the file or within
each record.
The standard codes used as file or field identifiers in the transfer
file.
A short, meaningfulsounding singleword name for the field, proposed
by TDWG.
A general elaboration of the field name.
The existence of this type of data in any other published or proposed
biological standards.
The type of data allowed in this field, the range of values, or
individual allowable values, and capitalisation.
Any other remarks on the use or application of these data and its
relationships to other data. Any conflicts or problems in the application of these data
types.
Additional information to that provided in Comments explaining
the rules applying to these data.
Additional comments to those provided in Comments and Rules