isocat.org - Rational Reconstruction for TDG Metadata

Claus Zinn, version v0.1 -- timestamp: <2011-11-23 12:14:26 (Zinn)>
Christina Hoppermann, version v0.2 -- timestamp: <2011-12-16 17:15:10 (Hoppermann)>
Thorsten Trippel, version v0.2.2 --timestamp: < 2011-12-20T15:06 (Trippel)>

The Type Hierarchy

Classes start with a Capital letter, properties are written in camelCase with initial lowercase letter, and instances are entirely written in CAPITAL letters.


Thing: description, url
Media Note: ISOcat names this "mediatype"
Communication Note: ISOcat names this "eventStructure"
Intangible
Location:
Continent: continentName Please use one of: ANTARCTICA, ASIA, AFRICA, AUSTRALIA, EUROPE, NORTH AMERICA, SOUTH AMERICA
Country: countryName Please provide the two-letter ISO 3166-1 alpha-2 country code.
Resource: accessProtocol, approach, availability, author, characterencoding, characterSet, creationdate, creationtool, completionyear, deliveryFormat, deploymentTool, derivationDate, derivationMode, derivationTool, derivationType, derivationWorkflow, dialect, distributionType, domain, dominantLanguage, endposition, geographiccoverage, geocoordinates, harvestingDate, languageid, languagename, languagescript, lastUpdate, license, licenseType, locationAddress, locationContinent, locationCountry, locationRegion, mainScript, mediaType, medium, metadataCreationDate, metadataCreator, metadataLanguage, mimetype, modalities, noLanguages, originalSource, positionType, price, publicationDate, region, resourceClass, relationType, resourceName, resourceTitle, size, sizePerLanguage, sizePerRepLevel, sizeUnit, socialFamilyContext, sourceLanguage, startPosition, startYear, structuralUnits, subStructureName, subStructureType, timecoverage, targetLanguage, task, temporalClassification, topic, updatefrequency, validation, validationLevel, validationMode, validationType, varietyName version

Rationale and todo list

  • This is one possible structured representation of the (more or less) flat set of DC entries. The testserver ISOcat registry available at lux13.mpi.nl was used in November 2011. The given hierarchy is in no way final, and serves as a basis for discussion, of course. The basic insight is that the ISOcat registry, in a more implicit than explicit way, contains concepts and relations already; the structure given on this page attempts to reflect this.
  • The structured representation will allow users to get a better and quicker overview of the ISOcat registry's content. N.B. Only the TDG "Metadata" is being covered!
  • The structured representation will allow contributors to better define new entries, or curate existing ones.
  • The structured representation will allow maintainers to drive the standardization process.
  • Classes start with a Capital letter, relations are written in camelCase, instances are given in CAPITALS. Whenever possible, a link to the corresponding ISOcat entry is given. -- I have "borrowed" the design of schema.org.
  • Classes stemming from schema.org may not have an ISOcat link, such as Thing, Intangible, Enumeration, and most of the datatypes.
  • In some cases, I have not linked a class name to an ISOcat entry, usually when there is a "corresponding" relation pointing to an ISOcat entries. See Country and countryName, for instance.
  • I have created various sub-classes for tools (AnalysisTool, CreationTool etc), mostly because there are corresponding relations (analysisTool, creationTool etc). These classes may not be necessary, unless we would like to attach more specialised relations to them that we do not want to attach to the class Tool.
  • Some relations attached to class Resource may be better attached to subclasses of Resource.
  • Similar to schema.org, we should give access to "tabled views" where a class and its attributes are described in full detail (showing which relations are inherited from which superclass), and giving examples of their use.
  • There is good potential for adding new entries to ISOcat given this structured representation.
  • There is good potential for curating existing entries to ISOcat given this structured representation. Renaming of DCs to make them more consistent is advisable.
  • There is good potential for deleting existing entries to ISOcat given this structured representation (see below).
  • And of course, we should be able to integrate our representation with the one of schema.org. That is, as a first step for this, identify its interfaces.
  • For instance, is Resource a subclass to CreativeWork?

    For instance, can we get rid of the many superfluous DC entries when using schema.org classes such as PostalAddress, ContactPoints... ?

    Also, more standards should be used. A good example is schema.org's reference to the ISO 8601 duration format; similarly, standards should be used for mimetypes, file formats, and their attributes.

  • There is little use of datatypes in the ISOcat registry (most complex/open DCs are of type string). The schema.org datatype hierarchy has been copied here, and provisionally extended a little.
  • An OWL representation for the visualisation will be done. It will/should interact with the existing schema.org/OWL representation.
  • More to come