Home Categories Other Resources articles Tutorial

How to Build the Semantic Web with Dublin Core

3.0/5.0 (2 votes total)
Rate:

Adriana Iordan
January 30, 2007


Adriana Iordan

Adriana Iordan is a Web Marketing Specialist at Avangate B.V. She has in depth knowledge of internet marketing services and website analysis applied to the software industry and e-commerce development. Avangate is an eCommerce platform for electronic software distribution incorporating an easy to use and secure online payment system plus additional marketing and sales tools.

 

Adriana Iordan has written 2 articles for WebKnowHow.
View all articles by Adriana Iordan...

Developed by the Dublin Core Metadata Initiative, the Dublin Core Metadata Element Set is a set of 15 elements that can be used for resource description. The Dublin Core Metadata Initiative (DCMI) is an organization dedicated to promoting the widespread adoption of interoperable metadata standards and develops specialized metadata vocabularies for describing resources that enable more intelligent information discovery systems.

Concept Overview

Even if its name reminds of Ireland, it actually comes from Dublin, Ohio, USA, where it appeared in 1995 as a result of a workshop. The second part of its name, "core", stands for the main characteristic of this standard, that of being rather broad and generic, but expandable and usable for describing a quite wide range of resources.

Dublin Core is more of a standard rather than a meta-tag system. The aim of DCMI has been from the very beginning to keep this standard simple and flexible in order to allow authors to provide metadata by themselves, so that it can be used within the context of any Internet document and it can be easily adapted into other languages.

The basic element set is intended to capture most of the fundamental descriptive categories necessary to facilitate the effective search and retrieval of information. Additional building blocks can be created to provide modular compilations of metadata that can be built into more complex descriptions for information resources.

Therefore, the main characteristics of Dublin Core are:

    * Simplicity (of creation and maintenance)
    * Interoperability (among collections and indexing systems)
    * International applicability
    * Extensibility
    * Modularity

The semantics of Dublin Core have been established by an international, cross-disciplinary group of professionals from librarianship, computer science, text encoding, museum community, and other related fields of scholarship and practice.

Functionality of Dublin Core

At the time of the creation of the Dublin Core standard, the DCMI have identified an apparent so-called "crisis" for Web search and information retrieval. Given that the web search engines only cover a small fraction of the Internet, the solution that the DCMI had found was to develop a standardized vocabulary that could be used efficiently for the description of Web pages. The Dublin Core Metadata Element Set is intended to facilitate the discovery of electronic resources.

It is generally accepted that the Dublin Core standard comprises two levels: Simple and Qualified. The Simple Dublin Core includes 15 elements, whereas the Qualified DC also includes three additional elements (i.e. Audience, Provenance and RightsHolder) and a group of element refinements (or qualifiers) that refine the semantics of the elements in order to improve resource discovery.

The Simple Dublin Core Metadata Element Set is composed of the following 15 elements:

    * Title: a name given to the resource
    * Creator: an entity primarily responsible for making the resource
    * Subject: the topic of the resource
    * Description: an account of the resource
    * Publisher: an entity responsible for making the resource available
    * Contributor: an entity responsible for making contributions to the resource
    * Date: a point or period of time associated with an event in the lifecycle of a resource
    * Type: the nature or genre of the resource
    * Format: the file format, physical medium, or dimensions of the resource
    * Identifier: an unambiguous reference to the resource within a given context
    * Source: the resource from which the described resource is derived
    * Language: a language of the resource
    * Relation: a related resource
    * Coverage: the spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant
    * Rights: information about rights held in and over the resource

It is important to know that each of the DC elements is optional, it can be repeated if necessary, and that there is no prescribed order for using or presenting them.

As stated above, the Qualified Dublin Core comprises a set of element refinements, whose main purpose is to make the meaning of an element more specific. This means that a refined element will share the meaning of the "unqualified" element while having a more specific scope.

There exists a guiding principle (called the Dumb-Down Principle) that has been devised in support of the qualification of the Dublin Core elements, which states that, if an application fails to understand a specific element refinement term, it should nevertheless be able to ignore the qualifier assigned to that specific element and treat it as an unqualified element.

Even if this equals to a certain loss of specificity, there still remains the general, broad term, which holds a value that is correct and useful for discovery.

Beside the element refinements, there also exist, in the same class of DC qualifiers, the Encoding Schemes. These qualifiers identify schemes that are helpful in the interpretation of an element value. These schemes include controlled vocabularies and formal notations or parsing rules.

A value expressed using an encoding scheme will thus be a segment selected from a controlled vocabulary (e.g., a term from a classification system or set of subject headings) or a string formatted according to a formal notation (e.g., "2007-01-17") as the standard expression of a date.

Examples

The Dublin Core metatags can be placed within the HEAD section of the HTML code of web pages. Normally, the Dublin Core elements are preceded by the "DC" abbreviation. The best method to be used whenever you need to use Dublin Core elements in your HTML code is to have this metadata embedded so that it will not affect the way in which the browser sees the data and validates the XHTML. You need to embed this metadata so that it will not interfere with the browser's understanding and then rendering of the HTML.

Following is an example of how the Dublin Core Metadata Elements could be used in your Web content:

<head>
<title>Shareware articles | Expert advice on how to sell software online</title>
<meta name="DC.title" content="How to Build the Semantic Web with Dublin Core">
<meta name="DC.creator" content="Avangate">

If you want to be even more specific, you can insert a qualifier, which could look like this:

<meta name="DC.creator.address"content="[email protected]"
<meta name="DC.subject" content="Dublin Core, Dublin Core metadata, Dublin Core element, concept">
<meta name="DC.description" content="emergence of the Dublin Core concept, Dublin Core levels, examples, pro's and con's of Dublin Core">
<meta name="DC.date.created" content="2007-01-17">
<meta name="DC.format" content="text/html">
<meta name="DC.identifier" content="http://www.avangate.com/articles/">
<meta name="DC.language" content="en">
</head>
You can also check the Dublin Core metadata editor.

Pro's and Con's

Pro's

    * It is a basic description mechanism that:
          o can be used in all domains
          o can be used for any type of resource
          o is simple , yet powerful
          o can be extended and can work with specific solutions
    * It makes it easier to find information wherever located (Internet/Intranets)
    * It is a successful standard on the Web
    * It records a growing use in specific communities with high quality requirements, such as:
          o Public Sector and Government Information
          o Corporate knowledge management
    * It may be seen as an essential building block for the Semantic Web(s)


Con's

    * Many of the DC meta names are not taken into account by major search engines, such as Google. DC meta names are mainly used by governments, libraries, museums, archives, publishers, environmental science repositories, print and e-print archives.
    * Given the fact that Dublin Core elements are so versatile, there could be the tendency to over-use them. Many web developers and webmasters consider this tendency as too "spammy" and may result in a web site not getting the desired ranking from a particular search engine.

Conclusion

Dublin Core offers a standardized frame for resource description, in particular Web pages. For the time being, even if the use of DC is not a trend per se (the webmasters don't find any direct benefit in using Dublin Core), it answers nevertheless to a constantly growing need for the definition of metadata that can be placed within the global whirlpool of information. At the end, when the standard is used and the metadata is properly documented, a great step forward will have been taken towards the emergence of a Semantic Web.


Add commentAdd comment (Comments: 0)  

Advertisement

Partners

Related Resources

Other Resources