How to Build the Semantic Web with Dublin Core
|
|
|
| 3.0/5.0 (2 votes total) |
|
|
|
Adriana Iordan January 30, 2007
|
Adriana Iordan |
Adriana Iordan is a Web Marketing Specialist at Avangate B.V. She has
in depth knowledge of internet marketing services and website analysis
applied to the software industry and e-commerce development. Avangate
is an eCommerce platform for electronic software distribution
incorporating an easy to use and secure online payment system plus
additional marketing and sales tools. |
Adriana Iordan
has written 2 articles for WebKnowHow. |
View all articles by Adriana Iordan... |
Developed by the Dublin Core Metadata Initiative, the Dublin Core
Metadata Element Set is a set of 15 elements that can be used for
resource description. The Dublin Core Metadata Initiative (DCMI) is an
organization dedicated to promoting the widespread adoption of
interoperable metadata standards and develops specialized metadata
vocabularies for describing resources that enable more intelligent
information discovery systems.
Concept Overview
Even
if its name reminds of Ireland, it actually comes from Dublin, Ohio,
USA, where it appeared in 1995 as a result of a workshop. The second
part of its name, "core", stands for the main characteristic of this
standard, that of being rather broad and generic, but expandable and
usable for describing a quite wide range of resources.
Dublin
Core is more of a standard rather than a meta-tag system. The aim of
DCMI has been from the very beginning to keep this standard simple and
flexible in order to allow authors to provide metadata by themselves,
so that it can be used within the context of any Internet document and
it can be easily adapted into other languages.
The basic element
set is intended to capture most of the fundamental descriptive
categories necessary to facilitate the effective search and retrieval
of information. Additional building blocks can be created to provide
modular compilations of metadata that can be built into more complex
descriptions for information resources.
Therefore, the main characteristics of Dublin Core are:
* Simplicity (of creation and maintenance) * Interoperability (among collections and indexing systems) * International applicability * Extensibility * Modularity
The
semantics of Dublin Core have been established by an international,
cross-disciplinary group of professionals from librarianship, computer
science, text encoding, museum community, and other related fields of
scholarship and practice.
Functionality of Dublin Core
At
the time of the creation of the Dublin Core standard, the DCMI have
identified an apparent so-called "crisis" for Web search and
information retrieval. Given that the web search engines only cover a
small fraction of the Internet, the solution that the DCMI had found
was to develop a standardized vocabulary that could be used efficiently
for the description of Web pages. The Dublin Core Metadata Element Set
is intended to facilitate the discovery of electronic resources.
It
is generally accepted that the Dublin Core standard comprises two
levels: Simple and Qualified. The Simple Dublin Core includes 15
elements, whereas the Qualified DC also includes three additional
elements (i.e. Audience, Provenance and RightsHolder) and a group of
element refinements (or qualifiers) that refine the semantics of the
elements in order to improve resource discovery.
The Simple Dublin Core Metadata Element Set is composed of the following 15 elements:
* Title: a name given to the resource * Creator: an entity primarily responsible for making the resource * Subject: the topic of the resource * Description: an account of the resource * Publisher: an entity responsible for making the resource available * Contributor: an entity responsible for making contributions to the resource * Date: a point or period of time associated with an event in the lifecycle of a resource * Type: the nature or genre of the resource * Format: the file format, physical medium, or dimensions of the resource * Identifier: an unambiguous reference to the resource within a given context * Source: the resource from which the described resource is derived * Language: a language of the resource * Relation: a related resource
* Coverage: the spatial or temporal topic of the resource, the spatial
applicability of the resource, or the jurisdiction under which the
resource is relevant * Rights: information about rights held in and over the resource
It
is important to know that each of the DC elements is optional, it can
be repeated if necessary, and that there is no prescribed order for
using or presenting them.
As stated above, the Qualified Dublin
Core comprises a set of element refinements, whose main purpose is to
make the meaning of an element more specific. This means that a refined
element will share the meaning of the "unqualified" element while
having a more specific scope.
There exists a guiding principle
(called the Dumb-Down Principle) that has been devised in support of
the qualification of the Dublin Core elements, which states that, if an
application fails to understand a specific element refinement term, it
should nevertheless be able to ignore the qualifier assigned to that
specific element and treat it as an unqualified element.
Even if
this equals to a certain loss of specificity, there still remains the
general, broad term, which holds a value that is correct and useful for
discovery.
Beside the element refinements, there also exist, in
the same class of DC qualifiers, the Encoding Schemes. These qualifiers
identify schemes that are helpful in the interpretation of an element
value. These schemes include controlled vocabularies and formal
notations or parsing rules.
A value expressed using an encoding
scheme will thus be a segment selected from a controlled vocabulary
(e.g., a term from a classification system or set of subject headings)
or a string formatted according to a formal notation (e.g.,
"2007-01-17") as the standard expression of a date.
Examples
The
Dublin Core metatags can be placed within the HEAD section of the HTML
code of web pages. Normally, the Dublin Core elements are preceded by
the "DC" abbreviation. The best method to be used whenever you need to
use Dublin Core elements in your HTML code is to have this metadata
embedded so that it will not affect the way in which the browser sees
the data and validates the XHTML. You need to embed this metadata so
that it will not interfere with the browser's understanding and then
rendering of the HTML.
Following is an example of how the Dublin Core Metadata Elements could be used in your Web content:
<head> <title>Shareware articles | Expert advice on how to sell software online</title> <meta name="DC.title" content="How to Build the Semantic Web with Dublin Core"> <meta name="DC.creator" content="Avangate">
If you want to be even more specific, you can insert a qualifier, which could look like this:
<meta name="DC.creator.address"content="[email protected]" <meta name="DC.subject" content="Dublin Core, Dublin Core metadata, Dublin Core element, concept"> <meta
name="DC.description" content="emergence of the Dublin Core concept,
Dublin Core levels, examples, pro's and con's of Dublin Core"> <meta name="DC.date.created" content="2007-01-17"> <meta name="DC.format" content="text/html"> <meta name="DC.identifier" content="http://www.avangate.com/articles/"> <meta name="DC.language" content="en"> </head> You can also check the Dublin Core metadata editor.
Pro's and Con's
Pro's
* It is a basic description mechanism that: o can be used in all domains o can be used for any type of resource o is simple , yet powerful o can be extended and can work with specific solutions * It makes it easier to find information wherever located (Internet/Intranets) * It is a successful standard on the Web * It records a growing use in specific communities with high quality requirements, such as: o Public Sector and Government Information o Corporate knowledge management * It may be seen as an essential building block for the Semantic Web(s)
Con's
* Many of the DC meta names are not taken into account by major search
engines, such as Google. DC meta names are mainly used by governments,
libraries, museums, archives, publishers, environmental science
repositories, print and e-print archives. * Given the fact that
Dublin Core elements are so versatile, there could be the tendency to
over-use them. Many web developers and webmasters consider this
tendency as too "spammy" and may result in a web site not getting the
desired ranking from a particular search engine.
Conclusion
Dublin
Core offers a standardized frame for resource description, in
particular Web pages. For the time being, even if the use of DC is not
a trend per se (the webmasters don't find any direct benefit in using
Dublin Core), it answers nevertheless to a constantly growing need for
the definition of metadata that can be placed within the global
whirlpool of information. At the end, when the standard is used and the
metadata is properly documented, a great step forward will have been
taken towards the emergence of a Semantic Web. |