Getting Started with Linked Data

Getting Started with Linked Data

Linked Data is the set of best practices for publishing and connecting structured data on the Web. Its main objective is to liberate data from silos that are framed by proprietary database schemas following four rules, defined by Tim Berners-Lee in 2006:

  1. to use of URIs (uniform resource identifiers) to identify resources uniquely;
  2. to use of http URIs so people can access the information about the resource;
  3. to provide information about the resources using standard formats like RDF/XML; and
  4. to include links to other resources, URIs, enhancing the linking between different resources distributed on the web.

These principles are defined as rules, but in reality are rather recommendations or best practices for the development of the semantic web. You can publish data that meets only the first three principles, but the failure to implement the fourth makes data less visible and, therefore, less reusable.

What is RDF?

RDF is the Resource Description Framework for metadata on the Web developed by the W3C. It is based on the idea of ??declaring resources using the expression in the form subject-predicate-object. This expression is known as RDF triple.  An RDF triple contains three components, all with its own URI:

  • Subject, a URI, a person, or node, is the entity to which we refer;
  • Predicate is the property or relationship you want to set about the subject;
  • Object is the value of the property or another resource that establishes the relationship.

By using URIs to link data,  the Semantic Web becomes a kind of large database that allows people and machines to explore the information referenced and interconnected. The Web-based on LOD is a breakthrough in content syndication, which uses external data sources to create new services.

What is Linked Open Data?

Linked Open Data (LOD) is Linked Data distributed under an open license that allows its reuse for free. In 2010, Tim Berners-Lee defined a 5-star rating scheme to encourage data providers to provide linked data under open licenses. The scheme uses gold stars to evaluate the availability of linked data as linked open data.

How to facilitate the linking between resources?

Simply transforming database schemas into RDF does not create linked data.  There is a chance to get stuck at the 4th star in the 5-star rating scheme.  To create automatic links between RDF triple stores on the web should be possible, otherwise there is a risk of creating RDF silos.  The easiest way to facilitate the establishing of automatic linking between datasets is the use of standard vocabularies, including standard vocabularies for describing data/metadata elements and standard vocabularies for indicating values.

In order to give content providers with a set of recommendations that will support the selection of appropriate encoding strategies for producing LOD-enabled data, the AIMS Team plans to prepare a series of LODE recommendations  that overarch a wide range of resource types including the encoding strategies for producing LOD-enabled bibliographical data as well as the encoding of value vocabularies used in describing agents, places, and topics in bibliographic data.

Examples of Vocabularies using Linked Data

Resource Topics Concepts Languages Linked Data Type of link
AGROVOC Agriculture, food, fishery, forestry 31 956 EN, ES, FR + 19 more Yes skos:broader, skos:narrower, skos:related
EUROVOC General EU 6 779 EN, ES, FR + 21 more Yes skos:exactMatch
GEMET Environment 5 298 EN, ES, FR + 30 more Yes skos:exactMatch
Library of Congress Subject Headings (LCSH) General 30 784 EN Yes skos:exactMatch
NAL Thesaurus General 30 298 EN, ES No skos:exactMatch
RAMEAU Répertoire d’autorité-matière encyclopedique et alphabetique unifie  General 16 407 FR Yes skos:exactMatch
STW – Thesaurus for Economics Economy 1 165 EN, DE Yes skos:exactMatch
TheSoz – Thesaurus for the Social Sciences Social sciences 7 750 EN, DE Yes skos:exactMatch
Geopolical Ontology Country Names 253 AR, CH, EN, ES, FR, RU Yes skos:exactMatch
Dewey Decimal Classification General 409 EN, ES, FR + 8 more Yes skos:exactMatch
DBpedia General 10 989 EN, ES, FR + 8 more Yes skos:exactMatch
skos:closeMatch
SWD (Schlagwortnormdatei) General 6 245 DE Yes skos:exactMatch
skos:closeMatch
skos:broadMatch
skos:narrowMatch
GeoNames Geographical database 212 EN, ES, FR + 63 more Yes skos:exactMatch

Other Vocabularies using Linked Data

CAB Thesaurus, CAT, AgroXML, GBIF, ASFA, FAO Biotechnology Glossary, UMBEL and the Thesaurus Ethics in the Life Sciences.

Further Reading

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *