RDF – Basic Building Blocks of Semantic Web

In the first post, we talked about the general description of Semantic Web and how it can be useful. In this post, we would try to look at RDF which is the basic building block. RDF is Resource Description Framework which was defined as standard for encoding metadata by W3C in 1999. The idea for this standard is to make metadata readable by machines.

The standard is domain agnostic. It is fair to consider RDF for Semantic Web in the same way as we have HTML for the Web. The format of the RDF is that it has 3 parts.

Screenshot from 2016-07-22 15-31-19.png

A combination of this triplet is called a statement. The subject and object are two things in the world and the predicate connects them. Each statement represents a fact  and a collection of facts forms a RDF graph. The graph is a If you recall from the earlier blog post, each of these statements combine together to form a graph like the one below

Screenshot from 2016-07-19 19-41-48

The Subject and the Object can be proper nouns like things, cities etc or abstract things like “resourcefulness”. The subject or object is called a Resource and we are defining the resource. Hence the Resource Definition Framework (RDF).

Having a unique global name is important

If you notice, the subject and object are names of resources. This name can create issues if it is not universal. Assume that we denote the Movie Gladiator with “MovieID:Gladiator” however someone else could have called it “mid:GladiatorTheMovie”. In this case in the sematic web terminology, the 2 subjects are quite different. Another problem is if someone used “MovieID:Gladiator” to represent something totally unrelated to the movie Gladiator. If this is the case then we might end up merging graphs which are unrelated. Hence, to remove this ambiguity, the name of the resource should be global and should be identified by Uniform Resource Identifier (URI)

Usually, these URIs are either a hash URI or a slash URI. For example,
http://www.knoldus.com/about/team/erik is a slash URI and
http://www.knoldus.com/about/team#erik is a hash URI. Earlier the slash URIs were expected to return a resource from the web and the hash URIs were not but this difference is blurring now.

The idea is to re-use the URIs that already exist and create new ones only if we have to. The URIs can be long names so it is usually best to represent a URI with its XML Qualified Name (QName). For example we can define the mapping as

and hence http://www.knoldus.com/about/team#erik can be written as knoldus:erik

Apart from the subject and the object, the predicate name must also be a URI and should only be created if one does not exist already. This allows in creating shared vocabularies on the web and allows us to use predicates as subject or object when the situation thus demands.

Thus, in the above example, if the predicate was represented as a URI as well then the statement a.k.a fact a.k.a triple would look like this

Literals and Blank Nodes

The object can be a URI or it can be a literal. Literals can be represented as String optionally with a language tag so that the machine reading the literal knows how to decipher it. Examples would be

In the above example, the first two literals are untyped, i.e. they do not have a specific type assigned whereas the last statement has a type associated which is  and hence the last literal is a typed literal.

Sometimes we have a situation where in the subject or the object might not have a unique URI. In such cases it is called a Blank node.  The blank node might have further predicates and objects associated with it but by itself it is unrecognizable. For example, in our movie RDF graph lets add a statement for reviewedBy and represent it like this

Screenshot from 2016-08-02 19-12-59

In this scenario, the movie represented by “MovieID:Gladiator” is reviewedBy someone whom we do not know as his URI is unknown or does not exist. However, we do know that this blank node has a predicate called name which has a literal “Vikas Hazrati” associated to it. In RDF graphs however it is common to give this Blank node a local URI and work with that. Hence in our case, the statements with the local URI could be

Keep tuned.


KNOLDUS-advt-sticker

Written by 

Vikas is the CEO and Co-Founder of Knoldus Inc. Knoldus does niche Reactive and Big Data product development on Scala, Spark, and Functional Java. Knoldus has a strong focus on software craftsmanship which ensures high-quality software development. It partners with the best in the industry like Lightbend (Scala Ecosystem), Databricks (Spark Ecosystem), Confluent (Kafka) and Datastax (Cassandra). Vikas has been working in the cutting edge tech industry for 20+ years. He was an ardent fan of Java with multiple high load enterprise systems to boast of till he met Scala. His current passions include utilizing the power of Scala, Akka and Play to make Reactive and Big Data systems for niche startups and enterprises who would like to change the way software is developed. To know more, send a mail to hello@knoldus.com or visit www.knoldus.com

1 thought on “RDF – Basic Building Blocks of Semantic Web

Leave a Reply

%d bloggers like this: