EXERCISES TUTORIAL SEMANTIC DAYS 2010

For each section there will about 15 minutes to solve exercises. There is probably more work than 15 minutes allows for, but there are exercises of different degree of difficulty, so choose the level you are comfortable with.

1 SPARQL

In these exercises we will write SPARQL queries and execute them in a SPARQL query interface located at http://sws.ifi.uio.no/d2rq/snorql/. The dataset which is queried is an RDF representation of a traditional relational database containing facts about countries, cities, continents and so on in the world1, similar to the RDF we wrote in the previous exercise.

By using the web browser interface to the RDF representation, e.g., Stavanger, you can look at the dataset in a human friendly readable way and see e.g. what properties the different types of resources have. We will come back to this database system in later exercises.

For each of the exercises below write a SPARQL query which returns the desired result when executed on the endpoint.

1.1 Exercise, Getting started

First, to get you started, using a web browser go to the address http://sws.ifi.uio.no/d2rq/snorql/. In the text area on this page you should see the SPARQL query

1:  SELECT DISTINCT * WHERE {
2:    ?s ?p ?o
3:  }
4:  LIMIT 10

Without hesitation, press the "Go!" button.

In less than a second you should see the results of the query execution. The query asks for any 10 distinct triples from the dataset. The result I got was the following table. Note that the results you get might very well not be the same!

spo
db:District/AFG/Kabolrdfs:label"Kabol"
db:District/AFG/Qandaharrdfs:label"Qandahar"
db:District/AFG/Heratrdfs:label"Herat"
db:District/AFG/Balkhrdfs:label"Balkh"
db:District/NLD/Noord-Hollandrdfs:label"Noord-Holland"
db:District/NLD/Zuid-Hollandrdfs:label"Zuid-Holland"
db:District/NLD/Utrechtrdfs:label"Utrecht"
db:District/NLD/Noord-Brabantrdfs:label"Noord-Brabant"
db:District/NLD/Groningenrdfs:label"Groningen"
db:District/NLD/Gelderlandrdfs:label"Gelderland"

1.2 Exercise

List all continents.

Formulated more RDF-friendly this exercise would be "select everything which is of type world:Continent", or perhaps even more friendly: "select all the subjects of triples where rdf:type is the predicate and world:Continent is the object."

1.2.1 Solution

1:  SELECT ?continent
2:  WHERE {
3:    ?continent rdf:type world:Continent .
4:  }

Click to run query 2.

1.3 Exercise

What is the name of the capital of Albania?

The predicate world:hasCapital connects a country with its capital. The identifier for the resource Albania is

 <http://sws.ifi.uio.no/d2rq/resource/Country/ALB>

1.3.1 Solution

The exercise was to list the name of the capital (and not the identifier), so we need to get the identifier for the capital of Albania, which will be bound to the variable ?hasCapital, and get the triple connecting the capital resource to its name, which we do with the predicate world:hasName.

1:  SELECT ?capital_name
2:  WHERE {
3:    <http://sws.ifi.uio.no/d2rq/resource/Country/ALB> world:hasCapital ?capital .
4:    ?capital world:hasName ?capital_name .
5:  }

Click to run query.

Note that it is not possible to use the prefix

PREFIX worlddata: <http://sws.ifi.uio.no/d2rq/resource/>

to select the resource representing Albania, i.e.,

worlddata:Country/ALB

This is because forward slashes (/) are not allowed in localnames, i.e., the part of the identifier following the prefix.

1.4 Exercise

List all the names of cities which have a population of more than 5.000.000.

The predicate connecting a city to its population is world:hasCityPopulation, and world:hasName connects it to its name.

1.4.1 Solution

The trick here is to use FILTER to restrict the output of a query.

1:  SELECT ?city_name
2:  WHERE{
3:     ?city a world:City ;
4:           world:hasCityPopulation ?pop ;
5:           world:hasName ?city_name .
6:  FILTER(?pop > 5000000)
7:  }

Click to run query.

1.5 Exercise

List all the names of Chinese cities which have a population of more than 5.000.000.

The predicate connecting a city to its country is world:isCityInCountry. The identifier for China is

 <http://sws.ifi.uio.no/d2rq/resource/Country/CHN>

1.5.1 Solution

In this query we need to combine the lessons learnt from the two previous queries.

1:  SELECT ?city_name
2:  WHERE{
3:     ?city a world:City ;
4:           world:hasCityPopulation ?pop ;
5:           world:hasName ?city_name ;
6:           world:isCityInCountry <http://sws.ifi.uio.no/d2rq/resource/Country/CHN> .
7:  FILTER(?pop > 5000000)
8:  }

Click to run query.

1.6 Exercise

List all unique government forms.

1.6.1 Solution

Use DISTINCT to only list the unique answers.

1:  SELECT DISTINCT ?government_form
2:  WHERE{
3:    ?x world:hasGovernmentForm ?government_form
4:  }

Click to run query.

1.7 Exercise

List all the countries that lie in more than one continent.

The predicate connecting a country to its continents is world:isCountryInContinent.

1.7.1 Solution

A query which solves this exercise is one that requires that the output countries are all connected to two continents, and—important!—that these continents are not the same continent. This is done in the query be requiring that ?continent1 is different from (!=) ?continent2.

1:  SELECT ?country
2:  WHERE {
3:    ?country world:isCountryInContinent ?continent1, ?continent2
4:    FILTER(?continent1 != ?continent2)
5:  }

Click to run query.

The results of the query should be empty.

1.8 Exercise

List all continents with the number of countries they contain.

Tip: Use the function count and GROUP BY. Even though count is not part of the SPARQL standard, it is often found in the various implementations of the standard.

1.8.1 Solution

1:  SELECT ?continent count(?country)
2:  WHERE {
3:    ?continent a world:Continent .
4:    ?country world:isCountryInContinent ?continent .
5:  }
6:  GROUP BY ?continent

Click to run query.

1.9 Exercise

List all countries which are not independent, i.e, have no independent year (world:hasIndependenceYear).

1.9.1 Solution

This is a tricky one. Since NOT EXISTS is not part of the SPARQL language, due to what is known as the open world assumption, this has no straight-forward solution like a SQL solution to this question would. On one side, this question does not make sense in an open world as there is an infinite number of possibilities to check before a positive answer can be definitive. On the other hand we are querying a finite set of triples and it reasonable to ask the data if there is something in the data that does not have some property, e.g., for countries which does not have a year of independence.

The solution is to make the property optional and then filter in all the resources where this property is not bound.

1:  SELECT ?country_name
2:  WHERE {
3:        ?country a world:Country ;
4:                 world:hasName ?country_name .
5:        OPTIONAL{
6:           ?country world:hasIndependenceYear ?year .
7:        }
8:        FILTER (!bound(?year))
9:  } ORDER BY ?country_name

Click to run query.

1.10 Exercise

List all unique government forms with the country which has the maximum value of GNP for this government form. Order the output by the GNP value, the maximum on top.

1.10.1 Solution

This is similar to the NOT EXISTS solution above. There is no MAX in SPARQL, so we need to select the countries where there is no other country with the same form of government and a greater GNP.

Sorting of the output is done with ORDER BY followed by the variables to be sorted. Descending sorting order is achieved with DESC.

 1:  SELECT ?government ?country ?gnp
 2:  WHERE {
 3:       ?country world:hasGovernmentForm ?government;
 4:                world:hasGNP ?gnp .
 5:       OPTIONAL {
 6:         ?other_country world:hasGovernmentForm ?government;
 7:                        world:hasGNP ?other_gnp .        
 8:         FILTER (?gnp < ?other_gnp)
 9:       }
10:       FILTER (!bound (?other_country))
11:  }
12:  ORDER BY DESC(?gnp)

Click to run query.

Footnotes:

1 The database is the same that MySQL provides to their users for experimentation, see http://dev.mysql.com/doc/world-setup/en/world-setup.html. The sample data used in the world database is Copyright Statistics Finland, http://www.stat.fi/worldinfigures.

2 Sadly, this will only work if you're reading a digital version of this document

Author: Martin G. Skjæveland <martige@ifi.uio.no>

Date: 2010-05-31 08:27:12 CEST

HTML generated by org-mode 6.34trans in emacs 23