Exercises tutorial Semantic Days 2012

1 Installing software

Your first exercise is to install the software we will be using on your local computer. We will try to have the software available on CDs or Memory sticks.

1.1 Protégé

Install latest version of Protégé 4.1. Go to Protége's download site and select the version correct for your system.

1.2 D2R Server

Install Java runtime environment.

Download D2R Server. We will go through the installation in later exercises.

2 RDF

In these exercises we will use the RDF serialisation format Turtle to write RDF.

2.1 Exercise

2.1.1 Getting started

Open a plain text editor of your own choice, e.g., notepad, textpad, gedit, and start the file with the following prefix declarations (ignore the line numbers):

1:  @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
2:  @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
3:  @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
4:  @prefix ex: <http://www.example.org#> .
5:  @prefix w: <http://sws.ifi.uio.no/ont/world.owl#> .

2.1.2 Triples

Continue by adding triples that capture the statements:

  • Norway is called "Norway", using the predicate rdfs:label,
  • Oslo is called "Oslo",
  • Oslo is the capital of Norway—use the predicate w:isCapitalOfCountry,
  • Stavanger is called "Stavanger", and
  • Stavanger is a city in Norway—use the predicate w:isCityInCountry.

Use the namespace prefixed ex: for the resources Norway, Oslo and Stavanger, e.g., ex:Norway.

2.1.3 Validate

Validate your finished RDF file using the RDF Validator and Converter. Paste the contents of your RDF file in the text area on the website and set the input format drop-down menu to "Notation 3 (or N-Triples/Turtle)" and click "Validate!". Sort out any errors in your RDF "code" that the validator reports.

2.1.4 Visualise

When your RDF validates, the website will, in addition to giving you a thumbs up, return an RDF/XML rendering of your file. Copy this RDF/XML, open the W3C's RDF validator, and paste the RDF/XML into the text area. Under Display Result Options select "Triples and Graph" and click "Parse RDF".

2.2 Exercise

Add more triples expressing that

  • Stavanger is City, and Rogaland is a Region—use the predicate rdf:type and the resources w:City and w:Region,
  • Rogaland is a region in Norway, and
  • Stavanger is a city in Rogaland.

Create predicates similar to the predicates in the previous exercise, e.g., isCapitalOfCountry to capture the two last bullet points.

Make sure your extended RDF file validates.

2.3 Exercise

Further extend your RDF file to contain that:

Norway is a country with a population of 4985870. The head of state is "King Harald V". Norway has two local names, one in the language "Norwegian bokmål" (language code @nb): "Norge", and one the the language "Norwegian nynorsk" (@nn): "Noreg".

Again, create new predicates for the relations between Norway and the information about Norway. It is natural to use literals for the RDF representation of the statements; try also to specify the datatype or language of the literals where appropriate.

3 SPARQL

In these exercises we will write SPARQL queries and execute them in a SPARQL query interface located at http://sws.ifi.uio.no/snorql/world/. The dataset which is queried is an RDF representation of a traditional relational database containing facts about countries, cities, continents and so on in the world1, similar to the RDF we wrote in the previous exercise.

By using the web browser interface to the RDF representation, e.g., Stavanger, you can look at the dataset in a human friendly readable way and see, e.g., what properties the different types of resources have. We will come back to this database system in later exercises.

For each of the exercises below write a SPARQL query which returns the desired result when executed on the endpoint.

3.1 Exercise, Getting started

First, to get you started, using a web browser go to the address http://sws.ifi.uio.no/snorql/world/. In the text area on this page you should see the SPARQL query

1:  SELECT DISTINCT * WHERE {
2:    ?s ?p ?o
3:  }
4:  LIMIT 10

and press the "Go!" button.

In less than a second you should see the results of the query execution. The query asks for any 10 distinct triples from the dataset. The result I got was the following table. Note that the results you get might very well not be the same.

spo
db:District/AFG/Kabolrdfs:label"Kabol"
db:District/AFG/Qandaharrdfs:label"Qandahar"
db:District/AFG/Heratrdfs:label"Herat"
db:District/AFG/Balkhrdfs:label"Balkh"
db:District/NLD/Noord-Hollandrdfs:label"Noord-Holland"
db:District/NLD/Zuid-Hollandrdfs:label"Zuid-Holland"
db:District/NLD/Utrechtrdfs:label"Utrecht"
db:District/NLD/Noord-Brabantrdfs:label"Noord-Brabant"
db:District/NLD/Groningenrdfs:label"Groningen"
db:District/NLD/Gelderlandrdfs:label"Gelderland"

3.2 Exercise

List all continents.

Formulated more RDF-friendly this exercise would be "select everything which is of type world:Continent", or perhaps even more friendly: "select all the subjects of triples where rdf:type is the predicate and world:Continent is the object."

3.3 Exercise

What is the name of the capital of Albania?

The predicate world:hasCapital connects a country with its capital. The identifier for the resource Albania is

 <http://sws.ifi.uio.no/d2rq/resource/Country/ALB>

3.4 Exercise

List all the names of cities which have a population of more than 5.000.000.

The predicate connecting a city to its population is world:hasCityPopulation, and world:hasName connects it to its name.

3.5 Exercise

List all the names of Chinese cities which have a population of more than 5.000.000.

The predicate connecting a city to its country is world:isCityInCountry. The identifier for China is

 <http://sws.ifi.uio.no/d2rq/resource/Country/CHN>

3.6 Exercise

List all unique government forms.

3.7 Exercise

List all the countries that lie in more than one continent.

The predicate connecting a country to its continents is world:isCountryInContinent.

3.8 Exercise

List all continents with the number of countries they contain.

Tip: Use the function count and GROUP BY.

3.9 Exercise

List all countries which are not independent, i.e, have no independent year (world:hasIndependenceYear).

3.10 Exercise

List all unique government forms with the country which has the maximum value of GNP for this government form. Order the output by the GNP value, the maximum on top.

4 OWL

In these exercises we will create an ontology which defines some of the vocabulary used in the world database we queried in the previous exercise.

The first exercise is some simple modelling exercises using Protégé: making classes, subclasses, setting domain and range and so on.

4.1 Exercise

This exercise is a walk-through of how to get started with creating and editing ontologies in Protégé, showing the basic concepts.2

4.1.1 Getting started with Protégé

  1. Open Protégé and choose to "Create new OWL ontology".
  2. Set the Ontology IRI to http://sws.ifi.uio.no/ont/world.owl.
  3. Choose a location on your local computer to save your ontology, anywhere will do.
  4. Set the Ontology Format to RDF/XML.

4.1.2 Create classes, object properties and data properties

To create a class, select the "Classes" tab, select the class "Thing" and click the "Add subclass" button immediately above "Thing". Create new subclasses of Thing:

  • Country,
  • City and
  • Region.

Repeat the process for object properties and data properties:

Object properties:

  • isCityInCountry,
  • isCapitalOfCountry,
  • isCityInRegion and
  • isRegionInCountry.

Data properties:

  • hasPopulation,
  • hasHeadOfState and
  • hasLocalName.

4.1.3 Creating subclasses and subproperties

State that a capital of a country is always a city in the same country.

This can be done by making isCapitalOfCountry a subproperty of isCityInCountry. In Protégé it is done by selecting isCapitalOfCountry and adding isCityInCountry as a superproperty, or by dragging isCapitalOfCountry onto isCityInCountry in the Object property hierarchy frame.

The process of creating subclasses and subproperties for data properties is similar.

Create a new class "CityState", and make it a subclass of both Country and City.3

4.1.4 Set domain and range for property

Specify the correct domain and range for the object property isCityInCountry. The domain should be City and the range should be Country. In Protégé, select the property and add domain and range in the Description frame.

Specify also the correct domain and range for isRegionInCountry and hasHeadOfState.

4.1.5 Disjoint classes

State that a city is not a region, and vice verse. This is done by making the two classes disjoint. Disjoint classes cannot share any members. In Protégé select one of the classes and add the other class as a disjoint class.

4.1.6 Adding more restrictions

State that a city lies in exactly one country. Specify this by adding an anonymous superclass to City. Select City, and click to add a new superclass. In the box that appears, select the "Class expression editor" and write (remember to be sensitive to cases):

isCityInCountry exactly 1 Country

State also that a city lies in not more than one region. Tip: use max.

4.2 Exercise

If you have skipped the previous exercise, you can get up to speed by downloading the ontology file world.1.owl.

Download and open the dump of the RDF world database in Protégé. Import the ontology you created in the previous exercise. Add more axioms to the ontology:

  1. Create a new object property hasCapital state that it is the inverse of isCapitalInCountry.
  2. Define the class Capitol such that it contains all capitals. In the reasoner menu, select a reasoner, wait for the reasoner to calculate classifications and check if all capitals are inferred as members of Capital.
  3. Define the class Metropolis such that is contains all cities with a population of more than 1.000.000. Apply reasoning and see by the results of the classification if you have modelled correctly.
  4. State that isCountryInContinent is the property chain
    isCountryInRegion o isRegionInContinent
    
  5. Define a class DevelopingCountry as a country which has low life expectancy, e.g., 45 years, and a low GNP, e.g., 10000. Apply reasoning and see if it looks correct.
  6. Define a class DevelopedCountry to be the class of countries which are not a developing country. Again, use reasoning to check the results; are they what you expected?
  7. Set hasGNP and hasLifeExpectancy to be functional properties. Apply reasoning4, check the members of the class DevelopedCountry and explain the effects.
  8. Define AmericanCity as a city which lies on the American continent.
  9. Can you define add the necessary axiom(s) such that Singapore is inferred as a member of CityState?

5 D2R

5.1 Exercise: Set up a D2R server

This is a walk-trough of how to get a D2R server with the world database. The steps are tested on both Windows Vista™ Ultimate and Ubuntu Linux.

  1. Download the D2R software: http://d2rq.org/
  2. … and in the meanwhile read the http://d2rq.org/getting-started. These are the instructions we will follow.
  3. Extract the downloaded archive into a suitable location.
  4. To be able to translate the data from a relational database format to RDF, the D2R server needs a mapping. Luckily, D2R is capable of generating a mapping based on the database schema. Change into the D2R Server directory and run:
    generate-mapping -o mapping.n3 -u testinf3580 -p testinf3580 jdbc:mysql://db4free.net/testinf3580
    

    The D2R server connects to the MySQL database and creates a mapping based on the database schema. This may take a few seconds because of the network communication with the database server.

  5. The mapping generates the file mapping.n3.
  6. Start the D2R server with the command
    d2r-server mapping.n3
    
  7. Wait until you get the message
    [[[ Server started at http://localhost:2020/ ]]]
    
  8. Open http://localhost:2020 in your web browser.
  9. That's it!

Note that the D2R server you have just setup will be slower than the server at sws.ifi.uio.no. Your server needs to communicate over the Internet with the external database, while the D2R server at sws.ifi.uio.no communicates with a local database. It is quite easy to setup a MySQL database running on your local computer. A dump of the world database can download be downloaded from sws.ifi.uio.no.

5.2 Exercise

You will notice that the data in the server you have setup is different than the server running on http://sws.ifi.uio.no/d2rq/. This is because we have changed the generated mapping file slightly, extracting continents, districts and regions to own classes, changing property names and adding datatypes to literals.

Changing the mapping by using the specification D2R Mapping Language.

The current mapping file for the world database at http://sws.ifi.uio.no/d2rq/ is sws_mapping.n3, which is available for download.

Download the mapping file by clicking on the link above and restart your D2R server with this mapping.

5.3 Exercise

Download the jar file D2RQueryEngine from the download catalogue on the tutorial homepage. The program reads a D2R server dataset and an ontology and applies reasoning to the combined knowledge base. Then, a query is sent to the dataset with the inferred triples and output is written to the console/standard out. The program reads a D2R mapping file, an ontology file and a query, and is executed like this:

java -jar D2RQueryEngine mapping.n3 http://sws.ifi.uio.no/ont/world.owl query.rq

Write a query which lists all capitols which are metropolises and run the query with the D2RQueryEngine like shown above.

5.3.1 D2RQueryEngine

The java code for the D2RQueryEngine program is listed below. If you know a little java you will see that it is quite simple to get started with programming of semantic technologies.

Import necessary external classes. All but the two last are Jena's. The two last are to be able to connect to a D2R server and use the Pellet reasoner, respectively.

1:  import com.hp.hpl.jena.ontology.*;
2:  import com.hp.hpl.jena.query.*;
3:  import com.hp.hpl.jena.reasoner.*;
4:  import com.hp.hpl.jena.rdf.model.*;
5:  import com.hp.hpl.jena.util.*;
6:  import de.fuberlin.wiwiss.d2rq.ModelD2RQ;
7:  import org.mindswap.pellet.jena.PelletReasonerFactory;
8:  public class D2RQueryEngine {

A method which creates a Jena model, i.e., a representation of an RDF graph, by reading from file:

 9:    public Model readModel(String file) {
10:      return FileManager.get().loadModel(file);
11:    }

A method which takes a query object and a model object, queries the model with the query according to the type of the query, SELECT, CONSTRUCT or ASK, and returns the results accordingly:

12:    protected void queryModel(Query query, Model model){
13:      QueryExecution qexec = QueryExecutionFactory.create(query, model);
14:  
15:      if(query.isSelectType()){
16:        ResultSet rs = qexec.execSelect();
17:        ResultSetFormatter.out(rs, query);
18:      } 
19:      else if(query.isConstructType()){
20:        Model result = qexec.execConstruct();
21:        result.write(System.out, "TTL");
22:      } 
23:      else if(query.isAskType()){
24:        boolean result = qexec.execAsk();
25:        System.out.println(result);
26:      } 
27:      else{ System.err.println("Error!"); 
28:      }
29:      qexec.close();
30:    }

A method which runs the whole shebang:

  1. Reads input, i.e., the path to a D2R mapping file, an ontology and a query.
  2. Creates model with an attached Pellet reasoner and the ontology.
  3. Sets the type of ontology model
  4. Adds the D2R data, "automatically" causing reasoning and inferred triples to be added to the model.
  5. Queries the model.
31:    public void run(String d2r, String ont, String q){
32:      
33:      // Read input
34:      Model d2rData = new ModelD2RQ(readModel(d2r), null);                
35:      Model ontData = readModel(ont);
36:      Query query = QueryFactory.read(q);
37:  
38:      // Create ontology model
39:      Reasoner reasoner = PelletReasonerFactory.theInstance().create();
40:      InfModel infModel = ModelFactory.createInfModel(reasoner, ontData);
41:      OntModelSpec spec = new OntModelSpec(OntModelSpec.OWL_MEM);
42:      OntModel ontModel = ModelFactory.createOntologyModel(spec, infModel);
43:  
44:      // Add d2r data
45:      ontModel.add(d2rData);
46:      
47:      // Query model and write results to stdout
48:      queryModel(query, ontModel);
49:    }

main:

50:    public static void main(String[] args) {
51:      D2RQueryEngine dave = new D2RQueryEngine();
52:      dave.run(args[0], args[1], args[2]);
53:    }
54:  } //end class

Footnotes:

1 The database is the same that MySQL provides to their users for experimentation, see http://dev.mysql.com/doc/world-setup/en/world-setup.html. The sample data used in the world database is Copyright Statistics Finland, http://www.stat.fi/worldinfigures.

2 An in-depth tutorial for Protégé is developed by the University of Manchester and is available online: http://owl.cs.manchester.ac.uk/tutorials/protegeowltutorial/

3 A possible member of this class could be Singapore, although in the world database the city Singapore and the country Singapore are not the same individual.

4 In my experience, the reasoner FaCT++ tackled this task better than Pellet.

Author: Martin G. Skjæveland

Date: 2012-05-08 10:03:59 CEST

HTML generated by org-mode 7.3 in emacs 23