FoafContradictions

From FOAF

Jump to: navigation, search

(as announced in 2003-07-13 foaf weblog).

Q: If I can say what I like in FOAF file, even say nothing, and if I can use any semantic web vocabularies at all, all mixed together, how can we ever know if a FOAF file is 'wrong' (broken, in error)?

A: There are a few different answers here...

  * recall that all FOAF files are RDF documents, and as such must obey the basic file-format structures imposed by the RDF/XML specification from W3C. This means we can use W3C's RDF Validator to do some basic checking of the format of an RDF document.
  * but what if we want to know if the document is wrong? well, in the general case, there is no solution. machines can't generally know what state the world needs to be in to match the claims made in an RDF document. They don't know your name, they don't know your age, they don't know who your friends are.
  * what if we want to do consistency checking, look for documents that contradict themselves. Can machines help us here? Yes!

FOAF consistency checking...

An important point about computers: they know nothing about the world, but if you give them some rules about descriptions of the world, they can help you detect when you have contradicted yourself.

(This example from an IRC conversation with Masaka)

So, in FOAF we have the notion of a Person, and a Document. The FOAF specification says that foaf:Person and foaf:Document are 'disjoint with' each other. In other words, the definitions of foaf:Person and foaf:Document explain, in a machine-processable way, that there are no things in the world that are simultaneously documents and people. This captures some small part of the meaning of these concepts, and allows machines to check our usage of them.

Imagine we write the following RDF:

<foaf:Person>
<foaf:nick>danbri</foaf:nick>
 <foaf:topic>
   <foaf:Person>
    <foaf:name>Elvis Presley</foaf:name>
   </foaf:Person>
   </foaf:topic>
</foaf:Person>

This says that there exists a Person with a nick of 'danbri' and which has as a foaf:topic a thing that is a Person that has a foaf:name of 'Elvis Presley'.

Is this a sensible thing to say? Let's find out...

Does it make sense? Could there be an arrangement of the world in which this description was true?

To find out, we need to consult the Web, to find out the meaning of the terms used. This is how the Semantic Web works: the Web itself serves as a dictionary which contains descriptions of the terms used to create Semantic Web data files such as FOAF.

So if we look up the definitions of 'topic' and 'Person' in FOAF at http://xmlns.com/foaf/0.1/ we find:

<rdf:Property rdf:about="http://xmlns.com/foaf/0.1/topic" 
  rdfs:label="topic"
  rdfs:comment="A topic of some page or document.">
         <rdfs:domain rdf:resource="http://xmlns.com/foaf/0.1/Document"/>
         <rdfs:range rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource"/>
         <owl:inverseOf rdf:resource="http://xmlns.com/foaf/0.1/page"/>
         <rdfs:isDefinedBy rdf:resource="http://xmlns.com/foaf/0.1/"/>
    </rdf:Property>

What does this tell us? It says "there is a thing that is a property, whose URI is http://xmlns.com/foaf/0.1/topic and which indicates the topic of some page or document. The bit that describes the 'range' and 'domain' is interesting. These are special RDF concepts that indicate something of the meaning of a property.

When we see that foaf:topic has a domain of foaf:Document, we know that anything which is claimed to have a foaf:topic has to be a foaf:Document. The only things in the world that have foaf:topics are foaf:Documents; anything with a foaf:topic is a foaf:Document.

Similarly, when we see that foaf:topic has a range of rdfs:Resource, we can conclude something (less interesting) about things that are the foaf:topic of documents. We can conclude that they must be a 'Resource'. Now since 'Resource' is the RDF word for 'thing', and since in RDF everything is a Resource, this is less interesting, but still informative.

We now know that foaf:topic is a relationship between documents and things; if foaf:topic truly relates two things together, we know that one of them is a document, and the other is a thing that the document is 'about', ie. one of its topics.

Typically we use this to say things like 'This is a document about Elvis Presley'.

So, revisiting our example, can we see the mistake yet?

<foaf:Person>
<foaf:nick>danbri</foaf:nick>
 <foaf:topic>
   <foaf:Person>
    <foaf:name>Elvis Presley</foaf:name>
   </foaf:Person>
   </foaf:topic>
</foaf:Person>

This example tries to use foaf:topic as a relationship between two things of type foaf:Person.

Now that could be OK. As far as a (dumb) computer is concerned, something might be both a person and a document. After all, computers don't have common sense. To a human reader, the error is clear: nothing in the world can be both a person and a document. It doesn't make sense.

But how to help computers understand this tiny fragment of common sense, so they can help us detect mistakes in the FOAF files we write?

Here's how RDF does it:

The FOAF spec at http://xmlns.com/foaf/0.1/ contains the following markup when it defines foaf:Person:

<rdfs:Class rdf:about="http://xmlns.com/foaf/0.1/Person" 
  rdfs:label="Person"
  rdfs:comment="A person.">
   <rdfs:subClassOf rdf:resource="http://xmlns.com/wordnet/1.6/Person"/>
   <rdfs:subClassOf rdf:resource="http://www.w3.org/2000/10/swap/pim/contact#Person"/>
   <rdfs:isDefinedBy rdf:resource="http://xmlns.com/foaf/0.1/"/>
   <owl:disjointWith rdf:resource="http://xmlns.com/foaf/0.1/Document" />
   <owl:disjointWith rdf:resource="http://xmlns.com/foaf/0.1/Organization" />
   <owl:disjointWith rdf:resource="http://xmlns.com/foaf/0.1/Project" />
</rdfs:Class>

The last few lines are the interesting bit. It says that the category ('class') of things that are people is 'owl:disjointWith' the class of things that are Documents. Also Organizations and Projects.

This means that there are no things in the world which can be in both categories at once.

Yet that is just what our example FOAF sample is claiming, implicitly.

By using the foaf:topic property to relate two foaf:Person descriptions together, we buy into the meaning of the foaf:topic property and the foaf:Person class. And when we look at the definitions FOAF provides for those terms, we realise that they can never be truly used together in those combinations.

There is simply no way in which our example could ever be true, given the definitions it relies on.

So this is how advanced RDF and OWL tools should be able to help us check the quality of some aspects of FOAF documents. Even though machines have no knowlege of the world, of people and documents and their relationships, machines can help check our claims against rules describing RDF vocablary. They can detect contradictions.

Right now, most RDF checking tools work purely at the file format level, helping us know whether we have arranged our XML tags correctly. As the Semantic Web effort picks up speed, we are seeing more tools that work at the 'semantic' level, helping check the simple claims that our documents make. But even then it is worth remembering that most of the meaning of these documents is beyond the reach of machines, and that human intellect is needed to comprehend in any deep way what the meaning of 'Person' or 'Document' amounts to. We don't seek to build artificial intelligence here, just to have machines help us by doing what they do best, so we can concentrate on more interesting things...