Here are two serialisations of (roughly) the same thing:
(1)
<foaf:Person>
<foaf:name>Damian</foaf:name>
<foaf:mbox rdf:resource="mailto:pldms@mac.com"/>
<foaf:knows>
<foaf:Person>
<foaf:name>Libby</foaf:name>
</foaf:Person>
</foaf:knows>
</foaf:Person>
(2) <rdf:Description foaf:name="Damian"> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/> <foaf:mbox rdf:resource="mailto:pldms@mac.com"/> <foaf:knows rdf:resource="http://example.com/#libby"/> </rdf:Description> <foaf:Person rdf:about="http://example.com/#libby"> <foaf:name>Libby</foaf:name> </foaf:Person>
The second is pretty perverse, but still valid. XML tools see a great difference between to two, whereas RDF tools see very little (namely that one node has gained a label). RDF tools take some getting used to, and TreeHugger is an attempt to ease the transition by making xpaths navigate rdf graphs.
Suppose I want to get the name of the person that the person named
with mailbox pldms@mac.com knows. In case (1) I might
try:
/foaf:Person /foaf:mbox[@rdf:resource='mailto:pldms@mac.com'] /.. /foaf:knows /foaf:Person /foaf:name/text()
In case (2) it's much harder.
Here's another one: get the mailbox of the thing of type 'http://xmlns.com/foaf/0.1/Person' named 'Damian'. For case (2) we could try:
/rdf:Description[@foaf:name='Damian'] /rdf:type[@rdf:resource='http://xmlns.com/foaf/0.1/Person'] /.. /foaf:mbox /@rdf:resource
This won't work for case (1) of course.
Both paths, however, work for either serialisation in TreeHugger.
TreeHugger is implemented as a Saxon extension function. The function returns a document root (about which more below). All of the following assumes that root.
| XPath | TreeHugger |
|---|---|
| / | A little complicated in the implementation, but consider it as representing the rdf graph |
| /<rdf_class> |
All the subjects in the graph of type rdf class. Special
cases: /rdf:Description and /rdfs:Resource
which return all subjects in the graph. Example:
/rdf:Seq - returns every sequence node in the graph.
|
| ./<rdf_class>/ | The children of an rdf class node are all the rdf properties of that node. |
| ./<rdf_property>/ | The children of an rdf property node are all the objects
such that <parent, property, child node>
holds. Because of the RDF/XML striped syntax the full path
will always be of the form
/class/property/class/property...
|
| ./<node>/@<attribute> | For an rdf class node this returns the objects of
property <attribute>,
eg. ./foaf:Person/@foaf:name gives all the
foaf:name's of the person. One special case is
rdf:about which gives the URI (if any) of node,
so ./rdfs:Resource[@rdf:about] selects non-blank
nodes. For property nodes there is only one attribute:
rdf:resource, which gives the resource objects
of the property, eg:
./rdf:type[@rdf:resource='http://xmlns.com/foaf/0.1/Person']
(used above), a verbose way of saying that a node has type
foaf:Person. Warning: the
implementation has a limitation when the
node[@attribute='val'] form is used. If the
node has more than value for a given attribute then it is
uncertain which value will be used for the match. For
example if a resource has more than one
dc:title you will have unpredictable results
finding it using
rdf:Description[@dc:title='title'].
|
| ./text() | A literal. This fits the xml usage nicely, but in rdf literals may have attributes: language and datatype. Needs work. |
| ./.. | Parent. This just backs up one level in the path, and is
thus dependant on the path taken. RDF people may want to
look at inv:property, described below. |
In the following I'll assume the namespace declarations:
xmlns:th="http://rootdev.net/net.rootdev.treehugger.TreeHugger" xmlns:inv="http://rootdev.net/treehugger/inverse#"
TreeHugger currently adds three extension functions:
$th:document(rdf document)$th:documentRDFS(rdf document, rdf
schema)$th:documentOWL(rdf document, owl
ontology)TreeHugger also adds a 'pseudo' function, or (better) 'pseudo' axis:
.../inv:property/...foaf:Person/inv:dc:creator takes you to things which
this person created. It's a hack, but
a very useful one.A brief example:
$documentRDFS('foo.rdf', 'http://xmlns.com/foaf/0.1/')/
foaf:Person/
inv:foaf:knows/
foaf:Person/
rdfs:label/text()
This path takes all the people in foo.rdf, goes to people who know
them, and returns the label for those people. Since foaf:name
is a subproperty of rdfs:label, and this is an RDFS document,
this will return the names of people who know people.
Alt, Bag, and Seq look nasty in the rdf model. TreeHugger follows the XML serialisation to protect you:
<xsl:for-each
select="./rss:items/rdf:Seq/rdf:li/rdf:Resource">
...
do something with each member of the sequence
...
</xsl:for-each>
The members of the sequence will be in the correct order, of course. prdf:parseType="collection" support will be added when I find out how to do it.
Take an example document like Dan's foaf file. Let's try to make an html file which lists all the people who know people, plus the people they know and a link to their mailboxes.
The TreeHugger style sheet looks like this:
<?xml version="1.0"?>
<html xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:th="http://foo.com/blah/net.rootdev.treehugger.TreeHugger"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xsl:version="1.0">
<!-- Load an rdf document -->
<xsl:variable name="doc"
select="th:document('http://rdfweb.org/people/danbri/rdfweb/danbri-foaf.rdf')"
/>
<head>
<title>Some people</title></head>
<body>
<h2>People who know people</h2>
<h5>(the happiest people)</h5>
<!-- find people who know people -->
<xsl:for-each select="$doc/foaf:Person[@foaf:knows]">
<!-- show their name -->
<h4><xsl:value-of select="./foaf:name/text()"/></h4>
<!-- get people they know who have a name and mailbox -->
<xsl:for-each select="./foaf:knows/foaf:Person[@foaf:name and @foaf:mbox]">
<p>knows:
<!-- show the name in a line to their mailbox -->
<a href="{./foaf:mbox/rdfs:Resource[last()]}">
<xsl:value-of select="./foaf:name/text()"/></a>
</p>
</xsl:for-each>
</xsl:for-each>
<p>
<b><i><u>
<!-- show name of person with mbox 'libby...' -->
<!-- here properties are used as attributes and elements -->
<xsl:value-of
select="$doc/foaf:Person[@foaf:mbox='mailto:libby.miller@bristol.ac.uk']/foaf:name/text()"
/>
</u></i></b>
</p>
</body>
</html>
The result is here. (Martin Poulter knows noone because the person he knows has no mbox, btw).
Although the TreeHugger implementation doesn't work like this, a more efficient implementation could take complete path expressions and translate them to RDF queries. This would be faster over RDB-backed models. (At some point I'll write a little perl or ruby script to do this).
The following are paths translated to squish (minus namespace declarations). The node set becomes an result set with each row containing one variable binding: 'node'.
/foaf:Person/foaf:name/text()
becomes:
SELECT ?node WHERE (rdf:type ?a foaf:Person) (foaf:name ?a ?node)
/foaf:Person[@foaf:mbox="mailto:pldms@mac.com]/foaf:name/text()
becomes:
SELECT ?node WHERE (rdf:type ?a foaf:Person) (foaf:mbox ?a mailto:pldms@mac.com) (foaf:name ?a ?node)
/foaf:Person/inv:foaf:knows/foaf:Person/foaf:mbox/rdfs:Resource
becomes:
SELECT ?node WHERE (rdf:type ?a foaf:Person) (foaf:knows ?b ?a) [note: inverted] (rdf:type ?b foaf:Person) (foaf:mbox ?b ?node)
/rdfs:Resource/foaf:knows/foaf:Person/../../foaf:mbox/rdfs:Resource
becomes:
SELECT ?node WHERE (foaf:knows ?a ?b) [note: 'a' untyped in path] (rdf:type ?b foaf:Person) (foaf:mbox ?a ?node) [note: ../../ takes us back to 'a']