We argued previously that
there is a need for a system of identity for Semantic Web Agents, particularly in the process of making judgements of trust.
Examining the requirements of a system of identity, we recognise that such a system cannot count on universal uptake among Semantic Web agents, and therefore it cannot require each agent to state an identity for itself. Additionally even if universal uptake could be relied upon, we cannot count on the honest and benevolent behaviour of every Semantic Web agent. Thus, as we briefly mentioned at the end of our previous post, a system of identity for the Semantic Web must be primarily built around observable characteristics as a measure of identity.
As an analogy; when surfing the Web you would not rely on a Website's claim that it is your bank's online portal, you would rely on the factors you can observe (such as the domain name and also the digital certificate) to inform your judgement. Digital certificates are especially important if you are connected to the Internet over an untrusted network connection.
Building on our earlier example of a rudimentary HTTP-based Semantic Web agent, suppose we request a URI from it, and receive some RDF in response. The data we collect about the identity of the agent may look something like the following:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix ex: <http://example.com/ont/>.
_:agent1
rdf:type ex:HTTPAgent;
ex:port 80;
ex:host "agent.example.com";
ex:ip "10.0.0.1";
ex:time "2010-04-14T14:37:37Z"^^xsd:dateTime.
Suppose at some later date we again communicate with the agent at the domain
agent.example.com, and in the process observe that the DNS entry has changed, and the domain now refers to a new IP address. Do we then consider this to be the same agent which we have previous experience of? Further, is the information we have sufficient to make such a decision? Other attributes may influence the judgement of similarity if they significantly alter the behaviour of the agent, software version numbers or digital certificates, for example.
Returning to our analogy, if your browser stored the credentials for your bank's online banking portal, you would specify very strict criteria, very similar to what we described above, to dictate which websites are permitted to see this information.
Below follows a second observation record, for an interaction with the same agent at a different IP address.
_:agent2
rdf:type ex:HTTPAgent;
ex:port 80;
ex:host "agent.example.com";
ex:ip "10.0.0.2";
ex:time "2010-04-14T14:37:37Z"^^xsd:dateTime.
It is possible to encode our criteria for equivalence using OWL (to some degree) such that a reasoner can identify that two agents are in fact the same entity. This involves declaring a class of all things which meet the criteria of being a particular agent such that those which meet the necessary and sufficient criteria may be considered the same.
Unfortunately the equivalence afforded by OWL causes the effective merging of the identifiers, such that, as below, the metadata from the two different requests becomes inseparable.
_:agent1
owl:sameAs _:agent2;
rdf:type ex:HTTPAgent;
ex:port 80;
ex:host "agent.example.com";
ex:ip "10.0.0.1";
ex:ip "10.0.0.2";
ex:time "2010-04-18T10:24:12Z"^^xsd:dateTime;
ex:time "2010-04-14T14:37:37Z"^^xsd:dateTime.
The problem with this approach is not the use of OWL classification (though it is somewhat ill suited to this task), rather it is the result of a simplistic ontology design. We acknowledge that this crude example ontology has many flaws (the assumption that a HTTP agent operates on a sole port and network address, for example), however to fully satisfy our potential requirements we must adopt an event-based ontology design, as these observations are inherently temporal in nature.