The OA Community Group's February 2013 Open Annotation Data Model (this document) has been superseded the following W3C Web Annotation Working Group Candiate Recommendations (July 2016):
Implementers are encouraged to begin using the newer specifications as soon as practicable.In general terms, an Annotation expresses the relationship between two or more resources, and their metadata, using an RDF graph. The Open Annotation Core framework explains how to identify and describe the related resources, and how to provide information concerning the creation and intent of the Annotation.
Typically an Annotation has a single Body, which is the comment or other descriptive resource, and a single Target that the Body is somehow "about". The Body provides the information which is annotating the Target. This "aboutness" may be further clarified or extended to notions such as classifying or identifying, discussed in more detail in the section on Motivations.
The Body and Target may be of any media type, and contain any type of content. The Body and Target SHOULD be identified by HTTP URIs unless they are embedded within the Annotation.
All Annotations MUST be instances of the class oa:Annotation
,
and additional subclassing is ONLY RECOMMENDED in order to provide additional, community-specific constraints on the model.
The model does not define classes for Body and Target, as the Body of one Annotation may be the Target of another,
and thus the classes would not convey any usable information.
Instead, typing based on the resource content is RECOMMENDED.
The model defines two relationships, oa:hasBody
and oa:hasTarget
,
to associate the Body and Target resources, respectively, with the Annotation.
Vocabulary Item Type Description oa:Annotation Class The class for Annotations
The oa:Annotation class MUST be associated with an Annotation.oa:hasBody Relationship The relationship between an Annotation and the Body of the Annotation
There SHOULD be 1 or more oa:hasBody relationships associated with an Annotation but there MAY be 0.oa:hasTarget Relationship The relationship between an Annotation and the Target of the Annotation
There MUST be 1 or more oa:hasTarget relationships associated with an Annotation.
<anno1> a oa:Annotation ; oa:hasBody <body1> ; oa:hasTarget <target1> .
SELECT ?anno WHERE { ?anno oa:hasTarget <target1> } => <anno1>
SELECT ?body WHERE { ?anno oa:hasBody ?body ; oa:hasTarget <target1> } => <body1>
Information concerning the general content type (Text, Image, Audio, Video etc) of the Annotation's related resources
is useful to applications. This is expressed using typing of the Body and Target resources, and thereby allows the client to
easily determine if and how it can render the resource without maintaining a long list of media types. For example,
an HTML5 based client can use the information that the Target resource
is an image to generate a <img>
element with the appropriate
src
attribute, rather than having to maintain a list of all of the image media types. The creator of the Annotation may
also not know the exact media type of the Body or Target, but should at least be able to provide this general class.
The Dublin Core Types vocabulary is RECOMMENDED
for expressing this information by means of RDF classes. The most common classes are listed in the table below, but other classes MAY also be used.
The definitions are summarized from the DCMI types documentation. Please note that the advice of the DCMI to encode images of text as dctypes:Text
is NOT RECOMMENDED within the context of Open Annotation, as it does not help consuming clients to interpret or render the resource.
There SHOULD be 1 or more content-based classes associated with the Body and Target resources of an Annotation.
Vocabulary Item Type Description dctypes:Dataset Class The class for a resource which encodes data in a defined structure dctypes:Image Class The class for image resources, primarily intended to be seen dctypes:MovingImage Class The class for video resources, with or without audio dctypes:Sound Class The class for a resource primarily intended to be heard dctypes:Text Class The class for a resource primarily intended to be read
<anno1> a oa:Annotation ; oa:hasBody <body1> ; oa:hasTarget <target1> . <body1> a dctypes:Text . <target1> a dctypes:Image .
SELECT ?anno WHERE { ?anno oa:hasBody ?body . ?body a dctypes:Text } => <anno1>
Some previous annotation systems have used a property with a string literal to represent textual Bodies.
However the Open Annotation model employs the W3C's Content in RDF specification
for consistency with non-textual Bodies and other types of resources. The Content in RDF specification introduces a resource with the class
cnt:ContentAsText
to represent the content, and a property cnt:chars
to hold the content string itself.
If it is important that the Body have an identity, then it is RECOMMENDED that the cnt:ContentAsText
resource be
identified with a UUID URI, but other URIs MAY be used. If this is not considered to be important, for example for short personal notes,
then an RDF blank node SHOULD be used instead. This pattern reduces the burden for minting and maintaining identifiers when it is not necessary to do so, but makes it impossible for further Annotations or other systems to refer to the Body without a Skolem IRI.
If known, the media type of the body SHOULD be given using the
dc:format
property, for example to distinguish between embedded comments in plain text versus those encoded in HTML.
As above, the dctypes:Text
class MAY also be assigned along with the cnt:ContentAsText
class,
as there could be other uses of cnt:ContentAsText
that encode resources with content other than plain text.
This model was chosen over having a literal as the Body directly for the following reasons:
Vocabulary Item Type Description cnt:ContentAsText Class A class assigned to the Body for embedding textual resources within the Annotation.
This class SHOULD be assigned to the Body, however it can be inferred from the presence of the mandatory cnt:chars property.cnt:chars Property The character sequence of the content.
There MUST be exactly 1 cnt:chars property associated with the ContentAsText resource.dc:format Property The media type of the content.
There SHOULD be exactly 1 dc:format property associated with the resource.dc:language Property The language of the content, if known.
There MAY be 0 or more dc:language properties. Each language SHOULD be expressed as a language tag, as defined by RFC 3066.
<anno1> a oa:Annotation ; oa:hasBody <body1> ; oa:hasTarget <target1> . <body1> a cnt:ContentAsText, dctypes:Text ; cnt:chars "content" ; dc:format "text/plain" .
SELECT ?comment WHERE { ?anno oa:hasBody ?body . ?body a dctypes:Text ; cnt:chars ?comment } => "content"
Tagging a resource, either with a short text string or a with a URI, is a common use case for Annotation. Tags are typically keywords or labels, and used for organization, description or discovery of the resource being tagged. In the Semantic Web, URIs are used instead of strings to avoid the issue of polysemy where one word has multiple meanings. In this situation, one would use two different URIs to refer to the "bank" of a river versus a financial institution.
In the Open Annotation Core model, the tag is represented as the Body of the Annotation, and the resource being tagged is the Target.
For example, one might wish to associate the textual tag "paris" with an image of the capital of France to describe what is being depicted.
Similarly, "capital", "city", "photo", "stunning" might all be used as tags for the same image. This situation is modeled using the same method as
embedded textual bodies described in the previous section. The body resource SHOULD also have the oa:Tag
class assigned to it, as applications render comments and tags in very different ways. Figure 2.1.3.1 below depicts this.
For semantic tags, where the tag is expressed as a URI, the Body is the URI of the tagging resource. The above example might instead use the URI
http://dbpedia.org/resource/Paris
as the Body, and is typically a term from a controlled vocabulary intended to be widely reused. It is also necessary to know that this resource should not be retrieved and rendered for the user, and thus the oa:SemanticTag
class MUST be associated with the tagging resource. Note well that semantic tags, and other URIs, may have a fragment component, where the purpose is NOT to select a segment of the document. As such, these resources MUST NOT be re-expressed using FragmentSelectors as that would mean they were selecting a segment of interest. The semantic tagging model is depicted in Figure 2.1.3.2.
As explained in the Motivation section, Annotations that tag resources, either with text or semantic tags,
SHOULD also have the oa:tagging
motivation to make the reason for the Annotation more clear to applications, and MAY have other motivations as well.
It is NOT RECOMMENDED to use the URI of a document as a Semantic Tag, as it might also be used as a regular Body in other Annotations which would inherit
the oa:SemanticTag
class assignment. Instead it is more appropriate to create a new URI and link it to the document, as demonstrated in figure 2.1.3.3 using the foaf:page
predicate.
Vocabulary Item Type Description oa:Tag Class A class assigned to the Body when it is a tag, such as a embedded text string oa:SemanticTag Class [subClass of oa:Tag] A class assigned to the Body when it is a semantic tagging resource; a URI that identifies a concept, rather than an embedded string, frequently a term from a controlled vocabulary foaf:page Relationship The foaf:page relationship expresses the link between a Semantic Tag and the document that describes or somehow embodies the tagging concept.
<anno1> a oa:Annotation ; oa:motivatedBy oa:tagging ; oa:hasBody <tag1> ; oa:hasTarget <target1> . <tag1> a oa:Tag, cnt:ContentAsText ; cnt:chars "tag" . <target1> a dctypes:Image .
<anno1> a oa:Annotation ; oa:motivatedBy oa:tagging ; oa:hasBody <term1> ; oa:hasTarget <target1> ; <term1> a oa:SemanticTag . <target1> a dctypes:Image .
<anno1> a oa:Annotation ; oa:motivatedBy oa:tagging ; oa:hasBody <term1> ; oa:hasTarget <target1> ; <term1> a oa:SemanticTag ; foaf:page <document1> . <target1> a dctypes:Image .
SELECT ?anno WHERE { ?anno oa:hasBody ?body . ?body a oa:Tag } => <anno1>
Many Annotations are about part of a resource, rather than its entirety. Resources can be arbitrarily large, and annotations arbitrarily precise as to their segment of interest. In the Architecture of the World Wide Web, segments of resources are identified using URIs with a Fragment component that at the same time both describes how to extract the segment of interest from the resource, and identifies the extracted content. For simple annotations, it is valuable to be able to use these fragments as either Body or Target.
It is important to be aware of the consequences of using a Fragment URI for the purpose of identifying parts of a resource, and the restrictions that using them places on implementations.
http://example.com/image.jpg#xywh=1,1,1,1
would not be discovered in a simple search for
http://example.com/image.jpg
, even though it is part of it.For systems where these issues are not a concern, or would provide a significant burden to implementation, Fragment URIs MAY be used as either the Body or Target of an Annotation. It is otherwise RECOMMENDED to use the Selector mechanism described in the Specific Resources module, which includes a transition mechanism (oa:FragmentSelector) to ensure compatibility with existing and future fragment specifications.
<anno1> a oa:Annotation ; oa:hasBody <body1> ; oa:hasTarget <t1#xywh> . <body1> a dctypes:Text . <t1#xywh> a dctypes:Image .
SELECT ?anno WHERE { ?anno oa:hasTarget <t1#xywh> } => <anno1>
SELECT ?anno WHERE { ?anno oa:hasTarget ?x . FILTER ( regex (str(?x), "^t1")) } => <anno1>
A special case exists when the Annotation does not have a Body resource. Examples of this sort of situation include bookmarking a particular resource, marking a point within a resource, and highlighting a section of a resource without making a comment about why it is highlighted. A Body may be added to these Annotations later, perhaps explaining the importance of the resource and thus why it was bookmarked.
No new relationships or classes are introduced for Annotations without a Body.
<anno1> a oa:Annotation ; oa:hasTarget <target1> .
SELECT ?anno WHERE { ?anno oa:hasTarget <target1> . FILTER(NOT EXISTS { ?anno oa:hasBody ?notbody }) } => <anno1>
Please note that this query uses features from SPARQL version 1.1.
Conversely to the previous section, it is also possible for an Annotation to have multiple Bodies and/or Targets. Each Body is considered to be equally related to each Target individually, rather than the complete set of Targets. This construction may be used so long as dropping any of the Bodies or Targets would not invalidate the Annotation's meaning. Thus in Figure 1.1.5 below, all of the following are individually true:
Example use cases include having multiple tags about a single target image, or a single comment that applies to several web pages.
For situations when the Annotation needs different semantics for multiple Bodies or Targets, such as when a comment is comparing or contrasting the Targets, and hence not about each equally and individually, it is necessary to use further constructions described in the Multiplicity module. It also allows the Bodies or Targets to be ordered, or for a choice to be made by the client on which one of the resources is most appropriate for the user.
The Open Annotation model does not define any new relationships to enable multiple Bodies or Targets. The oa:hasBody
and oa:hasTarget
relationships are used multiple times with the same Annotation as the subject.
<anno1> a oa:Annotation ; oa:hasBody <body1> ; oa:hasBody <body2> ; oa:hasTarget <target1> ; oa:hasTarget <target2> .
SELECT ?anno WHERE { ?anno oa:hasTarget ?t1 ; oa:hasTarget ?t2 . FILTER( ?t1 != ?t2 ) } => <anno1>
It is important for consuming clients and services to understand the context in which the Annotation was created. In particular, the person or machine responsible for the Annotation deserves credit for their contribution, and the time at which the Annotation was created is useful for filtering out old, potentially irrelevant annotations. The creator of the Annotation is also useful for determining the trustworthiness of the Annotation, potentially based on reputation models. Also, the software used to create and serialize the model, along with when that activity occurred, is useful for both advertising and debugging.
Provenance information can be attached to the Annotation, Body, Target or any other resource in the Annotation graph. Thus, the provenance information attached to an Annotation is not necessarily true for the Body or the Target resources. For instance, a PhD student in 2013 could be formalizing Charles Darwin's notebooks from 1836 as Annotations with textual comments, and so the student would be the author of the Annotation, while Darwin would be the author of the Body. Additional provenance information, such as Darwin as the creator of the Body, SHOULD be provided where possible, but it is considered out of scope for this specification to formalize further requirements. Existing vocabularies, such as Dublin Core Terms, SHOULD be used.
A complete mapping for the Annotation's provenance in the W3C PROV model is provided in Appendix A. Please note that the Annotation node primarily represents the concept of the Annotation, but for simplicity the model allows serialization level properties to be attached to it. If a more accurate model with distinct identifiers is required for particular use cases, then the model expressed in Appendix A is RECOMMENDED.
Vocabulary Item Type Description oa:annotatedBy Relationship [subProperty of prov:wasAttributedTo] The object of the relationship is a resource that identifies the agent responsible for creating the Annotation. This may be either a human or software agent.
There SHOULD be exactly 1 oa:annotatedBy relationship per Annotation, but MAY be 0 or more than 1, as the Annotation may be anonymous, or multiple agents may have worked together on it.oa:annotatedAt Property The time at which the Annotation was created.
There SHOULD be exactly 1 oa:annotatedAt property per Annotation, and MUST NOT be more than 1. The datetime MUST be expressed in the xsd:dateTime format, and SHOULD have a timezone specified.oa:serializedBy Relationship [subProperty of prov:wasAttributedTo] The object of the relationship is the agent, likely software, responsible for generating the Annotation's serialization.
There MAY be 0 or more oa:serializedBy relationships per Annotation.oa:serializedAt Property The time at which the agent referenced by oa:serializedBy generated the first serialization of the Annotation, and any subsequent substantially different one. The annotation graph MUST have changed for this property to be updated, and as such represents the last modified datestamp for the Annotation. This might be used to determine if it should be re-imported into a triplestore when discovered.
There MAY be exactly 1 oa:serializedAt property per Annotation, and MUST NOT be more than 1. The datetime MUST be expressed in the xsd:dateTime format, and SHOULD have a timezone specified.
<anno1> a oa:Annotation ; oa:hasBody <body1> ; oa:hasTarget <target1> ; oa:annotatedBy <agent1> ; oa:annotatedAt "2013-01-28T12:00:00Z" ; oa:serializedBy <agent2> ; oa:serializedAt "2013-02-04T12:00:00Z" .
SELECT ?anno WHERE { ?anno oa:hasTarget <target1> ; oa:annotatedBy <agent1> } => <anno1>
This section recommends best practices for recording information about the agents involved in the Annotation, in particular the annotator and serializer.
The terms listed below are RECOMMENDED for use in describing
agents. Other terms from the FOAF vocabulary are also RECOMMENDED,
but not presented explicitly. Other more specific vocabularies MAY
also be used as required. The PROV class is used as FOAF does not define a SoftwareAgent class.
prov:Agent
and foaf:Agent
are equivalent classes.
Vocabulary Item Type Description foaf:Person Class The class for a human agent, typically used as the class of the object of the oa:annotatedBy relationship prov:SoftwareAgent Class The class for a software agent, typically used as the class of the object of the oa:serializedBy relationship. It might also be used for the object of the oa:annotatedBy for machine generated annotations. foaf:Organization Class The class for an organization, as opposed to an individual. This might be used as the class of the object of the oa:annotatedBy relationship, for example. foaf:name Property The name of the agent.
Each agent SHOULD have exactly 1 name property.foaf:mbox Relationship The email address associated with the agent, using the mailto: URI scheme.
Each agent MAY have 1 or more mailboxesfoaf:openid Relationship The openId URI associated with the agent.
Each agent MAY have 1 or more openIds.foaf:homepage Relationship The home page for the agent.
Each agent MAY have 1 or more home pages.
<anno1> a oa:Annotation ; oa:hasBody <body1> ; oa:hasTarget <target1> ; oa:annotatedBy <agent1> ; oa:serializedBy <agent2> . <agent1> a foaf:Person ; foaf:openid <OpenId1> ; foaf:name "A. Person" . <agent2> a prov:SoftwareAgent ; foaf:homepage <HomePage1> ; foaf:name "Code v2.1" .
SELECT ?anno WHERE { ?anno oa:annotatedBy ?who . ?who a foaf:Person } => <anno1>
In many cases it is important to understand the reasons why the Annotation was created, not just the agents involved. Although previous systems have subclassed the core Annotation class to convey these motivations, it was considered that a richer and better description could be obtained by using a SKOS Concept hierarchy. Motivations are SKOS Concepts, and can be inter-related between communities with more meaningful distinctions than a simple class/subclass tree. This frees up the use of subclassing for situations when it is desirable to be more explicit and prescriptive about the form an Annotation takes.
Each Annotation SHOULD have at least one oa:motivatedBy
relationship to an instance of oa:Motivation
, which is a subClass of skos:Concept
.
A list of high level Motivations is presented below. For more information about how these can be inter-related and new Motivations created, please see Appendix B.
Vocabulary Item Type Description oa:Motivation Class [subClass of skos:Concept] The Motivation for an Annotation is a reason for its creation, and might include things like Replying to another annotation, Commenting on a resource, or Linking to a related resource. oa:motivatedBy Relationship The relationship between an Annotation and a Motivation.
There SHOULD be at least 1 Motivation for each Annotation, and MAY be more than 1.Instances of oa:Motivation oa:bookmarking Instance The motivation that represents the creation of a bookmark to the target resources or recorded point or points within one or more resources. For example, an Annotation that bookmarks the point in a text where the reader finished reading. Bookmark Annotations may or may not have a Body resource. oa:classifying Instance The motivation that represents the assignment of a classification type, typically from a controlled vocabulary, to the target resource(s). For example to classify an Image resource as a Portrait. oa:commenting Instance The motivation that represents a commentary about or review of the target resource(s). For example to provide a commentary about a particular PDF. oa:describing Instance The motivation that represents a description of the target resource(s), as opposed to a comment about them. For example describing the above PDF's contents, rather than commenting on their accuracy. oa:editing Instance The motivation that represents a request for a modification or edit to the target resource. For example, an Annotation that requests a typo to be corrected. oa:highlighting Instance The motivation that represents a highlighted section of the target resource or segment. For example to draw attention to the selected text that the annotator disagrees with. A Highlight may or may not have a Body resource oa:identifying Instance The motivation that represents the assignment of an identity to the target resource(s). For example, annotating the name of a city in a string of text with the URI that identifies it. oa:linking Instance The motivation that represents an untyped link to a resource related to the target. oa:moderating Instance The motivation that represents an assignment of value or quality to the target resource(s). For example annotating an Annotation to moderate it up in a trust network or threaded discussion. oa:questioning Instance The motivation that represents asking a question about the target resource(s). For example to ask for assistance with a particular section of text, or question its veracity. oa:replying Instance The motivation that represents a reply to a previous statement, either an Annotation or another resource. For example providing the assistance requested in the above. oa:tagging Instance The motivation that represents adding a Tag on the target resource(s). Please see the section on Tagging and Semantic Tags for more information.
<anno1> a oa:Annotation ; oa:hasBody <body1> ; oa:hasTarget <target1> ; oa:motivatedBy oa:editing .
SELECT ?anno WHERE { ?anno oa:hasTarget <target1> ; oa:motivatedBy oa:editing } => <anno1>