# RDF::RDFa reader/writer
[RDFa][RDFa 1.1 Core] parser for RDF.rb.
## DESCRIPTION
RDF::RDFa is an RDFa reader and writer for Ruby using the [RDF.rb][RDF.rb] library suite.
## FEATURES
RDF::RDFa parses [RDFa][RDFa 1.1 Core] into statements or triples.
* Fully compliant RDFa 1.1 parser.
* Template-based Writer to generate XHTML+RDFa.
* Writer uses user-replacable [Haml][Haml]-based templates to generate RDFa.
* If available, Uses Nokogiri for parsing HTML/SVG, falls back to REXML otherwise (and for JRuby)
* [RDFa tests][RDFa-test-suite] use SPARQL for most tests due to Rasqal limitations. Other tests compare directly against N-triples.
Install with 'gem install rdf-rdfa'
### Important changes from previous versions
RDFa is an evolving standard, undergoing some substantial recent changes partly due to perceived competition
with Microdata. As a result, the RDF Webapps working group is currently looking at changes in the processing model for RDFa. These changes are now being tracked in {RDF::RDFa::Reader}:
#### RDFa 1.1 Lite
This version fully supports the limited syntax of [RDFa Lite 1.1][]. This includes the ability to use
@property exclusively.
#### Remove RDFa Profiles
RDFa Profiles were a mechanism added to allow groups of terms and prefixes to be defined in an external resource and loaded to affect the processing of an RDFa document. This introduced a problem for some implementations needing to perform a cross-origin GET in order to retrieve the profiles. The working group elected to drop support for user-defined RDFa Profiles (the default profiles defined by RDFa Core and host languages still apply) and replace it with an inference regime using vocabularies. Parsing of @profile has been removed from this version.
#### Vocabulary Expansion
One of the issues with vocabularies was that they discourage re-use of existing vocabularies when terms from several vocabularies are used at the same time. As it is common (encouraged) for RDF vocabularies to form sub-class and/or sub-property relationships with well defined vocabularies, the RDFa vocabulary expansion mechanism takes advantage of this.
As an optional part of RDFa processing, an RDFa processor will perform limited
[OWL 2 RL Profile entailment](http://www.w3.org/TR/2009/REC-owl2-profiles-20091027/#Reasoning_in_OWL_2_RL_and_RDF_Graphs_using_Rules),
specifically rules prp-eqp1, prp-eqp2, cax-sco, cax-eqc1, and
cax-eqc2. This causes sub-classes and sub-properties of type and property IRIs to be added
to the output graph.
{RDF::RDFa::Reader} implements this using the `#expand` method, which looks for `rdfa:usesVocabulary` properties within the output graph and performs such expansion. See an example in the usage section.
#### RDF Collections (lists)
One significant RDF feature missing from RDFa was support for ordered collections, or lists. RDF supports this with special properties `rdf:first`, `rdf:rest`, and `rdf:nil`, but other RDF languages have first-class support for this concept. For example, in [Turtle][Turtle], a list can be defined as follows:
[ a schema:MusicPlayList;
schema:name "Classic Rock Playlist";
schema:numTracks 5;
schema:tracks (
[ a schema:MusicRecording; schema:name "Sweet Home Alabama"; schema:byArtist "Lynard Skynard"]
[ a schema:MusicRecording; schema:name "Shook you all Night Long"; schema:byArtist "AC/DC"]
[ a schema:MusicRecording; schema:name "Sharp Dressed Man"; schema:byArtist "ZZ Top"]
[ a schema:MusicRecording; schema:name "Old Time Rock and Roll"; schema:byArtist "Bob Seger"]
[ a schema:MusicRecording; schema:name "Hurt So Good"; schema:byArtist "John Cougar"]
)
]
defines a playlist with an ordered set of tracks. RDFa adds the @inlist attribute, which is used to identify values (object or literal) that are to be placed in a list. The same playlist might be defined in RDFa as follows:
Classic Rock Playlist
1.Sweet Home Alabama -
Lynard Skynard
2.Shook you all Night Long -
AC/DC
3.Sharp Dressed Man -
ZZ Top
4.Old Time Rock and RollBob Seger
5.Hurt So GoodJohn Cougar
This basically does the same thing, but places each track in an rdf:List in the defined order.
#### Property relations
The @property attribute has been updated to allow for creating URI references as well as object literals.
1. If an element contains @property but no @rel, @datatype or @content and it contains a resource attribute (such as @href, @src, or @resource)
1. Generate an IRI object. Furthermore, sub-elements do not chain, i.e., the subject in effect when the @property is processed is also in effect for sub-elements.
2. Otherwise, generate a literal as before.
For example:
NBA Eastern Conference ...
results in
<> schema:url ;
schema:title "NBA Eastern Conference".
#### Magnetic @about/@typeof
The @typeof attribute has changed; previously, it always created a new subject, either using a resource from @about, @resource and so forth. This has long been a source of errors for people using RDFa. The new rules cause @typeof to bind to a subject if used with @about, otherwise, to an object, if either used alone, or in combination with some other resource attribute (such as @href, @src or @resource).
For example:
results in
a foaf:Person;
foaf:name "Gregg Kellogg";
foaf:knows .
a foaf:Person;
foaf:name "Manu Sporny" .
Note that if the explicit @href is not present, i.e.,
this results in
a foaf:Person;
foaf:name "Gregg Kellogg";
foaf:knows [
a foaf:Person;
foaf:name "Manu Sporny"
].
#### Property chaining
If used without @rel, but with @typeof and a resource attribute, @property will cause chaining to another object just like @rel. The effect of this and other changes is to allow pretty much all RDFa to be marked up using just @property; @rel/@rev is no longer required. Although, @rel and @rev have useful features that @property does not, so it's worth keeping them in your toolkit.
#### Support for HTML5 `time` element
The `time` element allows the creation of a datatyped-literal based on the lexical scope of either the ``@datetime`` attribute, or the element content. We parse it according to xsd:date, xsd:time, xsd:dateTime, xsd:gYear, xsd:gYearMonth, and xsd:duration. If it matches none of these, a plain literal is emitted.
The `time` element is described in the WHATWG version of the [HTML5 spec](http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level-semantics.html#the-time-element).
This is related to [RDFa ISSUE-97](http://www.w3.org/2010/02/rdfa/track/issues/97).
#### Support for HTML5 `data` element
This is an alternate way of adding data using the `@value` property. Similar to `meta`
The `data` element is described in the WHATWG version of the [HTML5 spec](http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level-semantics.html#the-data-element).
This is related to [RDFa ISSUE-113](http://www.w3.org/2010/02/rdfa/track/issues/113)
### Support for embedded RDF/XML
If the document includes embedded RDF/XML, as is the case with many SVG documents, and the RDF::RDFXML gem is installed, the reader will add extracted triples to the default graph.
For example:
generates the following turtle:
@prefix dc: .
dc:title "Test 0304" ;
dc:description "A yellow rectangle with sharp corners." .
### Support for embedded N-Triples or Turtle
If the document includes a `<script>` element having an `@type` attribute whose value matches that of a loaded RDF reader (text/ntriples and text/turtle are loaded if they are availble), the data will be extracted and added to the default graph.
Additionally, if the `<script>` element has an `@id` attribute, the triples will be placed into a graph named by appending the value of `@id` as a frament of the base IRI of the input document. For example:
generates the following TriG:
@prefix gr: .
@prefix rdfs: .
{
a gr:BusinessEntity ;
rdfs:seeAlso ;
gr:hasLegalName "Hepp Industries Ltd." .
}
## Usage
### Reading RDF data in the RDFa format
graph = RDF::Graph.load("etc/doap.html", :format => :rdfa)
### Reading RDF data with vocabulary expansion
graph = RDF::Graph.load("etc/doap.html", :format => :rdfa, :vocab_expansion => true)
or
graph = RDF::RDFa::Reader.open("etc/doap.html").expand
### Reading Processor Graph
graph = RDF::Graph.load("etc/doap.html", :format => :rdfa, :rdfagraph => :processor)
### Reading Both Processor and Output Graphs
graph = RDF::Graph.load("etc/doap.html", :format => :rdfa, :rdfagraph => [:output, :processor])
### Writing RDF data using the XHTML+RDFa format
require 'rdf/rdfa'
RDF::RDFa::Writer.open("etc/doap.html") do |writer|
writer << graph
end
Note that prefixes may be chained between Reader and Writer, so that the Writer will
use the same prefix definitions found during parsing:
prefixes = {}
graph = RDF::Graph.load("etc/doap.html", :prefixes => prefixes)
puts graph.dump(:rdfa, :prefixes => prefixes)
### Template-based Writer
The RDFa writer uses [Haml][Haml] templates for code generation. This allows fully
customizable RDFa output in a variety of host languages.
The [default template]({RDF::RDFa::Writer::DEFAULT_HAML}) generates human readable HTML5
output. A [minimal template]({RDF::RDFa::Writer::MIN_HAML}) generates HTML, which is not
intended for human consumption.
To specify an alternative Haml template, consider the following:
require 'rdf/rdfa'
RDF::RDFa::Writer.buffer(:haml => RDF::RDFa::Writer::MIN_HAML) << graph
The template hash defines four Haml templates:
* _doc_: Document Template, takes an ordered list of _subject_s and yields each one to be rendered. From {RDF::RDFa::Writer#render_document}:
{include:RDF::RDFa::Writer#render_document}
This template takes locals _lang_, _prefix_, _base_, _title_ in addition to _subjects_
to create output similar to the following:
Document Title
...
Options passed to the Writer are used to supply _lang_ and _base_ locals.
_prefix_ is generated based upon prefixes found from the default profiles, as well
as those provided by a previous Reader. _title_ is taken from the first top-level subject
having an appropriate title property (as defined by the _heading\_predicates_ option).
* _subject_: Subject Template, take a _subject_ and an ordered list of _predicate_s and yields
each _predicate_ to be rendered. From {RDF::RDFa::Writer#render_subject}:
{include:RDF::RDFa::Writer#render_subject}
The template takes locals _rel_ and _typeof_ in addition to _predicates_ and _subject_ to
create output similar to the following:
...
Note that if _typeof_ is defined, in this template, it will generate a textual description.
* _property\_value_: Property Value Template, used for predicates having a single value; takes
a _predicate_, and a single-valued Array of _objects_. From {RDF::RDFa::Writer#render_property}:
{include:RDF::RDFa::Writer#render_property}
In addition to _predicate_ and _objects_, the template takes _inlist_ to indicate that the
property is part of an `rdf:List`.
Also, if the predicate is identified as a _heading predicate_ (via _:heading\_predicates_ option),
it will generate a heading element, and may use the value as the document title.
Each _object_ is yielded to the calling block, and the result is rendered, unless nil.
Otherwise, rendering depends on the type of _object_. This is useful for recursive document
descriptions.
Creates output similar to the following:
Note the use of methods defined in {RDF::RDFa::Writer} useful in rendering the output.
* _property\_values_: Similar to _property\_value_, but for predicates having more than one value.
Locals are identical to _property\_values_, but _objects_ is expected to have more than one value. Described further in {RDF::RDFa::Writer#render_property}.
In this case, and unordered list is used for output. Creates output similar to the following:
If _property\_values_ does not exist, repeated values will be replecated
using _property\_value_.
* Type-specific templates.
To simplify generation of different output types, the
template may contain a elements indexed by a URI. When a subject with an rdf:type
matching that URI is found, subsequent Haml definitions will be taken from
the associated Hash. For example:
{
:document => "...",
:subject => "...",
:property\_value => "...",
:property\_values => "...",
RDF::URI("http://schema.org/Person") => {
:subject => "...",
:property\_value => "...",
:property\_values => "...",
}
}
## Dependencies
* [Ruby](http://ruby-lang.org/) (>= 1.9) or (>= 1.8.1 with [Backports][])
* [RDF.rb](http://rubygems.org/gems/rdf) (>= 0.3.1)
* [Haml](https://rubygems.org/gems/haml) (>= 3.0.0)
* [HTMLEntities](https://rubygems.org/gems/htmlentities) ('>= 4.3.0')
* Soft dependency on [Nokogiri](http://rubygems.org/gems/nokogiri) (>= 1.3.3)
## Documentation
Full documentation available on [Rubydoc.info][RDFa doc]
### Principle Classes
* {RDF::RDFa::Format}
* {RDF::RDFa::HTML}
Asserts :html format, text/html mime-type and .html file extension.
* {RDF::RDFa::XHTML}
Asserts :html format, application/xhtml+xml mime-type and .xhtml file extension.
* {RDF::RDFa::SVG}
Asserts :svg format, image/svg+xml mime-type and .svg file extension.
* {RDF::RDFa::Reader}
* {RDF::RDFa::Reader::Nokogiri}
* {RDF::RDFa::Reader::REXML}
* {RDF::RDFa::Context}
* {RDF::RDFa::Expansion}
* {RDF::RDFa::Writer}
### Additional vocabularies
* {RDF::PTR}
* {RDF::RDFA}
* {RDF::XHV}
* {RDF::XML}
* {RDF::XSI}
## TODO
* Add support for LibXML and REXML bindings, and use the best available
* Consider a SAX-based parser for improved performance
## Resources
* [RDF.rb][RDF.rb]
* [Distiller](http://rdf.greggkellogg.net/distiller)
* [Documentation][RDFa doc]
* [History]{file:History.markdown}
* [RDFa 1.1 Core][RDFa 1.1 Core]
* [XHTML+RDFa 1.1][XHTML+RDFa 1.1]
* [RDFa-test-suite](http://rdfa.info/test-suite/ "RDFa test suite")
## Author
* [Gregg Kellogg](http://github.com/gkellogg) -
## Contributors
* [Nicholas Humfrey](http://github.com/njh)
## Contributing
* Do your best to adhere to the existing coding conventions and idioms.
* Don't use hard tabs, and don't leave trailing whitespace on any line.
* Do document every method you add using [YARD][] annotations. Read the
[tutorial][YARD-GS] or just look at the existing code for examples.
* Don't touch the `.gemspec`, `VERSION` or `AUTHORS` files. If you need to
change them, do so on your private branch only.
* Do feel free to add yourself to the `CREDITS` file and the corresponding
list in the the `README`. Alphabetical order applies.
* Do note that in order for us to merge any non-trivial changes (as a rule
of thumb, additions larger than about 15 lines of code), we need an
explicit [public domain dedication][PDD] on record from you.
## License
This is free and unencumbered public domain software. For more information,
see or the accompanying {file:UNLICENSE} file.
## FEEDBACK
* gregg@kellogg-assoc.com
*
*
*
[RDF.rb]: http://rubygems.org/gems/rdf
[YARD]: http://yardoc.org/
[YARD-GS]: http://rubydoc.info/docs/yard/file/docs/GettingStarted.md
[PDD]: http://lists.w3.org/Archives/Public/public-rdf-ruby/2010May/0013.html
[RDFa 1.1 Core]: http://www.w3.org/TR/2012/PR-rdfa-core-20120508/ "RDFa 1.1 Core"
[RDFa Lite 1.1]: http://www.w3.org/TR/2012/PR-rdfa-lite-20120508/ "RDFa Lite 1.1"
[XHTML+RDFa 1.1]: http://www.w3.org/TR/2012/PR-xhtml-rdfa-20120508/ "XHTML+RDFa 1.1"
[HTML+RDFa 1.1]: http://www.w3.org/TR/rdfa-in-html/ "HTML+RDFa 1.1"
[RDFa-test-suite]: http://rdfa.info/test-suite/ "RDFa test suite"
[RDFa doc]: http://rubydoc.info/github/ruby-rdf/rdf-rdfa/frames
[Haml]: http://haml-lang.com/
[Turtle]: http://www.w3.org/TR/2011/WD-turtle-20110809/