mofo

Points of Interest

Other Parsers

Get Started Immediately


$ sudo gem install mofo -y
... install mofo and hpricot dependency ...
$ irb -rubygems 
>> require 'mofo'
=> true
>> fireball = HCard.find 'http://flickr.com/people/gruber/'
=> #<HCard:0x6db898 ...>
>> fireball.properties
=> ["fn", "logo", "url", "n", "adr", "title", "nickname"]
>> fireball.nickname
=> "gruber"
>> fireball.url
=> "http://daringfireball.net/"
>> fireball.n.family_name
=> "Gruber"
>> fireball.title
=> "Raconteur"
>> fireball.adr.locality
=> "Philadelphia"
>> fireball.logo
=> "http://static.flickr.com/9/buddyicons/44621776@N00.jpg?1117572751"

Microwhozit?

Microformats are tiny little markup definitions built on top of, usually, HTML or XHTML.

You have a blog. You have recent posts on your blog's index page. You have an Atom feed. You have recent posts on your blog's Atom feed. See where I'm going with this?

The hAtom microformat (or uformat) can be embedded in your existing HTML by setting CSS classes with semantic meaning inside of your posts. A class to signify a post is contained within this div, a class to signify the contents of this h3 are the post's title, a class to signify the contents of this span is the blog post's author, etc.

You can then use a microformat parser (like, say, mofo) to extract this information as you would from an Atom feed. Hell, you can even convert hAtom to Atom. It's an insta-feed! No extra code required!

You're already doing the work, you see. Microformats are everywhere. We just need to set them free.

Check it:

<div class="post">
<h3>Megadeth Show Last Night</h3>
<span class="subtitle">Posted by Chris on June 4th</span>
<div class="content">Went to a show last night. Megadeth. It was alright.</div>
</div>

Right? Normal. Here's the same post marked up with hAtom:

<div class="post hentry">
<h3 class="entry-title">Megadeth Show Last Night</h3>
<span class="subtitle">Posted by <span class="author vcard fn">Chris</span> on 
<abbr class="updated" title="2006-06-04T10:32:10Z">June 4th</abbr></span>
<div class="content entry-content">
Went to a show last night. Megadeth. It was alright.
</div>
</div>

All I did was add the hentry, entry-title, and entry-content classes to existing containers. Then I went ahead and wrapped the date in an <abbr> tag giving it a title in the microformat-standard way. Finally I put a div around Chris signifying it as the author field of the hEntry and making it a valid hCard by including the vcard and fn classes. It's really not all that hard. Did I mess it up? Maybe, but I'm sure I got close. And I didn't even use a reference. Practice.

How'd we parse this, tho?

$ irb -rubygems
>> require 'mofo'
=> true
>> post = HEntry.find 'http://milesofstyle.org/posts/351-megadeth-show-last-night'
=> #<HEntry:0x6db898 ... > 
>> post.entry_title
=> "Megadeth Show Last Night"
>> post.properties
=> ["entry_content", "updated", "author", "entry_title"]
>> post.updated
=> Sun Jun 04 10:32:10 UTC 2006
>> post.updated.class
=> Time
>> post.author
=> #<HCard:0x6e7b98 @properties=["fn"], @fn="Chris">
>> post.author.fn
=> "Chris"
>> post.entry_content
=> "Went to a show last night.  Megadeth.  It was alright."

That's, like, stupid easy. If HEntry.find gets back more than one hEntry, you'll get an array.

Mofo#find

Everything revolves around the #find method. Sound familiar? Yeah.

>> Microformat.find "http://valid-url.com"
>> Microformat.find "/path/to/existing/file"
>> Microformat.find :text => "microformat text"

Also, #find can be told explicitly to find all (returning an array on failure) or only find the first (returning nil on failure).

>> Microformat.find :all => "/existing/file"
=> [ array of microformat objects ] 
>> Microformat.find :first => "/existing/file"
=> microformat object
>> Microformat.find "/existing/file"
=> either an array of objects or just one object

When parsing a string, use :all and :first go outside of :text.

>> Microformat.find :all => { :text => 'mfin text' } 

That's it.

Supported Microformats

hCard - http://microformats.org/wiki/hcard

>> messina = HCard.find 'http://www.flickr.com/people/factoryjoe/'
=> #<HCard:0x125eb5c ...>
>> messina.properties
=> ["fn", "note", "logo", "url", "n", "adr", "title", "nickname"]
>> messina.title
=> "Citizen Provocateur, Open Source Ambassador"
>> messina.logo
=> "http://farm1.static.flickr.com/1/buddyicons/25419820@N00.jpg?1167346106"
>> messina.n
=> #<OpenStruct given_name="Chris", family_name="Messina">
>> messina.fn
=> "Chris Messina"
>> messina.url
=> "http://factoryjoe.com/blog"
>> messina.nickname
=> "factoryjoe"

hCalendar - http://microformats.org/wiki/hcalendar

>> events = HCalendar.find 'http://upcoming.org'
=> [#<HCalendar:0x131d304 ...> ... ]
>> events.size
=> 17
>> events.first.properties
=> ["summary", "url", "location"]
>> events.first.location 
=> "Neumo&#39;s, Seattle, WA"
>> events.first.summary
=> "Ratatat + 120 Days"

hReview - http://microformats.org/wiki/hreview

>> wine = HReview.find 'http://corkd.com/wine/view/1772'
=> [#<HReview:0x156c3f8 ...> ...]
>> wine.size
=> 7
>> wine.first.properties
=> ["rating", "description", "item", "reviewer", "tags", "dtreviewed"]
>> wine.first.rating
=> 3
>> wine.first.tags
=> ["fresh", "lime", "pear"]
>> wine.first.dtreviewed
=> Fri Jun 02 00:00:00 -0700 2006

hEntry - http://microformats.org/wiki/hatom

>> post = HEntry.find 'http://errtheblog.com'
=> #<HEntry:0x169309c ...>
>> post.properties
=> ["published", "entry_title", "author", "entry_content", "bookmark", "tags"]
>> post.author.class
=> HCard
>> post.author.fn
=> "Chris"
>> post.published
=> Mon Mar 26 09:21:00 UTC 2007
>> post.entry_content.length
=> 4737

hResume - http://microformats.org/wiki/hresume

>> crunch = HResume.find 'http://www.linkedin.com/in/michaelarrington'
=> #<HResume:0x129d370 ...>
>> crunch.properties
=> ["summary", "education", "experience", "contact"]
>> crunch.experience.first.class
=> HCalendar
>> crunch.contact
=> #<HCard:0x36614 ...>
>> crunch.contact.title
=> "Editor - TechCrunch"

XOXO - http://microformats.org/wiki/xoxo

>> mofo = XOXO.find 'http://mofo.rubyforge.org', :class => true
=> [["Get Started", "Microwhozit?", "Mofo#find", ...]
>> mofo.first
=> ["Get Started", "Microwhozit?", "Mofo#find", "Supported Microformats", ...]
>> mofo[1]
=> ["Me and uFormats", "Microformats HQ", "Microformatique", "Assaf Arkin", ...]
>> mofo[1].first
=> "Me and uFormats"
>> mofo[1].first.class
=> XOXO::Label
>> mofo[1].first.url
=> "http://errtheblog.com/post/37"

Geo - http://microformats.org/wiki/geo

>> somewhere = Geo.find 'http://www.geograph.org.uk/photo/1234'
=> #<Geo:0x12337a4 ...>
>> somewhere.latitude
=> 54.05836
>> somewhere.longitude
=> -2.14662

Adr - http://microformats.org/wiki/adr

...coming soon...

XFN - http://microformats.org/wiki/xfn

>> tons = XFN.find 'http://deliciouslymeta.com/projects/xfn/test_data.html'
=> #<XFN:0x157f200 ...>
>> tons.first
=> #<XFN::Link name="friend - contact", relation="contact", link="#contact">
>> tons.me_and_parent
=> #<XFN::Link name="me + parent", relation=["me", "parent"], link="#parent">
>> tons.me_and_parent.name
=> "me + parent
>> tons.neighbor 
=> [#<XFN::Link ...> ...]
>> tons.neighbor.size
=> 5
>> tons.parent_and_kin.link
=> "#parent"

Ruby on Rails

mofo doubles as a Rails plugin. Just drop it into vendor/plugins and you are good to go, with all the available microformat parsers loaded into your application. mofo classes are YAML and Marshal approved, meaning you can cache them with memcached (or DRb) or store them in a session.

Install with Piston:

$ piston import svn://errtheblog.com/svn/mofo/trunk vendor/plugins/mofo  

Install with SVN:

$ ./script/plugin install -x svn://errtheblog.com/svn/mofo/trunk  

Get in Touch