README.md in mediawiktory-0.0.3 vs README.md in mediawiktory-0.1.0

- old
+ new

@@ -1,103 +1,184 @@ # MediaWiktory, The MediaWiki Client -**MediaWiktory** is MediaWiki API client that doesn't suck. No, seriously. +[![Gem Version](https://badge.fury.io/rb/mediawiktory.svg)](http://badge.fury.io/rb/mediawiktory) +[![Build Status](https://travis-ci.org/molybdenum-99/mediawiktory.svg?branch=master)](https://travis-ci.org/molybdenum-99/mediawiktory) -[MediaWiki API](https://www.mediawiki.org/wiki/API:Main_page) -currently is very powerful and full-featured (thought not very easy to use). -Things like "fetch first 50 pages from that category alongside with their -revision history and interwiki links and mediafile stats" are typically done +**MediaWiktory** is a MediaWiki (think Wikipedia, Wiktionary and others) API client. It is the only +client that allows (almost) full access to MediaWiki API powers without loosing of Ruby powers. + +No, seriously. + +[MediaWiki API](https://www.mediawiki.org/wiki/API:Main_page) currently is very powerful and +full-featured (thought not very easy to use). Things like "fetch first 50 pages from that category +alongside with their revision history and interwiki links and mediafile stats" are typically done with one carefully constructed request and return lots of useful information. Yes, there already are [several](https://www.mediawiki.org/wiki/API:Client_code#Ruby) API clients for Ruby, including ["official" one](https://github.com/wikimedia/mediawiki-ruby-api). -Typical approach for all of them is thick wrapper around some functionality -(like "login and edit pages" or "search and analyze pages"), and leave -all the other cool things for generic `query` method (at best), or -without any coverage at all. +Typical approach for all of them is thick wrapper around some functionality (like "login and edit +pages" or "search and analyze pages"), and leave all the other cool things for generic `action` method +(at best), or without any coverage at all. -MediaWiktory, on the contrary, does this: -* wraps all the functionality of all the modules of modern MediaWiki, - as per [this documentation](https://en.wikipedia.org/w/api.php?action=help&modules=main); - * and, for the record, provides sparse documentation for all modules - and possibilities; -* makes the wrapper reasonably thin, so you can always inspect raw responses - and raw request MediaWiktory will provide to the API; -* makes the wrapper reasonably smart, so things like "fetch next page for - this request" or "login with a token" are done with ease (but without - unnecessary "auto-magic", like fetching 500 pages of data without any - chance to control this); -* validates requests and simplifies parameters passing, while still - preserving abilitity to use any existing parameter and feature; -* it manages cookies and tokens for you, but can expose underlying - Faraday web client so the middleware can be added (like caching). +MediaWiktory, to the contrary is: -**Warning: Above is IDEAL state of the library. Current state, at -version 0.0.1 is far less impressive (yet already useful).** +* **thin** wrapper... +* around **all** MediaWiki API features... +* making access to them available through idiomatic Ruby code, easy to use and clearly documented. -## Structure and usage +## Examples -MediaWiktory tries to resemble original API structure, but make its calls -more Ruby-ish. So, [Arel](https://github.com/rails/arel)-alike "chainable -query" approach is implemented. +**Example 1.** Fetching page's text and metadata: -Fast example: +```ruby +api = MediaWiktory::Wikipedia::Api.new +response = api.query. # "query" action is a basis for all pages/categories/meta receiving + titles('Argentina'). # query page titles: Argentina + prop(:info, :revisions). # query page properties: info, revisions + prop(:url, :content). # query those properties subproperties: full URL (from info) and content (from revisions) + response # perform query and parse it! -**Goal**: fetch content for first 30 pages from category "Countries in -South America". +page = response['pages'].values.first +puts page['title'] +# Prints: +# Argentina +puts page['fullurl'] +# Prints: +# https://en.wikipedia.org/wiki/Argentina +puts page['revisions'].first['*'].slice(0..200) # first 200 chars of page contents +# Prints: +# {{other uses}} +# {{pp-semi|small=yes}} +# {{Use dmy dates|date=March 2017}} +# {{Coord|34|S|64|W|display=title}} +# {{Infobox country +# |coordinates = {{Coord|34|36|S|58|23|W|type:city}} +# |conventional_long_name = A +``` -**Raw request** (split in lines for readability): +Note, that for using MediaWiktory API wrapper you need to understand the underlying API. While previous +experience might make you expect something like `api.page('Argentina').text`, in fact you should +use the `query` action, request page title 'Argentina', its `:revisions` property, its `:content` +subproperty—and voila, you have a _1-element list of revisions_ for the page and last revisions `'*'` +key has page's text. +The good news is all methods are documented at [RubyDoc.info](http://www.rubydoc.info/gems/mediawiktory). +Most of the time, the documentation has enough details, so you don't need to refer to MediaWiki +official docs. + +**Example 2:** Editing the page (we are editing [Sandbox](https://en.wikipedia.org/wiki/Wikipedia:Sandbox) +here, which is safe, but be careful while experimenting, this code **really** replaces page's text!): + +```ruby +token = api.query.meta(:tokens).response.dig('tokens', 'csrftoken') +response = api.edit.title('Wikipedia:Sandbox').text("Test '''me''', MediaWiktory!").token(token).response +response.to_h +# => {"result"=>"Success", "pageid"=>16283969, "title"=>"Wikipedia:Sandbox", "contentmodel"=>"wikitext", "oldrevid"=>779502714, "newrevid"=>779502729, "newtimestamp"=>"2017-05-09T08:24:26Z"} + +# This, without token, will raise: +api.edit.title('Wikipedia:Sandbox').text("Test '''me''', MediaWiktory without token!").response +# MediaWiktory::Wikipedia::Response::Error: The "token" parameter must be set. ``` -https://en.wikipedia.org/w/api.php? - action=query& - generator=categorymembers& - gcmtitle=Category:Countries_in_South_America& - gcmlimit=30& - prop=revisions& - rvprop=content -``` -**MediaWiktory request**—almost the same, but structured and validated -on-the-fly, and without `gcm` and `rv` prefixes black magic: +**Example 3:** Fetching all "main" page images for the pages of category: ```ruby -client = MediaWiktory::Client.new('https://en.wikipedia.org/w/api.php') +response = api.query. # "query" action again + generator(:categorymembers). # instead of listing titles, we use "page list generator": all members of a category + title('Category:1960s_automobiles'). # ...of this category + prop(:pageimages).prop(:thumbnail). # and fetch "pageimages" property, its "thumbnail" sub-property + limit('max'). # limit to maximum number of pages available in one response + response -response = client. - query. - generator(categorymembers: {title: 'Category:Countries_in_South_America', limit: 30}). - prop(revisions: {prop: :content}). - perform +# You can fetch ALL of them with, it will be a lot: +# response = response.continue while response.continue? -# MediaWiktory handles "next page fetching" for you, if you want -response.continue! while response.can_continue? +response.to_h['pages'].values.each do |page| + puts "#{page['title']}: #{page.dig('thumbnail', 'source')}" +end +# AC Cobra: https://upload.wikimedia.org/wikipedia/commons/thumb/e/e8/Shelby_AC_427_Cobra_vl_blue.jpg/50px-Shelby_AC_427_Cobra_vl_blue.jpg +# Acadian (automobile): +# Alfa Romeo 33 Stradale: https://upload.wikimedia.org/wikipedia/commons/thumb/e/eb/1968_Alfa_Romeo_Tipo_33_Stradale.jpg/50px-1968_Alfa_Romeo_Tipo_33_Stradale.jpg +# Alfa Romeo 105/115 Series Coupés: https://upload.wikimedia.org/wikipedia/commons/thumb/8/81/Alfa_Romeo_GT_1300_Junior.jpg/50px-Alfa_Romeo_GT_1300_Junior.jpg +# Alfa Romeo 1750 Berlina: https://upload.wikimedia.org/wikipedia/commons/thumb/2/20/Alfa_Romeo_1750_berlina_grey-front.JPG/50px-Alfa_Romeo_1750_berlina_grey-front.JPG +# Alfa Romeo 2000: https://upload.wikimedia.org/wikipedia/commons/thumb/f/f6/Alfa_2000_touring_spider.JPG/50px-Alfa_2000_touring_spider.JPG +# Alfa Romeo 2600: https://upload.wikimedia.org/wikipedia/commons/thumb/6/6b/Alfa-Romeo_2600-Spider-Touring.JPG/50px-Alfa-Romeo_2600-Spider-Touring.JPG +# ... +``` -# MediaWiktory parses response and provides smart shortcuts -p response.pages.map(&:title) +## Usage + ``` +gem install mediawiktory +``` -The same approach works for ANY `action` described in [docs](https://en.wikipedia.org/w/api.php), -and all of its "modules" and "submodules". +There are a lot of popular installations of MediaWiki besides Wikipedia. All of them are having +different versions installed with different features enabled and custom extensions turned on. -**NB**: its 0.0.1, ok? Domain is complex and is still investigated. For -example, above statement definitely **wants** to look like +To catch with this multitude of features, MediaWiktory provides two ways of usage. +### 1. Use default wrapper, generated from English Wikipedia: + ```ruby -response = client. - query. - generator(categorymembers: 'Category:Countries_in_South_America'). - limit(30). - prop(revisions: :content). - perform +require 'mediawiktory' +api = MediaWiktory::Wikipedia::Api.new # => English Wikipedia +# or +api = MediaWiktory::Wikipedia::Api.new('http://some.site/w/api.php') # => any other MediaWiki ``` -...but it should be done without flexibility loss. +...and wonder through docs of [MediaWiktory::Wikipedia::Api](http://www.rubydoc.info/gems/mediawiktory/MediaWiktory/Wikipedia/Api) +class to understand what you can do. +### 2. Custom wrapper generation. +``` +mediawiktory-gen -u http://some.site/w/api.php --path lib/path/to/wrapper --namespace My::Wrapper +``` +This will generate `My::Wrapper::Api` class and a lot of other classes wrapping all actions and +modules of target APIs. The generated code is **independent** of MediaWiktory (so you can exclude it +from your runtime), and depends only on `addressable`, `faraday` and `faraday_middleware` gems. + +The usage of custom wrapper is basically the same: + +```ruby +require 'path/to/wrapper/api' +api = My::Wrapper::Api.new +api.query # .and.so.on +``` + +You need custom wrapper if: + +* you want to have the exact list of features your site has: for example, with Wikia sites, most of + generic functionality (like query and edit) will work, but most of fancy modern Wikipedia actions + will fail with "unknown action"; +* your target site has some custom actions and modules: for example, most informative Wikidata actions + are custom ones, like [wbgetentities](https://www.wikidata.org/w/api.php?action=help&modules=wbgetentities), + they are not present in default wrapper; +* you want to catch up with some edge Wikipedia features; Wikipedia wrapper is generated on gem + release, but Wikipedia's API changes everyday with new small and large exerimental features. + +**Generator limitations:** Wrapper is generated from [HTML docs of API](en.wikipedia.org/w/api.php), +but currently generator can't process old MediaWiki versions ASCII docs format, which, unfortunately, +is stil in use on [Wikia](https://marvel.wikia.com/api.php), for example. It is subject to further +development, as some "old" installations of MediaWiki provide pretty useful content and a lot of +custom modules. + +If you integrate wrapper generated by MediaWiktory into some other library, you should note that: + +* All generated code is documented in YARD format, Markdown markup flavour; +* If you use Rubocop, you will find some "good code" practices broken in generated code, because it + is hard to follow them in large code generation. + +## Roadmap + +* Expose underlying Faraday client for fine-tuning; +* Handle cookies automatically (for logging in); +* Handle file uploads (should be done as multipart, use appropriate Faraday middleware); +* Add parser for outdated ASCII docs. + ## Authors * [Victor Shepelev](https://zverok.github.io) [@zverok](https://github.com/zverok); -* Sergey Mostovoy [@smostovoy](https://github.com/smostovoy). +* Serhiy Mostovyi [@smostovoy](https://github.com/smostovoy). ## License [MIT](https://github.com/molybdenum-99/mediawiktory/blob/master/LICENSE.txt)