Class MediaWiki::Gateway
In: lib/media_wiki/gateway.rb
Parent: Object

Methods

Attributes

base_url  [R] 
log  [R] 

Public Class methods

Set up a MediaWiki::Gateway for a given MediaWiki installation

url
Path to API of target MediaWiki (eg. "en.wikipedia.org/w/api.php")
options
Hash of options

Options:

:ignorewarnings
Log API warnings and invalid page titles, instead of aborting with an error.
:limit
Maximum number of results returned per search (see www.mediawiki.org/wiki/API:Query_-_Lists#Limits), defaults to the MediaWiki default of 500.
:loglevel
Log level to use, defaults to Logger::WARN. Set to Logger::DEBUG to dump every request and response to the log.
:maxlag
Maximum allowed server lag (see www.mediawiki.org/wiki/Manual:Maxlag_parameter), defaults to 5 seconds.
:retry_count
Number of times to try before giving up if MediaWiki returns 503 Service Unavailable, defaults to 3 (original request plus two retries).
:retry_delay
Seconds to wait before retry if MediaWiki returns 503 Service Unavailable, defaults to 10 seconds.

[Source]

    # File lib/media_wiki/gateway.rb, line 24
24:     def initialize(url, options={})
25:       default_options = {
26:         :limit => 500,
27:         :loglevel => Logger::WARN,
28:         :maxlag => 5,
29:         :retry_count => 3,
30:         :retry_delay => 10
31:       }
32:       @options = default_options.merge(options)
33:       @wiki_url = url
34:       @log = Logger.new(STDERR)
35:       @log.level = @options[:loglevel]
36:       @headers = { "User-Agent" => "MediaWiki::Gateway/#{MediaWiki::VERSION}" }
37:       @cookies = {}
38:     end

Public Instance methods

Get a list of pages that link to a target page

title
Link target page
filter
"all" links (default), "redirects" only, or "nonredirects" (plain links only)

Returns array of page titles (empty if no matches)

[Source]

     # File lib/media_wiki/gateway.rb, line 196
196:     def backlinks(title, filter = "all")
197:       titles = []
198:       blcontinue = nil
199:       begin
200:         form_data =
201:           {'action' => 'query',
202:           'list' => 'backlinks',
203:           'bltitle' => title,
204:           'blfilterredir' => filter,
205:           'bllimit' => @options[:limit] }
206:         form_data['blcontinue'] = blcontinue if blcontinue
207:         res, blcontinue = make_api_request(form_data, '//query-continue/backlinks/@blcontinue')
208:         titles += REXML::XPath.match(res, "//bl").map { |x| x.attributes["title"] }
209:       end while blcontinue
210:       titles
211:     end

Create a new page, or overwrite an existing one

title
Page title to create or overwrite, string
content
Content for the page, string
options
Hash of additional options

Options:

  • [:overwrite] Allow overwriting existing pages
  • [:summary] Edit summary for history, string
  • [:token] Use this existing edit token instead requesting a new one (useful for bulk loads)

[Source]

     # File lib/media_wiki/gateway.rb, line 116
116:     def create(title, content, options={})
117:       form_data = {'action' => 'edit', 'title' => title, 'text' => content, 'summary' => (options[:summary] || ""), 'token' => get_token('edit', title)}
118:       form_data['createonly'] = "" unless options[:overwrite]
119:       make_api_request(form_data)
120:     end

Delete one page. (MediaWiki API does not support deleting multiple pages at a time.)

title
Title of page to delete

[Source]

     # File lib/media_wiki/gateway.rb, line 146
146:     def delete(title)
147:       form_data = {'action' => 'delete', 'title' => title, 'token' => get_token('delete', title)}
148:       make_api_request(form_data)
149:     end

Download file_name. Returns file contents. All options are passed to image_info however options[‘iiprop’] is forced to url. You can still set other options to control what file you want to download.

[Source]

     # File lib/media_wiki/gateway.rb, line 381
381:     def download(file_name, options={})
382:       options['iiprop'] = 'url'
383:   
384:       attributes = image_info(file_name, options)
385:       if attributes
386:         RestClient.get attributes['url']
387:       else
388:         nil
389:       end
390:     end

Exports a page or set of pages

page_titles
String or array of page titles to fetch

Returns MediaWiki XML dump

[Source]

     # File lib/media_wiki/gateway.rb, line 412
412:     def export(page_titles)
413:       form_data = {'action' => 'query', 'titles' => [page_titles].join('|'), 'export' => nil, 'exportnowrap' => nil}
414:       return make_api_request(form_data)
415:     end

Get a list of all installed (and registered) extensions

Returns array of extensions (name => version)

[Source]

     # File lib/media_wiki/gateway.rb, line 433
433:     def extensions
434:       form_data = { 'action' => 'query', 'meta' => 'siteinfo', 'siprop' => 'extensions' }
435:       res = make_api_request(form_data)
436:       REXML::XPath.match(res, "//ext").inject(Hash.new) do |extensions, extension|
437:         name = extension.attributes["name"] || ""
438:         extensions[name] = extension.attributes["version"]
439:         extensions
440:       end
441:     end

Fetch MediaWiki page in MediaWiki format. Does not follow redirects.

page_title
Page title to fetch

Returns content of page as string, nil if the page does not exist.

[Source]

    # File lib/media_wiki/gateway.rb, line 61
61:     def get(page_title)
62:       form_data = {'action' => 'query', 'prop' => 'revisions', 'rvprop' => 'content', 'titles' => page_title}
63:       page = make_api_request(form_data).first.elements["query/pages/page"]
64:       if valid_page? page
65:         page.elements["revisions/rev"].text || ""
66:       end
67:     end

Requests image info from MediaWiki. Follows redirects.

file_name_or_page_id should be either:

  • a file name (String) you want info about without File: prefix.
  • or a Fixnum page id you of the file.

options is Hash passed as query arguments. See www.mediawiki.org/wiki/API:Query_-_Properties#imageinfo_.2F_ii for more information.

options[‘iiprop’] should be either a string of properties joined by ’|’ or an Array (or more precisely something that responds to join).

Hash like object is returned where keys are image properties.

Example:

  mw.image_info(
    "Trooper.jpg", 'iiprop' => ['timestamp', 'user']
  ).each do |key, value|
    puts "#{key.inspect} => #{value.inspect}"
  end

Output:

  "timestamp" => "2009-10-31T12:59:11Z"
  "user" => "Valdas"

[Source]

     # File lib/media_wiki/gateway.rb, line 348
348:     def image_info(file_name_or_page_id, options={})
349:       options['iiprop'] = options['iiprop'].join('|') \
350:         if options['iiprop'].respond_to?(:join)
351:       form_data = options.merge(
352:         'action' => 'query',
353:         'prop' => 'imageinfo',
354:         'redirects' => true
355:       )
356: 
357:       case file_name_or_page_id
358:       when Fixnum
359:         form_data['pageids'] = file_name_or_page_id
360:       else
361:         form_data['titles'] = "File:#{file_name_or_page_id}"
362:       end
363: 
364:       xml, dummy = make_api_request(form_data)
365:       page = xml.elements["query/pages/page"]
366:       if valid_page? page
367:         if xml.elements["query/redirects/r"]
368:           # We're dealing with redirect here.
369:           image_info(page.attributes["pageid"].to_i, options)
370:         else
371:           page.elements["imageinfo/ii"].attributes
372:         end
373:       else
374:         nil
375:       end
376:     end

Imports a MediaWiki XML dump

xml
String or array of page names to fetch

Returns XML array <api><import><page/><page/>… <page revisions="1"> (or more) means successfully imported <page revisions="0"> means duplicate, not imported

[Source]

     # File lib/media_wiki/gateway.rb, line 399
399:     def import(xmlfile)
400:       form_data = { "action"  => "import",
401:         "xml"     => File.new(xmlfile),
402:         "token"   => get_token('import', 'Main Page'), # NB: dummy page name
403:         "format"  => 'xml' }
404:       make_api_request(form_data)
405:     end

Get a list of matching page titles in a namespace

key
Search key, matched as a prefix (^key.*). May contain or equal a namespace, defaults to main (namespace 0) if none given.

Returns array of page titles (empty if no matches)

[Source]

     # File lib/media_wiki/gateway.rb, line 171
171:     def list(key)
172:       titles = []
173:       apfrom = nil
174:       key, namespace = key.split(":", 2).reverse
175:       namespace = namespaces_by_prefix[namespace] || 0
176:       begin
177:         form_data =
178:           {'action' => 'query',
179:           'list' => 'allpages',
180:           'apfrom' => apfrom,
181:           'apprefix' => key,
182:           'aplimit' => @options[:limit],
183:           'apnamespace' => namespace}
184:         res, apfrom = make_api_request(form_data, '//query-continue/allpages/@apfrom')
185:         titles += REXML::XPath.match(res, "//p").map { |x| x.attributes["title"] }
186:       end while apfrom
187:       titles
188:     end

Login to MediaWiki

username
Username
password
Password
domain
Domain for authentication plugin logins (eg. LDAP), optional — defaults to ‘local’ if not given

Throws error if login fails

[Source]

    # File lib/media_wiki/gateway.rb, line 49
49:     def login(username, password, domain = 'local')
50:       form_data = {'action' => 'login', 'lgname' => username, 'lgpassword' => password, 'lgdomain' => domain}
51:       make_api_request(form_data)
52:       @password = password
53:       @username = username
54:     end

Move a page to a new title

from
Old page name
to
New page name
options
Hash of additional options

Options:

  • [:movesubpages] Move associated subpages
  • [:movetalk] Move associated talkpages
  • [:noredirect] Do not create a redirect page from old name. Requires the ‘suppressredirect’ user right, otherwise MW will silently ignore the option and create the redirect anyway.
  • [:reason] Reason for move
  • [:watch] Add page and any redirect to watchlist
  • [:unwatch] Remove page and any redirect from watchlist

[Source]

     # File lib/media_wiki/gateway.rb, line 135
135:     def move(from, to, options={})
136:       valid_options = %w(movesubpages movetalk noredirect reason watch unwatch)
137:       options.keys.each{|opt| raise ArgumentError.new("Unknown option '#{opt}'") unless valid_options.include?(opt.to_s)}
138:       
139:       form_data = options.merge({'action' => 'move', 'from' => from, 'to' => to, 'token' => get_token('move', from)})
140:       make_api_request(form_data)
141:     end

Get a list of all known namespaces

Returns array of namespaces (name => id)

[Source]

     # File lib/media_wiki/gateway.rb, line 420
420:     def namespaces_by_prefix
421:       form_data = { 'action' => 'query', 'meta' => 'siteinfo', 'siprop' => 'namespaces' }
422:       res = make_api_request(form_data)
423:       REXML::XPath.match(res, "//ns").inject(Hash.new) do |namespaces, namespace|
424:         prefix = namespace.attributes["canonical"] || ""
425:         namespaces[prefix] = namespace.attributes["id"].to_i
426:         namespaces
427:       end
428:     end

Checks if page is a redirect.

page_title
Page title to fetch

Returns true if the page is a redirect, false if it is not or the page does not exist.

[Source]

     # File lib/media_wiki/gateway.rb, line 316
316:     def redirect?(page_title)
317:       form_data = {'action' => 'query', 'prop' => 'info', 'titles' => page_title}
318:       page = make_api_request(form_data).first.elements["query/pages/page"]
319:       !!(valid_page?(page) and page.attributes["redirect"])
320:     end

Render a MediaWiki page as HTML

page_title
Page title to fetch
options
Hash of additional options

Options:

  • [:linkbase] supply a String to prefix all internal (relative) links with. ’/wiki/’ is assumed to be the base of a relative link
  • [:noeditsections] strips all edit-links if set to true
  • [:noimages] strips all img tags from the rendered text if set to true

Returns rendered page as string, or nil if the page does not exist

[Source]

     # File lib/media_wiki/gateway.rb, line 80
 80:     def render(page_title, options = {})
 81:       form_data = {'action' => 'parse', 'page' => page_title}
 82: 
 83:       valid_options = %w(linkbase noeditsections noimages)
 84:       # Check options
 85:       options.keys.each{|opt| raise ArgumentError.new("Unknown option '#{opt}'") unless valid_options.include?(opt.to_s)}
 86: 
 87:       rendered = nil
 88:       parsed = make_api_request(form_data).first.elements["parse"]
 89:       if parsed.attributes["revid"] != '0'
 90:         rendered = parsed.elements["text"].text.gsub(/<!--(.|\s)*?-->/, '')
 91:         # OPTIMIZE: unifiy the keys in +options+ like symbolize_keys! but w/o
 92:         if options["linkbase"] or options[:linkbase]
 93:           linkbase = options["linkbase"] || options[:linkbase]
 94:           rendered = rendered.gsub(/\shref="\/wiki\/([\w\(\)_\-\.%\d:,]*)"/, ' href="' + linkbase + '/wiki/\1"')
 95:         end
 96:         if options["noeditsections"] or options[:noeditsections]
 97:           rendered = rendered.gsub(/<span class="editsection">\[.+\]<\/span>/, '')
 98:         end
 99:         if options["noimages"] or options[:noimages]
100:           rendered = rendered.gsub(/<img.*\/>/, '')
101:         end
102:       end
103:       rendered
104:     end

Get a list of pages with matching content in given namespaces

key
Search key
namespaces
Array of namespace names to search (defaults to main only)
limit
Maximum number of hits to ask for (defaults to 500; note that Wikimedia Foundation wikis allow only 50 for normal users)

Returns array of page titles (empty if no matches)

[Source]

     # File lib/media_wiki/gateway.rb, line 220
220:     def search(key, namespaces=nil, limit=@options[:limit])
221:       titles = []
222:       offset = nil
223:       in_progress = true
224: 
225:       form_data = { 'action' => 'query',
226:         'list' => 'search',
227:         'srwhat' => 'text',
228:         'srsearch' => key,
229:         'srlimit' => limit
230:       }
231:       if namespaces
232:         namespaces = [ namespaces ] unless namespaces.kind_of? Array
233:         form_data['srnamespace'] = namespaces.map! do |ns| namespaces_by_prefix[ns] end.join('|')
234:       end
235:       begin
236:         form_data['sroffset'] = offset if offset
237:         res, offset = make_api_request(form_data, '//query-continue/search/@sroffset')
238:         titles += REXML::XPath.match(res, "//p").map { |x| x.attributes["title"] }
239:       end while offset
240:       titles
241:     end

Execute Semantic Mediawiki query

query
Semantic Mediawiki query
params
Array of additional parameters or options, eg. mainlabel=Foo or ?Place (optional)

Returns result as an HTML string

[Source]

     # File lib/media_wiki/gateway.rb, line 449
449:     def semantic_query(query, params = [])
450:       params << "format=list"
451:       form_data = { 'action' => 'parse', 'prop' => 'text', 'text' => "{{#ask:#{query}|#{params.join('|')}}}" }
452:       xml, dummy = make_api_request(form_data)
453:       return xml.elements["parse/text"].text
454:     end

Undelete all revisions of one page.

title
Title of page to undelete

Returns number of revisions undeleted, or zero if nothing to undelete

[Source]

     # File lib/media_wiki/gateway.rb, line 156
156:     def undelete(title)
157:       token = get_undelete_token(title)
158:       if token
159:         form_data = {'action' => 'undelete', 'title' => title, 'token' => token }
160:         make_api_request(form_data).first.elements["undelete"].attributes["revisions"].to_i
161:       else
162:         0 # No revisions to undelete
163:       end
164:     end

Upload a file, or get the status of pending uploads. Several methods are available:

  • Upload file contents directly.
  • Have the MediaWiki server fetch a file from a URL, using the "url" parameter

Requires Mediawiki 1.16+

Arguments:

  • [path] Path to file to upload. Set to nil if uploading from URL.
  • [options] Hash of additional options

Note that queries using session keys must be done in the same login session as the query that originally returned the key (i.e. do not log out and then log back in).

Options:

  • ‘filename’ - Target filename (defaults to local name if not given), options[:target] is alias for this.
  • ‘comment’ - Upload comment. Also used as the initial page text for new files if "text" is not specified.
  • ‘text’ - Initial page text for new files
  • ‘watch’ - Watch the page
  • ‘ignorewarnings’ - Ignore any warnings
  • ‘url’ - Url to fetch the file from. Set path to nil if you want to use this.

Deprecated but still supported options:

  • :description - Description of this file. Used as ‘text’.
  • :target - Target filename, same as ‘filename’.
  • :summary - Edit summary for history. Used as ‘comment’. Also used as ‘text’ if neither it or :description is specified.

Examples:

  mw.upload('/path/to/local/file.jpg', 'filename' => "RemoteFile.jpg")
  mw.upload(nil, 'filename' => "RemoteFile2.jpg", 'url' => 'http://remote.com/server/file.jpg')

[Source]

     # File lib/media_wiki/gateway.rb, line 277
277:     def upload(path, options={})
278:       if options[:description]
279:         options['text'] = options[:description]
280:         options.delete(:description)
281:       end
282: 
283:       if options[:target]
284:         options['filename'] = options[:target]
285:         options.delete(:target)
286:       end
287: 
288:       if options[:summary]
289:         options['text'] ||= options[:summary]
290:         options['comment'] = options[:summary]
291:         options.delete(:summary)
292:       end
293: 
294:       options['comment'] ||= "Uploaded by MediaWiki::Gateway"
295:       options['file'] = File.new(path) if path
296:       full_name = path || options['url']
297:       options['filename'] ||= File.basename(full_name) if full_name
298: 
299:       raise ArgumentError.new(
300:         "One of the 'file', 'url' or 'sessionkey' options must be specified!"
301:       ) unless options['file'] || options['url'] || options['sessionkey']
302: 
303:       form_data = options.merge(
304:         'action' => 'upload',
305:         'token' => get_token('edit', options['filename'])
306:       )
307: 
308:       make_api_request(form_data)
309:     end

Private Instance methods

Get API XML response If there are errors or warnings, raise exception Otherwise return XML root

[Source]

     # File lib/media_wiki/gateway.rb, line 518
518:     def get_response(res)
519:       begin
520:         doc = REXML::Document.new(res).root
521:       rescue REXML::ParseException => e
522:         raise "Response is not XML.  Are you sure you are pointing to api.php?"
523:       end
524:       log.debug("RES: #{doc}")
525:       raise "Response does not contain Mediawiki API XML: #{res}" unless [ "api", "mediawiki" ].include? doc.name
526:       if doc.elements["error"]
527:         code = doc.elements["error"].attributes["code"]
528:         info = doc.elements["error"].attributes["info"]
529:         raise "API error: code '#{code}', info '#{info}'"
530:       end
531:       if doc.elements["warnings"]
532:         warning("API warning: #{doc.elements["warnings"].children.map {|e| e.text}.join(", ")}")
533:       end
534:       doc
535:     end

Fetch token (type ‘delete’, ‘edit’, ‘import’, ‘move’)

[Source]

     # File lib/media_wiki/gateway.rb, line 459
459:     def get_token(type, page_titles)
460:       form_data = {'action' => 'query', 'prop' => 'info', 'intoken' => type, 'titles' => page_titles}
461:       res, dummy = make_api_request(form_data)
462:       token = res.elements["query/pages/page"].attributes[type + "token"]
463:       raise "User is not permitted to perform this operation: #{type}" if token.nil?
464:       token
465:     end

[Source]

     # File lib/media_wiki/gateway.rb, line 467
467:     def get_undelete_token(page_titles)
468:       form_data = {'action' => 'query', 'list' => 'deletedrevs', 'prop' => 'info', 'drprop' => 'token', 'titles' => page_titles}
469:       res, dummy = make_api_request(form_data)
470:       if res.elements["query/deletedrevs/page"]
471:         token = res.elements["query/deletedrevs/page"].attributes["token"]
472:         raise "User is not permitted to perform this operation: #{type}" if token.nil?
473:         token
474:       else
475:         nil
476:       end
477:     end

Make generic request to API

form_data
hash or string of attributes to post
continue_xpath
XPath selector for query continue parameter
retry_count
Counter for retries

Returns XML document

[Source]

     # File lib/media_wiki/gateway.rb, line 486
486:     def make_api_request(form_data, continue_xpath=nil, retry_count=1)
487:       if form_data.kind_of? Hash
488:         form_data['format'] = 'xml'
489:         form_data['maxlag'] = @options[:maxlag]
490:       end
491:       log.debug("REQ: #{form_data.inspect}, #{@cookies.inspect}")
492:       RestClient.post(@wiki_url, form_data, @headers.merge({:cookies => @cookies})) do |response, &block|
493:         if response.code == 503 and retry_count < @options[:retry_count]
494:           log.warn("503 Service Unavailable: #{response.body}.  Retry in #{@options[:retry_delay]} seconds.")
495:           sleep @options[:retry_delay]
496:           make_api_request(form_data, continue_xpath, retry_count + 1)
497:         end
498:         # Check response for errors and return XML
499:         raise "API error, bad response: #{response}" unless response.code >= 200 and response.code < 300 
500:         doc = get_response(response.dup)
501:         if(form_data['action'] == 'login')
502:           login_result = doc.elements["login"].attributes['result']
503:           @cookies.merge!(response.cookies)
504:           case login_result
505:             when "Success" then # do nothing
506:             when "NeedToken" then make_api_request(form_data.merge('lgtoken' => doc.elements["login"].attributes["token"]))
507:             else raise "Login failed: " + login_result
508:           end
509:         end
510:         continue = (continue_xpath and doc.elements['query-continue']) ? REXML::XPath.first(doc, continue_xpath).value : nil
511:         return [doc, continue]
512:       end
513:     end

[Source]

     # File lib/media_wiki/gateway.rb, line 537
537:     def valid_page?(page)
538:       return false unless page
539:       return false if page.attributes["missing"]
540:       if page.attributes["invalid"]
541:         warning("Invalid title '#{page.attributes["title"]}'")
542:       else
543:         true
544:       end
545:     end

[Source]

     # File lib/media_wiki/gateway.rb, line 547
547:     def warning(msg)
548:       if @options[:ignorewarnings]
549:         log.warn(msg)
550:         return false
551:       else
552:         raise msg
553:       end
554:     end

[Validate]