# patentscope gem [![Gem Version](https://badge.fury.io/rb/patentscope.png)](http://badge.fury.io/rb/patentscope) [![Code Climate](https://codeclimate.com/github/cantab/patentscope.png)](https://codeclimate.com/github/cantab/patentscope) Gem to allow easy access to data from the WIPO PATENTSCOPE Web Service. ## Introduction The Patentscope gem allows easy access, with Ruby, to data provided by the PATENTSCOPE Web Service of the World Intellectual Property Organisation (WIPO). As provided by WIPO, the PATENTSCOPE Web Service is available through a SOAP interface. The documentation provided by WIPO uses Java. The Patentscope gem, on the other hand, provides a simple Ruby interface to the PATENTSCOPE Web Service. The gem allows access for each of the functions available from the SOAP interface. ## About the WIPO PATENTSCOPE Web Service From [PATENTSCOPE Web Service](http://www.wipo.int/patentscope/en/data/products.html) site: "Includes: * Bibliographic data for all published international applications (XML format); * Images for all published international applications (TIFF format); * Full-text description and claims (OCR output) for all international applications published in English, French, German, Spanish and Russian, as well as Japanese and Korean (soon available) (PDF format). Available on the Internet on the day of publication. Programmatic access ... to the documents available in the document tab of the PATENTSCOPE search engine ([example](http://www.wipo.int/pctdb/en/wo.jsp?WO=2009120859&IA=US2009038389&DISPLAY=DOCS)). This set makes it possible to integrate access to PATENTSCOPE in an IT architecture, to retrieve the International Application Status Report (IASR) and to parse it on the fly and to download, within the framework of the [authorized uses policy](http://www.wipo.int/patentscope/en/data/terms.html); documents by batch. The formats of the documents are the same as the formats of the documents available via the web site, i.e. TIFF, XML for all documents and a text-based PDF OCR for most pamphlets." The PATENTSCOPE Web Service is available from the World Intellectual Property Organisation (WIPO) through a [paid subscription](http://www.wipo.int/patentscope/en/data/forms/web_service.jsp). The current cost of a subscription is 600 Swiss Francs per calendar year. If you [ask nicely](mailto:patentscope@wipo.int?subject=Request%20for%20Trial%20Trial%20to%20PATENTSCOPE%20Web%20Service), the folks at WIPO might give you a trial account. ## Installation Add this line to your application's Gemfile: gem 'patentscope' And then execute: $ bundle Or install it yourself as: $ gem install patentscope ## Usage ### Configuration Run the configuration block first to set the credentials for the PATENTSCOPE Web Service. Patentscope.configure do |config| config.username = 'username' config.password = 'password' end ### Configuring from Environment Variables It is most convenient to store the PATENTSCOPE Web Service username and password credentials as environment variables. If these are stored as `PATENTSCOPE_WEBSERVICE_USERNAME` and `PATENTSCOPE_WEBSERVICE_PASSWORD` respectively, you can simply use Patentscope.configure_from_env to load the credentials into the configuration in a single step. ### Querying and Resetting Configuration The `configured?` class method returns a boolean indicating whether the configuration has been set. This doesn't necessarily mean that the credentials are valid, only that they have been set. Patentscope.configured? #=> true Use the `username` method of the `configuration` object to obtain the username set in the configuration. Patentscope.configuration.username #=> 'username' The `password` method of the `configuration` object returns the password set in the configuration. Patentscope.configuration.password #=> 'password' The `reset_configuration` class method resets the configuration. Patentscope.reset_configuration Patentscope.configuration #=> nil ### List of Available Methods * `get_iasr` * `get_available_documents` * `get_document_content` * `get_document_ocr_content` * `get_document_table_of_contents` * `get_document_content_page` * `wsdl` ### Getting the International Application Status Report (`get_iasr`) This is possibly the most useful of all the functions provided by this gem. The `get_iasr` class method returns an International Application Status Report in XML format for the specified application number. The IASR document is essentially a bibliographic summary of the PCT application in XML format. The `get_iasr` method takes an International Application number, with or without the PCT prefix and with or without slashes. Patentscope.get_iasr('SG2003000062') Patentscope.get_iasr('SG2003/000062') Patentscope.get_iasr('PCTSG2003000062') Patentscope.get_iasr('PCT/SG2003/000062') Example output for SG2003000062: Patentscope.get_iasr('SG2003000062') #=> WO 2009/105044 A1 20090827 ... The PATENTSCOPE Web Service doesn't allow us to access documents using WO publication numbers. Calling `Patentscope.get_iasr('WO2003/080231')` for example will fail. ### Getting a List of Available Documents (`get_available_documents`) The `get_available_documents` class method returns the list of available documents for the specified application number. Patentscope.get_available_documents('SG2009000062') # => ### Getting the Binary Content of a Document (`get_document_content`) The `get_document_content` class method returns the binary content of the document for the specified document id. Patentscope.get_document_content('090063618004ca88') #=> UEsDBBQACAAIAIyMOy0AAAAAAAAAAAAAAAAKAAAAMDAwMDAxLnRpZsy7ezxU2 ... ### Getting the Text of a Document in PDF Format (`get_document_ocr_content`) The `get_document_ocr_content` class method returns the binary content of the document for the specified document id, in text-based PDF format (high quality OCR). Patentscope.get_document_ocr_content('id00000015801579') => JVBERi0xLjQKJeLjz9MKOCAwIG9iago8PC9EZWNvZGVQYXJtcyA8PC9CbG ... ### Getting a List of Page IDs for a Document (`get_document_table_of_contents`) The `get_document_table_of_contents` class method returns the list of page ids for the specified document id. Patentscope.get_document_table_of_contents('090063618004ca88') #=> 000001.tif ### Getting the Binary Content for a Document and Page (`get_document_content_page`) The `get_document_content_page` class method returns the binary content for specified document and page ids. Patentscope.get_document_content_page('090063618004ca88', '000001.tif') #=> SUkqAAgAAAASAP4ABAABAAAAAAAAAAABAwABAAAA ###Getting a WSDL Document for the Web Service (`wsdl`) The `wsdl` method returns a WSDL document for the PATENTSCOPE Web Service Patentscope.wsdl #=> ...