README.md in disco-0.1.1 vs README.md in disco-0.1.2

- old
+ new

@@ -2,11 +2,11 @@ :fire: Collaborative filtering for Ruby - Supports user-based and item-based recommendations - Works with explicit and implicit feedback -- Uses matrix factorization +- Uses high-performance matrix factorization [![Build Status](https://travis-ci.org/ankane/disco.svg?branch=master)](https://travis-ci.org/ankane/disco) ## Installation @@ -200,11 +200,11 @@ recommender = Marshal.load(bin) ``` ## Algorithms -Disco uses matrix factorization. +Disco uses high-performance matrix factorization. - For explicit feedback, it uses [stochastic gradient descent](https://www.csie.ntu.edu.tw/~cjlin/papers/libmf/libmf_journal.pdf) - For implicit feedback, it uses [coordinate descent](https://www.csie.ntu.edu.tw/~cjlin/papers/one-class-mf/biased-mf-sdm-with-supp.pdf) Specify the number of factors and epochs @@ -234,17 +234,46 @@ There are a number of ways to deal with this, but here are some common ones: - For user-based recommendations, show new users the most popular items. - For item-based recommendations, make content-based recommendations with a gem like [tf-idf-similarity](https://github.com/jpmckinney/tf-idf-similarity). -## Daru +## Data -Disco works with Daru data frames +Data can be an array of hashes ```ruby -data = Daru::DataFrame.from_csv("ratings.csv") -recommender.fit(data) +[{user_id: 1, item_id: 1, rating: 5}, {user_id: 2, item_id: 1, rating: 3}] ``` + +Or a Daru data frame + +```ruby +Daru::DataFrame.from_csv("ratings.csv") +``` + +## Faster Similarity [experimental] + +If you have a large number of users/items, you can use an approximate nearest neighbors library like [NGT](https://github.com/ankane/ngt) to speed up item-based recommendations and similar users. + +Add this line to your application’s Gemfile: + +```ruby +gem 'ngt', '>= 0.2.3' +``` + +Speed up item-based recommendations with: + +```ruby +model.optimize_item_recs +``` + +Speed up similar users with: + +```ruby +model.optimize_similar_users +``` + +This should be called after fitting or loading the model. ## Reference Get the global mean