README.md in xlearn-0.1.0 vs README.md in xlearn-0.1.1

- old
+ new

@@ -8,10 +8,12 @@ - Linear models - Factorization machines - Field-aware factorization machines +[![Build Status](https://travis-ci.org/ankane/xlearn.svg?branch=master)](https://travis-ci.org/ankane/xlearn) + ## Installation First, [install xLearn](https://xlearn-doc.readthedocs.io/en/latest/install/index.html). On Mac, copy `build/lib/libxlearn_api.dylib` to `/usr/local/lib`. Add this line to your application’s Gemfile: @@ -20,12 +22,10 @@ gem 'xlearn' ``` ## Getting Started -This library is modeled after the [Python Scikit-learn API](https://xlearn-doc.readthedocs.io/en/latest/python_api/index.html). Some methods are missing at the moment. PRs welcome! - Prep your data ```ruby x = [[1, 2], [3, 4], [5, 6], [7, 8]] y = [1, 2, 3, 4] @@ -56,43 +56,138 @@ ```ruby model.load_model("model.bin") ``` +Save a text version of the model + +```ruby +model.save_txt("model.txt") +``` + +Pass a validation set + +```ruby +model.fit(x_train, y_train, eval_set: [x_val, y_val]) +``` + +Train online + +```ruby +model.partial_fit(x_train, y_train) +``` + +Get the bias term, linear term, and latent factors + +```ruby +model.bias_term +model.linear_term +model.latent_factors # fm and ffm only +``` + ## Parameters Specify parameters ```ruby -model = XLearn::FM.new(k: 20, epoch: 50) +model = XLearn::Linear.new(k: 20, epoch: 50) ``` Supports the same parameters as [Python](https://xlearn-doc.readthedocs.io/en/latest/all_api/index.html) -## Validation +## Cross-Validation -Pass a validation set when fitting +Cross-validation ```ruby -model.fit(x_train, y_train, eval_set: [x_val, y_val]) +model.cv(x, y) ``` +Specify the number of folds + +```ruby +model.cv(x, y, folds: 5) +``` + +## Data + +Data can be an array of arrays + +```ruby +[[1, 2, 3], [4, 5, 6]] +``` + +Or a Daru data frame + +```ruby +Daru::DataFrame.from_csv("houses.csv") +``` + +Or a Numo NArray + +```ruby +Numo::DFloat.new(3, 2).seq +``` + ## Performance -For performance, you can read data directly from files +For large datasets, read data directly from files ```ruby model.fit("train.txt", eval_set: "validate.txt") model.predict("test.txt") +model.cv("train.txt") ``` -[These formats](https://xlearn-doc.readthedocs.io/en/latest/python_api/index.html#choose-machine-learning-algorithm) are supported +For linear models and factorization machines, use CSV: +```txt +label,value_1,value_2,...,value_n +``` + +Or the `libsvm` format (better for sparse data): + +```txt +label index_1:value_1 index_2:value_2 ... index_n:value_n +``` + +> You can also use commas instead of spaces for separators + +For field-aware factorization machines, use the `libffm` format: + +```txt +label field_1:index_1:value_1 field_2:index_2:value_2 ... +``` + +> You can also use commas instead of spaces for separators + You can also write predictions directly to a file ```ruby -model.predict("test.txt", out_file: "predictions.txt") +model.predict("test.txt", out_path: "predictions.txt") ``` + +## xLearn Installation + +There’s an experimental branch that includes xLearn with the gem for easiest installation. + +```ruby +gem 'xlearn', github: 'ankane/xlearn', branch: 'vendor', submodules: true +``` + +Please file an issue if it doesn’t work for you. + +You can also specify the path to xLearn in an initializer: + +```ruby +XLearn.ffi_lib << "/path/to/xlearn/lib/libxlearn_api.so" +``` + +> Use `libxlearn_api.dylib` for Mac and `xlearn_api.dll` for Windows + +## Credits + +This library is modeled after xLearn’s [Scikit-learn API](https://xlearn-doc.readthedocs.io/en/latest/python_api/index.html). ## History View the [changelog](https://github.com/ankane/xlearn/blob/master/CHANGELOG.md)