Matrix Factorization-based algortihms

class surprise.prediction_algorithms.matrix_factorization.SVD

Bases: surprise.prediction_algorithms.algo_base.AlgoBase

The famous SVD algorithm, as popularized by Simon Funk during the Netflix Prize.

The prediction \(\hat{r}_{ui}\) is set as:

\[\hat{r}_{ui} = \mu + b_u + b_i + q_i^Tp_u\]

If user \(u\) is unknown, then the bias \(b_u\) and the factors \(p_u\) are assumed to be zero. The same applies for item \(i\) with \(b_i\) and \(q_i\).

For details, see eq. 5 from Matrix Factorization Techniques For Recommender Systems by Koren, Bell and Volinsky. See also The Recommender System Handbook, section 5.3.1.

To estimate all the unkown, we minimize the following regularized squared error:

\[\sum_{r_{ui} \in R_{train}} \left(r_{ui} - \hat{r}_{ui} \right)^2 + \lambda\left(b_i^2 + b_u^2 + ||q_i||^2 + ||p_u||^2\right)\]

The minimization is performed by a very straightforward stochastic gradient descent:

\[\begin{split}b_u &\rightarrow b_u &+ \gamma (e_{ui} - \lambda b_u)\\ b_i &\rightarrow b_i &+ \gamma (e_{ui} - \lambda b_i)\\ p_u &\rightarrow p_u &+ \gamma (e_{ui} q_i - \lambda p_u)\\ q_i &\rightarrow q_i &+ \gamma (e_{ui} p_u - \lambda q_i)\end{split}\]

where \(e_{ui} = r_{ui} - \hat{r}_{ui}\). These steps are performed over all the ratings of the trainset and repeated n_epochs times. Baselines are initialized to 0. User and item factors are initialized to 0.1, as recommended by Funk.

You have control over the learning rate \(\gamma\) and the regularization term \(\lambda\). Both can be different for each kind of parameter (see below). By default, learning rates are set to 0.005 and regularization termes are set to 0.02.

Parameters:
  • n_factors – The number of factors. Default is 100.
  • n_epochs – The number of iteration of the SGD procedure. Default is 20.
  • lr_all – The learning rate for all parameters. Default is 0.005.
  • reg_all – The regularization term for all parameters. Default is 0.02.
  • lr_bu – The learning rate for \(b_u\). Takes precedence over lr_all if set. Default is None.
  • lr_bi – The learning rate for \(b_i\). Takes precedence over lr_all if set. Default is None.
  • lr_pu – The learning rate for \(p_u\). Takes precedence over lr_all if set. Default is None.
  • lr_qi – The learning rate for \(q_i\). Takes precedence over lr_all if set. Default is None.
  • reg_bu – The regularization term for \(b_u\). Takes precedence over reg_all if set. Default is None.
  • reg_bi – The regularization term for \(b_i\). Takes precedence over reg_all if set. Default is None.
  • reg_pu – The regularization term for \(p_u\). Takes precedence over reg_all if set. Default is None.
  • reg_qi – The regularization term for \(q_i\). Takes precedence over reg_all if set. Default is None.
class surprise.prediction_algorithms.matrix_factorization.SVDpp

Bases: surprise.prediction_algorithms.algo_base.AlgoBase

The SVD++ algorithm, an extension of SVD taking into account implicite ratings.

The prediction \(\hat{r}_{ui}\) is set as:

\[\hat{r}_{ui} = \mu + b_u + b_i + q_i^T\left(p_u + |I_u|^{-\frac{1}{2}} \sum_{j \in I_u}y_j\right)\]

Where the \(y_j\) terms are a new set of item factors that capture implicite ratings.

If user \(u\) is unknown, then the bias \(b_u\) and the factors \(p_u\) are assumed to be zero. The same applies for item \(i\) with \(b_i\), \(q_i\) and \(y_i\).

For details, see eq. 15 from Factorization Meets The Neighborhood by Yehuda Koren. See also The Recommender System Handbook, section 5.3.1.

Just as for SVD, the parameters are learnt using a SGD on the regularized squared error objective.

Baselines are initialized to 0. User and item factors are initialized to 0.1, as recommended by Funk.

You have control over the learning rate \(\gamma\) and the regularization term \(\lambda\). Both can be different for each kind of parameter (see below). By default, learning rates are set to 0.005 and regularization termes are set to 0.02.

Parameters:
  • n_factors – The number of factors. Default is 20.
  • n_epochs – The number of iteration of the SGD procedure. Default is 20.
  • lr_all – The learning rate for all parameters. Default is 0.007.
  • reg_all – The regularization term for all parameters. Default is 0.02.
  • lr_bu – The learning rate for \(b_u\). Takes precedence over lr_all if set. Default is None.
  • lr_bi – The learning rate for \(b_i\). Takes precedence over lr_all if set. Default is None.
  • lr_pu – The learning rate for \(p_u\). Takes precedence over lr_all if set. Default is None.
  • lr_qi – The learning rate for \(q_i\). Takes precedence over lr_all if set. Default is None.
  • lr_yj – The learning rate for \(y_j\). Takes precedence over lr_all if set. Default is None.
  • reg_bu – The regularization term for \(b_u\). Takes precedence over reg_all if set. Default is None.
  • reg_bi – The regularization term for \(b_i\). Takes precedence over reg_all if set. Default is None.
  • reg_pu – The regularization term for \(p_u\). Takes precedence over reg_all if set. Default is None.
  • reg_qi – The regularization term for \(q_i\). Takes precedence over reg_all if set. Default is None.
  • reg_yj – The regularization term for \(y_j\). Takes precedence over reg_all if set. Default is None.