Matrix Factorization-based algortihms¶
-
class
surprise.prediction_algorithms.matrix_factorization.
SVD
¶ Bases:
surprise.prediction_algorithms.algo_base.AlgoBase
The famous SVD algorithm, as popularized by Simon Funk during the Netflix Prize.
The prediction \(\hat{r}_{ui}\) is set as:
\[\hat{r}_{ui} = \mu + b_u + b_i + q_i^Tp_u\]If user \(u\) is unknown, then the bias \(b_u\) and the factors \(p_u\) are assumed to be zero. The same applies for item \(i\) with \(b_i\) and \(q_i\).
For details, see eq. 5 from Matrix Factorization Techniques For Recommender Systems by Koren, Bell and Volinsky. See also The Recommender System Handbook, section 5.3.1.
To estimate all the unkown, we minimize the following regularized squared error:
\[\sum_{r_{ui} \in R_{train}} \left(r_{ui} - \hat{r}_{ui} \right)^2 + \lambda\left(b_i^2 + b_u^2 + ||q_i||^2 + ||p_u||^2\right)\]The minimization is performed by a very straightforward stochastic gradient descent:
\[\begin{split}b_u &\rightarrow b_u &+ \gamma (e_{ui} - \lambda b_u)\\ b_i &\rightarrow b_i &+ \gamma (e_{ui} - \lambda b_i)\\ p_u &\rightarrow p_u &+ \gamma (e_{ui} q_i - \lambda p_u)\\ q_i &\rightarrow q_i &+ \gamma (e_{ui} p_u - \lambda q_i)\end{split}\]where \(e_{ui} = r_{ui} - \hat{r}_{ui}\). These steps are performed over all the ratings of the trainset and repeated
n_epochs
times. Baselines are initialized to0
. User and item factors are initialized to0.1
, as recommended by Funk.You have control over the learning rate \(\gamma\) and the regularization term \(\lambda\). Both can be different for each kind of parameter (see below). By default, learning rates are set to
0.005
and regularization termes are set to0.02
.Parameters: - n_factors – The number of factors. Default is
100
. - n_epochs – The number of iteration of the SGD procedure. Default is
20
. - lr_all – The learning rate for all parameters. Default is
0.005
. - reg_all – The regularization term for all parameters. Default is
0.02
. - lr_bu – The learning rate for \(b_u\). Takes precedence over
lr_all
if set. Default isNone
. - lr_bi – The learning rate for \(b_i\). Takes precedence over
lr_all
if set. Default isNone
. - lr_pu – The learning rate for \(p_u\). Takes precedence over
lr_all
if set. Default isNone
. - lr_qi – The learning rate for \(q_i\). Takes precedence over
lr_all
if set. Default isNone
. - reg_bu – The regularization term for \(b_u\). Takes precedence
over
reg_all
if set. Default isNone
. - reg_bi – The regularization term for \(b_i\). Takes precedence
over
reg_all
if set. Default isNone
. - reg_pu – The regularization term for \(p_u\). Takes precedence
over
reg_all
if set. Default isNone
. - reg_qi – The regularization term for \(q_i\). Takes precedence
over
reg_all
if set. Default isNone
.
- n_factors – The number of factors. Default is
-
class
surprise.prediction_algorithms.matrix_factorization.
SVDpp
¶ Bases:
surprise.prediction_algorithms.algo_base.AlgoBase
The SVD++ algorithm, an extension of
SVD
taking into account implicite ratings.The prediction \(\hat{r}_{ui}\) is set as:
\[\hat{r}_{ui} = \mu + b_u + b_i + q_i^T\left(p_u + |I_u|^{-\frac{1}{2}} \sum_{j \in I_u}y_j\right)\]Where the \(y_j\) terms are a new set of item factors that capture implicite ratings.
If user \(u\) is unknown, then the bias \(b_u\) and the factors \(p_u\) are assumed to be zero. The same applies for item \(i\) with \(b_i\), \(q_i\) and \(y_i\).
For details, see eq. 15 from Factorization Meets The Neighborhood by Yehuda Koren. See also The Recommender System Handbook, section 5.3.1.
Just as for
SVD
, the parameters are learnt using a SGD on the regularized squared error objective.Baselines are initialized to
0
. User and item factors are initialized to0.1
, as recommended by Funk.You have control over the learning rate \(\gamma\) and the regularization term \(\lambda\). Both can be different for each kind of parameter (see below). By default, learning rates are set to
0.005
and regularization termes are set to0.02
.Parameters: - n_factors – The number of factors. Default is
20
. - n_epochs – The number of iteration of the SGD procedure. Default is
20
. - lr_all – The learning rate for all parameters. Default is
0.007
. - reg_all – The regularization term for all parameters. Default is
0.02
. - lr_bu – The learning rate for \(b_u\). Takes precedence over
lr_all
if set. Default isNone
. - lr_bi – The learning rate for \(b_i\). Takes precedence over
lr_all
if set. Default isNone
. - lr_pu – The learning rate for \(p_u\). Takes precedence over
lr_all
if set. Default isNone
. - lr_qi – The learning rate for \(q_i\). Takes precedence over
lr_all
if set. Default isNone
. - lr_yj – The learning rate for \(y_j\). Takes precedence over
lr_all
if set. Default isNone
. - reg_bu – The regularization term for \(b_u\). Takes precedence
over
reg_all
if set. Default isNone
. - reg_bi – The regularization term for \(b_i\). Takes precedence
over
reg_all
if set. Default isNone
. - reg_pu – The regularization term for \(p_u\). Takes precedence
over
reg_all
if set. Default isNone
. - reg_qi – The regularization term for \(q_i\). Takes precedence
over
reg_all
if set. Default isNone
. - reg_yj – The regularization term for \(y_j\). Takes precedence
over
reg_all
if set. Default isNone
.
- n_factors – The number of factors. Default is