Co-clustering¶
- class surprise.prediction_algorithms.co_clustering.CoClustering(n_cltr_u=3, n_cltr_i=3, n_epochs=20, random_state=None, verbose=False)¶
Bases:
AlgoBase
A collaborative filtering algorithm based on co-clustering.
This is a straightforward implementation of [GM05].
Basically, users and items are assigned some clusters \(C_u\), \(C_i\), and some co-clusters \(C_{ui}\).
The prediction \(\hat{r}_{ui}\) is set as:
\[\hat{r}_{ui} = \overline{C_{ui}} + (\mu_u - \overline{C_u}) + (\mu_i - \overline{C_i}),\]where \(\overline{C_{ui}}\) is the average rating of co-cluster \(C_{ui}\), \(\overline{C_u}\) is the average rating of \(u\)’s cluster, and \(\overline{C_i}\) is the average rating of \(i\)’s cluster. If the user is unknown, the prediction is \(\hat{r}_{ui} = \mu_i\). If the item is unknown, the prediction is \(\hat{r}_{ui} = \mu_u\). If both the user and the item are unknown, the prediction is \(\hat{r}_{ui} = \mu\).
Clusters are assigned using a straightforward optimization method, much like k-means.
- Parameters:
n_cltr_u (int) – Number of user clusters. Default is
3
.n_cltr_i (int) – Number of item clusters. Default is
3
.n_epochs (int) – Number of iteration of the optimization loop. Default is
20
.random_state (int, RandomState instance from numpy, or
None
) – Determines the RNG that will be used for initialization. If int,random_state
will be used as a seed for a new RNG. This is useful to get the same initialization over multiple calls tofit()
. If RandomState instance, this same instance is used as RNG. IfNone
, the current RNG from numpy is used. Default isNone
.verbose (bool) – If True, the current epoch will be printed. Default is
False
.