Welcome to Surprise’ documentation!

Surprise is an open source Python library that provides with tools to build and evaluate the performance of many recommender system prediction algorithms. Its goal is to make life easy(-ier) for reseachers, teachers and students who want to play around with new recommender algorithms ideas and teach/learn more about recommender systems.

If you’re new to Surprise, we invite you to take a look at the Getting Started guide, where you’ll find a series of tutorials illustrating all you can do with Surprise.

Any kind of feedback/criticism would be greatly appreciated (software design, documentation, improvement ideas, spelling mistakes, etc...). Please feel free to contribute and send pull requests (see GitHub page)!

Getting Started

Basic usage

Surprise has a set of built-in algorithms and datasets for you to play with. In its simplest form, it takes about four lines of code to evaluate the performance of an algorithm:

From file examples/basic_usage.py
from surprise import SVD
from surprise import Dataset
from surprise import evaluate


# Load the movielens-100k dataset (download it if needed),
# and split it into 3 folds for cross-validation.
data = Dataset.load_builtin('ml-100k')
data.split(n_folds=3)

# We'll use the famous SVD algorithm.
algo = SVD()

# Evaluate performances of our algorithm on the dataset.
perf = evaluate(algo, data, measures=['RMSE', 'MAE'])

print(perf)

If Surprise cannot find the movielens-100k dataset, it will offer to download it and will store it under the .surprise_data folder in your home directory. The split() method automatically splits the dataset into 3 folds and the evaluate() function runs the cross-validation procedure and compute some accuracy measures.

Load a custom dataset

You can of course use a custom dataset. Surprise offers two ways of loading a custom dataset:

  • you can either specify a single file with all the ratings and use the split () method to perform cross-validation ;
  • or if your dataset is already split into predefined folds, you can specify a list of files for training and testing.

Either way, you will need to define a Reader object for Surprise to be able to parse the file(s).

We’ll see how to handle both cases with the movielens-100k dataset. Of course this is a built-in dataset, but we will act as if it were not.

Load an entire dataset

From file examples/load_custom_dataset.py
# path to dataset file
file_path = '/home/nico/.surprise_data/ml-100k/ml-100k/u.data'  # change this

# As we're loading a custom dataset, we need to define a reader. In the
# movielens-100k dataset, each line has the following format:
# 'user item rating timestamp', separated by '\t' characters.
reader = Reader(line_format='user item rating timestamp', sep='\t')

data = Dataset.load_from_file(file_path, reader=reader)
data.split(n_folds=5)

Note

Actually, as the Movielens-100k dataset is builtin, Surprise provides with a proper reader so in this case, we could have just created the reader like this:

reader = Reader('ml-100k')

For more details about readers and how to use them, see the Reader class documentation.

Load a dataset with predefined folds

From file examples/load_custom_dataset_predefined_folds.py
# path to dataset folder
files_dir = os.path.exapanduser('~/.surprise_data/ml-100k/ml-100k/')

# This time, we'll use the built-in reader.
reader = Reader('ml-100k')

# folds_files is a list of tuples containing file paths:
# [(u1.base, u1.test), (u2.base, u2.test), ... (u5.base, u5.test)]
train_file = files_dir + 'u%d.base'
test_file = files_dir + 'u%d.test'
folds_files = [(train_file % i, test_file % i) for i in (1, 2, 3, 4, 5)]

data = Dataset.load_from_folds(folds_files, reader=reader)

Of course, nothing prevents you from only loading a single file for training and a single file for testing. However, the folds_files parameter still needs to be a list.

Advanced usage

We will here get a little deeper on what can Surprise do for you.

Manually iterate over folds

We have so far used the evaluate() function that does all the hard work for us. If you want to have better control on your experiments, you can use the folds() generator of your dataset, and then the train() and test() methods of your algorithm on each of the folds:

From file examples/iterate_over_folds.py
data = Dataset.load_builtin('ml-100k')
data.split(n_folds=3)

algo = BaselineOnly()

for trainset, testset in data.folds():

    # train and test algorithm.
    algo.train(trainset)
    predictions = algo.test(testset)

    # Compute and print Root Mean Squared Error
    rmse = accuracy.rmse(predictions, verbose=True)

Train on a whole trainset and specifically query for predictions

We will here review how to get a prediction for specified users and items. In the mean time, we will also review how to train on a whole dataset, whithout performing cross-validation (i.e. there is no test set).

The latter is pretty straightforward: all you need is to load a dataset, and the build_full_trainset() method to build the trainset and train you algorithm:

From file examples/query_for_predictions.py
data = Dataset.load_builtin('ml-100k')

# Retrieve the trainset.
trainset = data.build_full_trainset()

# Build an algorithm, and train it.
algo = KNNBasic()
algo.train(trainset)

Now, there’s no way we could call the test() method, because we have no testset. But you can still get predictions for the users and items you want.

Let’s say you’re interested in user 196 and item 302 (make sure they’re in the trainset!), and you know that the true rating \(r_{ui} = 4\). All you need is call the predict() method:

From file examples/query_for_predictions.py
uid = str(196)  # raw user id (as in the ratings file). They are **strings**!
iid = str(302)  # raw item id (as in the ratings file). They are **strings**!

# get a prediction for specific users and items.
pred = algo.predict(uid, iid, r=4, verbose=True)

If the predict() method is called with user or item ids that were not part of the trainset, it’s up to the algorithm to decide if he still can make a prediction or not. If it can’t, predict() will still predict the mean of all ratings \(\mu\).

Note

Raw ids are ids as defined in a rating file. They can be strings or whatever. On trainset creation, each raw id is mapped to a (unique) integer called inner id, which is a lot more suitable for Surprise to manipulate. To convert a raw id to an inner id, you can use the to_inner_uid() and to_inner_iid() methods of the trainset.

Obviously, it is perfectly fine to use the predict() method directly during a cross-validation process. It’s then up to you to ensure that the user and item ids are present in the trainset though.

Dump the predictions for later analysis

You may want to save your algorithm predictions along with all the usefull information about the algorithm. This way, you can run your algorithm once, save the results, and go back to them whenever you want to inspect in greater details each of the predictions, and get a good insight on why your algorithm performs well (or bad!). Surprise provides with some tools to do that.

You can dump your algorithm predictions either using the evaluate() function, or do it manually with the dump function. Either way, an example is worth a thousand words, so here a few jupyter notebooks:

Notation standards

In the documentation, you will find the following notation:

  • \(R\) : the set of all ratings.
  • \(R_{train}\), \(R_{test}\) and \(\hat{R}\) denote the training set, the test set, and the set of predicted ratings.
  • \(U\) : the set of all users. \(u\) and \(v\) denotes users.
  • \(I\) : the set of all items. \(i\) and \(j\) denotes items.
  • \(U_i\) : the set of all users that have rated item \(i\).
  • \(U_{ij}\) : the set of all users that have rated both items \(i\) and \(j\).
  • \(I_u\) : the set of all items rated by user \(u\).
  • \(I_{uv}\) : the set of all items rated by both users \(u\) and \(v\).
  • \(r_{ui}\) : the true rating of user \(u\) for item \(i\).
  • \(\hat{r}_{ui}\) : the estimated rating of user \(u\) for item \(i\).
  • \(b_{ui}\) : the baseline rating of user \(u\) for item \(i\).
  • \(\mu\) : the mean of all ratings.
  • \(\mu_u\) : the mean of all ratings given by user \(u\).
  • \(\mu_i\) : the mean of all ratings given to item \(i\).
  • \(N_i^k(u)\) : the \(k\) nearest neighbors of user \(u\) that have rated item \(i\). This set is computed using a similarity metric.
  • \(N_u^k(i)\) : the \(k\) nearest neighbors of item \(i\) that are rated by user \(u\). This set is computed using a similarity metric.

Prediction algorithms

Surprise provides with a bunch of built-in algorithms. You can find the details of each of these in the surprise.prediction_algorithms package documentation.

Every algorithm is part of the global Surprise namespace, so you only need to import their names from the Surprise package, for example:

from surprise import KNNBasic
algo = KNNBasic()

Some of these algorithms may use baseline estimates, some may use a similarity measure. We will here review how to configure the way baselines and similarities are computed.

Baselines estimates configuration

Note

This section only applies to algorithms (or similarity measures) that try to minimize the following regularized squared error (or equivalent):

\[\sum_{r_{ui} \in R_{train}} \left(r_{ui} - (\mu + b_u + b_i)\right)^2 + \lambda \left(b_u^2 + b_i^2 \right).\]

For algorithms using baselines in another objective function (e.g. the SVD algorithm), the baseline configuration is done differently and is specific to each algorithm. Please refer to their own documentation.

First of all, if you do not want to configure the way baselines are computed, you don’t have to: the default parameters will do just fine. If you do want to well... This is for you.

You may want to read section 2.1 of Factor in the Neighbors: Scalable and Accurate Collaborative Filtering by Yehuda Koren to get a good idea of what are baseline estimates.

Baselines can be estimated in two different ways:

  • Using Stochastic Gradient Descent (SGD).
  • Using Alternating Least Squares (ALS).

You can configure the way baselines are computed using the bsl_options parameter passed at the creation of an algorithm. This parameter is a dictionary for which the key 'method' indicates the method to use. Accepted values are 'als' (default) and 'sgd'. Depending on its value, other options may be set. For ALS:

  • 'reg_i': The regularization parameter for items. Corresponding to \(\lambda_2\) in the paper. Default is 10.
  • 'reg_u': The regularization parameter for users, orresponding to \(\lambda_3\) in the paper. Default is 15.
  • 'n_epochs': The number of iteration of the ALS procedure. Default is 10. Note that in the paper, what is described is a single iteration ALS process.

And for SGD:

  • 'reg': The regularization parameter of the cost function that is optimized, corresponding to \(\lambda_1\) and then \(\lambda_5\) in the paper. Default is 0.02.
  • 'learning_rate': The learning rate of SGD, corresponding to \(\gamma\) in the paper. Default is 0.005.
  • 'n_epochs': The number of iteration of the SGD procedure. Default is 20.

Note

For both procedures (ALS and SGD), user and item biases (\(b_u\) and \(b_i\)) are initialized to zero.

Usage examples:

From file examples/baselines_conf.py
print('Using ALS')
bsl_options = {'method': 'als',
               'n_epochs': 5,
               'reg_u': 12,
               'reg_i': 5
               }
algo = BaselineOnly(bsl_options=bsl_options)
From file examples/baselines_conf.py
print('Using SGD')
bsl_options = {'method': 'sgd',
               'learning_rate': .00005,
               }
algo = BaselineOnly(bsl_options=bsl_options)

Note that some similarity measures may use baselines, such as the pearson_baseline similarity. Configuration works just the same, whether the baselines are used in the actual prediction \(\hat{r}_{ui}\) or not:

From file examples/baselines_conf.py
bsl_options = {'method': 'als',
               'n_epochs': 20,
               }
sim_options = {'name': 'pearson_baseline'}
algo = KNNBasic(bsl_options=bsl_options, sim_options=sim_options)

This leads us to similarity measure configuration, which we will review right now.

Similarity measure configuration

Many algorithms use a similarity measure to estimate a rating. The way they can be configured is done in a similar fashion as for baseline ratings: you just need to pass a sim_options argument at the creation of an algorithm. This argument is a dictionary with the following (all optional) keys:

  • 'name': The name of the similarity to use, as defined in the similarities module. Default is 'MSD'.
  • 'user_based': Whether similarities will be computed between users or between items. This has a huge impact on the performance of a prediction algorithm. Default is True.
  • 'min_support': The minimum number of common items (when 'user_based' is 'True') or minimum number of common users (when 'user_based' is 'False') for the similarity not to be zero. Simply put, if \(|I_{uv}| < \text{min_support}\) then \(\text{sim}(u, v) = 0\). The same goes for items.
  • 'shrinkage': Shrinkage parameter to apply (only relevent for pearson_baseline similarity). Default is 100.

Usage examples:

From file examples/similarity_conf.py
sim_options = {'name': 'cosine',
               'user_based': False  # compute  similarities between items
               }
algo = KNNBasic(sim_options=sim_options)
From file examples/similarity_conf.py
sim_options = {'name': 'pearson_baseline',
               'shrinkage': 0  # no shrinkage
               }
algo = KNNBasic(sim_options=sim_options)

See also

The similarities module.

How to build you own prediction algorithm

This page describes how to build a custom prediction algorithm using Surprise.

The basics

Want to get your hands dirty? Cool.

Creating your own prediction algorithm is pretty simple: an algorithm is nothing but a class derived from AlgoBase that has an estimate method. This is the method that is called by the predict() method. It takes in an inner user id, an inner item id (see this note), and returns the estimated rating \(\hat{r}_{ui}\):

From file examples/building_custom_algorithms/most_basic_algorithm.py
from surprise import AlgoBase
from surprise import Dataset
from surprise import evaluate


class MyOwnAlgorithm(AlgoBase):

    def __init__(self):

        # Always call base method before doing anything.
        AlgoBase.__init__(self)

    def estimate(self, u, i):

        return 3

data = Dataset.load_builtin('ml-100k')
algo = MyOwnAlgorithm()

evaluate(algo, data)

This algorithm is the dumbest we could have thought of: it just predicts a rating of 3, regardless of users and items.

If you want to store additional information about the prediction, you can also return a dictionary with given details:

def estimate(self, u, i):

    details = {'info1' : 'That was',
               'info2' : 'easy stuff :)'}
    return 3, details

This dictionary will be stored in the prediction as the details field and can be used for later analysis.

The train method

Now, let’s make a slightly cleverer algorithm that predicts the average of all the ratings of the trainset. As this is a constant value that does not depend on current user or item, we would rather compute it once and for all. This can be done by defining the train method:

From file examples/building_custom_algorithms/most_basic_algorithm2.py

class MyOwnAlgorithm(AlgoBase):

    def __init__(self):

        # Always call base method before doing anything.
        AlgoBase.__init__(self)

    def train(self, trainset):

        # Here again: call base method before doing anything.
        AlgoBase.train(self, trainset)

        # Compute the average rating. We might as well use the
        # trainset.global_mean attribute ;)
        self.the_mean = np.mean(
                        [r for (_, _, r) in self.trainset.all_ratings()])

    def estimate(self, u, i):

        return self.the_mean

The train method is called by the evaluate function at each fold of a cross-validation process, (but you can also call it yourself). Before doing anything, you should call the base class train() method.

The trainset attribute

Once the base class train() method has returned, all the info you need about the current training set (rating values, etc...) is stored in the self.trainset attribute. This is a Trainset object that has many attributes and methods of interest for prediction.

To illustrate its usage, let’s make an algorithm that predicts an average between the mean of all ratings, the mean rating of the user and the mean rating for the item:

From file examples/building_custom_algorithms/mean_rating_user_item.py

    def estimate(self, u, i):

        sum_means = self.trainset.global_mean
        div = 1

        if self.trainset.knows_user(u):
            sum_means += np.mean([r for (_, r) in self.trainset.ur[u]])
            div += 1
        if self.trainset.knows_item(i):
            sum_means += np.mean([r for (_, r) in self.trainset.ir[i]])
            div += 1

        return sum_means / div

Predicting the mean rating for an item would have been done in a similar fashion using the ir field. Note that it would have been a better idea to compute all the user means in the train method, thus avoiding the same computations multiple times.

When the prediction is impossible

It’s up to your algorithm to decide if it can or cannot yield a prediction. If the prediction is impossible, then you can raise the PredictionImpossible exception. You’ll need to import it first):

from surprise import PredictionImpossible

This exception will be caught by the predict() method, and the estimation \(\hat{r}_{ui}\) will be set to the global mean of all ratings \(\mu\).

Using similarities and baselines

Should your algorithm use a similarity measure or baseline estimates, you’ll need to accept bsl_options and sim_options as parmeters to the __init__ method, and pass them along to the Base class. See how to use these parameters in the Prediction algorithms section.

Methods compute_baselines() and compute_similarities() can be called in the train method (or anywhere else).

From file examples/building_custom_algorithms/.with_baselines_or_sim.py
class MyOwnAlgorithm(AlgoBase):

    def __init__(self, sim_options={}, bsl_options={}):

        AlgoBase.__init__(self, sim_options=sim_options,
                          bsl_options=bsl_options)

    def train(self, trainset):

        AlgoBase.train(self, trainset)

        # Compute baselines and similarities
        self.bu, self.bi = self.compute_baselines()
        self.sim = self.compute_similarities()

    def estimate(self, u, i):

        if not (self.trainset.knows_user(u) and self.trainset.knows_item(i)):
            raise PredictionImpossible('User and/or item is unkown.')

        # Compute similarities between u and v, where v describes all other
        # users that have also rated item i.
        neighbors = [(v, self.sim[u, v]) for (v, r) in self.trainset.ir[i]]
        # Sort these neighbors by similarity
        neighbors = sorted(neighbors, key=lambda x: x[1], reverse=True)

        print('The 3 nearest neighbors of user', str(u), 'are:')
        for v, sim_uv in neighbors[:3]:
            print('user {0:} with sim {1:1.2f}'.format(v, sim_uv))

        # ... Aaaaand return the baseline estimate anyway ;)
        bsl = self.trainset.global_mean + self.bu[u] + self.bi[i]
        return bsl

Feel free to explore the prediction_algorithms package source to get an idea of what can be done.

prediction_algorithms package

The prediction_algorithms package includes the prediction algorithms available for recommendation.

The available prediction algorithms are:

random_pred.NormalPredictor Algorithm predicting a random rating based on the distribution of the training set, which is assumed to be normal.
baseline_only.BaselineOnly Algorithm predicting the baseline estimate for given user and item.
knns.KNNBasic A basic collaborative filtering algorithm.
knns.KNNWithMeans A basic collaborative filtering algorithm, taking into account the mean ratings of each user.
knns.KNNBaseline A basic collaborative filtering algorithm taking into account a baseline rating.
matrix_factorization.SVD The famous SVD algorithm, as popularized by Simon Funk during the Netflix Prize.
matrix_factorization.SVDpp The SVD++ algorithm, an extension of SVD taking into account implicite ratings.

You may want to check the Notation standards before diving into the formulas.

The algorithm base class

The surprise.prediction_algorithms.bases module defines the base class AlgoBase from which every single prediction algorithm has to inherit.

class surprise.prediction_algorithms.algo_base.AlgoBase(**kwargs)[source]

Abstract class where is defined the basic behaviour of a prediction algorithm.

Keyword Arguments:
 baseline_options (dict, optional) – If the algorithm needs to compute a baseline estimate, the baseline_options parameter is used to configure how they are computed. See Baselines estimates configuration for usage.
compute_baselines()[source]

Compute users and items baselines.

The way baselines are computed depends on the bsl_options parameter passed at the creation of the algoritihm (see Baselines estimates configuration).

Returns:A tuple (bu, bi), which are users and items baselines.
compute_similarities()[source]

Build the simlarity matrix.

The way the similarity matric is computed depends on the sim_options parameter passed at the creation of the algorithm (see Similarity measure configuration).

Returns:The similarity matrix.
predict(uid, iid, r=0, verbose=False)[source]

Compute the rating prediction for given user and item.

The predict method converts raw ids to inner ids and then calls the estimate method which is defined in every derived class. If the prediction is impossible (for whatever reason), the prediction is set to the global mean of all ratings. Also, if \(\hat{r}_{ui}\) is outside the bounds of the rating scale, (e.g. \(\hat{r}_{ui} = 6\) for a rating scale of \([1, 5]\)), then it is capped.

Parameters:
  • uid – (Raw) id of the user. See this note.
  • iid – (Raw) id of the item. See this note.
  • r – The true rating \(r_{ui}\).
  • verbose (bool) – Whether to print details of the prediction. Default is False.
Returns:

A Prediction object.

test(testset, verbose=False)[source]

Test the algorithm on given testset.

Parameters:
  • testset – A test set, as returned by the folds method.
  • verbose (bool) – Whether to print details for each predictions. Default is False.
Returns:

A list of Prediction objects.

train(trainset)[source]

Train an algorithm on a given training set.

This method is called by every derived class as the first basic step for training an algorithm. It basically just initializes some internal structures and set the self.trainset attribute.

Parameters:trainset (Trainset) – A training set, as returned by the folds method.

The predictions module

The surprise.prediction_algorithms.predictions module defines the Prediction named tuple and the PredictionImpossible exception.

class surprise.prediction_algorithms.predictions.Prediction[source]

A named tuple for storing the results of a prediction.

It’s wrapped in a class, but only for documentation and printing purposes.

Parameters:
  • uid – The (inner) user id. See this note.
  • iid – The (inner) item id. See this note.
  • r0 – The true rating \(r_{ui}\).
  • est – The estimated rating \(\hat{r}_{ui}\).
  • details (dict) – Stores additional details about the prediction that might be useful for later analysis.
exception surprise.prediction_algorithms.predictions.PredictionImpossible[source]

Exception raised when a prediction is impossible.

When raised, the estimation \(\hat{r}_{ui}\) is set to the global mean of all ratings \(\mu\).

Basic algorithms

These are basic algorithm that do not do much work but that are still useful for comparing accuracies.

class surprise.prediction_algorithms.random_pred.NormalPredictor[source]

Bases: surprise.prediction_algorithms.algo_base.AlgoBase

Algorithm predicting a random rating based on the distribution of the training set, which is assumed to be normal.

The prediction \(\hat{r}_{ui}\) is generated from a normal distribution \(\mathcal{N}(\hat{\mu}, \hat{\sigma}^2)\) where \(\hat{\mu}\) and \(\hat{\sigma}\) are estimated from the training data using Maximum Likelihood Estimation:

\[\begin{split}\hat{\mu} &= \frac{1}{|R_{train}|} \sum_{r_{ui} \in R_{train}} r_{ui}\\\\ \hat{\sigma} &= \sqrt{\sum_{r_{ui} \in R_{train}} \frac{(r_{ui} - \hat{\mu})^2}{|R_{train}|}}\end{split}\]
class surprise.prediction_algorithms.baseline_only.BaselineOnly(bsl_options={})[source]

Bases: surprise.prediction_algorithms.algo_base.AlgoBase

Algorithm predicting the baseline estimate for given user and item.

\(\hat{r}_{ui} = b_{ui} = \mu + b_u + b_i\)

If user \(u\) is unknown, then the bias \(b_u\) is assumed to be zero. The same applies for item \(i\) with \(b_u\).

See paper Factor in the Neighbors: Scalable and Accurate Collaborative Filtering by Yehuda Koren for details.

Parameters:bsl_options (dict) – A dictionary of options for the baseline estimates computation. See Baselines estimates configuration for accepted options.

k-NN inspired algorithms

These are algorithms that are directly derived from a basic nearest neighbors approach.

Note

For each of these algorithms, the actual number of neighbors that are aggregated to compute an estimation is necessarily less than or equal to \(k\). First, there might just not exist enough neighbors and second, the sets \(N_i^k(u)\) and \(N_u^k(i)\) only include neighbors for which the similarity measure is positive. It would make no sense to aggregate ratings from users (or items) that are negatively correlated. For a given prediction, the actual number of neighbors can be retrieved in the 'actual_k' field of the details dictionary of the prediction.

You may want to read the User Guide on how to configure the sim_options parameter.

class surprise.prediction_algorithms.knns.KNNBasic(k=40, min_k=1, sim_options={}, **kwargs)[source]

Bases: surprise.prediction_algorithms.knns.SymmetricAlgo

A basic collaborative filtering algorithm.

The prediction \(\hat{r}_{ui}\) is set as:

\[\hat{r}_{ui} = \frac{ \sum\limits_{v \in N^k_i(u)} \text{sim}(u, v) \cdot r_{vi}} {\sum\limits_{v \in N^k_i(u)} \text{sim}(u, v)}\]

or

\[\hat{r}_{ui} = \frac{ \sum\limits_{j \in N^k_u(i)} \text{sim}(i, j) \cdot r_{uj}} {\sum\limits_{j \in N^k_u(j)} \text{sim}(i, j)}\]

depending on the user_based field of the sim_options parameter.

Parameters:
  • k (int) – The (max) number of neighbors to take into account for aggregation (see this note). Default is 40.
  • min_k (int) – The minimum number of neighbors to take into account for aggregation. If there are not enough neighbors, the prediction is set the the global mean of all ratings. Default is 1.
  • sim_options (dict) – A dictionary of options for the similarity measure. See Similarity measure configuration for accepted options.
class surprise.prediction_algorithms.knns.KNNWithMeans(k=40, min_k=1, sim_options={}, **kwargs)[source]

Bases: surprise.prediction_algorithms.knns.SymmetricAlgo

A basic collaborative filtering algorithm, taking into account the mean ratings of each user.

The prediction \(\hat{r}_{ui}\) is set as:

\[\hat{r}_{ui} = \mu_u + \frac{ \sum\limits_{v \in N^k_i(u)} \text{sim}(u, v) \cdot (r_{vi} - \mu_v)} {\sum\limits_{v \in N^k_i(u)} \text{sim}(u, v)}\]

or

\[\hat{r}_{ui} = \mu_i + \frac{ \sum\limits_{j \in N^k_u(i)} \text{sim}(i, j) \cdot (r_{uj} - \mu_j)} {\sum\limits_{j \in N^k_u(i)} \text{sim}(i, j)}\]

depending on the user_based field of the sim_options parameter.

Parameters:
  • k (int) – The (max) number of neighbors to take into account for aggregation (see this note). Default is 40.
  • min_k (int) – The minimum number of neighbors to take into account for aggregation. If there are not enough neighbors, the neighbor aggregation is set to zero (so the prediction ends up being equivalent to the mean \(\mu_u\) or \(\mu_i\)). Default is 1.
  • sim_options (dict) – A dictionary of options for the similarity measure. See Similarity measure configuration for accepted options.
class surprise.prediction_algorithms.knns.KNNBaseline(k=40, min_k=1, sim_options={}, bsl_options={})[source]

Bases: surprise.prediction_algorithms.knns.SymmetricAlgo

A basic collaborative filtering algorithm taking into account a baseline rating.

The prediction \(\hat{r}_{ui}\) is set as:

\[\hat{r}_{ui} = b_{ui} + \frac{ \sum\limits_{v \in N^k_i(u)} \text{sim}(u, v) \cdot (r_{vi} - b_{vi})} {\sum\limits_{v \in N^k_i(u)} \text{sim}(u, v)}\]

or

\[\hat{r}_{ui} = b_{ui} + \frac{ \sum\limits_{j \in N^k_u(i)} \text{sim}(i, j) \cdot (r_{uj} - b_{uj})} {\sum\limits_{j \in N^k_u(j)} \text{sim}(i, j)}\]

depending on the user_based field of the sim_options parameter.

For details, see paper Factor in the Neighbors: Scalable and Accurate Collaborative Filtering by Yehuda Koren.

Parameters:
  • k (int) – The (max) number of neighbors to take into account for aggregation (see this note). Default is 40.
  • min_k (int) – The minimum number of neighbors to take into account for aggregation. If there are not enough neighbors, the neighbor aggregation is set to zero (so the prediction ends up being equivalent to the baseline). Default is 1.
  • sim_options (dict) – A dictionary of options for the similarity measure. See Similarity measure configuration for accepted options.
  • bsl_options (dict) – A dictionary of options for the baseline estimates computation. See Baselines estimates configuration for accepted options.

Matrix Factorization-based algortihms

class surprise.prediction_algorithms.matrix_factorization.SVD

Bases: surprise.prediction_algorithms.algo_base.AlgoBase

The famous SVD algorithm, as popularized by Simon Funk during the Netflix Prize.

The prediction \(\hat{r}_{ui}\) is set as:

\[\hat{r}_{ui} = \mu + b_u + b_i + q_i^Tp_u\]

If user \(u\) is unknown, then the bias \(b_u\) and the factors \(p_u\) are assumed to be zero. The same applies for item \(i\) with \(b_i\) and \(q_i\).

For details, see eq. 5 from Matrix Factorization Techniques For Recommender Systems by Koren, Bell and Volinsky. See also The Recommender System Handbook, section 5.3.1.

To estimate all the unkown, we minimize the following regularized squared error:

\[\sum_{r_{ui} \in R_{train}} \left(r_{ui} - \hat{r}_{ui} \right)^2 + \lambda\left(b_i^2 + b_u^2 + ||q_i||^2 + ||p_u||^2\right)\]

The minimization is performed by a very straightforward stochastic gradient descent:

\[\begin{split}b_u &\rightarrow b_u &+ \gamma (e_{ui} - \lambda b_u)\\ b_i &\rightarrow b_i &+ \gamma (e_{ui} - \lambda b_i)\\ p_u &\rightarrow p_u &+ \gamma (e_{ui} q_i - \lambda p_u)\\ q_i &\rightarrow q_i &+ \gamma (e_{ui} p_u - \lambda q_i)\end{split}\]

where \(e_{ui} = r_{ui} - \hat{r}_{ui}\). These steps are performed over all the ratings of the trainset and repeated n_epochs times. Baselines are initialized to 0. User and item factors are initialized to 0.1, as recommended by Funk.

You have control over the learning rate \(\gamma\) and the regularization term \(\lambda\). Both can be different for each kind of parameter (see below). By default, learning rates are set to 0.005 and regularization termes are set to 0.02.

Parameters:
  • n_factors – The number of factors. Default is 100.
  • n_epochs – The number of iteration of the SGD procedure. Default is 20.
  • lr_all – The learning rate for all parameters. Default is 0.005.
  • reg_all – The regularization term for all parameters. Default is 0.02.
  • lr_bu – The learning rate for \(b_u\). Takes precedence over lr_all if set. Default is None.
  • lr_bi – The learning rate for \(b_i\). Takes precedence over lr_all if set. Default is None.
  • lr_pu – The learning rate for \(p_u\). Takes precedence over lr_all if set. Default is None.
  • lr_qi – The learning rate for \(q_i\). Takes precedence over lr_all if set. Default is None.
  • reg_bu – The regularization term for \(b_u\). Takes precedence over reg_all if set. Default is None.
  • reg_bi – The regularization term for \(b_i\). Takes precedence over reg_all if set. Default is None.
  • reg_pu – The regularization term for \(p_u\). Takes precedence over reg_all if set. Default is None.
  • reg_qi – The regularization term for \(q_i\). Takes precedence over reg_all if set. Default is None.
class surprise.prediction_algorithms.matrix_factorization.SVDpp

Bases: surprise.prediction_algorithms.algo_base.AlgoBase

The SVD++ algorithm, an extension of SVD taking into account implicite ratings.

The prediction \(\hat{r}_{ui}\) is set as:

\[\hat{r}_{ui} = \mu + b_u + b_i + q_i^T\left(p_u + |I_u|^{-\frac{1}{2}} \sum_{j \in I_u}y_j\right)\]

Where the \(y_j\) terms are a new set of item factors that capture implicite ratings.

If user \(u\) is unknown, then the bias \(b_u\) and the factors \(p_u\) are assumed to be zero. The same applies for item \(i\) with \(b_i\), \(q_i\) and \(y_i\).

For details, see eq. 15 from Factorization Meets The Neighborhood by Yehuda Koren. See also The Recommender System Handbook, section 5.3.1.

Just as for SVD, the parameters are learnt using a SGD on the regularized squared error objective.

Baselines are initialized to 0. User and item factors are initialized to 0.1, as recommended by Funk.

You have control over the learning rate \(\gamma\) and the regularization term \(\lambda\). Both can be different for each kind of parameter (see below). By default, learning rates are set to 0.005 and regularization termes are set to 0.02.

Parameters:
  • n_factors – The number of factors. Default is 20.
  • n_epochs – The number of iteration of the SGD procedure. Default is 20.
  • lr_all – The learning rate for all parameters. Default is 0.007.
  • reg_all – The regularization term for all parameters. Default is 0.02.
  • lr_bu – The learning rate for \(b_u\). Takes precedence over lr_all if set. Default is None.
  • lr_bi – The learning rate for \(b_i\). Takes precedence over lr_all if set. Default is None.
  • lr_pu – The learning rate for \(p_u\). Takes precedence over lr_all if set. Default is None.
  • lr_qi – The learning rate for \(q_i\). Takes precedence over lr_all if set. Default is None.
  • lr_yj – The learning rate for \(y_j\). Takes precedence over lr_all if set. Default is None.
  • reg_bu – The regularization term for \(b_u\). Takes precedence over reg_all if set. Default is None.
  • reg_bi – The regularization term for \(b_i\). Takes precedence over reg_all if set. Default is None.
  • reg_pu – The regularization term for \(p_u\). Takes precedence over reg_all if set. Default is None.
  • reg_qi – The regularization term for \(q_i\). Takes precedence over reg_all if set. Default is None.
  • reg_yj – The regularization term for \(y_j\). Takes precedence over reg_all if set. Default is None.

similarities module

The similarities module includes tools to compute similarity metrics between users or items. You may need to refer to the Notation standards page. See also the Similarity measure configuration section of the User Guide.

Available similarity measures:

cosine Compute the cosine similarity between all pairs of users (or items).
msd Compute the Mean Squared Difference similarity between all pairs of users (or items).
pearson Compute the Pearson correlation coefficient between all pairs of users (or items).
pearson_baseline Compute the (shrunk) Pearson correlation coefficient between all pairs of users (or items) using baselines for centering instead of means.
surprise.similarities.cosine()

Compute the cosine similarity between all pairs of users (or items).

Only common users (or items) are taken into account. The cosine similarity is defined as:

\[\text{cosine_sim}(u, v) = \frac{ \sum\limits_{i \in I_{uv}} r_{ui} \cdot r_{vi}} {\sqrt{\sum\limits_{i \in I_{uv}} r_{ui}^2} \cdot \sqrt{\sum\limits_{i \in I_{uv}} r_{vi}^2} }\]

or

\[\text{cosine_sim}(i, j) = \frac{ \sum\limits_{u \in U_{ij}} r_{ui} \cdot r_{uj}} {\sqrt{\sum\limits_{u \in U_{ij}} r_{ui}^2} \cdot \sqrt{\sum\limits_{u \in U_{ij}} r_{uj}^2} }\]

depending on the user_based field of sim_options (see Similarity measure configuration).

For details on cosine similarity, see on Wikipedia.

surprise.similarities.msd()

Compute the Mean Squared Difference similarity between all pairs of users (or items).

Only common users (or items) are taken into account. The Mean Squared Difference is defined as:

\[\text{msd}(u, v) = \frac{1}{|I_{uv}|} \cdot \sum\limits_{i \in I_{uv}} (r_{ui} - r_{vi})^2\]

or

\[\text{msd}(i, j) = \frac{1}{|U_{ij}|} \cdot \sum\limits_{u \in U_{ij}} (r_{ui} - r_{uj})^2\]

depending on the user_based field of sim_options (see Similarity measure configuration).

The MSD-similarity is then defined as:

\[\begin{split}\text{msd_sim}(u, v) &= \frac{1}{\text{msd}(u, v) + 1}\\ \text{msd_sim}(i, j) &= \frac{1}{\text{msd}(i, j) + 1}\end{split}\]

The \(+ 1\) term is just here to avoid dividing by zero.

For details on MSD, see third definition on Wikipedia.

surprise.similarities.pearson()

Compute the Pearson correlation coefficient between all pairs of users (or items).

Only common users (or items) are taken into account. The Pearson correlation coefficient can be seen as a mean-centered cosine similarity, and is defined as:

\[\text{pearson_sim}(u, v) = \frac{ \sum\limits_{i \in I_{uv}} (r_{ui} - \mu_u) \cdot (r_{vi} - \mu_{v})} {\sqrt{\sum\limits_{i \in I_{uv}} (r_{ui} - \mu_u)^2} \cdot \sqrt{\sum\limits_{i \in I_{uv}} (r_{vi} - \mu_{v})^2} }\]

or

\[\text{pearson_sim}(i, j) = \frac{ \sum\limits_{u \in U_{ij}} (r_{ui} - \mu_i) \cdot (r_{uj} - \mu_{j})} {\sqrt{\sum\limits_{u \in U_{ij}} (r_{ui} - \mu_i)^2} \cdot \sqrt{\sum\limits_{u \in U_{ij}} (r_{uj} - \mu_{j})^2} }\]

depending on the user_based field of sim_options (see Similarity measure configuration).

Note: if there are no common users or items, similarity will be 0 (and not -1).

For details on Pearson coefficient, see Wikipedia.

surprise.similarities.pearson_baseline()

Compute the (shrunk) Pearson correlation coefficient between all pairs of users (or items) using baselines for centering instead of means.

The shrinkage parameter helps to avoid overfitting when only few ratings are available (see Similarity measure configuration).

The Pearson-baseline correlation coefficient is defined as:

\[\text{pearson_baseline_sim}(u, v) = \hat{\rho}_{uv} = \frac{ \sum\limits_{i \in I_{uv}} (r_{ui} - b_{ui}) \cdot (r_{vi} - b_{vi})} {\sqrt{\sum\limits_{i \in I_{uv}} (r_{ui} - b_{ui})^2} \cdot \sqrt{\sum\limits_{i \in I_{uv}} (r_{vi} - b_{vi})^2}}\]

or

\[\text{pearson_baseline_sim}(i, j) = \hat{\rho}_{ij} = \frac{ \sum\limits_{u \in U_{ij}} (r_{ui} - b_{ui}) \cdot (r_{uj} - b_{uj})} {\sqrt{\sum\limits_{u \in U_{ij}} (r_{ui} - b_{ui})^2} \cdot \sqrt{\sum\limits_{u \in U_{ij}} (r_{uj} - b_{uj})^2}}\]

The shrunk Pearson-baseline correlation coefficient is then defined as:

\[\begin{split}\text{pearson_baseline_shrunk_sim}(u, v) &= \frac{|I_{uv}| - 1} {|I_{uv}| - 1 + \text{shrinkage}} \cdot \hat{\rho}_{uv}\end{split}\]\[\begin{split}\text{pearson_baseline_shrunk_sim}(i, j) &= \frac{|U_{ij}| - 1} {|U_{ij}| - 1 + \text{shrinkage}} \cdot \hat{\rho}_{ij}\end{split}\]

Obviously, a shrinkage parameter of 0 amounts to no shrinkage at all.

Note: here again, if there are no common users/items, similarity will be 0 (and not -1).

Motivations for such a similarity measure can be found on the Recommender System Handbook, section 5.4.1.

accuracy module

The surprise.accuracy module provides with tools for computing accuracy metrics on a set of predictions.

Available accuracy metrics:

rmse Compute RMSE (Root Mean Squared Error).
mae Compute MAE (Mean Absolute Error).
fcp Compute FCP (Fraction of Concordant Pairs).
surprise.accuracy.fcp(predictions, verbose=True)[source]

Compute FCP (Fraction of Concordant Pairs).

Computed as described in paper Collaborative Filtering on Ordinal User Feedback by Koren and Sill, section 5.2.

Parameters:
  • predictions (list of Prediction) – A list of predictions, as returned by the test method.
  • verbose – If True, will print computed value. Default is True.
Returns:

The Fraction of Concordant Pairs.

Raises:

ValueError – When predictions is empty.

surprise.accuracy.mae(predictions, verbose=True)[source]

Compute MAE (Mean Absolute Error).

\[\text{MAE} = \frac{1}{|\hat{R}|} \sum_{\hat{r}_{ui} \in \hat{R}}|r_{ui} - \hat{r}_{ui}|\]
Parameters:
  • predictions (list of Prediction) – A list of predictions, as returned by the test method.
  • verbose – If True, will print computed value. Default is True.
Returns:

The Mean Absolute Error of predictions.

Raises:

ValueError – When predictions is empty.

surprise.accuracy.rmse(predictions, verbose=True)[source]

Compute RMSE (Root Mean Squared Error).

\[\text{RMSE} = \sqrt{\frac{1}{|\hat{R}|} \sum_{\hat{r}_{ui} \in \hat{R}}(r_{ui} - \hat{r}_{ui})^2}.\]
Parameters:
  • predictions (list of Prediction) – A list of predictions, as returned by the test method.
  • verbose – If True, will print computed value. Default is True.
Returns:

The Root Mean Squared Error of predictions.

Raises:

ValueError – When predictions is empty.

dataset module

the dataset module defines some tools for managing datasets.

Users may use both built-in and user-defined datasets (see the Getting Started page for examples). Right now, four built-in datasets are available:

Built-in datasets can all be loaded (or downloaded if you haven’t already) using the Dataset.load_builtin() method. For each built-in dataset, Surprise also provide predefined readers which are useful if you want to use a custom dataset that has the same format as a built-in one.

Summary:

Dataset.load_builtin Load a built-in dataset.
Dataset.load_from_file Load a dataset from a (custom) file.
Dataset.load_from_folds Load a dataset where folds (for cross-validation) are predifined by some files.
Dataset.folds Generator function to iterate over the folds of the Dataset.
DatasetAutoFolds.split Split the dataset into folds for futur cross-validation.
Reader The Reader class is used to parse a file containing ratings.
Trainset A trainset contains all useful data that constitutes a training set.
class surprise.dataset.Dataset(reader)[source]

Base class for loading datasets.

Note that you should never instantiate the Dataset class directly (same goes for its derived classes), but instead use one of the three available methods for loading datasets.

folds()[source]

Generator function to iterate over the folds of the Dataset.

See User Guide for usage.

Yields:tupleTrainset and testset of current fold.
classmethod load_builtin(name=u'ml-100k')[source]

Load a built-in dataset.

If the dataset has not already been loaded, it will be downloaded and saved. You will have to split your dataset using the split method. See an example in the User Guide.

Parameters:name (string) – The name of the built-in dataset to load. Accepted values are ‘ml-100k’, ‘ml-1m’, and ‘jester’. Default is ‘ml-100k’.
Returns:A Dataset object.
Raises:ValueError – If the name parameter is incorrect.
classmethod load_from_file(file_path, reader)[source]

Load a dataset from a (custom) file.

Use this if you want to use a custom dataset and all of the ratings are stored in one file. You will have to split your dataset using the split method. See an example in the User Guide.

Parameters:
  • file_path (string) – The path to the file containing ratings.
  • reader (Reader) – A reader to read the file.
classmethod load_from_folds(folds_files, reader)[source]

Load a dataset where folds (for cross-validation) are predifined by some files.

The purpose of this method is to cover a common use case where a dataset is already split into predefined folds, such as the movielens-100k dataset which defines files u1.base, u1.test, u2.base, u2.test, etc... It can also be used when you don’t want to perform cross-validation but still want to specify your training and testing data (which comes down to 1-fold cross-validation anyway). See an example in the User Guide.

Parameters:
  • folds_files (iterable of tuples) – The list of the folds. A fold is a tuple of the form (path_to_train_file, path_to_test_file).
  • reader (Reader) – A reader to read the files.
class surprise.dataset.DatasetAutoFolds(ratings_file=None, reader=None)[source]

A derived class from Dataset for which folds (for cross-validation) are not predefined. (Or for when there are no folds at all).

build_full_trainset()[source]

Do not split the dataset into folds and just return a trainset as is, built from the whole dataset.

User can then query for predictions, as shown in the User Guide.

Returns:The Trainset.
split(n_folds=5, shuffle=True)[source]

Split the dataset into folds for futur cross-validation.

If you forget to call split(), the dataset will be automatically shuffled and split for 5-folds cross-validation.

You can obtain repeatable splits over your all your experiments by seeding the RNG:

import random
random.seed(my_seed)  # call this before you call split!
Parameters:
  • n_folds (int) – The number of folds.
  • shuffle (bool) – Whether to shuffle ratings before splitting. If False, folds will always be the same each time the experiment is run. Default is True.
class surprise.dataset.Reader(name=None, line_format=None, sep=None, rating_scale=(1, 5), skip_lines=0)[source]

The Reader class is used to parse a file containing ratings.

Such a file is assumed to specify only one rating per line, and each line needs to respect the following structure:

user ; item ; rating ; [timestamp]

where the order of the fields and the seperator (here ‘;’) may be arbitrarily defined (see below). brackets indicate that the timestamp field is optional.

Parameters:
  • name (string, optional) – If specified, a Reader for one of the built-in datasets is returned and any other parameter is ignored. Accepted values are ‘ml-100k’, ‘ml-1m’, and ‘jester’. Default is None.
  • line_format (string) – The fields names, in the order at which they are encountered on a line. Example: 'item user rating'.
  • sep (char) – the separator between fields. Example : ';'.
  • rating_scale (tuple, optional) – The rating scale used for every rating. Default is (1, 5).
  • skip_lines (int, optional) – Number of lines to skip at the beginning of the file. Default is 0.
class surprise.dataset.Trainset(rm, ur, ir, n_users, n_items, r_min, r_max, raw2inner_id_users, raw2inner_id_items)[source]

A trainset contains all useful data that constitutes a training set.

It is used by the train() method of every prediction algorithm. You should not try to built such an object on your own but rather use the Dataset.folds() method or the DatasetAutoFolds.build_full_trainset() method.

rm

defaultdict of int

A dictionary containing all known ratings. Keys are tuples (user_inner__id, item_inner_id), values are ratings. rm stands for ratings matrix, even though it’s not a proper matrix object.

ur

defaultdict of list

A dictionary containing lists of tuples of the form (item_inner_id, rating). Keys are user inner ids. ur stands for user ratings.

ir

defaultdict of list

A dictionary containing lists of tuples of the form (user_inner_id, rating). Keys are item inner ids. ir stands for item ratings.

n_users

Total number of users \(|U|\).

n_items

Total number of items \(|I|\).

n_ratings

Total number of ratings \(|R_{train}|\).

r_min

Minimum value of the rating scale.

r_max

Maximum value of the rating scale.

global_mean

The mean of all ratings \(\mu\).

all_items()[source]

Generator function to iterate over all items.

Yields:Inner id of items.
all_ratings()[source]

Generator function to iterate over all ratings.

Yields:A tuple (uid, iid, rating) where ids are inner ids.
all_users()[source]

Generator function to iterate over all users.

Yields:Inner id of users.
global_mean

Return the mean of all ratings.

It’s only computed once.

knows_item(iid)[source]

Indicate if the item is part of the trainset.

An item is part of the trainset if the item was rated at least once.

Parameters:iid – The (inner) item id. See this note.
Returns:True if item is part of the trainset, else False.
knows_user(uid)[source]

Indicate if the user is part of the trainset.

A user is part of the trainset if the user has at least one rating.

Parameters:uid – The (inner) user id. See this note.
Returns:True if user is part of the trainset, else False.
to_inner_iid(riid)[source]

Convert a raw item id to an inner id.

See this note.

Parameters:riid – The item raw id.
Returns:The item inner id.
Raises:ValueError – When item is not part of the trainset.
to_inner_uid(ruid)[source]

Convert a raw user id to an inner id.

See this note.

Parameters:ruid – The user raw id.
Returns:The user inner id.
Raises:ValueError – When user is not part of the trainset.

evaluate module

The evaluate module defines the evaluate() function.

surprise.evaluate.evaluate(algo, data, measures=[u'rmse', u'mae'], with_dump=False, dump_dir=None, verbose=1)[source]

Evaluate the performance of the algorithm on given data.

Depending on the nature of the data parameter, it may or may not perform cross validation.

Parameters:
  • algo (AlgoBase) – The algorithm to evaluate.
  • data (Dataset) – The dataset on which to evaluate the algorithm.
  • measures (list of string) – The performance measures to compute. Allowed names are function names as defined in the accuracy module. Default is ['rmse', 'mae'].
  • with_dump (bool) – If True, the predictions, the trainsets and the algorithm parameters will be dumped for later further analysis at each fold (see User Guide). The file names will be set as: '<date>-<algorithm name>-<fold number>'. Default is False.
  • dump_dir (str) – The directory where to dump to files. Default is '~/.surprise_data/dumps/'.
  • verbose (int) – Level of verbosity. If 0, nothing is printed. If 1 (default), accuracy measures for each folds are printed, with a final summary. If 2, every prediction is printed.
Returns:

A dictionary containing measures as keys and lists as values. Each list contains one entry per fold.

dump module

The dump module defines the dump() function.

surprise.dump.dump(file_name, predictions, trainset=None, algo=None)[source]

Dump a list of predictions for future analysis, using Pickle.

If needed, the trainset object and the algorithm can also be dumped. What is dumped is a dictionnary with keys 'predictions, 'trainset', and 'algo'.

The dumped algorithm won’t be a proper algorithm object but simply a dictionnary with the algorithm attributes as keys-values (technically, the algo.__dict__ attribute).

See User Guide for usage.

Parameters:
  • file_name (str) – The name (with full path) specifying where to dump the predictions.
  • predictions (list of Prediction) – The predictions to dump.
  • trainset (Trainset, optional) – The trainset to dump.
  • algo (Algorithm, optional) – algorithm to dump.