The algorithm base class¶

The surprise.prediction_algorithms.algo_base module defines the base class AlgoBase from which every single prediction algorithm has to inherit.

class surprise.prediction_algorithms.algo_base.AlgoBase(**kwargs)[source]¶

Abstract class where is defined the basic behavior of a prediction algorithm.

Keyword Arguments:: baseline_options (dict, optional) – If the algorithm needs to compute a baseline estimate, the baseline_options parameter is used to configure how they are computed. See Baselines estimates configuration for usage.

compute_baselines()[source]¶

Compute users and items baselines.

The way baselines are computed depends on the bsl_options parameter passed at the creation of the algorithm (see Baselines estimates configuration).

This method is only relevant for algorithms using Pearson baseline similarity or the BaselineOnly algorithm.

Returns:: A tuple (bu, bi), which are users and items baselines.

compute_similarities()[source]¶

Build the similarity matrix.

The way the similarity matrix is computed depends on the sim_options parameter passed at the creation of the algorithm (see Similarity measure configuration).

This method is only relevant for algorithms using a similarity measure, such as the k-NN algorithms.

Returns:: The similarity matrix.

default_prediction()[source]¶

Used when the PredictionImpossible exception is raised during a call to predict(). By default, return the global mean of all ratings (can be overridden in child classes).

Returns:: The mean of all ratings in the trainset.
Return type:: (float)

fit(trainset)[source]¶

Train an algorithm on a given training set.

This method is called by every derived class as the first basic step for training an algorithm. It basically just initializes some internal structures and set the self.trainset attribute.

Parameters:: trainset (Trainset) – A training set, as returned by the folds method.
Returns:: self

get_neighbors(iid, k)[source]¶

Return the k nearest neighbors of iid, which is the inner id of a user or an item, depending on the user_based field of sim_options (see Similarity measure configuration).

As the similarities are computed on the basis of a similarity measure, this method is only relevant for algorithms using a similarity measure, such as the k-NN algorithms.

For a usage example, see the FAQ.

Parameters:

iid (int) – The (inner) id of the user (or item) for which we want the nearest neighbors. See this note.
k (int) – The number of neighbors to retrieve.

Returns:

The list of the k (inner) ids of the closest users (or items) to iid.

predict(uid, iid, r_ui=None, clip=True, verbose=False)[source]¶

Compute the rating prediction for given user and item.

The predict method converts raw ids to inner ids and then calls the estimate method which is defined in every derived class. If the prediction is impossible (e.g. because the user and/or the item is unknown), the prediction is set according to default_prediction().

Parameters:

uid – (Raw) id of the user. See this note.
iid – (Raw) id of the item. See this note.
r_ui (float) – The true rating \(r_{ui}\). Optional, default is None.
clip (bool) – Whether to clip the estimation into the rating scale. For example, if \(\hat{r}_{ui}\) is \(5.5\) while the rating scale is \([1, 5]\), then \(\hat{r}_{ui}\) is set to \(5\). Same goes if \(\hat{r}_{ui} < 1\). Default is True.
verbose (bool) – Whether to print details of the prediction. Default is False.

Returns:

A Prediction object containing:

The (raw) user id uid.
The (raw) item id iid.
The true rating r_ui (\(r_{ui}\)).
The estimated rating (\(\hat{r}_{ui}\)).
Some additional details about the prediction that might be useful for later analysis.

test(testset, verbose=False)[source]¶

Test the algorithm on given testset, i.e. estimate all the ratings in the given testset.

Parameters:

testset – A test set, as returned by a cross-validation itertor or by the build_testset() method.
verbose (bool) – Whether to print details for each predictions. Default is False.

Returns:

A list of Prediction objects that contains all the estimated ratings.