evaluate module¶
The evaluate
module defines the evaluate()
function and
GridSearch
class

class
surprise.evaluate.
GridSearch
(algo_class, param_grid, measures=[u'rmse', u'mae'], verbose=1)¶ The
GridSearch
class, used to evaluate the performance of an algorithm on various combinations of parameters, and extract the best combination. It is analogous to GridSearchCV from scikitlearn.See User Guide for usage.
Parameters:  algo_class (
AlgoBase
) – A class object of of the algorithm to evaluate.  param_grid (dict) – The dictionary has algo_class parameters as keys (string) and list of parameters as the desired values to try. All combinations will be evaluated with desired algorithm.
 measures (list of string) – The performance measures to compute. Allowed names are function
names as defined in the
accuracy
module. Default is['rmse', 'mae']
.  verbose (int) – Level of verbosity. If
0
, nothing is printed. If1
, accuracy measures for each parameters combination are printed, with combination values. If2
, folds accuracy values are also printed. Default is1
.

cv_results
¶ dict of arrays – A dict that contains all parameters and accuracy information for each combination. Can be imported into a pandas DataFrame.

best_estimator
¶ dict of AlgoBase – Using an accuracy measure as key, get the estimator that gave the best accuracy results for the chosen measure.

best_score
¶ dict of floats – Using an accuracy measure as key, get the best score achieved for that measure.

best_params
¶ dict of dicts – Using an accuracy measure as key, get the parameters combination that gave the best accuracy results for the chosen measure.

best_index
¶ dict of ints – Using an accuracy measure as key, get the index that can be used with cv_results_ that achieved the highest accuracy for that measure.
 algo_class (

surprise.evaluate.
evaluate
(algo, data, measures=[u'rmse', u'mae'], with_dump=False, dump_dir=None, verbose=1)¶ Evaluate the performance of the algorithm on given data.
Depending on the nature of the
data
parameter, it may or may not perform cross validation.Parameters:  algo (
AlgoBase
) – The algorithm to evaluate.  data (
Dataset
) – The dataset on which to evaluate the algorithm.  measures (list of string) – The performance measures to compute. Allowed
names are function names as defined in the
accuracy
module. Default is['rmse', 'mae']
.  with_dump (bool) – If True, the predictions, the trainsets and the
algorithm parameters will be dumped for later further analysis at
each fold (see User Guide). The file names will
be set as:
'<date><algorithm name><fold number>'
. Default isFalse
.  dump_dir (str) – The directory where to dump to files. Default is
'~/.surprise_data/dumps/'
.  verbose (int) – Level of verbosity. If 0, nothing is printed. If 1 (default), accuracy measures for each folds are printed, with a final summary. If 2, every prediction is printed.
Returns: A dictionary containing measures as keys and lists as values. Each list contains one entry per fold.
 algo (