evaluate module¶
The evaluate
module defines the evaluate()
function and
GridSearch
class
-
class
surprise.evaluate.
GridSearch
(algo_class, param_grid, measures=[u'rmse', u'mae'], verbose=1)¶ The
GridSearch
class, used to evaluate the performance of an algorithm on various combinations of parameters, and extract the best combination. It is analogous to GridSearchCV from scikit-learn.See User Guide for usage.
Parameters: - algo_class (
AlgoBase
) – A class object of of the algorithm to evaluate. - param_grid (dict) – The dictionary has algo_class parameters as keys (string) and list of parameters as the desired values to try. All combinations will be evaluated with desired algorithm.
- measures (list of string) – The performance measures to compute. Allowed names are function
names as defined in the
accuracy
module. Default is['rmse', 'mae']
. - verbose (int) – Level of verbosity. If
0
, nothing is printed. If1
, accuracy measures for each parameters combination are printed, with combination values. If2
, folds accuracy values are also printed. Default is1
.
-
cv_results
¶ dict of arrays – A dict that contains all parameters and accuracy information for each combination. Can be imported into a pandas DataFrame.
-
best_estimator
¶ dict of AlgoBase – Using an accuracy measure as key, get the estimator that gave the best accuracy results for the chosen measure.
-
best_score
¶ dict of floats – Using an accuracy measure as key, get the best score achieved for that measure.
-
best_params
¶ dict of dicts – Using an accuracy measure as key, get the parameters combination that gave the best accuracy results for the chosen measure.
-
best_index
¶ dict of ints – Using an accuracy measure as key, get the index that can be used with cv_results_ that achieved the highest accuracy for that measure.
- algo_class (
-
surprise.evaluate.
evaluate
(algo, data, measures=[u'rmse', u'mae'], with_dump=False, dump_dir=None, verbose=1)¶ Evaluate the performance of the algorithm on given data.
Depending on the nature of the
data
parameter, it may or may not perform cross validation.Parameters: - algo (
AlgoBase
) – The algorithm to evaluate. - data (
Dataset
) – The dataset on which to evaluate the algorithm. - measures (list of string) – The performance measures to compute. Allowed
names are function names as defined in the
accuracy
module. Default is['rmse', 'mae']
. - with_dump (bool) – If True, the predictions and the algorithm will be
dumped for later further analysis at each fold (see FAQ). The file names will be set as:
'<date>-<algorithm name>-<fold number>'
. Default isFalse
. - dump_dir (str) – The directory where to dump to files. Default is
'~/.surprise_data/dumps/'
. - verbose (int) – Level of verbosity. If 0, nothing is printed. If 1 (default), accuracy measures for each folds are printed, with a final summary. If 2, every prediction is printed.
Returns: A dictionary containing measures as keys and lists as values. Each list contains one entry per fold.
- algo (