Reader(name=None, line_format=u'user item rating', sep=None, rating_scale=(1, 5), skip_lines=0)¶
The Reader class is used to parse a file containing ratings.
Such a file is assumed to specify only one rating per line, and each line needs to respect the following structure:
user ; item ; rating ; [timestamp]
where the order of the fields and the separator (here ‘;’) may be arbitrarily defined (see below). brackets indicate that the timestamp field is optional.
For each built-in dataset, Surprise also provides predefined readers which are useful if you want to use a custom dataset that has the same format as a built-in one (see the
- name (
string, optional) – If specified, a Reader for one of the built-in datasets is returned and any other parameter is ignored. Accepted values are ‘ml-100k’, ‘ml-1m’, and ‘jester’. Default is
- line_format (
string) – The fields names, in the order at which they are encountered on a line. Please note that
line_formatis always space-separated (use the
sepparameter). Default is
'user item rating'.
- sep (char) – the separator between fields. Example :
- rating_scale (
tuple, optional) –
The rating scale used for every rating. Default is
- skip_lines (
int, optional) – Number of lines to skip at the beginning of the file. Default is
- name (