Prepare a dataset for analysis

Transform a dataset with named columns into a list with features (x) and response (y) elements.

dataset_prepare(dataset, x, y = NULL, named = TRUE,
  named_features = FALSE, parallel_records = NULL)

Arguments

dataset

A dataset

x

Features to include. When named_features is FALSE all features will be stacked into a single tensor so must have an identical data type.

y

(Optional). Response variable.

named

TRUE to name the dataset elements "x" and "y", FALSE to not name the dataset elements.

named_features

TRUE to yield features as a named list; FALSE to stack features into a single array. Note that in the case of FALSE (the default) all features will be stacked into a single 2D tensor so need to have the same underlying data type.

parallel_records

(Optional) An integer, representing the number of records to decode in parallel. If not specified, records will be processed sequentially.

Value

A dataset. The dataset will have a structure of either:

  • When named_features is TRUE: list(x = list(feature_name = feature_values, ...), y = response_values)

  • When named_features is FALSE: list(x = features_array, y = response_values), where features_array is a Rank 2 array of (batch_size, num_features).

Note that the y element will be omitted when y is NULL.

See also

input_fn() for use with tfestimators.