Represents Sparse Feature where IDs are set by Hashing

Use this when your sparse features are in string or integer format, and you want to distribute your inputs into a finite number of buckets by hashing. output_id = Hash(input_feature_string) features, features$key$ is either tensor or sparse tensor object. If it's tensor object, missing values can be represented by -1 for int and '' for string. Note that these values are independent of the default_value argument.

column_categorical_with_hash_bucket(..., hash_bucket_size, dtype = tf$string)

Arguments

...

Expression(s) identifying input feature(s). Used as the column name and the dictionary key for feature parsing configs, feature tensors, and feature columns.

hash_bucket_size

An int > 1. The number of buckets.

dtype

The type of features. Only string and integer types are supported.

Value

A _HashedCategoricalColumn.

Raises

  • ValueError: hash_bucket_size is not greater than 1.

  • ValueError: dtype is neither string nor integer.

See also

Other feature column constructors: column_bucketized, column_categorical_weighted, column_categorical_with_identity, column_categorical_with_vocabulary_file, column_categorical_with_vocabulary_list, column_crossed, column_embedding, column_numeric, input_layer