Creates a dataset that includes only 1 / num_shards of this dataset.

This dataset operator is very useful when running distributed training, as it allows each worker to read a unique subset.

dataset_shard(dataset, num_shards, index)

Arguments

dataset

A dataset

num_shards

A integer representing the number of shards operating in parallel.

index

A integer, representing the worker index.

Value

A dataset