What Does Buffer_size Do In Tensorflow Dataset Shuffling?
Solution 1:
It's used as the buffer_size
argument in tf.data.Dataset.shuffle
. Have you read the docs?
This dataset fills a buffer with
buffer_size
elements, then randomly samples elements from this buffer, replacing the selected elements with new elements. For perfect shuffling, a buffer size greater than or equal to the full size of the dataset is required.
For instance, if your dataset contains 10,000 elements but
buffer_size
is set to 1,000, then shuffle will initially select a random element from only the first 1,000 elements in the buffer. Once an element is selected, its space in the buffer is replaced by the next (i.e. 1,001-st) element, maintaining the 1,000 element buffer.
Solution 2:
In the documentation of TensorFlow, the buffer_size
define a random first element between the size of buffer_size. After choose this random one, the next numbers will follow the size of buffer_size
samples = 1000 buffer_size = 100
choose a random between (0, 100) random = 37 the sample will be (37 to 137)
Post a Comment for "What Does Buffer_size Do In Tensorflow Dataset Shuffling?"