Код: Выделить всё
(123, 3)
(100, 3)
(134, 3)
...
Код: Выделить всё
lengths = tf.random.uniform(shape=(10,), minval=5, maxval=10, dtype=tf.int32)
# this would be a dataset of filenames in my code
dataset = tf.data.Dataset.from_tensor_slices(lengths)
def fake_read_file(tensor):
# this function would read and preprocess data from file in my code
# however, for demonstration purposes it just returns a dummy data with varying length in the first dimension
dummy_data = tf.convert_to_tensor([[0.2, 0.2, 0.2]])
return tf.repeat(dummy_data, tensor, axis=0)
# I use tf.py_function decorator because file read can be done only in eager mode
dataset = dataset.map(lambda x: tf.py_function(fake_read_file, inp=[x], Tout=tf.float32))
Сейчас я придумал следующее решение:
Код: Выделить всё
dataset = dataset.map(lambda x: tf.RaggedTensor.from_tensor(x))
dataset = dataset.batch(batch_size=2, drop_remainder=True)
Подробнее здесь: https://stackoverflow.com/questions/789 ... n-but-cons