Change log

Change log#

The change log file hosting all releases with lists of new features and breaking changes. Best viewed here.

New features
- Adds reseed_each_epoch option to MapDataset.repeat that allows to replay the first epoch exactly if set to False (True by default).
- Introduces grain.experimental.RebatchIterDataset for efficient rebatch.
- Migrates data loader to use dataset API under the hood.
- Improves first-fit packing speed by up to 12x.
- Adds best-fit packing implementation which reduces padding in benchmarks by over 27% compared to first-fit.
- Adds max_sequences_per_bin to packing transformations to limit the number of sequences packed into a single bin.
- Introduces grain.experimental.RepeatIterDataset.
- Adds custom batching function support to grain.DataLoader.
- Adds grain.experimental.FlatMapTransform support to grain.DataLoader.
Breaking changes:
- SliceMapDataset updated to use the full index relative to the parent dataset, instead index%len(self).
Deprecations:
- Graduates grain.experimental.apply_transformations to grain.{MapDataset|IterDataset}.apply. The experimental API will soon be deprecated.
Bug fixes
- Fixes memory leak on ThreadPrefetchDatasetIterator deletion.

New features:
- Automatic publishing releases to PyPI via GitHub actions.
- Nightly builds.
- Introduced changelog.