Time Series Dataset

Torch Dataset for Time Series

TimeSeriesLoader

 TimeSeriesLoader (dataset, **kwargs)

TimeSeriesLoader DataLoader. Source code.

Small change to PyTorch’s Data loader. Combines a dataset and a sampler, and provides an iterable over the given dataset.

The class ~torch.utils.data.DataLoader supports both map-style and iterable-style datasets with single- or multi-process loading, customizing loading order and optional automatic batching (collation) and memory pinning.

Parameters:
batch_size: (int, optional): how many samples per batch to load (default: 1).
shuffle: (bool, optional): set to True to have the data reshuffled at every epoch (default: False).
sampler: (Sampler or Iterable, optional): defines the strategy to draw samples from the dataset.
Can be any Iterable with __len__ implemented. If specified, shuffle must not be specified.

source

TimeSeriesDataset

 TimeSeriesDataset (temporal, temporal_cols, indptr, max_size,
                    static=None, static_cols=None)

An abstract class representing a :class:Dataset.

All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite :meth:__getitem__, supporting fetching a data sample for a given key. Subclasses could also optionally overwrite :meth:__len__, which is expected to return the size of the dataset by many :class:~torch.utils.data.Sampler implementations and the default options of :class:~torch.utils.data.DataLoader.

.. note:: :class:~torch.utils.data.DataLoader by default constructs a index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.

source

TimeSeriesDataModule

 TimeSeriesDataModule (dataset:__main__.TimeSeriesDataset, batch_size=32,
                       num_workers=0, shuffle=False, drop_last=False)

A DataModule standardizes the training, val, test splits, data preparation and transforms. The main advantage is consistent data splits, data preparation and transforms across models.

Example::

class MyDataModule(LightningDataModule):
    def __init__(self):
        super().__init__()
    def prepare_data(self):
        # download, split, etc...
        # only called on 1 GPU/TPU in distributed
    def setup(self, stage):
        # make assignments here (val/train/test split)
        # called on every process in DDP
    def train_dataloader(self):
        train_split = Dataset(...)
        return DataLoader(train_split)
    def val_dataloader(self):
        val_split = Dataset(...)
        return DataLoader(val_split)
    def test_dataloader(self):
        test_split = Dataset(...)
        return DataLoader(test_split)
    def teardown(self):
        # clean up after fit or test
        # called on every process in DDP