Time Series Dataset
TimeSeriesLoader
TimeSeriesLoader (dataset, **kwargs)
TimeSeriesLoader DataLoader. Source code.
Small change to PyTorch’s Data loader. Combines a dataset and a sampler, and provides an iterable over the given dataset.
The class ~torch.utils.data.DataLoader
supports both map-style and iterable-style datasets with single- or multi-process loading, customizing loading order and optional automatic batching (collation) and memory pinning.
Parameters:
batch_size
: (int, optional): how many samples per batch to load (default: 1).
shuffle
: (bool, optional): set to True
to have the data reshuffled at every epoch (default: False
).
sampler
: (Sampler or Iterable, optional): defines the strategy to draw samples from the dataset.
Can be any Iterable
with __len__
implemented. If specified, shuffle
must not be specified.
TimeSeriesDataset
TimeSeriesDataset (temporal, temporal_cols, indptr, max_size, static=None, static_cols=None)
An abstract class representing a :class:Dataset
.
All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite :meth:__getitem__
, supporting fetching a data sample for a given key. Subclasses could also optionally overwrite :meth:__len__
, which is expected to return the size of the dataset by many :class:~torch.utils.data.Sampler
implementations and the default options of :class:~torch.utils.data.DataLoader
.
.. note:: :class:~torch.utils.data.DataLoader
by default constructs a index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.
TimeSeriesDataModule
TimeSeriesDataModule (dataset:__main__.TimeSeriesDataset, batch_size=32, num_workers=0, shuffle=False, drop_last=False)
A DataModule standardizes the training, val, test splits, data preparation and transforms. The main advantage is consistent data splits, data preparation and transforms across models.
Example::
class MyDataModule(LightningDataModule):
def __init__(self):
super().__init__()
def prepare_data(self):
# download, split, etc...
# only called on 1 GPU/TPU in distributed
def setup(self, stage):
# make assignments here (val/train/test split)
# called on every process in DDP
def train_dataloader(self):
train_split = Dataset(...)
return DataLoader(train_split)
def val_dataloader(self):
val_split = Dataset(...)
return DataLoader(val_split)
def test_dataloader(self):
test_split = Dataset(...)
return DataLoader(test_split)
def teardown(self):
# clean up after fit or test
# called on every process in DDP