Transformer

Defined in fynance.models.transformer

class Transformer(X, y, d_model=32, num_heads=4, num_layers=2, dim_ff=64, drop=0., x_type=None, y_type=None)[source]

Bases: BaseNeuralNet

Causal Transformer encoder for sequential financial data.

Projects the N input features to d_model, adds sinusoidal positional encoding, applies num_layers causal self-attention blocks, and reads out to M outputs. A lower-triangular mask makes every block strictly causal (no lookahead): the output at t depends only on inputs up to t.

Configure the optimizer with BaseNeuralNet.set_optimizer (e.g. with fynance.models.loss.SharpeLoss).

Parameters:
X, yarray-like or int
  • If array-like, respectively the input and output data.

  • If an integer, respectively the input and output dimension.

d_modelint, optional

Embedding / model dimension (divisible by num_heads). Default 32.

num_headsint, optional

Number of attention heads. Default 4.

num_layersint, optional

Number of stacked encoder blocks. Default 2.

dim_ffint, optional

Hidden size of the position-wise feed-forward sublayer. Default 64.

dropfloat, optional

Dropout probability. Default 0.

See also

fynance.models.attention.MultiHeadAttention, fynance.models.tcn.TemporalConvNet

Examples

>>> import torch
>>> from fynance.models.transformer import Transformer
>>> _ = torch.manual_seed(0)
>>> X = torch.randn(40, 3)
>>> y = torch.randn(40, 1)
>>> model = Transformer(X, y, d_model=16, num_heads=2, num_layers=2)
>>> model(X).shape
torch.Size([40, 1])
forward(x)[source]

Forward pass.

Parameters:
xtorch.Tensor

Input window, shape (L, N).

Returns:
torch.Tensor

Per-step output, shape (L, M).

load_model(path, load_optimizer=False)

Save the model with this weights and parameters.

Parameters:
pathstr or os.PathLike object

Path to load the model.

load_optimizerbool, optional

If True, then load also the optimizer.

predict(X)

Predicts outputs of neural network model.

Runs self.forward(X) under torch.no_grad, so no autograd graph is built. The returned tensor is detached and lives on the same device as the model parameters.

Parameters:
Xtorch.Tensor

Inputs to compute prediction. Same shape and dtype contract as train_on.

Returns:
torch.Tensor

Outputs prediction (detached, gradient-free).

save_model(path, save_optimizer=False)

Save the model with this weights and parameters.

Parameters:
pathstr or os.PathLike object

Path to save the model.

save_optimizerbool, optional

If True, then save also the optimizer.

set_data(X, y, x_type=None, y_type=None)

Set data inputs and outputs.

Coerces X and y to torch.Tensor and caches them as self.X / self.y. After the call the attributes self.T (number of observations), self.N (input columns) and self.M (output columns) are set.

Parameters:
X, yarray-like

Respectively input and output data. Accepted types: numpy.ndarray, torch.Tensor, polars.DataFrame. Shapes must be (T, N) and (T, M) respectively.

x_type, y_typetorch.dtype, optional

Target dtypes for the resulting tensors. Default is None, which preserves the input dtype.

Returns:
BaseNeuralNet

self, to allow chaining.

Raises:
ValueError

If self.N / self.M were already set and X / y do not match, or if X and y have different lengths.

set_lr_scheduler(lr_scheduler, **kwargs)

Set dynamic learning rate.

Parameters:
lr_schedulertorch.optim.lr_scheduler._LRScheduler

Method from torch.optim.lr_scheduler to wrap self.optimizer, cf module torch.optim.lr_scheduler in PyTorch documentation [2].

**kwargs

Keyword arguments to pass to the learning rate scheduler.

References

set_optimizer(criterion, optimizer, params=None, **kwargs)

Set the optimizer object.

Set optimizer object with specified criterion as loss function and any kwargs as optional parameters.

Parameters:
criterionCallabletorch.nn.modules.loss

A loss function.

optimizertorch.optim.Optimizer

An optimizer algorithm.

paramsobject or iterable object

Layer of parameters to optimize or dicts defining parameter groups. If set to None then all parameters of model will be optimized. Default is None.

**kwargs

Keyword arguments of optimizer, cf PyTorch documentation [1].

Returns:
BaseNeuralNet

Self object model.

References

set_seed(seed_torch=None, seed_numpy=None)

Set seed for PyTorch and NumPy random number generator.

Parameters:
seed_torch, seed_numpybool or int, optional

If seed is an int \(0 < seed < 2^32\) set respectively PyTorch and NumPy seed with the number. Otherwise if is True then choose a random number, else doesn’t set seed.

train_on(X, y)

Trains the neural network model on a single batch.

Runs one forward / backward / optimizer-step cycle on the batch (X, y). As a side effect, gradients of all parameters are zeroed before the forward pass and the optimizer state is advanced afterwards. If a learning-rate scheduler has been registered via set_lr_scheduler, its step is also called.

Parameters:
X, ytorch.Tensor

Respectively inputs and outputs to train model. Shapes must match what self.forward expects (see the class-level “Public API contract” section).

Returns:
torch.Tensor

The loss tensor produced by self.criterion(self(X), y), with gradient already consumed by loss.backward().

Raises:
AttributeError

If set_optimizer has not been called yet.