Transformer

Defined in fynance.models.transformer

class Transformer(X, y, d_model=32, num_heads=4, num_layers=2, dim_ff=64, drop=0., x_type=None, y_type=None)[source]

Bases: BaseNeuralNet

Causal Transformer encoder for sequential financial data.

Projects the N input features to d_model, adds sinusoidal positional encoding, applies num_layers causal self-attention blocks, and reads out to M outputs. A lower-triangular mask makes every block strictly causal (no lookahead): the output at t depends only on inputs up to t.

Configure the optimizer with BaseNeuralNet.set_optimizer (e.g. with fynance.models.loss.SharpeLoss).

Parameters:
X, yarray-like or int
  • If array-like, respectively the input and output data.

  • If an integer, respectively the input and output dimension.

d_modelint, optional

Embedding / model dimension (divisible by num_heads). Default 32.

num_headsint, optional

Number of attention heads. Default 4.

num_layersint, optional

Number of stacked encoder blocks. Default 2.

dim_ffint, optional

Hidden size of the position-wise feed-forward sublayer. Default 64.

dropfloat, optional

Dropout probability. Default 0.

See also

fynance.models.attention.MultiHeadAttention, fynance.models.tcn.TemporalConvNet

Examples

>>> import torch
>>> from fynance.models.transformer import Transformer
>>> _ = torch.manual_seed(0)
>>> X = torch.randn(40, 3)
>>> y = torch.randn(40, 1)
>>> model = Transformer(X, y, d_model=16, num_heads=2, num_layers=2)
>>> model(X).shape
torch.Size([40, 1])
fit(X, y, epochs=1, x_type=None, y_type=None)

Fit the model on (X, y) for epochs full-batch steps.

Convenience wrapper that makes the network conform to the SignalModel protocol: it coerces the data via set_data and runs train_on epochs times. An optimizer must have been registered with set_optimizer.

Parameters:
X, yarray-like

Input and output data (numpy / torch / polars), shapes (T, N) and (T, M).

epochsint

Number of full-batch training steps.

x_type, y_typetorch.dtype, optional

Target dtypes forwarded to set_data.

Returns:
BaseNeuralNet

self, to allow chaining.

forward(x)[source]

Forward pass.

Parameters:
xtorch.Tensor

Input window, shape (L, N).

Returns:
torch.Tensor

Per-step output, shape (L, M).

load_model(path, load_optimizer=False)

Load the model weights and parameters from a file.

Parameters:
pathstr or os.PathLike object

Path to load the model.

load_optimizerbool, optional

If True, then load also the optimizer.

predict(X)

Predicts outputs of neural network model.

Runs self.forward(X) under torch.no_grad with the module switched to evaluation mode, so no autograd graph is built and stochastic layers (dropout, batch-norm) behave deterministically. The previous training/eval mode is restored on exit. The returned tensor is detached and lives on the same device as the model parameters; the coerced input is moved to that device too. Array-like inputs (numpy / polars) are coerced to a tensor first, so the method also satisfies the SignalModel contract.

Parameters:
Xarray-like

Inputs to compute prediction. Same shape and dtype contract as train_on.

Returns:
torch.Tensor

Outputs prediction (detached, gradient-free).

save_model(path, save_optimizer=False)

Save the model with this weights and parameters.

Parameters:
pathstr or os.PathLike object

Path to save the model.

save_optimizerbool, optional

If True, then save also the optimizer.

set_data(X, y, x_type=None, y_type=None)

Set data inputs and outputs.

Coerces X and y to torch.Tensor and caches them as self.X / self.y. After the call the attributes self.T (number of observations), self.N (input columns) and self.M (output columns) are set.

Parameters:
X, yarray-like

Respectively input and output data. Accepted types: numpy.ndarray, torch.Tensor, polars.DataFrame. Shapes must be (T, N) and (T, M) respectively.

x_type, y_typetorch.dtype, optional

Target dtypes for the resulting tensors. Default is None, which casts floating-point inputs to torch.get_default_dtype() (float32 by default) and leaves integer inputs unchanged. See _set_data.

Returns:
BaseNeuralNet

self, to allow chaining.

Raises:
ValueError

If self.N / self.M were already set and X / y do not match, or if X and y have different lengths.

set_lr_scheduler(lr_scheduler, **kwargs)

Set dynamic learning rate.

Parameters:
lr_schedulertorch.optim.lr_scheduler._LRScheduler

Method from torch.optim.lr_scheduler to wrap self.optimizer, cf module torch.optim.lr_scheduler in PyTorch documentation [2].

**kwargs

Keyword arguments to pass to the learning rate scheduler.

References

set_optimizer(criterion, optimizer, params=None, **kwargs)

Set the optimizer object.

Set optimizer object with specified criterion as loss function and any kwargs as optional parameters.

Parameters:
criterionCallable, torch.nn.modules.loss

A loss function.

optimizertorch.optim.Optimizer

An optimizer algorithm.

paramsobject or iterable object

Layer of parameters to optimize or dicts defining parameter groups. If set to None then all parameters of model will be optimized. Default is None.

**kwargs

Keyword arguments of optimizer, cf PyTorch documentation [1].

Returns:
BaseNeuralNet

Self object model.

References

set_seed(seed_torch=None, seed_numpy=None)

Set seed for PyTorch and NumPy random number generator.

Each generator is only (re)seeded when its argument is provided: passing seed_torch alone leaves the global NumPy RNG untouched, and vice versa.

Parameters:
seed_torch, seed_numpybool or int, optional

If an int \(0 \leq seed < 2^{32}\), seed respectively the PyTorch and NumPy generator with that number. If True, draw a random seed. If None (default), leave that generator untouched.

Examples

>>> from fynance.models.mlp import MultiLayerPerceptron
>>> model = MultiLayerPerceptron(3, 1, layers=[4])
>>> model.set_seed(seed_torch=42)
>>> model.seed_torch
42
>>> model.seed_numpy is None
True
train_on(X, y)

Trains the neural network model on a single batch.

Runs one forward / backward / optimizer-step cycle on the batch (X, y). The module is switched to training mode (so dropout and batch-norm behave as expected) before the forward pass. As a side effect, gradients of all parameters are zeroed before the forward pass and the optimizer state is advanced afterwards. If a learning-rate scheduler has been registered via set_lr_scheduler, its step is also called.

Parameters:
X, ytorch.Tensor

Respectively inputs and outputs to train model. Shapes must match what self.forward expects (see the class-level “Public API contract” section).

Returns:
torch.Tensor

The loss tensor produced by self.criterion(self(X), y), with gradient already consumed by loss.backward().

Raises:
AttributeError

If set_optimizer has not been called yet.