Transformer¶

Defined in fynance.models.transformer

class Transformer(X, y, d_model=32, num_heads=4, num_layers=2, dim_ff=64, drop=0., x_type=None, y_type=None)[source]

Bases: BaseNeuralNet

Causal Transformer encoder for sequential financial data.

Projects the N input features to d_model, adds sinusoidal positional encoding, applies num_layers causal self-attention blocks, and reads out to M outputs. A lower-triangular mask makes every block strictly causal (no lookahead): the output at t depends only on inputs up to t.

Configure the optimizer with BaseNeuralNet.set_optimizer (e.g. with fynance.models.loss.SharpeLoss).

Parameters:

X, yarray-like or int

If array-like, respectively the input and output data.
If an integer, respectively the input and output dimension.

d_modelint, optional

Embedding / model dimension (divisible by num_heads). Default 32.

num_headsint, optional

Number of attention heads. Default 4.

num_layersint, optional

Number of stacked encoder blocks. Default 2.

dim_ffint, optional

Hidden size of the position-wise feed-forward sublayer. Default 64.

dropfloat, optional

Dropout probability. Default 0.

See also

fynance.models.attention.MultiHeadAttention, fynance.models.tcn.TemporalConvNet

Examples

>>> import torch
>>> from fynance.models.transformer import Transformer
>>> _ = torch.manual_seed(0)
>>> X = torch.randn(40, 3)
>>> y = torch.randn(40, 1)
>>> model = Transformer(X, y, d_model=16, num_heads=2, num_layers=2)
>>> model(X).shape
torch.Size([40, 1])

fit(X, y, epochs=1, x_type=None, y_type=None)

Fit the model on (X, y) for epochs full-batch steps.

Convenience wrapper that makes the network conform to the SignalModel protocol: it coerces the data via set_data and runs train_on epochs times. An optimizer must have been registered with set_optimizer.

Parameters:

X, yarray-like: Input and output data (numpy / torch / polars), shapes (T, N) and (T, M).
epochsint: Number of full-batch training steps.
x_type, y_typetorch.dtype, optional: Target dtypes forwarded to set_data.

Returns:

BaseNeuralNet: self, to allow chaining.

forward(x)[source]

Forward pass.

Parameters:

xtorch.Tensor: Input window, shape (L, N).

Returns:

torch.Tensor: Per-step output, shape (L, M).

load_model(path, load_optimizer=False)

Load the model weights and parameters from a file.

Parameters:

pathstr or os.PathLike object: Path to load the model.
load_optimizerbool, optional: If True, then load also the optimizer.

predict(X)

Predicts outputs of neural network model.

Runs self.forward(X) under torch.no_grad with the module switched to evaluation mode, so no autograd graph is built and stochastic layers (dropout, batch-norm) behave deterministically. The previous training/eval mode is restored on exit. The returned tensor is detached and lives on the same device as the model parameters; the coerced input is moved to that device too. Array-like inputs (numpy / polars) are coerced to a tensor first, so the method also satisfies the SignalModel contract.

Parameters:

Xarray-like: Inputs to compute prediction. Same shape and dtype contract as train_on.

Returns:

torch.Tensor: Outputs prediction (detached, gradient-free).

save_model(path, save_optimizer=False)

Save the model with this weights and parameters.

Parameters:

pathstr or os.PathLike object: Path to save the model.
save_optimizerbool, optional: If True, then save also the optimizer.

set_data(X, y, x_type=None, y_type=None)

Set data inputs and outputs.

Coerces X and y to torch.Tensor and caches them as self.X / self.y. After the call the attributes self.T (number of observations), self.N (input columns) and self.M (output columns) are set.

Parameters:

X, yarray-like: Respectively input and output data. Accepted types: numpy.ndarray, torch.Tensor, polars.DataFrame. Shapes must be (T, N) and (T, M) respectively.
x_type, y_typetorch.dtype, optional: Target dtypes for the resulting tensors. Default is None, which casts floating-point inputs to torch.get_default_dtype() (float32 by default) and leaves integer inputs unchanged. See _set_data.

Returns:

BaseNeuralNet: self, to allow chaining.

Raises:

ValueError: If self.N / self.M were already set and X / y do not match, or if X and y have different lengths.

set_lr_scheduler(lr_scheduler, **kwargs)

Set dynamic learning rate.

Parameters:

lr_schedulertorch.optim.lr_scheduler._LRScheduler: Method from torch.optim.lr_scheduler to wrap self.optimizer, cf module torch.optim.lr_scheduler in PyTorch documentation [2].
**kwargs: Keyword arguments to pass to the learning rate scheduler.

References

[2]

https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate

set_optimizer(criterion, optimizer, params=None, **kwargs)

Set the optimizer object.

Set optimizer object with specified criterion as loss function and any kwargs as optional parameters.

Parameters:

criterionCallable, torch.nn.modules.loss: A loss function.
optimizertorch.optim.Optimizer: An optimizer algorithm.
paramsobject or iterable object: Layer of parameters to optimize or dicts defining parameter groups. If set to None then all parameters of model will be optimized. Default is None.
**kwargs: Keyword arguments of optimizer, cf PyTorch documentation [1].

Returns:

BaseNeuralNet: Self object model.

References

[1]

https://pytorch.org/docs/stable/optim.html

set_seed(seed_torch=None, seed_numpy=None)

Set seed for PyTorch and NumPy random number generator.

Each generator is only (re)seeded when its argument is provided: passing seed_torch alone leaves the global NumPy RNG untouched, and vice versa.

Parameters:

seed_torch, seed_numpybool or int, optional: If an int \(0 \leq seed < 2^{32}\), seed respectively the PyTorch and NumPy generator with that number. If True, draw a random seed. If None (default), leave that generator untouched.

Examples

>>> from fynance.models.mlp import MultiLayerPerceptron
>>> model = MultiLayerPerceptron(3, 1, layers=[4])
>>> model.set_seed(seed_torch=42)
>>> model.seed_torch
42
>>> model.seed_numpy is None
True

train_on(X, y)

Trains the neural network model on a single batch.

Runs one forward / backward / optimizer-step cycle on the batch (X, y). The module is switched to training mode (so dropout and batch-norm behave as expected) before the forward pass. As a side effect, gradients of all parameters are zeroed before the forward pass and the optimizer state is advanced afterwards. If a learning-rate scheduler has been registered via set_lr_scheduler, its step is also called.

Parameters:

X, ytorch.Tensor: Respectively inputs and outputs to train model. Shapes must match what self.forward expects (see the class-level “Public API contract” section).

Returns:

torch.Tensor: The loss tensor produced by self.criterion(self(X), y), with gradient already consumed by loss.backward().

Raises:

AttributeError: If set_optimizer has not been called yet.