zensols.deeplearn.layer package¶
Submodules¶
zensols.deeplearn.layer.conv module¶
Convolution network creation utilities.
- class zensols.deeplearn.layer.conv.ConvolutionLayerFactory(width=1, height=1, depth=1, n_filters=1, kernel_filter=(2, 2), stride=1, padding=0)[source]¶
Bases:
object
Create convolution layers. Each attribute maps a corresponding attribuate variable in
Im2DimCalculator
, which documented in the parenthesis in the parameter documentation below.- Parameters:
width (
int
) – the width of the image/data (W
)height (
int
) – the height of the image/data (H
)depth (
int
) – the volume, which is usually same asn_filters
(D
)n_filters (
int
) – the number of filters, aka the filter depth/volume (K
)kernel_filter (
Tuple
[int
,int
]) – the kernel filter dimension in width X height (F
)stride (
int
) – the stride, which is the number of cells to skip for each convolution (S
)padding (
int
) – the zero’d number of cells on the ends of the image/data (P
)
- See:
- __init__(width=1, height=1, depth=1, n_filters=1, kernel_filter=(2, 2), stride=1, padding=0)¶
- property calc: Im2DimCalculator¶
- class zensols.deeplearn.layer.conv.Flattenable[source]¶
Bases:
object
A class with a
flatten_dim
andout_shape
properties.
- class zensols.deeplearn.layer.conv.Im2DimCalculator(W, H, D=1, K=1, F=(2, 2), S=1, P=0)[source]¶
Bases:
Flattenable
Convolution matrix dimension calculation utility.
Implementation as Matrix Multiplication section.
- Example (im2col)::
W_in = H_in = 227 Ch_in = D_in = 3 Ch_out = D_out = 3 K = 96 F = (11, 11) S = 4 P = 0 W_out = H_out = 227 - 11 + (2 * 0) / 4 = 55 output locations X_col = Fw^2 * D_out x W_out * H_out = 11^2 * 3 x 55 * 55 = 363 x 3025
- Example (im2row)::
W_row = 96 filters of size 11 x 11 x 3 => K x 11 * 11 * 3 = 96 x 363
Result of convolution: transpose(W_row) dot X_col. Must reshape back to 55 x 55 x 96
- See:
- property H_out¶
- property W_out¶
- property W_row¶
- property X_col¶
- property out_shape¶
Return the shape of the layer after flattened in to one dimension.
- class zensols.deeplearn.layer.conv.MaxPool1dFactory(layer_factory=None, stride=1, padding=0, kernel_filter=2)[source]¶
Bases:
PoolFactory
Create a 1D max pool and output it’s shape.
- __init__(layer_factory=None, stride=1, padding=0, kernel_filter=2)¶
- class zensols.deeplearn.layer.conv.MaxPool2dFactory(layer_factory=None, stride=1, padding=0, kernel_filter=(2, 2))[source]¶
Bases:
PoolFactory
Create a 2D max pool and output it’s shape.
- __init__(layer_factory=None, stride=1, padding=0, kernel_filter=(2, 2))¶
- class zensols.deeplearn.layer.conv.PoolFactory(layer_factory=None, stride=1, padding=0)[source]¶
Bases:
Flattenable
Create a 2D max pool and output it’s shape.
- See:
- __init__(layer_factory=None, stride=1, padding=0)¶
-
layer_factory:
ConvolutionLayerFactory
= None¶
zensols.deeplearn.layer.crf module¶
Conditional random field PyTorch module forked from Kemal Kurniawan’s
pytorch_crf
GitHub repository. See the Torch CRF
section of the
README.md
module documentation for more information.
- see:
- see:
- class zensols.deeplearn.layer.crf.CRF(num_tags, batch_first=False, score_reduction='skip')[source]¶
Bases:
Module
Conditional random field.
This module implements a conditional random field [LMP01]. The forward computation of this class computes the log likelihood of the given sequence of tags and emission score tensor. This class also has ~CRF.decode method which finds the best tag sequence given an emission score tensor using Viterbi algorithm.
- Parameters:
num_tags – Number of tags.
batch_first – Whether the first dimension corresponds to the size of a minibatch.
score_reduction –
- reduces how the score output over batches, and then
tags, and has shape
(batch size, number of tags)
with the exception oftags
, which has shape(batch_size, sequence length, number of tags)
; how output is returned indecode()
by:
skip: do not return scores, only the decoded output (default)
none: return the scores unaltered, then divide by the batch count
tags: all scores
sum: sum the max over batches, then divide by the batch count
max: max over each batch max, then divide by the batch count
min: min over each batch max, then divide by the batch count
mean: average the max over batchs, then divide by the batch count
- start_transitions¶
Start transition score tensor of size
(num_tags,)
.- Type:
~torch.nn.Parameter
- end_transitions¶
End transition score tensor of size
(num_tags,)
.- Type:
~torch.nn.Parameter
- transitions¶
Transition score tensor of size
(num_tags, num_tags)
.- Type:
~torch.nn.Parameter
[LMP01]Lafferty, J., McCallum, A., Pereira, F. (2001). “Conditional random fields: Probabilistic models for segmenting and labeling sequence data”. Proc. 18th International Conf. on Machine Learning. Morgan Kaufmann. pp. 282–289.
- __init__(num_tags, batch_first=False, score_reduction='skip')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- decode(emissions, mask=None)[source]¶
Find the most likely tag sequence using Viterbi algorithm.
- Return type:
- Parameters:
emissions (~torch.Tensor) – Emission score tensor of size
(seq_length, batch_size, num_tags)
ifbatch_first
isFalse
,(batch_size, seq_length, num_tags)
otherwise.mask (~torch.ByteTensor) – Mask tensor of size
(seq_length, batch_size)
ifbatch_first
isFalse
,(batch_size, seq_length)
otherwise.
- Returns:
List of list containing the best tag sequence for each batch and optionally the scores based on the (~`score_reduction`) parameter in
__init__()
.
- forward(emissions, tags, mask=None, reduction='sum')[source]¶
Compute the conditional log likelihood of a sequence of tags given emission scores.
- Return type:
Tensor
- Parameters:
emissions (~torch.Tensor) – Emission score tensor of size
(seq_length, batch_size, num_tags)
ifbatch_first
isFalse
,(batch_size, seq_length, num_tags)
otherwise.tags (~torch.LongTensor) – Sequence of tags tensor of size
(seq_length, batch_size)
ifbatch_first
isFalse
,(batch_size, seq_length)
otherwise.mask (~torch.ByteTensor) – Mask tensor of size
(seq_length, batch_size)
ifbatch_first
isFalse
,(batch_size, seq_length)
otherwise.reduction – Specifies the reduction to apply to the output:
none|sum|mean|token_mean
.none
: no reduction will be applied.sum
: the output will be summed over batches.mean
: the output will be averaged over batches.token_mean
: the output will be averaged over tokens.
- Returns:
The log likelihood. This will have size
(batch_size,)
if reduction isnone
,()
otherwise.- Return type:
~torch.Tensor
zensols.deeplearn.layer.linear module¶
Convenience classes for linear layers.
- class zensols.deeplearn.layer.linear.DeepLinear(net_settings, sub_logger=None)[source]¶
Bases:
BaseNetworkModule
A layer that has contains one more nested layers, including batch normalization and activation. The input and output layer shapes are given and an optional 0 or more middle layers are given as percent changes in size or exact numbers.
If the network settings are configured to have batch normalization, batch normalization layers are added after each linear layer.
The drop out and activation function (if any) are applied in between each layer allowing other drop outs and activation functions to be applied before and after. Note that the activation is implemented as a function, and not a layer.
For example, if batch normalization and an activation function is configured and two layers are configured, the network is configured as:
linear
batch normalization
3. activation 5. dropout 6. linear 7. batch normalization 8. activation 9. dropout
The module also provides the output features of each layer with
n_features_after_layer()
and ability to forward though only the first given set of layers withforward_n_layers()
.- MODULE_NAME: ClassVar[str] = 'linear'¶
The module name used in the logging message. This is set in each inherited class.
- __init__(net_settings, sub_logger=None)[source]¶
Initialize the deep linear layer.
- Parameters:
net_settings (
DeepLinearNetworkSettings
) – the deep linear layer configurationsub_logger (
Logger
) – the logger to use for the forward process in this layer
- forward_n_layers(x, n_layers, full_forward=False)[source]¶
Forward throught the first 0 index based N layers.
- class zensols.deeplearn.layer.linear.DeepLinearNetworkSettings(name, config_factory, torch_config, batch_norm_d, batch_norm_features, dropout, activation, in_features, out_features, middle_features, proportions, repeats)[source]¶
Bases:
ActivationNetworkSettings
,DropoutNetworkSettings
,BatchNormNetworkSettings
Settings for a deep fully connected network using
DeepLinear
.- __init__(name, config_factory, torch_config, batch_norm_d, batch_norm_features, dropout, activation, in_features, out_features, middle_features, proportions, repeats)¶
- get_module_class_name()[source]¶
Returns the fully qualified class name of the module to create by
ModelManager
. This module takes as the first parameter an instance of this class.Important: This method is not used for nested modules. You must declare specific class names in the configuration for those nested class naems.
- Return type:
-
middle_features:
Tuple
[Any
]¶ The number of features in the middle layers; if
proportions
isTrue
, then each number is how much to grow or shrink as a percetage of the last layer, otherwise, it’s the number of features.
-
proportions:
bool
¶ Whether or not to interpret
middle_features
as a proportion of the previous layer or use directly as the size of the middle layer.
-
repeats:
int
¶ The number of repeats of the
middle_features
configuration.
zensols.deeplearn.layer.recur module¶
This file contains a convenience wrapper around RNN, GRU and LSTM modules in PyTorch.
- class zensols.deeplearn.layer.recur.RecurrentAggregation(net_settings, sub_logger=None)[source]¶
Bases:
BaseNetworkModule
A recurrent neural network model with an output aggregation. This includes RNNs, LSTMs and GRUs.
- MODULE_NAME: ClassVar[str] = 'recur'¶
The module name used in the logging message. This is set in each inherited class.
- __init__(net_settings, sub_logger=None)[source]¶
Initialize the recurrent layer.
- Parameters:
net_settings (
RecurrentAggregationNetworkSettings
) – the reccurent layer configurationsub_logger (
Logger
) – the logger to use for the forward process in this layer
- class zensols.deeplearn.layer.recur.RecurrentAggregationNetworkSettings(name, config_factory, torch_config, dropout, network_type, aggregation, bidirectional, input_size, hidden_size, num_layers)[source]¶
Bases:
DropoutNetworkSettings
Settings for a recurrent neural network. This configures a
RecurrentAggregation
layer.- __init__(name, config_factory, torch_config, dropout, network_type, aggregation, bidirectional, input_size, hidden_size, num_layers)¶
-
aggregation:
str
¶ A convenience operation to aggregate the parameters; this is one of:
max
: return the max of the output statesave
: return the average of the output stateslast
: return the last output statenone
: do not apply an aggregation function.
- get_module_class_name()[source]¶
Returns the fully qualified class name of the module to create by
ModelManager
. This module takes as the first parameter an instance of this class.Important: This method is not used for nested modules. You must declare specific class names in the configuration for those nested class naems.
- Return type:
The size of the hidden states of the network.
zensols.deeplearn.layer.recurcrf module¶
Contains an implementation of a recurrent with a conditional random field layer. This is usually configured as a BiLSTM CRF.
- class zensols.deeplearn.layer.recurcrf.RecurrentCRF(net_settings, sub_logger=None, use_crf=True)[source]¶
Bases:
BaseNetworkModule
Adapt the
CRF
module using the framework basedBaseNetworkModule
class. This provides methodsforward_recur_decode()
anddecode()
, which decodes the input.This adds a recurrent neural network and a fully connected feed forward decoder layer before the CRF layer.
- MODULE_NAME: ClassVar[str] = 'recur crf'¶
The module name used in the logging message. This is set in each inherited class.
- __init__(net_settings, sub_logger=None, use_crf=True)[source]¶
Initialize the reccurent CRF layer.
- Parameters:
net_settings (
RecurrentCRFNetworkSettings
) – the recurrent layer configurationsub_logger (
Logger
) – the logger to use for the forward process in this layer
- decode(x, mask)[source]¶
Forward the input though the recurrent network, decoder, and then the CRF.
- Parameters:
x (
Tensor
) – the inputmask (
Tensor
) – the mask used to block the last N states not provided
- Return type:
Tuple
[Tensor
,Tensor
]- Returns:
the CRF sequence output and the score provided by the CRF’s veterbi algorithm as a tuple
- forward_recur_decode(x)[source]¶
Forward the input through the recurrent network (i.e. LSTM), batch normalization and activation (if confgiured), and decoder output.
Note: this layer forwards batch normalization, activation and drop out (for those configured) after the recurrent layer is forwarded. However, the subordinate recurrent layer can also be configured with a dropout when having more than one stacked layer.
- Parameters:
x (
Tensor
) – the network input- Return type:
Tensor
- Returns:
the fully connected linear feed forward decoded output
- to(*args, **kwargs)[source]¶
Moves and/or casts the parameters and buffers.
This can be called as
- to(device=None, dtype=None, non_blocking=False)[source]
- to(dtype, non_blocking=False)[source]
- to(tensor, non_blocking=False)[source]
- to(memory_format=torch.channels_last)[source]
Its signature is similar to
torch.Tensor.to()
, but only accepts floating point or complexdtype
s. In addition, this method will only cast the floating point or complex parameters and buffers todtype
(if given). The integral parameters and buffers will be moveddevice
, if that is given, but with dtypes unchanged. Whennon_blocking
is set, it tries to convert/move asynchronously with respect to the host if possible, e.g., moving CPU Tensors with pinned memory to CUDA devices.See below for examples.
Note
This method modifies the module in-place.
- Parameters:
device (
torch.device
) – the desired device of the parameters and buffers in this moduledtype (
torch.dtype
) – the desired floating point or complex dtype of the parameters and buffers in this moduletensor (torch.Tensor) – Tensor whose dtype and device are the desired dtype and device for all parameters and buffers in this module
memory_format (
torch.memory_format
) – the desired memory format for 4D parameters and buffers in this module (keyword only argument)
- Returns:
self
- Return type:
Module
Examples:
>>> # xdoctest: +IGNORE_WANT("non-deterministic") >>> linear = nn.Linear(2, 2) >>> linear.weight Parameter containing: tensor([[ 0.1913, -0.3420], [-0.5113, -0.2325]]) >>> linear.to(torch.double) Linear(in_features=2, out_features=2, bias=True) >>> linear.weight Parameter containing: tensor([[ 0.1913, -0.3420], [-0.5113, -0.2325]], dtype=torch.float64) >>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA1) >>> gpu1 = torch.device("cuda:1") >>> linear.to(gpu1, dtype=torch.half, non_blocking=True) Linear(in_features=2, out_features=2, bias=True) >>> linear.weight Parameter containing: tensor([[ 0.1914, -0.3420], [-0.5112, -0.2324]], dtype=torch.float16, device='cuda:1') >>> cpu = torch.device("cpu") >>> linear.to(cpu) Linear(in_features=2, out_features=2, bias=True) >>> linear.weight Parameter containing: tensor([[ 0.1914, -0.3420], [-0.5112, -0.2324]], dtype=torch.float16) >>> linear = nn.Linear(2, 2, bias=None).to(torch.cdouble) >>> linear.weight Parameter containing: tensor([[ 0.3741+0.j, 0.2382+0.j], [ 0.5593+0.j, -0.4443+0.j]], dtype=torch.complex128) >>> linear(torch.ones(3, 2, dtype=torch.cdouble)) tensor([[0.6122+0.j, 0.1150+0.j], [0.6122+0.j, 0.1150+0.j], [0.6122+0.j, 0.1150+0.j]], dtype=torch.complex128)
- class zensols.deeplearn.layer.recurcrf.RecurrentCRFNetworkSettings(name, config_factory, torch_config, batch_norm_d, batch_norm_features, dropout, activation, network_type, bidirectional, input_size, hidden_size, num_layers, num_labels, decoder_settings, score_reduction)[source]¶
Bases:
ActivationNetworkSettings
,DropoutNetworkSettings
,BatchNormNetworkSettings
Settings for a recurrent neural network using
RecurrentCRF
.- __init__(name, config_factory, torch_config, batch_norm_d, batch_norm_features, dropout, activation, network_type, bidirectional, input_size, hidden_size, num_layers, num_labels, decoder_settings, score_reduction)¶
-
decoder_settings:
DeepLinearNetworkSettings
¶ The decoder feed forward network.
- get_module_class_name()[source]¶
Returns the fully qualified class name of the module to create by
ModelManager
. This module takes as the first parameter an instance of this class.Important: This method is not used for nested modules. You must declare specific class names in the configuration for those nested class naems.
- Return type:
The size of the hidden states of the network.
Module contents¶
Provides neural network layer implementations, which are all subclasses of
torch.nn.Module
.
- exception zensols.deeplearn.layer.LayerError[source]¶
Bases:
ModelError
Thrown for all deep learning layer errors.
- __annotations__ = {}¶
- __module__ = 'zensols.deeplearn.layer'¶