zensols.deeplearn.layer package¶
Submodules¶
zensols.deeplearn.layer.conv module¶
Convolution network creation utilities.
- class zensols.deeplearn.layer.conv.Convolution1DLayerFactory(stride=1, padding=0, pool_stride=1, pool_padding=0, in_channels=1, out_channels=1, kernel_filter=2, pool_kernel_filter=2)[source]¶
Bases:
ConvolutionLayerFactory
Two dimensional convoluation and output shape factory.
- __init__(stride=1, padding=0, pool_stride=1, pool_padding=0, in_channels=1, out_channels=1, kernel_filter=2, pool_kernel_filter=2)¶
- create_batch_norm_layer()[source]¶
Create the batch norm layer that follows the pool layer.
- Return type:
Module
- create_conv_layer()[source]¶
Create the convolution layer for this layer in the stack.
- Return type:
Module
- class zensols.deeplearn.layer.conv.Convolution2DLayerFactory(stride=1, padding=0, pool_stride=1, pool_padding=0, width=1, height=1, depth=1, kernel_filter=(2, 2), n_filters=1, pool_kernel_filter=(2, 2))[source]¶
Bases:
ConvolutionLayerFactory
Two dimensional convoluation and output shape factory. Implementation as matrix multiplication section taken from the `Standford CNN`_ class.
- Example (im2col)::
W_in = H_in = 227 Ch_in = D_in = 3 Ch_out = D_out = 3 K = 96 F = (11, 11) S = 4 P = 0 W_out = H_out = 227 - 11 + (2 * 0) / 4 = 55 output locations X_col = Fw^2 * D_out x W_out * H_out = 11^2 * 3 x 55 * 55 = 363 x 3025
- Example (im2row)::
W_row = 96 filters of size 11 x 11 x 3 => K x 11 * 11 * 3 = 96 x 363
Result of convolution: transpose(W_row) dot X_col. Must reshape back to 55 x 55 x 96
- property H_out¶
- property W_out¶
- property W_row¶
- property X_col¶
- __init__(stride=1, padding=0, pool_stride=1, pool_padding=0, width=1, height=1, depth=1, kernel_filter=(2, 2), n_filters=1, pool_kernel_filter=(2, 2))¶
- create_batch_norm_layer()[source]¶
Create the batch norm layer that follows the pool layer.
- Return type:
Module
- create_conv_layer()[source]¶
Create the convolution layer for this layer in the stack.
- Return type:
Module
- class zensols.deeplearn.layer.conv.ConvolutionLayerFactory(stride=1, padding=0, pool_stride=1, pool_padding=0)[source]¶
Bases:
Dictable
Create convolution layers and output shape calculator.
- __init__(stride=1, padding=0, pool_stride=1, pool_padding=0)¶
- abstract create_batch_norm_layer()[source]¶
Create the batch norm layer that follows the pool layer.
- Return type:
Module
- abstract create_conv_layer()[source]¶
Create the convolution layer for this layer in the stack.
- Return type:
Module
- abstract create_pool_layer()[source]¶
Create the pool layer that follows the convolutional layer.
- Return type:
Module
- iter_layers(use_pool=True)[source]¶
Iterate through over subsequent convolution and pooled stacked networks. Use with :function:`itertools.islice` to limit the output.
- Return type:
- Returns:
subsequent layers after the current instance for all valid layers
- next_layer(use_pool=True)[source]¶
Get a new factory that represents the next layer of the convolution stack.
- Parameters:
use_pool (
bool
) – whether to use the output shape of the pool for the next layer’s intput and output chanel settings- Return type:
- property out_conv_shape: Tuple[int, ...]¶
The convolution layer shape before flattened in to one dimension.
- property out_pool_shape: Tuple[int, ...]¶
The pooling layer shape before flattened in to one dimension.
- validate(raise_error=True)[source]¶
Validate the parameters of the factory.
- Parameters:
raise_error (
bool
) – ifTrue
raises and error when invalid- Raises:
LayerError – if invalid and
raise_error
isTrue
- Return type:
zensols.deeplearn.layer.crf module¶
Conditional random field PyTorch module forked from Kemal Kurniawan’s
pytorch_crf
GitHub repository. See the Torch CRF
section of the
README.md
module documentation for more information.
- see:
- see:
- class zensols.deeplearn.layer.crf.CRF(num_tags, batch_first=False, score_reduction='skip')[source]¶
Bases:
Module
Conditional random field.
This module implements a conditional random field [LMP01]. The forward computation of this class computes the log likelihood of the given sequence of tags and emission score tensor. This class also has ~CRF.decode method which finds the best tag sequence given an emission score tensor using Viterbi algorithm.
- Parameters:
num_tags – Number of tags.
batch_first – Whether the first dimension corresponds to the size of a minibatch.
score_reduction –
- reduces how the score output over batches, and then
tags, and has shape
(batch size, number of tags)
with the exception oftags
, which has shape(batch_size, sequence length, number of tags)
; how output is returned indecode()
by:
skip: do not return scores, only the decoded output (default)
none: return the scores unaltered, then divide by the batch count
tags: all scores
sum: sum the max over batches, then divide by the batch count
max: max over each batch max, then divide by the batch count
min: min over each batch max, then divide by the batch count
mean: average the max over batchs, then divide by the batch count
- start_transitions¶
Start transition score tensor of size
(num_tags,)
.- Type:
~torch.nn.Parameter
- end_transitions¶
End transition score tensor of size
(num_tags,)
.- Type:
~torch.nn.Parameter
- transitions¶
Transition score tensor of size
(num_tags, num_tags)
.- Type:
~torch.nn.Parameter
[LMP01]Lafferty, J., McCallum, A., Pereira, F. (2001). “Conditional random fields: Probabilistic models for segmenting and labeling sequence data”. Proc. 18th International Conf. on Machine Learning. Morgan Kaufmann. pp. 282–289.
- __init__(num_tags, batch_first=False, score_reduction='skip')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- decode(emissions, mask=None)[source]¶
Find the most likely tag sequence using Viterbi algorithm.
- Return type:
- Parameters:
emissions (~torch.Tensor) – Emission score tensor of size
(seq_length, batch_size, num_tags)
ifbatch_first
isFalse
,(batch_size, seq_length, num_tags)
otherwise.mask (~torch.ByteTensor) – Mask tensor of size
(seq_length, batch_size)
ifbatch_first
isFalse
,(batch_size, seq_length)
otherwise.
- Returns:
List of list containing the best tag sequence for each batch and optionally the scores based on the (~`score_reduction`) parameter in
__init__()
.
- forward(emissions, tags, mask=None, reduction='sum')[source]¶
Compute the conditional log likelihood of a sequence of tags given emission scores.
- Return type:
Tensor
- Parameters:
emissions (~torch.Tensor) – Emission score tensor of size
(seq_length, batch_size, num_tags)
ifbatch_first
isFalse
,(batch_size, seq_length, num_tags)
otherwise.tags (~torch.LongTensor) – Sequence of tags tensor of size
(seq_length, batch_size)
ifbatch_first
isFalse
,(batch_size, seq_length)
otherwise.mask (~torch.ByteTensor) – Mask tensor of size
(seq_length, batch_size)
ifbatch_first
isFalse
,(batch_size, seq_length)
otherwise.reduction – Specifies the reduction to apply to the output:
none|sum|mean|token_mean
.none
: no reduction will be applied.sum
: the output will be summed over batches.mean
: the output will be averaged over batches.token_mean
: the output will be averaged over tokens.
- Returns:
The log likelihood. This will have size
(batch_size,)
if reduction isnone
,()
otherwise.- Return type:
~torch.Tensor
zensols.deeplearn.layer.linear module¶
Convenience classes for linear layers.
- class zensols.deeplearn.layer.linear.DeepLinear(net_settings, sub_logger=None)[source]¶
Bases:
BaseNetworkModule
A layer that has contains one more nested layers, including batch normalization and activation. The input and output layer shapes are given and an optional 0 or more middle layers are given as percent changes in size or exact numbers.
If the network settings are configured to have batch normalization, batch normalization layers are added after each linear layer.
The drop out and activation function (if any) are applied in between each layer allowing other drop outs and activation functions to be applied before and after. Note that the activation is implemented as a function, and not a layer.
For example, if batch normalization and an activation function is configured and two layers are configured, the network is configured as:
linear
batch normalization
3. activation 5. dropout 6. linear 7. batch normalization 8. activation 9. dropout
The module also provides the output features of each layer with
n_features_after_layer()
and ability to forward though only the first given set of layers withforward_n_layers()
.- MODULE_NAME: ClassVar[str] = 'linear'¶
The module name used in the logging message. This is set in each inherited class.
- __init__(net_settings, sub_logger=None)[source]¶
Initialize the deep linear layer.
- Parameters:
net_settings (
DeepLinearNetworkSettings
) – the deep linear layer configurationsub_logger (
Logger
) – the logger to use for the forward process in this layer
- forward_n_layers(x, n_layers, full_forward=False)[source]¶
Forward throught the first 0 index based N layers.
- class zensols.deeplearn.layer.linear.DeepLinearNetworkSettings(name, config_factory, torch_config, batch_norm_d, batch_norm_features, dropout, activation, in_features, out_features, middle_features, proportions, repeats)[source]¶
Bases:
ActivationNetworkSettings
,DropoutNetworkSettings
,BatchNormNetworkSettings
Settings for a deep fully connected network using
DeepLinear
.- __init__(name, config_factory, torch_config, batch_norm_d, batch_norm_features, dropout, activation, in_features, out_features, middle_features, proportions, repeats)¶
- get_module_class_name()[source]¶
Returns the fully qualified class name of the module to create by
ModelManager
. This module takes as the first parameter an instance of this class.Important: This method is not used for nested modules. You must declare specific class names in the configuration for those nested class naems.
- Return type:
-
middle_features:
Tuple
[Union
[int
,float
,Dict
[str
,Any
]],...
]¶ The number of features in the middle layers; if
proportions
isTrue
, then each number is how much to grow or shrink as a percetage of the last layer, otherwise, it’s the number of features.If any element is a dictionary, then it iterprets the keys as:
value
: the value as if the entry was a number, and defaults to 1apply
: a sequence of strings indicating the order or the layers toapply with default
linear, bnorm, activation, dropout
; if a layer is omitted it won’t be applied
batch_norm_features
: the number of features to use in a batch, whichmight change based on ordering or
last
to use the last number of parameters computed in the deep linear network; otherwise it is computed as the size of the current linear input
-
out_features:
Union
[int
,Dict
[str
,Any
]]¶ The number of features as output from the last layer. If a dictionary, it follows the same rules as
middle_features
.
-
proportions:
bool
¶ Whether or not to interpret
middle_features
as a proportion of the previous layer or use directly as the size of the middle layer.
-
repeats:
int
¶ The number of repeats of the
middle_features
configuration.
zensols.deeplearn.layer.recur module¶
This file contains a convenience wrapper around RNN, GRU and LSTM modules in PyTorch.
- class zensols.deeplearn.layer.recur.RecurrentAggregation(net_settings, sub_logger=None)[source]¶
Bases:
BaseNetworkModule
A recurrent neural network model with an output aggregation. This includes RNNs, LSTMs and GRUs.
- MODULE_NAME: ClassVar[str] = 'recur'¶
The module name used in the logging message. This is set in each inherited class.
- __init__(net_settings, sub_logger=None)[source]¶
Initialize the recurrent layer.
- Parameters:
net_settings (
RecurrentAggregationNetworkSettings
) – the reccurent layer configurationsub_logger (
Logger
) – the logger to use for the forward process in this layer
- class zensols.deeplearn.layer.recur.RecurrentAggregationNetworkSettings(name, config_factory, torch_config, dropout, network_type, aggregation, bidirectional, input_size, hidden_size, num_layers)[source]¶
Bases:
DropoutNetworkSettings
Settings for a recurrent neural network. This configures a
RecurrentAggregation
layer.- __init__(name, config_factory, torch_config, dropout, network_type, aggregation, bidirectional, input_size, hidden_size, num_layers)¶
-
aggregation:
str
¶ A convenience operation to aggregate the parameters; this is one of:
max
: return the max of the output statesave
: return the average of the output stateslast
: return the last output statenone
: do not apply an aggregation function.
- get_module_class_name()[source]¶
Returns the fully qualified class name of the module to create by
ModelManager
. This module takes as the first parameter an instance of this class.Important: This method is not used for nested modules. You must declare specific class names in the configuration for those nested class naems.
- Return type:
The size of the hidden states of the network.
zensols.deeplearn.layer.recurcrf module¶
Contains an implementation of a recurrent with a conditional random field layer. This is usually configured as a BiLSTM CRF.
- class zensols.deeplearn.layer.recurcrf.RecurrentCRF(net_settings, sub_logger=None, use_crf=True)[source]¶
Bases:
BaseNetworkModule
Adapt the
CRF
module using the framework basedBaseNetworkModule
class. This provides methodsforward_recur_decode()
anddecode()
, which decodes the input.This adds a recurrent neural network and a fully connected feed forward decoder layer before the CRF layer.
- MODULE_NAME: ClassVar[str] = 'recur crf'¶
The module name used in the logging message. This is set in each inherited class.
- __init__(net_settings, sub_logger=None, use_crf=True)[source]¶
Initialize the reccurent CRF layer.
- Parameters:
net_settings (
RecurrentCRFNetworkSettings
) – the recurrent layer configurationsub_logger (
Logger
) – the logger to use for the forward process in this layer
- decode(x, mask)[source]¶
Forward the input though the recurrent network, decoder, and then the CRF.
- Parameters:
x (
Tensor
) – the inputmask (
Tensor
) – the mask used to block the last N states not provided
- Return type:
Tuple
[Tensor
,Tensor
]- Returns:
the CRF sequence output and the score provided by the CRF’s veterbi algorithm as a tuple
- forward_recur_decode(x)[source]¶
Forward the input through the recurrent network (i.e. LSTM), batch normalization and activation (if confgiured), and decoder output.
Note: this layer forwards batch normalization, activation and drop out (for those configured) after the recurrent layer is forwarded. However, the subordinate recurrent layer can also be configured with a dropout when having more than one stacked layer.
- Parameters:
x (
Tensor
) – the network input- Return type:
Tensor
- Returns:
the fully connected linear feed forward decoded output
- to(*args, **kwargs)[source]¶
Moves and/or casts the parameters and buffers.
This can be called as
- to(device=None, dtype=None, non_blocking=False)[source]
- to(dtype, non_blocking=False)[source]
- to(tensor, non_blocking=False)[source]
- to(memory_format=torch.channels_last)[source]
Its signature is similar to
torch.Tensor.to()
, but only accepts floating point or complexdtype
s. In addition, this method will only cast the floating point or complex parameters and buffers todtype
(if given). The integral parameters and buffers will be moveddevice
, if that is given, but with dtypes unchanged. Whennon_blocking
is set, it tries to convert/move asynchronously with respect to the host if possible, e.g., moving CPU Tensors with pinned memory to CUDA devices.See below for examples.
Note
This method modifies the module in-place.
- Parameters:
device (
torch.device
) – the desired device of the parameters and buffers in this moduledtype (
torch.dtype
) – the desired floating point or complex dtype of the parameters and buffers in this moduletensor (torch.Tensor) – Tensor whose dtype and device are the desired dtype and device for all parameters and buffers in this module
memory_format (
torch.memory_format
) – the desired memory format for 4D parameters and buffers in this module (keyword only argument)
- Returns:
self
- Return type:
Module
Examples:
>>> # xdoctest: +IGNORE_WANT("non-deterministic") >>> linear = nn.Linear(2, 2) >>> linear.weight Parameter containing: tensor([[ 0.1913, -0.3420], [-0.5113, -0.2325]]) >>> linear.to(torch.double) Linear(in_features=2, out_features=2, bias=True) >>> linear.weight Parameter containing: tensor([[ 0.1913, -0.3420], [-0.5113, -0.2325]], dtype=torch.float64) >>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA1) >>> gpu1 = torch.device("cuda:1") >>> linear.to(gpu1, dtype=torch.half, non_blocking=True) Linear(in_features=2, out_features=2, bias=True) >>> linear.weight Parameter containing: tensor([[ 0.1914, -0.3420], [-0.5112, -0.2324]], dtype=torch.float16, device='cuda:1') >>> cpu = torch.device("cpu") >>> linear.to(cpu) Linear(in_features=2, out_features=2, bias=True) >>> linear.weight Parameter containing: tensor([[ 0.1914, -0.3420], [-0.5112, -0.2324]], dtype=torch.float16) >>> linear = nn.Linear(2, 2, bias=None).to(torch.cdouble) >>> linear.weight Parameter containing: tensor([[ 0.3741+0.j, 0.2382+0.j], [ 0.5593+0.j, -0.4443+0.j]], dtype=torch.complex128) >>> linear(torch.ones(3, 2, dtype=torch.cdouble)) tensor([[0.6122+0.j, 0.1150+0.j], [0.6122+0.j, 0.1150+0.j], [0.6122+0.j, 0.1150+0.j]], dtype=torch.complex128)
- class zensols.deeplearn.layer.recurcrf.RecurrentCRFNetworkSettings(name, config_factory, torch_config, batch_norm_d, batch_norm_features, dropout, activation, network_type, bidirectional, input_size, hidden_size, num_layers, num_labels, decoder_settings, score_reduction)[source]¶
Bases:
ActivationNetworkSettings
,DropoutNetworkSettings
,BatchNormNetworkSettings
Settings for a recurrent neural network using
RecurrentCRF
.- __init__(name, config_factory, torch_config, batch_norm_d, batch_norm_features, dropout, activation, network_type, bidirectional, input_size, hidden_size, num_layers, num_labels, decoder_settings, score_reduction)¶
-
decoder_settings:
DeepLinearNetworkSettings
¶ The decoder feed forward network.
- get_module_class_name()[source]¶
Returns the fully qualified class name of the module to create by
ModelManager
. This module takes as the first parameter an instance of this class.Important: This method is not used for nested modules. You must declare specific class names in the configuration for those nested class naems.
- Return type:
The size of the hidden states of the network.
Module contents¶
Provides neural network layer implementations, which are all subclasses of
torch.nn.Module
.
- exception zensols.deeplearn.layer.LayerError[source]¶
Bases:
ModelError
Thrown for all deep learning layer errors.
- __annotations__ = {}¶
- __module__ = 'zensols.deeplearn.layer'¶