zensols.deeplearn.layer package

Submodules

zensols.deeplearn.layer.conv module

Convolution network creation utilities.

class zensols.deeplearn.layer.conv.ConvolutionLayerFactory(width=1, height=1, depth=1, n_filters=1, kernel_filter=(2, 2), stride=1, padding=0)[source]

Bases: object

Create convolution layers. Each attribute maps a corresponding attribuate variable in Im2DimCalculator, which documented in the parenthesis in the parameter documentation below.

Parameters:
  • width (int) – the width of the image/data (W)

  • height (int) – the height of the image/data (H)

  • depth (int) – the volume, which is usually same as n_filters (D)

  • n_filters (int) – the number of filters, aka the filter depth/volume (K)

  • kernel_filter (Tuple[int, int]) – the kernel filter dimension in width X height (F)

  • stride (int) – the stride, which is the number of cells to skip for each convolution (S)

  • padding (int) – the zero’d number of cells on the ends of the image/data (P)

See:

Stanford

__init__(width=1, height=1, depth=1, n_filters=1, kernel_filter=(2, 2), stride=1, padding=0)
batch_norm2d()[source]

Return a 2D batch normalization layer.

Return type:

BatchNorm2d

property calc: Im2DimCalculator
clone(**kwargs)[source]

Return a clone of this factory instance.

Return type:

Any

conv1d()[source]

Return a convolution layer in one dimension.

Return type:

Conv1d

conv2d()[source]

Return a convolution layer in two dimensions.

Return type:

Conv2d

copy_calc(calc)[source]
depth: int = 1
flatten()[source]

Return a new flattened instance of this class.

Return type:

Any

property flatten_dim: int

Return the dimension of a flattened array of the convolution layer represented by this instance.

height: int = 1
kernel_filter: Tuple[int, int] = (2, 2)
n_filters: int = 1
padding: int = 0
stride: int = 1
width: int = 1
class zensols.deeplearn.layer.conv.Flattenable[source]

Bases: object

A class with a flatten_dim and out_shape properties.

property flatten_dim: int

Return the number or neurons of the layer after flattening in to one dimension.

property out_shape: Tuple[int]

Return the shape of the layer after flattened in to one dimension.

class zensols.deeplearn.layer.conv.Im2DimCalculator(W, H, D=1, K=1, F=(2, 2), S=1, P=0)[source]

Bases: Flattenable

Convolution matrix dimension calculation utility.

Implementation as Matrix Multiplication section.

Example (im2col)::

W_in = H_in = 227 Ch_in = D_in = 3 Ch_out = D_out = 3 K = 96 F = (11, 11) S = 4 P = 0 W_out = H_out = 227 - 11 + (2 * 0) / 4 = 55 output locations X_col = Fw^2 * D_out x W_out * H_out = 11^2 * 3 x 55 * 55 = 363 x 3025

Example (im2row)::

W_row = 96 filters of size 11 x 11 x 3 => K x 11 * 11 * 3 = 96 x 363

Result of convolution: transpose(W_row) dot X_col. Must reshape back to 55 x 55 x 96

See:

Stanford

property H_out
property W_out
property W_row
property X_col
__init__(W, H, D=1, K=1, F=(2, 2), S=1, P=0)[source]

Initialize.

Parameters:
  • W (int) – width

  • H (int) – height

  • D (int) – depth [of volume] (usually same as K)

  • K (int) – number of filters

  • F (Tuple[int, int]) – tuple of kernel/filter (width, height)

  • S (int) – stride

  • P (int) – padding

flatten(axis=1)[source]
property out_shape

Return the shape of the layer after flattened in to one dimension.

validate()[source]
class zensols.deeplearn.layer.conv.MaxPool1dFactory(layer_factory=None, stride=1, padding=0, kernel_filter=2)[source]

Bases: PoolFactory

Create a 1D max pool and output it’s shape.

__init__(layer_factory=None, stride=1, padding=0, kernel_filter=2)
create_pool()[source]
Return type:

Module

kernel_filter: Tuple[int] = 2

The filter used for max pooling.

class zensols.deeplearn.layer.conv.MaxPool2dFactory(layer_factory=None, stride=1, padding=0, kernel_filter=(2, 2))[source]

Bases: PoolFactory

Create a 2D max pool and output it’s shape.

__init__(layer_factory=None, stride=1, padding=0, kernel_filter=(2, 2))
create_pool()[source]
Return type:

Module

kernel_filter: Tuple[int, int] = (2, 2)

The filter used for max pooling.

class zensols.deeplearn.layer.conv.PoolFactory(layer_factory=None, stride=1, padding=0)[source]

Bases: Flattenable

Create a 2D max pool and output it’s shape.

See:

Stanford

__init__(layer_factory=None, stride=1, padding=0)
abstract create_pool()[source]
Return type:

Module

layer_factory: ConvolutionLayerFactory = None
property out_shape: Tuple[int]

Calculates the dimensions for a max pooling filter and creates a layer.

Parameters:
  • F – the spacial extent (kernel filter)

  • S – the stride

padding: int = 0
stride: int = 1

zensols.deeplearn.layer.crf module

Conditional random field PyTorch module forked from Kemal Kurniawan’s pytorch_crf GitHub repository. See the Torch CRF section of the README.md module documentation for more information.

see:

pytorch_crf

see:

Torch CRF Readme

class zensols.deeplearn.layer.crf.CRF(num_tags, batch_first=False, score_reduction='skip')[source]

Bases: Module

Conditional random field.

This module implements a conditional random field [LMP01]. The forward computation of this class computes the log likelihood of the given sequence of tags and emission score tensor. This class also has ~CRF.decode method which finds the best tag sequence given an emission score tensor using Viterbi algorithm.

Parameters:
  • num_tags – Number of tags.

  • batch_first – Whether the first dimension corresponds to the size of a minibatch.

  • score_reduction

    reduces how the score output over batches, and then

    tags, and has shape (batch size, number of tags) with the exception of tags, which has shape (batch_size, sequence length, number of tags); how output is returned in decode() by:

    • skip: do not return scores, only the decoded output (default)

    • none: return the scores unaltered, then divide by the batch count

    • tags: all scores

    • sum: sum the max over batches, then divide by the batch count

    • max: max over each batch max, then divide by the batch count

    • min: min over each batch max, then divide by the batch count

    • mean: average the max over batchs, then divide by the batch count

start_transitions

Start transition score tensor of size (num_tags,).

Type:

~torch.nn.Parameter

end_transitions

End transition score tensor of size (num_tags,).

Type:

~torch.nn.Parameter

transitions

Transition score tensor of size (num_tags, num_tags).

Type:

~torch.nn.Parameter

[LMP01]

Lafferty, J., McCallum, A., Pereira, F. (2001). “Conditional random fields: Probabilistic models for segmenting and labeling sequence data”. Proc. 18th International Conf. on Machine Learning. Morgan Kaufmann. pp. 282–289.

__init__(num_tags, batch_first=False, score_reduction='skip')[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

decode(emissions, mask=None)[source]

Find the most likely tag sequence using Viterbi algorithm.

Return type:

Union[List[List[int]], Tuple[List[List[int]], Tensor]]

Parameters:
  • emissions (~torch.Tensor) – Emission score tensor of size (seq_length, batch_size, num_tags) if batch_first is False, (batch_size, seq_length, num_tags) otherwise.

  • mask (~torch.ByteTensor) – Mask tensor of size (seq_length, batch_size) if batch_first is False, (batch_size, seq_length) otherwise.

Returns:

List of list containing the best tag sequence for each batch and optionally the scores based on the (~`score_reduction`) parameter in __init__().

forward(emissions, tags, mask=None, reduction='sum')[source]

Compute the conditional log likelihood of a sequence of tags given emission scores.

Return type:

Tensor

Parameters:
  • emissions (~torch.Tensor) – Emission score tensor of size (seq_length, batch_size, num_tags) if batch_first is False, (batch_size, seq_length, num_tags) otherwise.

  • tags (~torch.LongTensor) – Sequence of tags tensor of size (seq_length, batch_size) if batch_first is False, (batch_size, seq_length) otherwise.

  • mask (~torch.ByteTensor) – Mask tensor of size (seq_length, batch_size) if batch_first is False, (batch_size, seq_length) otherwise.

  • reduction – Specifies the reduction to apply to the output: none|sum|mean|token_mean. none: no reduction will be applied. sum: the output will be summed over batches. mean: the output will be averaged over batches. token_mean: the output will be averaged over tokens.

Returns:

The log likelihood. This will have size (batch_size,) if reduction is none, () otherwise.

Return type:

~torch.Tensor

reset_parameters()[source]

Initialize the transition parameters.

The parameters will be initialized randomly from a uniform distribution between -0.1 and 0.1.

Return type:

None

zensols.deeplearn.layer.linear module

Convenience classes for linear layers.

class zensols.deeplearn.layer.linear.DeepLinear(net_settings, sub_logger=None)[source]

Bases: BaseNetworkModule

A layer that has contains one more nested layers, including batch normalization and activation. The input and output layer shapes are given and an optional 0 or more middle layers are given as percent changes in size or exact numbers.

If the network settings are configured to have batch normalization, batch normalization layers are added after each linear layer.

The drop out and activation function (if any) are applied in between each layer allowing other drop outs and activation functions to be applied before and after. Note that the activation is implemented as a function, and not a layer.

For example, if batch normalization and an activation function is configured and two layers are configured, the network is configured as:

  1. linear

  2. batch normalization

3. activation 5. dropout 6. linear 7. batch normalization 8. activation 9. dropout

The module also provides the output features of each layer with n_features_after_layer() and ability to forward though only the first given set of layers with forward_n_layers().

MODULE_NAME: ClassVar[str] = 'linear'

The module name used in the logging message. This is set in each inherited class.

__init__(net_settings, sub_logger=None)[source]

Initialize the deep linear layer.

Parameters:
  • net_settings (DeepLinearNetworkSettings) – the deep linear layer configuration

  • sub_logger (Logger) – the logger to use for the forward process in this layer

deallocate()[source]

Deallocate all resources for this instance.

forward_n_layers(x, n_layers, full_forward=False)[source]

Forward throught the first 0 index based N layers.

Parameters:
  • n_layers (int) – the number of layers to forward through (0-based index)

  • full_forward (bool) – if True, also return the full forward as a second parameter

Return type:

Tensor

Returns:

the tensor output of all layers or a tuple of (N-th layer, all layers)

get_batch_norm_layers()[source]

Return all batch normalize layers.

Return type:

Tuple[Module]

get_linear_layers()[source]

Return all linear layers.

Return type:

Tuple[Module]

n_features_after_layer(nth_layer)[source]

Get the output features of the Nth (0 index based) layer.

Parameters:

nth_layer – the layer to use for getting the output features

Return type:

int

property out_features: int

The number of features output from all layers of this module.

class zensols.deeplearn.layer.linear.DeepLinearNetworkSettings(name, config_factory, torch_config, batch_norm_d, batch_norm_features, dropout, activation, in_features, out_features, middle_features, proportions, repeats)[source]

Bases: ActivationNetworkSettings, DropoutNetworkSettings, BatchNormNetworkSettings

Settings for a deep fully connected network using DeepLinear.

__init__(name, config_factory, torch_config, batch_norm_d, batch_norm_features, dropout, activation, in_features, out_features, middle_features, proportions, repeats)
get_module_class_name()[source]

Returns the fully qualified class name of the module to create by ModelManager. This module takes as the first parameter an instance of this class.

Important: This method is not used for nested modules. You must declare specific class names in the configuration for those nested class naems.

Return type:

str

in_features: int

The number of features to the first layer.

middle_features: Tuple[Any]

The number of features in the middle layers; if proportions is True, then each number is how much to grow or shrink as a percetage of the last layer, otherwise, it’s the number of features.

out_features: int

The number of features as output from the last layer.

proportions: bool

Whether or not to interpret middle_features as a proportion of the previous layer or use directly as the size of the middle layer.

repeats: int

The number of repeats of the middle_features configuration.

zensols.deeplearn.layer.recur module

This file contains a convenience wrapper around RNN, GRU and LSTM modules in PyTorch.

class zensols.deeplearn.layer.recur.RecurrentAggregation(net_settings, sub_logger=None)[source]

Bases: BaseNetworkModule

A recurrent neural network model with an output aggregation. This includes RNNs, LSTMs and GRUs.

MODULE_NAME: ClassVar[str] = 'recur'

The module name used in the logging message. This is set in each inherited class.

__init__(net_settings, sub_logger=None)[source]

Initialize the recurrent layer.

Parameters:
deallocate()[source]

Deallocate all resources for this instance.

property out_features: int

The number of features output from all layers of this module.

class zensols.deeplearn.layer.recur.RecurrentAggregationNetworkSettings(name, config_factory, torch_config, dropout, network_type, aggregation, bidirectional, input_size, hidden_size, num_layers)[source]

Bases: DropoutNetworkSettings

Settings for a recurrent neural network. This configures a RecurrentAggregation layer.

__init__(name, config_factory, torch_config, dropout, network_type, aggregation, bidirectional, input_size, hidden_size, num_layers)
aggregation: str

A convenience operation to aggregate the parameters; this is one of: max: return the max of the output states ave: return the average of the output states last: return the last output state none: do not apply an aggregation function.

bidirectional: bool

Whether or not the network is bidirectional.

get_module_class_name()[source]

Returns the fully qualified class name of the module to create by ModelManager. This module takes as the first parameter an instance of this class.

Important: This method is not used for nested modules. You must declare specific class names in the configuration for those nested class naems.

Return type:

str

hidden_size: int

The size of the hidden states of the network.

input_size: int

The input size to the network.

network_type: str

One of rnn, lstm or gru.

num_layers: int

The number of “stacked” layers.

zensols.deeplearn.layer.recurcrf module

Contains an implementation of a recurrent with a conditional random field layer. This is usually configured as a BiLSTM CRF.

class zensols.deeplearn.layer.recurcrf.RecurrentCRF(net_settings, sub_logger=None, use_crf=True)[source]

Bases: BaseNetworkModule

Adapt the CRF module using the framework based BaseNetworkModule class. This provides methods forward_recur_decode() and decode(), which decodes the input.

This adds a recurrent neural network and a fully connected feed forward decoder layer before the CRF layer.

MODULE_NAME: ClassVar[str] = 'recur crf'

The module name used in the logging message. This is set in each inherited class.

__init__(net_settings, sub_logger=None, use_crf=True)[source]

Initialize the reccurent CRF layer.

Parameters:
deallocate()[source]

Deallocate all resources for this instance.

decode(x, mask)[source]

Forward the input though the recurrent network, decoder, and then the CRF.

Parameters:
  • x (Tensor) – the input

  • mask (Tensor) – the mask used to block the last N states not provided

Return type:

Tuple[Tensor, Tensor]

Returns:

the CRF sequence output and the score provided by the CRF’s veterbi algorithm as a tuple

forward_recur_decode(x)[source]

Forward the input through the recurrent network (i.e. LSTM), batch normalization and activation (if confgiured), and decoder output.

Note: this layer forwards batch normalization, activation and drop out (for those configured) after the recurrent layer is forwarded. However, the subordinate recurrent layer can also be configured with a dropout when having more than one stacked layer.

Parameters:

x (Tensor) – the network input

Return type:

Tensor

Returns:

the fully connected linear feed forward decoded output

to(*args, **kwargs)[source]

Moves and/or casts the parameters and buffers.

This can be called as

to(device=None, dtype=None, non_blocking=False)[source]
to(dtype, non_blocking=False)[source]
to(tensor, non_blocking=False)[source]
to(memory_format=torch.channels_last)[source]

Its signature is similar to torch.Tensor.to(), but only accepts floating point or complex dtypes. In addition, this method will only cast the floating point or complex parameters and buffers to dtype (if given). The integral parameters and buffers will be moved device, if that is given, but with dtypes unchanged. When non_blocking is set, it tries to convert/move asynchronously with respect to the host if possible, e.g., moving CPU Tensors with pinned memory to CUDA devices.

See below for examples.

Note

This method modifies the module in-place.

Parameters:
  • device (torch.device) – the desired device of the parameters and buffers in this module

  • dtype (torch.dtype) – the desired floating point or complex dtype of the parameters and buffers in this module

  • tensor (torch.Tensor) – Tensor whose dtype and device are the desired dtype and device for all parameters and buffers in this module

  • memory_format (torch.memory_format) – the desired memory format for 4D parameters and buffers in this module (keyword only argument)

Returns:

self

Return type:

Module

Examples:

>>> # xdoctest: +IGNORE_WANT("non-deterministic")
>>> linear = nn.Linear(2, 2)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]])
>>> linear.to(torch.double)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]], dtype=torch.float64)
>>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA1)
>>> gpu1 = torch.device("cuda:1")
>>> linear.to(gpu1, dtype=torch.half, non_blocking=True)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16, device='cuda:1')
>>> cpu = torch.device("cpu")
>>> linear.to(cpu)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16)

>>> linear = nn.Linear(2, 2, bias=None).to(torch.cdouble)
>>> linear.weight
Parameter containing:
tensor([[ 0.3741+0.j,  0.2382+0.j],
        [ 0.5593+0.j, -0.4443+0.j]], dtype=torch.complex128)
>>> linear(torch.ones(3, 2, dtype=torch.cdouble))
tensor([[0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j]], dtype=torch.complex128)
class zensols.deeplearn.layer.recurcrf.RecurrentCRFNetworkSettings(name, config_factory, torch_config, batch_norm_d, batch_norm_features, dropout, activation, network_type, bidirectional, input_size, hidden_size, num_layers, num_labels, decoder_settings, score_reduction)[source]

Bases: ActivationNetworkSettings, DropoutNetworkSettings, BatchNormNetworkSettings

Settings for a recurrent neural network using RecurrentCRF.

__init__(name, config_factory, torch_config, batch_norm_d, batch_norm_features, dropout, activation, network_type, bidirectional, input_size, hidden_size, num_layers, num_labels, decoder_settings, score_reduction)
bidirectional: bool

Whether or not the network is bidirectional (usually True).

decoder_settings: DeepLinearNetworkSettings

The decoder feed forward network.

get_module_class_name()[source]

Returns the fully qualified class name of the module to create by ModelManager. This module takes as the first parameter an instance of this class.

Important: This method is not used for nested modules. You must declare specific class names in the configuration for those nested class naems.

Return type:

str

hidden_size: int

The size of the hidden states of the network.

input_size: int

The input size to the layer.

network_type: str

One of rnn, lstm or gru (usually lstm).

num_labels: int

The number of output labels from the CRF.

num_layers: int

The number of “stacked” layers.

score_reduction: str

Reduces how the score output over batches.

See:

CRF

to_recurrent_aggregation()[source]
Return type:

RecurrentAggregationNetworkSettings

Module contents

Provides neural network layer implementations, which are all subclasses of torch.nn.Module.

exception zensols.deeplearn.layer.LayerError[source]

Bases: ModelError

Thrown for all deep learning layer errors.

__annotations__ = {}
__module__ = 'zensols.deeplearn.layer'