Einsum Expressions
einconv.expressions.convNd_forward
Generates einsum expression of the forward pass of a convolution.
einsum_expression
einsum_expression(x: Tensor, weight: Union[Tensor, Parameter], stride: Union[int, Tuple[int, ...]] = 1, padding: Union[int, str, Tuple[int, ...]] = 0, dilation: Union[int, Tuple[int, ...]] = 1, groups: int = 1, simplify: bool = True) -> Tuple[str, List[Union[Tensor, Parameter]], Tuple[int, ...]]
Generate einsum expression of a convolution's forward pass.
Parameters:
-
x
(Tensor
) –Convolution input. Has shape
[batch_size, in_channels, *input_sizes]
wherelen(input_sizes) == N
. -
weight
(Union[Tensor, Parameter]
) –Kernel of the convolution. Has shape
[out_channels, in_channels / groups, *kernel_size]
wherekernel_size
is anN
-tuple of kernel dimensions. -
stride
(Union[int, Tuple[int, ...]]
, default:1
) –Stride of the convolution. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. Default:1
. -
padding
(Union[int, str, Tuple[int, ...]]
, default:0
) –Padding of the convolution. Can be a single integer (shared along all spatial dimensions), an
N
-tuple of integers, or a string. Default:0
. Allowed strings are'same'
and'valid'
. -
dilation
(Union[int, Tuple[int, ...]]
, default:1
) –Dilation of the convolution. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. Default:1
. -
groups
(int
, default:1
) –In how many groups to split the input channels. Default:
1
. -
simplify
(bool
, default:True
) –Whether to simplify the einsum expression. Default:
True
.
Returns:
-
str
–Einsum equation
-
List[Union[Tensor, Parameter]]
–Einsum operands in order un-grouped input, patterns, un-grouped weight
-
Tuple[int, ...]
–Output shape:
[batch_size, out_channels, *output_sizes]
.
Source code in einconv/expressions/convNd_forward.py
einconv.expressions.convNd_input_vjp
Generates einsum expression of the input VJP of a convolution.
einsum_expression
einsum_expression(weight: Union[Tensor, Parameter], v: Tensor, input_size: Union[int, Tuple[int, ...]], stride: Union[int, Tuple[int, ...]] = 1, padding: Union[int, Tuple[int, ...]] = 0, dilation: Union[int, Tuple[int, ...]] = 1, groups: int = 1, simplify: bool = True) -> Tuple[str, List[Union[Tensor, Parameter]], Tuple[int, ...]]
Generate einsum expression of a convolution's input VJP.
Parameters:
-
weight
(Union[Tensor, Parameter]
) –Kernel of the convolution. Has shape
[out_channels, in_channels / groups, *kernel_size]
wherekernel_size
is anN
-tuple of kernel dimensions. -
v
(Tensor
) –Vector multiplied by the Jacobian. Has shape
[batch_size, out_channels, *output_sizes]
wherelen(output_sizes) == N
(same shape as the convolution's output). -
input_size
(Union[int, Tuple[int, ...]]
) –Spatial dimensions of the convolution. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. -
stride
(Union[int, Tuple[int, ...]]
, default:1
) –Stride of the convolution. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. Default:1
. -
padding
(Union[int, Tuple[int, ...]]
, default:0
) –Padding of the convolution. Can be a single integer (shared along all spatial dimensions), an
N
-tuple of integers, or a string. Default:0
. Allowed strings are'same'
and'valid'
. -
dilation
(Union[int, Tuple[int, ...]]
, default:1
) –Dilation of the convolution. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. Default:1
. -
groups
(int
, default:1
) –In how many groups to split the input channels. Default:
1
. -
simplify
(bool
, default:True
) –Whether to simplify the einsum expression. Default:
True
.
Returns:
-
str
–Einsum equation
-
List[Union[Tensor, Parameter]]
–Einsum operands in order un-grouped vector, patterns, un-grouped weight
-
Tuple[int, ...]
–Output shape:
[batch_size, in_channels, *input_sizes]
Source code in einconv/expressions/convNd_input_vjp.py
einconv.expressions.convNd_weight_vjp
Generates einsum expression of the weight VJP of a convolution.
einsum_expression
einsum_expression(x: Tensor, v: Tensor, kernel_size: Union[int, Tuple[int, ...]], dilation: Union[int, Tuple[int, ...]] = 1, padding: Union[int, Tuple[int, ...]] = 0, stride: Union[int, Tuple[int, ...]] = 1, groups: int = 1, simplify: bool = True) -> Tuple[str, List[Tensor], Tuple[int, ...]]
Generate einsum expression of a convolution's weight VJP.
Parameters:
-
x
(Tensor
) –Convolution input. Has shape
[batch_size, in_channels, *input_sizes]
wherelen(input_sizes) == N
. -
v
(Tensor
) –Vector multiplied by the Jacobian. Has shape
[batch_size, out_channels, *output_sizes]
wherelen(output_sizes) == N
(same shape as the convolution's output). -
kernel_size
(Union[int, Tuple[int, ...]]
) –Kernel dimensions. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. -
stride
(Union[int, Tuple[int, ...]]
, default:1
) –Stride of the convolution. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. Default:1
. -
padding
(Union[int, Tuple[int, ...]]
, default:0
) –Padding of the convolution. Can be a single integer (shared along all spatial dimensions), an
N
-tuple of integers, or a string. Default:0
. Allowed strings are'same'
and'valid'
. -
dilation
(Union[int, Tuple[int, ...]]
, default:1
) –Dilation of the convolution. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. Default:1
. -
groups
(int
, default:1
) –In how many groups to split the input channels. Default:
1
. -
simplify
(bool
, default:True
) –Whether to simplify the einsum expression. Default:
True
.
Returns:
-
str
–Einsum equation
-
List[Tensor]
–Einsum operands in order ungrouped input, patterns, ungrouped vector.
-
Tuple[int, ...]
–Output shape:
[out_channels, in_channels // groups, *kernel_size]
Source code in einconv/expressions/convNd_weight_vjp.py
einconv.expressions.convNd_unfold
einsum_expression
einsum_expression(x: Tensor, kernel_size: Union[int, Tuple[int, ...]], stride: Union[int, Tuple[int, ...]] = 1, padding: Union[str, int, Tuple[int, ...]] = 0, dilation: Union[int, Tuple[int, ...]] = 1, simplify: bool = True) -> Tuple[str, List[Tensor], Tuple[int, ...]]
Generate einsum expression to unfold the input of a convolution.
Parameters:
-
x
(Tensor
) –Convolution input. Has shape
[batch_size, in_channels, *input_sizes]
wherelen(input_sizes) == N
. -
kernel_size
(Union[int, Tuple[int, ...]]
) –Kernel dimensions. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. -
stride
(Union[int, Tuple[int, ...]]
, default:1
) –Stride of the convolution. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. Default:1
. -
padding
(Union[str, int, Tuple[int, ...]]
, default:0
) –Padding of the convolution. Can be a single integer (shared along all spatial dimensions), an
N
-tuple of integers, or a string. Default:0
. Allowed strings are'same'
and'valid'
. -
dilation
(Union[int, Tuple[int, ...]]
, default:1
) –Dilation of the convolution. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. Default:1
. -
simplify
(bool
, default:True
) –Whether to simplify the einsum expression. Default:
True
.
Returns:
-
str
–Einsum equation
-
List[Tensor]
–Einsum operands in order input, patterns
-
Tuple[int, ...]
–Output shape:
[batch_size, in_channels, tot_output_size]
Source code in einconv/expressions/convNd_unfold.py
einconv.expressions.convNd_kfc
Input-based factor of the KFC Fisher approximation for convolutions.
KFC was introduced by:
- Grosse, R., & Martens, J. (2016). A Kronecker-factored approximate Fisher matrix for convolution layers. International Conference on Machine Learning (ICML).
einsum_expression
einsum_expression(x: Tensor, kernel_size: Union[int, Tuple[int, ...]], stride: Union[int, Tuple[int, ...]] = 1, padding: Union[int, str, Tuple[int, ...]] = 0, dilation: Union[int, Tuple[int, ...]] = 1, groups: int = 1, simplify: bool = True) -> Tuple[str, List[Tensor], Tuple[int, ...]]
Generate einsum expression of input-based KFC factor for convolution.
Parameters:
-
x
(Tensor
) –Convolution input. Has shape
[batch_size, in_channels, *input_sizes]
wherelen(input_sizes) == N
. -
kernel_size
(Union[int, Tuple[int, ...]]
) –Kernel dimensions. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. -
stride
(Union[int, Tuple[int, ...]]
, default:1
) –Stride of the convolution. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. Default:1
. -
padding
(Union[int, str, Tuple[int, ...]]
, default:0
) –Padding of the convolution. Can be a single integer (shared along all spatial dimensions), an
N
-tuple of integers, or a string. Default:0
. Allowed strings are'same'
and'valid'
. -
dilation
(Union[int, Tuple[int, ...]]
, default:1
) –Dilation of the convolution. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. Default:1
. -
groups
(int
, default:1
) –In how many groups to split the input channels. Default:
1
. -
simplify
(bool
, default:True
) –Whether to simplify the einsum expression. Default:
True
.
Returns:
-
str
–Einsum equation
-
List[Tensor]
–Einsum operands in order un-grouped input, patterns, un-grouped input, patterns, normalization scaling
-
Tuple[int, ...]
–Output shape:
[groups, in_channels //groups * tot_kernel_sizes, in_channels //groups * tot_kernel_sizes]
Source code in einconv/expressions/convNd_kfc.py
einconv.expressions.convNd_kfac_reduce
Input-based factor of the K-FAC reduce approximation for convolutions.
KFAC-reduce was introduced by:
- Eschenhagen, R. (2022). Kronecker-factored approximate curvature for linear weight-sharing layers, Master thesis.
einsum_expression
einsum_expression(x: Tensor, kernel_size: Union[int, Tuple[int, ...]], stride: Union[int, Tuple[int, ...]] = 1, padding: Union[int, str, Tuple[int, ...]] = 0, dilation: Union[int, Tuple[int, ...]] = 1, groups: int = 1, simplify: bool = True) -> Tuple[str, List[Tensor], Tuple[int, ...]]
Generate einsum expression of input-based KFAC-reduce factor for convolution.
Parameters:
-
x
(Tensor
) –Convolution input. Has shape
[batch_size, in_channels, *input_sizes]
wherelen(input_sizes) == N
. -
kernel_size
(Union[int, Tuple[int, ...]]
) –Kernel dimensions. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. -
stride
(Union[int, Tuple[int, ...]]
, default:1
) –Stride of the convolution. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. Default:1
. -
padding
(Union[int, str, Tuple[int, ...]]
, default:0
) –Padding of the convolution. Can be a single integer (shared along all spatial dimensions), an
N
-tuple of integers, or a string. Default:0
. Allowed strings are'same'
and'valid'
. -
dilation
(Union[int, Tuple[int, ...]]
, default:1
) –Dilation of the convolution. Can be a single integer (shared along all spatial dimensions), or an
N
-tuple of integers. Default:1
. -
groups
(int
, default:1
) –In how many groups to split the input channels. Default:
1
. -
simplify
(bool
, default:True
) –Whether to simplify the einsum expression. Default:
True
.
Returns:
-
str
–Einsum equation
-
List[Tensor]
–Einsum operands in order un-grouped input, patterns, un-grouped input, patterns, normalization scaling
-
Tuple[int, ...]
–Output shape:
[groups, in_channels //groups * tot_kernel_sizes, in_channels //groups * tot_kernel_sizes]