Search

torch.nn.ConvTranspose2d

[trasposed convolution 에 대해 다뤘던 글]

torch.nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, bias=True, dilation=1, padding_mode='zeros', device=None, dtype=None)

여러 input 평면으로 이뤄진 input image에 2d transposed convolution 을 적용한다.
입력에 대한 Conv2d의 gradient로 볼 수 있다.
보통 Deep Convolution GAN을 구현할 때 주로 사용된다.
stride controls the stride for the cross-correlation.
padding controls the amount of implicit zero padding on both sides for dilation * (kernel_size - 1) - padding number of points.
output_padding controls the additional size added to one side of the output shape.
dilation controls the spacing between the kernel points; also known as the à trous algorithm. It is harder to describe, but the link here has a nice visualization of what dilation does.
groups controls the connections between inputs and outputs.
in_channels and out_channels must both be divisible by groups. For example, ◦ At groups=1, all inputs are convolved to all outputs. ◦ At groups=2, the operation becomes equivalent to having two conv layers side by side, each seeing half the input channels and producing half the output channels, and both subsequently concatenated. ◦ At groups= in_channels, each input channel is convolved with its own set of filters (of size out_channelsin_channels\frac{\text{out\_channels}}{\text{in\_channels}}).
The parameters kernel_sizestridepaddingoutput_padding can either be:
a single int – height와 width dimension에서 같은 값으로 사용
a tuple of two ints – 첫번째 값이 height dimension에서, 두번째 값이 width dimension에서 사용

parameters

in_channels (int) – Number of channels in the input image
out_channels (int) – Number of channels produced by the convolution
kernel_size (int or tuple) – Size of the convolving kernel
stride (int or tuple, optional) – Stride of the convolution. Default: 1
padding (int or tuple, optional) – dilation * (kernel_size - 1) - padding zero-padding will be added to both sides of each dimension in the input. Default: 0
output_padding (int or tuple, optional) – Additional size added to one side of each dimension in the output shape. Default: 0
groups (int, optional) – Number of blocked connections from input channels to output channels. Default: 1
bias (bool, optional) – If True, adds a learnable bias to the output. Default: True
dilation (int or tuple, optional) – Spacing between kernel elements. Default: 1

shape

input (N,Cin,Hin,Win)(N, C_{in}, H_{in}, W_{in})
output (N,Cout,Hout,Wout)(N, C_{out}, H_{out}, W_{out})
Hout=(Hin1)×stride[0]2×padding[0]+dilation[0]×(kernel_size[0]1)+output_padding[0]+1Hout=(Hin−1)×stride[0]−2×padding[0]+dilation[0]×(kernel\_size[0]−1)+output\_padding[0]+1
Wout=(Win1)×stride[1]2×padding[1]+dilation[1]×(kernel_size[1]1)+output_padding[1]+1Wout=(Win−1)×stride[1]−2×padding[1]+dilation[1]×(kernel\_size[1]−1)+output\_padding[1]+1