Keras Convolutional Layers

Conv1D

It refers to a one-dimensional convolutional layer. For example, temporal convolution which generates a convolution kernel for creating a tensor of outputs. The convolution kernel is convolved with input layer over a single temporal (spatial) dimension. A bias vector will be developed and included to the outputs, if use_bias is True. It will be applied to the output, if activation is set None.

When we utilize the conv1D layer as the initial layer in our model, it provides us with an input_shape argument, which is either a tuple of integers or None. It does not incorporate the batch axis.

Arguments

  • filters: Filters refers to an integer in which the output space dimensionality or the number of output filters present in the convolution.
  • kernel_size: It is an integer or tuple/list of an individual integer that specifies the length of the 1D convolution window.
  • strides: It is an integer or tuple/list of an individual integer that specifies the stride length of the convolution. Determining any stride value != 1 is incompatible with specifying any dilation_rate value != 1.
  • padding: It is one of "valid", "casual" or "same", where valid implies to no padding, same means padding the input in such a way that it generates the output having the same length as that of the original input and casual results in dilated output, i.e., output[t] is independent of output[t + 1:]. To get the output length similar to the input, a zero-padding can be used. The concept of padding is useful while modeling a temporal data in order to make sure that the model does not violate temporal
  • data_format: It is a string of "channels_last" or "channels_first", which is the order of input dimensions. Here the "channels_last" links to the input shape (batch, steps, channels), which is the default format for temporal data in Keras. However, the "channels_first" is used to relate the input shape (batch, channels, steps).
  • dilation_rate: It is an integer or tuple/ list of an individual integer that relates to the dilation rate of a dilated convolution. It currently relates any dilation_rate value != 1 is incompatible by specifying any strides value != 1.
  • activation: It refers to an activation function to be used. When nothing is specified, then by defaults, it is a linear activation a(x) = x, or we can say no activation function is applied.
  • use_bias: It represents a Boolean that shows whether the layer utilizes a bias vector.
  • kernel_initializer: It can be defined as an initializer for the kernel weights matrix.
  • bias_initializer: It refers to an initializer for bias vector.
  • kernel_regularizer: It refers to a regularizer function, which is applied to the kernel weights matrix.
  • bias_regularizer: It can be defined as a regularizer function, which is applied to the bias vector.
  • activity_regularizer: It refers to a regularizer function that is applied to the activation (output of the layer).
  • kernel_constraint: It is a constraint function applied to the kernel matrix.
  • bias_constraint: It can be defined as a constraint function applied to the bias vector.

Input shape

It refers to a 3D tensor of shape (batch, steps, channels).

Output shape

The output shape is a 3D tensor of shape (batch, new_steps, filters) steps. The values might differ due to strides and padding.

Conv2D

It refers to a two-dimensional convolution layer, like a spatial convolution on images. It develops a convolution kernel, which can be convolved with the input layer for the generation of the tensors output. If we set use_bias to True, it will create a bias vector, which will be added to the output. Similarly, if we set activation to None, then also it will be added to the output. The layer can be used as an initial layer in the model by using the input_shape keyword argument, which is a tuple of integers, and it does not include the batch axis.

Arguments

  • filter: It is an integer that signifies the output space dimensionality or a total number of output filters present in a convolution.
  • kernel_size: It can either be an integer or tuple/list of 2 integers to represent the height and width of a 2D convolution window. It can also exist as a single integer that signifies the same value for rest all of the spatial domain.
  • strides: It is either an integer or a tuple/list of 2 integers that represents the convolution strides along with height and width. It might exist as a single integer that indicates the same value for the spatial dimension. If we signify any stride value!=1, it relates to its incompatibility with specifying the dilation_rate value!=1.
  • padding: One of "valid" or "same," where the same reflects some inconsistency across the backend with strides !=1.
  • data_format: It is a string of "channels_last" or "channels_first", which is the order of input dimensions. Here the "channels_last" describes the input shape (batch, height, width, channels), and the "channels_first" describes the input shape (batch, channels, height, width). It defaults to the image_data_format value that is found in Keras config at ~/.keras/keras.json. If you cannot find it in that folder, then it is residing at "channels_last".
  • dilation_rate: It can be an integer or tuple/ list of 2 integers that relates to the dilation rate to be used for dilated convolution. It might have an individual integer that indicates the same value for a spatial dimension. If we specify any stride value!=1, it relates to its incompatibility with specifying the dilation_rate value!=1.
  • activation: It refers to an activation function to be used. When nothing is specified, then by defaults, it is a linear activation a(x) = x, or we can say no activation function is applied.
  • use_bias: It represents a Boolean that shows whether the layer utilizes a bias vector.
  • kernel_initializer: It can be defined as an initializer for the kernel weights matrix.
  • bias_initializer: It refers to an initializer for bias vector.
  • kernel_regularizer: It refers to a regularizer function, which is applied to the kernel weights matrix.
  • bias_regularizer: It can be defined as a regularizer function, which is applied to the bias vector.
  • activity_regularizer: It refers to a regularizer function that is applied to the activation (output of the layer).
  • kernel_constraint: It is a constraint function applied to the kernel matrix.
  • bias_constraint: It can be defined as a constraint function applied to the bias vector.

Input shape

If the data_format is "channels_first", then the input shape of a 4D tensor is (batch, channels, rows, cols), else if data_format is "channels_last" the input shape of a 4D tensor is (batch, rows, cols, channels).

Output shape

If the data_format is "channels_first", the output shape of a 4D tensor will be (batch, filters, new_rows, new_cols), else if the data_format is "channels_last" the output will be (batch, new_rows, new_cols, filters). The values of rows and cols might variate due to the effect of padding.

SeparableConv1D

It is a Depthwise separable 1D convolution. Firstly, it accomplishes a depthwise spatial convolution on an single channel and then pointwise convolution to mix the resultant channels output. The argument depth_multiplier manages the generation of number of outputs channels per input channel in a depthwise manner.

The Separable Convolutions can be easily understood by means of factorizing a convolution kernel into two smaller kernels.

Arguments

  • filter: It is an integer that signifies the output space dimensionality or a total number of output filters present in a convolution.
  • kernel_size: It can either be an integer or tuple/list of single integer to represent the length of a 1D convolution window.
  • strides: It is either an integer or a tuple/list of a single integer that represents the convolution strides length. If we specify any stride value!=1, it relates to its incompatibility with specifying the dilation_rate value!=1.
  • padding: One of "valid" or "same," where the same shows some inconsistency across the backend with strides !=1.
  • data_format: It is in either mode, i.e., 'channels_first' that corresponds to input shape: (batch, channels, steps) or 'channels_last' corresponds to (batch, steps, channels).
  • dilation_rate: It can be an integer or tuple/ list of a single integer that relates to the dilation rate to be used for dilated convolution. If we specify any stride value!=1, it relates to its incompatibility with specifying the dilation_rate value!=1.
  • depth_multiplier: It represents the total number of depthwise convolution channels each of the respective input channels, which is equivalent to filters_in * depth_multiplier.
  • activation: It refers to an activation function to be used. When nothing is specified, then by defaults, it is a linear activation a(x) = x, or we can say no activation function is applied.
  • use_bias: It represents a Boolean that shows whether the layer utilizes a bias vector.
  • depthwise_initializer: It refers to an initializer for the depthwise kernel matrix.
  • pointwise_initializer: It refers to an initializer for the pointwise kernel matrix.
  • bias_initializer: It refers to an initializer for bias vector.
  • depthwise_regularizer: It refers to a regularizer function that is applied to the depthwise kernel matrix.
  • pointwise_regularizer: It refers to a regularizer function that is applied to the pointwise kernel matrix.
  • bias_regularizer: It can be defined as a regularizer function, which is applied to the bias vector.
  • activity_regularizer: It refers to a regularizer function that is applied to the activation (output of the layer).
  • depthwise_constraint: It can be defined as a constraint function applied to the depthwise kernel matrix.
  • pointwise_constraint: It can be defined as a constraint function applied to the pointwise kernel matrix.
  • bias_constraint: It can be defined as a constraint function applied to the bias vector.

Input shape

If the data_format is "channels_first", then the input shape of a 3D tensor is (batch, channels, steps), else if the data_format is "channels_last," the input shape of a 3D tensor is (batch, steps, channels).

Output shape

If the data_format is "channels_first", then the output shape of a 3D tensor will be (batch, filters, new_steps), else if the data_format is "channels_last" the output shape of a 3D tensor will be (batch, new_steps, filters). The value of new_steps may vary due to the padding or strides.

SeparableConv2D

It is a depthwise separable 2D convolution. Firstly, it performs a depthwise spatial convolution on an individual channel and then pointwise convolution to mix the output of the resultant channel. The argument depth_multiplier manages the generation of the number of outputs channels per input channel in a depthwise manner. 

The Separable Convolutions can be easily understood by means of factorizing a convolution kernel into two smaller kernels or as an extension of an Inception block.

Arguments

  • filter: It is an integer that signifies the output space dimensionality or the total number of output filters present in a convolution.
  • kernel_size: It can either be an integer or tuple/list of 2 integers to represent the height and width of a 2D convolution window. It can also exist as a single integer that signifies the same value for rest all of the spatial domain.
  • strides: It is either an integer or a tuple/list of 2 integers that represents the convolution strides along with the height and width. It can also exist as a single integer that signifies the same value for rest all of the spatial domain. If we specify any stride value!=1, it relates to its incompatibility with specifying the dilation_rate value!=1.
  • padding: One of "valid" or "same," where the same shows some inconsistency across the backend with strides !=1.
  • data_format: It is in either mode, i.e., 'channels_first' that corresponds to input shape: (batch, channels, height, width) or 'channels_last' corresponds to (batch, height, width, channels). It defaults to the image_data_format value that is found in Keras config at ~/.keras/keras.json. If you cannot find it in that folder, then it is residing at "channels_last".
  • dilation_rate: It can be an integer or tuple/ list of 2 integers that relates to the dilation rate to be used for dilated convolution. If we specify any stride value!=1, it relates to its incompatibility with specifying the dilation_rate value!=1.
  • depth_multiplier: It represents the total number of depthwise convolution channels for each of the respective input channels, which is equivalent to filters_in * depth_multiplier.
  • activation: It refers to an activation function to be used. When nothing is specified, then by defaults, it is a linear activation a(x) = x, or we can say no activation function is applied.
  • use_bias: It represents a Boolean that shows whether the layer utilizes a bias vector.
  • depthwise_initializer: It refers to an initializer for the depthwise kernel matrix.
  • pointwise_initializer: It refers to an initializer for the pointwise kernel matrix.
  • bias_initializer: It refers to an initializer for bias vector.
  • depthwise_regularizer: It refers to a regularizer function that is applied to the depthwise kernel matrix.
  • pointwise_regularizer: It refers to a regularizer function that is applied to the pointwise kernel matrix.
  • bias_regularizer: It can be defined as a regularizer function, which is applied to the bias vector.
  • activity_regularizer: It refers to a regularizer function that is applied to the activation (output of the layer).
  • depthwise_constraint: It can be defined as a constraint function applied to the depthwise kernel matrix.
  • pointwise_constraint: It can be defined as a constraint function applied to the pointwise kernel matrix.
  • bias_constraint: It can be defined as a constraint function applied to the bias vector.

Input shape

If the data_format is "channels_first", then the input shape of a 4D tensor is (batch, channels, rows, cols), else if the data_format is "channels_last" the input shape of a 4D tensor is (batch, rows, cols, channels).

Output shape

If the data_format is "channels_first", then the output shape of a 4D tensor will be (batch, filters, new_rows, new_cols), else if the data_format is "channels_last" the output shape of a 4D tensor will be (batch, new_rows, new_cols, filters). The value of rows and cols may vary due to the padding or strides.

DepthwiseConv2D

It is a depthwise 2D convolution layer that firstly performs a similar action as that of the depthwise spatial convolution in which it separately performs on each input channel. The argument depth_multiplier manages the generation of the number of outputs channels per input channel in a depthwise manner. 

Arguments

  • kernel_size: It can either be an integer or tuple/list of 2 integers to represent the height and width of a 2D convolution window. It can also exist as a single integer that signifies the same value for all of the spatial domain.
  • strides: It is either an integer or a tuple/list of 2 integers that represents the convolution strides along with the height and width. It can exist as a single integer that signifies the same value for rest all of the spatial domain. If we specify any stride value!=1, it relates to its incompatibility with specifying the dilation_rate value!=1.
  • padding: One of "valid" or "same," where the same shows some inconsistency across the backend with strides !=1.
  • data_format: It is in either mode, i.e., 'channels_first' that corresponds to input shape: (batch, channels, height, width) or 'channels_last' corresponds to (batch, height, width, channels). It defaults to the image_data_format value that is found in Keras config at ~/.keras/keras.json. If you cannot find it in that folder, then it is residing at "channels_last".
  • dilation_rate: It can be an integer or tuple/ list of 2 integers that relates to the dilation rate to be used for dilated convolution. If we specify any stride value!=1, it relates to its incompatibility with specifying the dilation_rate value!=1.
  • depth_multiplier: It represents the total number of depthwise convolution channels for each of the respective input channels, which is equivalent to filters_in * depth_multiplier.
  • activation: It refers to an activation function to be used. When nothing is specified, then by defaults, it is a linear activation a(x) = x, or we can say no activation function is applied.
  • use_bias: It represents a Boolean that shows whether the layer utilizes a bias vector.
  • depthwise_initializer: It refers to an initializer for the depthwise kernel matrix.
  • bias_initializer: It refers to an initializer for bias vector.
  • depthwise_regularizer: It refers to a regularizer function that is applied to the depthwise kernel matrix.
  • bias_regularizer: It can be defined as a regularizer function, which is applied to the bias vector.
  • activity_regularizer: It refers to a regularizer function that is applied to the activation (output of the layer).
  • depthwise_constraint: It can be defined as a constraint function applied to the depthwise kernel matrix.
  • bias_constraint: It can be defined as a constraint function applied to the bias vector.

Input shape

If the data_format is "channels_first", then the input shape of a 4D tensor is (batch, channels, rows, cols), else if the data_format is "channels_last" the input shape of a 4D tensor is (batch, rows, cols, channels).

Output shape

If the data_format is "channels_first", then the output shape of a 4D tensor will be (batch, channels * depth_multiplier, new_rows, new_cols), else if the data_format is "channels_last" the output shape of a 4D tensor will be (batch, new_rows, new_cols, channels * depth_multiplier). The value of rows and cols may vary due to the padding or strides.

Conv2DTranspose

It is a Transpose convolution layer, which is sometimes incorrectly known as Deconvolution. But in reality, it does not perform Deconvolution. The Conv2DTranspose layer is mainly required when the transformation moves in the opposite direction to that of a normal convolution, or simply we can say when the transformation goes from something that has an output shape of some convolution to the one that has input shape of convolution.

The layer can be used as an initial layer by using an argument input_shape, which is nothing but a tuple of integers and does not encompass the batch axis.

Arguments

  • filter: It is an integer that signifies the output space dimensionality or a total number of output filters present in a convolution.
  • kernel_size: It can either be an integer or tuple/list of 2 integers to represent the height and width of a 2D convolution window. It can also exist as a single integer that signifies the same value for all of the spatial domain.
  • strides: It is either an integer or a tuple/list of 2 integers that represents the convolution strides along with the height and width. It can exist as a single integer that signifies the same value for rest all of the spatial domain. If we specify any stride value!=1, it relates to its incompatibility with specifying the dilation_rate value!=1.
  • padding: One of "valid" or "same," where the same shows some inconsistency across the backend with strides !=1.
  • output_padding: It can either be an integer or tuple/list of 2 integers to represent the height and width of a 2D convolution window. It can also exist as a single integer that signifies the same value for all of the spatial domain. The amount of output data padding along any specified dimension should be given less than the stride along the same dimension. By default, it is set to None, which states that the output shape is inferred.
  • data_format: It is in either mode, i.e., 'channels_first' that corresponds to input shape: (batch, channels, height, width) or 'channels_last' corresponds to (batch, height, width, channels). It defaults to the image_data_format value that is found in Keras config at ~/.keras/keras.json. If you cannot find it in that folder, then it is residing at "channels_last".
  • dilation_rate: It can be an integer or tuple/ list of 2 integers that relates to the dilation rate to be used for dilated convolution. If we specify any stride value!=1, it relates to its incompatibility with specifying the dilation_rate value!=1.
  • activation: It refers to an activation function to be used. When nothing is specified, then by defaults, it is a linear activation a(x) = x, or we can say no activation function is applied.
  • use_bias: It represents a Boolean that shows whether the layer utilizes a bias vector.
  • kernel_initializer: It can be defined as an initializer for the kernel weights matrix.
  • bias_initializer: It refers to an initializer for bias vector.
  • kernel_regularizer: It refers to a regularizer function, which is applied to the kernel weights matrix.
  • bias_regularizer: It can be defined as a regularizer function, which is applied to the bias vector.
  • activity_regularizer: It refers to a regularizer function that is applied to the activation (output of the layer).
  • kernel_constraint: It is a constraint function applied to the kernel matrix.
  • bias_constraint: It can be defined as a constraint function applied to the bias vector.

Input shape

If the data_format is "channels_first" then input shape of a 4D tensor is (batch, channels, rows, cols), else if the data_format is "channels_last" the input shape of a 4D tensor is (batch, rows, cols, channels).

Output shape

If the data_format is "channels_first" then the output shape of a 4D tensor will be (batch, filters, new_rows, new_cols), else if the data_format is "channels_last" the output shape of a 4D tensor will be (batch, new_rows, new_cols, filters). The value of rows and cols may vary due to the padding. If output_padding is specified:

Conv3D

It is a 3D convolution layer; for example, spatial convolution over volumes helps in the creation of the convolution kernel, which is convolved with the input layer in order to generate outputs of a tensor. It creates a bias vector if the use_bias is set to True, and then the bias vector is added to the output. It is applied outputs only if the activation is set to None.

The layer can be used as the first layer in the model by using the input_shape keyword argument, which is nothing but a tuple of integers and does not embrace the batch axis. 

Arguments

  • filter: It is an integer that signifies the output space dimensionality or a total number of output filters present in a convolution.
  • kernel_size: It can either be an integer or tuple/list of 3 integers to represent the depth, height, and width of a 3D convolution window. It can also exist as a single integer that signifies the same value for all of the spatial domain.
  • strides: It is either an integer or a tuple/list of 3 integers that represents the convolution strides along with the depth, height, and width. It can exist as a single integer that signifies the same value for rest all of the spatial domain. If we specify any stride value!=1, it relates to its incompatibility with specifying the dilation_rate value!=1.
  • padding: One of "valid" or "same," where the same shows some inconsistency across the backend with strides !=1.
  • data_format: It is in either mode, i.e. 'channels_first' that corresponds to input shape: (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) or 'channels_last' corresponds to (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels). It defaults to the image_data_format value that is found in Keras config at ~/.keras/keras.json. If you cannot find it in that folder, then it is residing at "channels_last".
  • dilation_rate: It can be an integer or tuple/ list of 3 integers that relates to the dilation rate to be used for dilated convolution. If we specify any stride value!=1, it relates to its incompatibility with specifying the dilation_rate value!=1.
  • activation: It refers to an activation function to be used. When nothing is specified, then by defaults, it is a linear activation a(x) = x, or we can say no activation function is applied.
  • use_bias: It represents a Boolean that shows whether the layer utilizes a bias vector.
  • kernel_initializer: It can be defined as an initializer for the kernel weights matrix.
  • bias_initializer: It refers to an initializer for bias vector.
  • kernel_regularizer: It refers to a regularizer function, which is applied to the kernel weights matrix.
  • bias_regularizer: It can be defined as a regularizer function, which is applied to the bias vector.
  • activity_regularizer: It refers to a regularizer function that is applied to the activation (output of the layer).
  • kernel_constraint: It is a constraint function applied to the kernel matrix.
  • bias_constraint: It can be defined as a constraint function applied to the bias vector.

Input shape

If the data_format is "channels_first" then input shape of a 5D tensor is (batch, channels, conv_dim1, conv_dim2, conv_dim3), else if the data_format is "channels_last" the input shape of a 5D tensor is (batch, conv_dim1, conv_dim2, conv_dim3, channels).

Output shape

If the data_format is "channels_first" then the output shape of a 5D tensor will be (batch, filters, new_conv_dim1, new_conv_dim2, new_conv_dim3), else if the data_format is "channels_last" the output shape of a 5D tensor will be (batch, new_conv_dim1, new_conv_dim2, new_conv_dim3, filters). The value of new_conv_dim1, new_conv_dim2 and new_conv_dim3 may vary due to the padding.

Conv3D Transpose

It is a transposed convolution layer, which is sometimes also called as Deconvolution. This layer is mainly required when the transformation moves in the opposite direction to that of a normal convolution, or simply we can say when the transformation goes from something that has an output shape of some convolution to the one that has input shape of convolution. 

The layer can be used as an initial layer by using an argument input_shape, which is nothing but a tuple of integers and does not encompass the batch axis.

Arguments

  • filter: It is an integer that signifies the output space dimensionality or a total number of output filters present in a convolution.
  • kernel_size: It can either be an integer or tuple/list of 3 integers to represent the depth, height, and width of a 3D convolution window. It can also exist as a single integer that signifies the same value for all of the spatial domain.
  • strides: It is either an integer or a tuple/list of 3 integers that represents the convolution strides along with the depth, height, and width. It can exist as a single integer that signifies the same value for rest all of the spatial domain. If we specify any stride value!=1, it relates to its incompatibility with specifying the dilation_rate value!=1.
  • padding: One of "valid" or "same," where the same shows some inconsistency across the backend with strides !=1.
  • output_padding: It can either be an integer or tuple/list of 3 integers to represent the depth, height, and width of a 3D convolution window. It can also exist as a single integer that signifies the same value for all of the spatial domain. The amount of output data padding along any specified dimension should be given less than the stride along the same dimension. By default, it is set to None, which states that the output shape is inferred.
  • data_format: It is in either mode, i.e., 'channels_first' that corresponds to input shape: (batch, channels, depth, height, width) or 'channels_last' corresponds to (batch, depth, height, width, channels). It defaults to the image_data_format value that is found in Keras config at ~/.keras/keras.json. If you cannot find it in that folder, then it is residing at "channels_last".
  • dilation_rate: It can be an integer or tuple/ list of 3 integers that relates to the dilation rate to be used for dilated convolution. If we specify any stride value!=1, it relates to its incompatibility with specifying the dilation_rate value!=1.
  • activation: It refers to an activation function to be used. When nothing is specified, then by defaults, it is a linear activation a(x) = x, or we can say no activation function is applied.
  • use_bias: It represents a Boolean that shows whether the layer utilizes a bias vector.
  • kernel_initializer: It can be defined as an initializer for the kernel weights matrix.
  • bias_initializer: It refers to an initializer for bias vector.
  • kernel_regularizer: It refers to a regularizer function, which is applied to the kernel weights matrix.
  • bias_regularizer: It can be defined as a regularizer function, which is applied to the bias vector.
  • activity_regularizer: It refers to a regularizer function that is applied to the activation (output of the layer).
  • kernel_constraint: It is a constraint function applied to the kernel matrix.
  • bias_constraint: It can be defined as a constraint function applied to the bias vector.

Input shape

If the data_format is "channels_first" then input shape of a 5D tensor is (batch, channels, depth, rows, cols), else if the data_format is "channels_last" the input shape of a 5D tensor is (batch, depth, rows, cols, channels).

Output shape

If the data_format is "channels_first" then the output shape of a 5D tensor will be (batch, filters, new_depth, new_rows, new_cols), else if the data_format is "channels_last" the output shape of a 5D tensor will be (batch, new_depth, new_rows, new_cols, filters). The value of depth, rows and cols may vary due to the padding. If output_padding is specified::

Cropping1D

It is a cropping layer for 1Dimension input, for example, a temporal sequence that crops alongside axis 1, i.e., time dimension.

Arguments

  • cropping: It is a tuple, which is of int length 2 ensures a total number of units to be trimmed at the beginning and end of axis 1(cropping dimension). In case if you provide a single int, then the same value will be utilized at the beginning and end.

Input shape

It is a 3D tensor of shape (batch, axis_to_crop, features).

Output shape

It is a 3D tensor of shape (batch, cropped_axis, features).

Cropping2D

It is a 2Dimension cropping layer for example picture that yields along the spatial dimensions such as height and width.

Arguments

  • cropping: It is a int, or tuple of 2 ints, or a tuple of 2 tuples of 2 int, such that if int, which is the same cropping symmetric is applied to height and width and if tuple of 2 int is interpreted as two different symmetric cropping value for height and width: (symmetric_height_crop, symmetric_width_crop)
  • data_format: It is in either mode, i.e. 'channels_first' that corresponds to input shape: (batch, channels, height, width) or 'channels_last' corresponding to (batch, height, width, channels). It is default to the image_data_format value that is found in Keras config at ~/.keras/keras.json. If you cannot find it in that folder then it is residing at "channels_last".

Input shape

If the data_format is "channels_first" then input shape of a 4D tensor is (batch, channels, rows, cols), else if the data_format is "channels_last" the input shape of a 4D tensor is (batch, rows, cols, channels).

Output shape

If the data_format is "channels_first" then the output shape of a 4D tensor will be (batch, channels, cropped_rows, cropped_cols), else if the data_format is "channels_last" the output shape of a 4D tensor will be (batch, cropped_rows, cropped_cols, channels).

Examples

Cropping3D

It is a 3D cropping layer just like spatio-temporal or spatial.

Arguments

  • cropping: It is an int, or a tuple of 3 ints, or a tuple of 3 tuples of 2ints, such that; If int is the same symmetric cropping that is applied to depth, height and width, If tuple of 3 ints is interpreted as three distinct values of symmetric cropping for depth, height and width: (symmetric_dim1_crop, symmetric_dim2_crop, symmetric_dim3_crop) and If tuple of 3 tuples of 2 ints is interpreted as ((left_dim1_crop, right_dim1_crop), (left_dim2_crop, right_dim2_crop), (left_dim3_crop, right_dim3_crop)).
  • data_format: It is a string of either mode, i.e. 'channels_first' that corresponds to input shape: (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) or 'channels_last' corresponding to (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels). It is default to the image_data_format value that is found in Keras config at ~/.keras/keras.json. If you cannot find it in that folder then it is residing at "channels_last".

Input shape

If the data_format is "channels_first" then input shape of a 5D tensor is (batch, depth, first_axis_to_crop, second_axis_to_crop, third_axis_to_crop), else if the data_format is "channels_last" the input shape of a 5D tensor is (batch, first_axis_to_crop, second_axis_to_crop, third_axis_to_crop, depth).

Output shape

If the data_format is "channels_first" then the output shape of a 5D tensor will be (batch, depth, first_cropped_axis, second_cropped_axis, third_cropped_axis), else if the data_format is "channels_last" the output shape of a 5D tensor will be (batch, first_cropped_axis, second_cropped_axis, third_cropped_axis, depth).

UpSampling1D

It is an Upsampling layer for 1 Dimensional inputs that repeat each individual temporal steps in terms of size times alongside time axis.

Arguments

  • size: It is an integer, which is an Upsampling factor.

Input shape

It is a 3D tensor of shape: (batch, steps, features).

Output shape

It is a 3D tensor with shape: (batch, upsampled_steps, features).

UpSampling2D

It is an Upsampling layer for 2D input that repeats the rows of the data by size [0] and columns of the data by size [1].

Arguments

  • size: It is an int or tuple of 2 integers, which is an upsampling factor for rows and columns.
  • data_format: It is a string of either mode, i.e., 'channels_first' that corresponds to input shape: (batch, channels, height, width) or 'channels_last' corresponding to (batch, height, width, channels). It defaults to the image_data_format value that is found in Keras config at ~/.keras/keras.json. If you cannot find it in that folder, then it is residing at "channels_last".
  • interpolation: It is a string one of nearest or bilinear. It should be illustrated that CNTK does not support yet the bilinear upscaling and that with Theano, only size=(2, 2) is possible.

Input shape

If data_format is "channels_last", then the input shape of a 4D tensor is (batch, rows, cols, channels), else if data_format is "channels_first", then the input shape of a 4D tensor is (batch, channels, rows, cols).

Output shape

If data_format is "channels_last", then the output shape of a 4D tensor is (batch, upsampled_rows, upsampled_cols, channels), else if the data_format is "channels_first", then the output shape of a 4D tensor is (batch, channels, upsampled_rows, upsampled_cols).

UpSampling3D

It refers to an Upsampling layer for 3 dimensional input that repeats 1st dimension of the data by size [0], 2nd dimension of the data by size [1], and 3rd dimension of the data by size [2].

Arguments

  • size: It is an int or tuple of 3 integers, which is an upsampling factor for dim1, dim2, and dim3.
  • data_format: It is a string of either mode, i.e. 'channels_first' that corresponds to input shape: (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) or 'channels_last' corresponding to (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels). It defaults to the image_data_format value that is found in Keras config at ~/.keras/keras.json. If you cannot find it in that folder, then it is residing at "channels_last".

Input shape

If data_format is "channels_last", then the input shape of a 5D tensor is (batch, dim1, dim2, dim3, channels), else if data_format is "channels_first", then the input shape of a 5D tensor is (batch, channels, dim1, dim2, dim3).

Output shape

If data_format is "channels_last", then the output shape of a 4D tensor is (batch, upsampled_dim1, upsampled_dim2, upsampled_dim3, channels), else if the data_format is "channels_first", then the output shape of a 4D tensor is (batch, channels, upsampled_dim1, upsampled_dim2, upsampled_dim3).

ZeroPadding1D

It refers to a zero-padding layer for one-dimensional input. For example, a temporal sequence.

Argument

  • padding: It is an int, or tuple of int (length 2) or dictionary, such that If int demonstrates a total number of zeros to be added at the beginning as well as at the end of the padding dimension(axis 1), whereas in case of If a tuple of int (length 2) the zeros are added at the beginning and end of the padding dimension ((left_pad, right_pad)).

Input shape

It is a 3D tensor of shape (batch, axis_to_pad, features).

Output shape

It refers to a 3 dimensional tensor of shape (batch, padded_axis, features).

ZeroPadding2D

It refers to a two-dimensional input zero-padding layer (for example, picture) that supports the addition of zero rows and columns containing zeros at the top, bottom, left, and right of a tensor image. 

Arguments

  • padding: It is an int, or tuple of 2 ints, or tuple of 2 tuples of 2 ints; where If int is the similar symmetric padding being applied to height and width, If a tuple of 2 ints is taken as two distinct values symmetric padding values for height and width of: (symmetric_height_pad, symmetric_width_pad), whereas If a tuple of 2 tuples of 2 ints is understood as ((top_pad, bottom_pad), (left_pad, right_pad)).
  • data_format: It is a string of either mode, i.e., 'channels_first' that corresponds to input shape: (batch, channels, height, width) or 'channels_last' corresponding to (batch, height, width, channels). It defaults to the image_data_format value that is found in Keras config at ~/.keras/keras.json. If you cannot find it in that folder, then it is residing at "channels_last".

Input shape

If data_format is "channels_last", then the input shape of a 4D tensor is (batch, rows, cols, channels), else if data_format is "channels_first", then the input shape of a 4D tensor is (batch, channels, rows, cols).

Output shape

If data_format is "channels_last", then the output shape of a 4D tensor is (batch, padded_rows, padded_cols, channels), else if the data_format is "channels_first", then the output shape of a 4D tensor is (batch, channels, padded_rows, padded_cols).

ZeroPadding3D

It is a three-dimensional zero-padding layer. For example, spatial or Spatio-temporal.

Arguments

  • padding: It is an int, or a tuple of 3 ints, or a tuple of 3 tuples of 2ints, such that; If int is the same symmetric padding that is applied to depth, height and width, If tuple of 3 ints is interpreted as three distinct values of symmetric padding values for depth, height and width: (symmetric_dim1_pad, symmetric_dim2_pad, symmetric_dim3_pad) and If tuple of 3 tuples of 2 ints is interpreted as ((left_dim1_pad, right_dim1_pad), (left_dim2_pad, right_dim2_pad), (left_dim3_pad, right_dim3_pad)).
  • data_format: It is a string of either mode, i.e. 'channels_first' that corresponds to input shape: (batch, channels, spatial_dim1, spatial_dim2, spatial_dim3) or 'channels_last' corresponding to (batch, spatial_dim1, spatial_dim2, spatial_dim3, channels). It is default to the image_data_format value that is found in Keras config at ~/.keras/keras.json. If you cannot find it in that folder then it is residing at "channels_last".

Input shape

If the data_format is "channels_first" then input shape of a 5D tensor is (batch, depth, first_axis_to_pad, second_axis_to_pad, third_axis_to_pad), else if the data_format is "channels_last" the input shape of a 5D tensor is (batch, first_axis_to_pad, second_axis_to_pad, third_axis_to_pad, depth).

Output shape

If the data_format is "channels_first" then the output shape of a 5D tensor will be (batch, depth, first_padded_axis, second_padded_axis, third_axis_to_pad), else if the data_format is "channels_last" the output shape of a 5D tensor will be (batch, first_padded_axis, second_padded_axis, third_axis_to_pad, depth).


Next TopicPooling Layers