An illustration(GIF) to explain deep convolutional networks (DCNN)

TeeTracker
3 min readFeb 21, 2022

In the world of computer vision, the most basic and common image recognition algorithm is the convolutional network. With the popularity of frameworks such as tensorflow and pytorch, it has become easier to use convolutional networks and instead of implementing backward, we can just focus on forward.

This article assumes that we are all familiar with convolutional networks, so we won’t go into much detail here.

Instead, I’ll just use a few GIFs to refresh your memory on the implementation of convolutional networks as we know them.

GIF source https://developer.ibm.com/articles/introduction-to-convolutional-neural-networks/

For details, you can go directly there.

Understanding image channels (RGB)

Features detection

Stride x 1

Stride x 2

Padding (SAME in TF) when the feature map keeps same size as input image

Pooling (max pooling as example)

Corrected: There is a bug in the origin GIF. The pooling should be started at 1. row 1.col instead what the gif shows.

Conclusion 1

Both Features detection and Pooling is shorted to call Conv & Pooling, they can be repeated several times during the whole process.

Multi filters (kernels)

1 Filter on a RGB image

3 Filters on a RGB image

P.S: The pooling can be applied right after extracting features.

Flatten or Global averaging

Conclusion 2

  • Input image
  • Loop (assume stride of conv and pooling is 2):
    - Extract feature maps with increased filter numbers
    - Max, average pooling feature maps
  • Flatten or global average feature maps
  • Bind linear classifier

Related read

https://developer.ibm.com/articles/introduction-to-convolutional-neural-networks/

--

--