CNN

  • Convolution: , where are the parameters, w is the kernel, and b is the bias.
  • Channels, input: . Output:
  • Filters: , filter bank =
  • Spatial resolution
  • Convolutions: Strided, Dilated
  • Nonlinearity: Pooling (Mean, Max, Min).
  • Downsampling and upsampling
  • Receptive field
  • Feature maps
  • Architecture: Encoder & Decoder, AlexNet, UNet, ResNet
  • Reason that images are processed locally while MLPs are processed globally?
    • Divide and Conquer
    • Translational Invariance

Equivariance and Invariance

Invariance: Consider G to be the group of actions (for example: group of translation for an image I), and g is a specific element of the translation group. A function f is said to be invariant under the group of actions G if for all elements I and for any , f(g(I)) = f(I). Equivariance: Consider G’ to another group of actions, function f is said to be equivariant under the group of action, if for any element I, and g in G, there exists such that f(g(I)) = g’(f(I)). Source: Theoretical view