Capsule Networks胶囊网络（一）Limitations of Convolutional NetworksDefinition of CapsuleFollow-up content

189 阅读 0 评论 125 点赞

我是靠谱客的博主标致马里奥，这篇文章主要介绍Capsule Networks胶囊网络（一）Limitations of Convolutional NetworksDefinition of CapsuleFollow-up content，现在分享给大家，希望可以做个参考。

原文链接：小样本学习与智能前沿
在这里插入图片描述

author: Sargur Srihari srihari@buffalo.edu

This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/CSE676

文章目录

Limitations of Convolutional Networks
- ConvolutionalNeuralNetworks
- Processing Steps and Training for ConvNets
- Pooling and Invariance
- Example of CNN Limitation
- - CNN to recognize faces extracts features from image
- Motivation for CapsNets
- Solution offered by CapsNets
- Visual Fixation
- - Human vision uses saccades
  - Parse Tree of a Fixation
  - Activation is a likelihood
- CNN versus CapsNets
Definition of Capsule
- - Example
- Representing Entity in Input
- Capsule Networks perform inverse Computer Graphics
- Pooling and Equivariance
- - Example of equivariance with Capsnet
- From Convolution to Capsule
Follow-up content

Limitations of Convolutional Networks

ConvolutionalNeuralNetworks

在这里插入图片描述

Source: https://hackernoon.com/ what-is-a-capsnet-or-capsule- network-2bfbe48769cc

与常规神经网络相比，将计算量最小化
卷积极大地简化了计算，而不会丢失数据的本质
擅长处理图像分类
在所有图像位置使用相同的知识

Processing Steps and Training for ConvNets

Givenaninputimage,asetofkernelsorfiltersscan it and perform the convolution operation.
This creates a feature map inside the network.
These features next pass via activation and pooling layers
• Activation layers, e.g., ReLU, induce nonlinearity
• Pooling (eg: max pooling) helps in reducing the training time.
（pooling实现子区域的摘要，实现不变性）
At the end, it will pass via a classifier sigmoid/softmax
Training is based on back propagation（反向传播） of error matched against labeled data.
（非线性也有助于解决消失的梯度问题）

Pooling and Invariance

（池化和不变性）
Pooling应该获得位置，方向，比例或旋转不变性。
在这里插入图片描述
Every input value changed, but only half the output values have changed because maxpool is only sensitive to max value in neighborhood not exact value.

Example of CNN Limitation

CNN to recognize faces extracts features from image

在这里插入图片描述
与顺序无关，位置不对CNN也能进行识别

Motivation for CapsNets

Caps nets are an improvement on CNNs

They are the next version of CNNs
Solve problems due to max pooling and deep nets
Loss of information regarding order and feature orientation
Hinton: “The pooling operation used in CNNs is a big mistake and the fact that it works so well is a disaster”

Solution offered by CapsNets

Low level features should also be arranged in a certain order for the object to be classified as a face
（排序低级特征）
Order is determined during training when the network learns not only what features to look for but also what their relationships to one another should be （顺利由训练决定，不仅学习特征，还要学习特征之间的关系）
具有特征顺序特征的图像才会被识别为人脸。

Visual Fixation

（视觉固定）

Human vision uses saccades

（人类视觉使用扫视）

通过仔细的固定顺序忽略无关的细节
确保仅以最高的分辨率处理光学阵列的一小部分

We assume a single fixation will give us
• Much more than a single identified object and its properties
• Assume our multi layer visual system creates a parse tree on each fixation
• We ignore coordination of parse trees（解析树） over multiple fixations
在这里插入图片描述

Parse Tree of a Fixation

对于单个注视，
从固定的多层神经网络中刻出一个分析树
像岩石上的雕塑
每层将被分成许多小的神经元组，称为“胶囊”
解析树中的每个节点将对应一个活动胶囊

Activation is a likelihood

神经元的激活水平可以解释为检测到特定特征的可能性
在这里插入图片描述
胶囊是一组神经元，不仅捕获可能性，而且捕获特定特征的参数。

CNN versus CapsNets

max pooling layers 获取图片的重要特征，但丢失了特征的结构取向
CNN 只检测特征是否存在，而不考虑位置

Capsnets replace scalar-output feature detectors with vector-output capsules and max-pooling with routing- by-agreement.
Capsnet用向量输出封装代替标量输出特征检测器，用按协议路代替最大池化。