CNN卷积神经网络原理

资源描述

一 CNN 卷积神经网络原理简介 details 43225445 本文主要是详细地解读 CNN 的实现代码如果你没学习过 CNN 在此推荐周晓艺师兄的博文 Deep Learning 深度学习学习笔记整理系列之七以及 UFLDL 上的卷积特征提取池化 CNN 的最大特点就是稀疏连接局部感受和权值共享如下面两图所示左为稀疏连接右为权值共享稀疏连接和权值共享可以减少所要训练的参数减少计算复杂度至于 CNN 的结构以经典的 LeNet5 来说明这个图真是无处不在一谈 CNN 必说 LeNet5 这图来自于这篇论文 Gradient Based Learning Applied to Document Recognition 论文很长第 7 页那里开始讲 LeNet5 这个结构建议看看那部分我这里简单说一下 LeNet5 这张图从左到右先是 input 这是输入层即输入的图片 input layer 到 C1 这部分就是一个卷积层 convolution 运算 C1 到 S2 是一个子采样层 pooling 运算关于卷积和子采样的具体过程可以参考下图然后 S2 到 C3 又是卷积 C3 到 S4 又是子采样可以发现卷积和子采样都是成对出现的卷积后面一般跟着子采样 S4 到 C5 之间是全连接的这就相当于一个 MLP 的隐含层了如果你不清楚 MLP 参考 DeepLearning tutorial 3 MLP 多层感知机原理简介代码详解 C5 到 F6 同样是全连接也是相当于一个 MLP 的隐含层最后从 F6 到输出 output 其实就是一个分类器这一层就叫分类层 ok CNN 的基本结构大概就是这样由输入卷积层子采样层全连接层分类层输出这些基本构件组成一般根据具体的应用或者问题去确定要多少卷积层和子采样层采用什么分类器当确定好了结构以后如何求解层与层之间的连接参数一般采用向前传播 FP 向后传播 BP 的方法来训练具体可参考上面给出的链接二 CNN 卷积神经网络代码详细解读基于 python theano 代码来自于深度学习教程 Convolutional Neural Networks LeNet 这个代码实现的是一个简化了的 LeNet5 具体如下没有实现 location specific gain and bias parameters 用的是 maxpooling 而不是 average pooling 分类器用的是 softmax LeNet5 用的是 rbf LeNet5 第二层并不是全连接的本程序实现的是全连接另外代码里将卷积层和子采用层合在一起定义为 LeNetConvPoolLayer 卷积采样层这好理解因为它们总是成对出现但是有个地方需要注意代码中将卷积后的输出直接作为子采样层的输入而没有加偏置 b 再通过 sigmoid 函数进行映射即没有了下图中 fx 后面的 bx 以及 sigmoid 映射也即直接由 fx 得到 Cx 最后代码中第一个卷积层用的卷积核有 20 个第二个卷积层用 50 个而不是上面那张 LeNet5 图中所示的 6 个和 16 个了解了这些下面看代码 1 导入必要的模块 python view plain copy 1 import cPickle 2 import gzip 3 import os 4 import sys 5 import time 6 7 import numpy 8 9 import theano 10 import theano tensor as T 11 from theano tensor signal import downsample 12 from theano tensor nnet import conv 2 定义 CNN 的基本构件 CNN 的基本构件包括卷积采样层隐含层分类器如下定义 LeNetConvPoolLayer 卷积采样层见代码注释 python view plain copy 1 2 卷积下采样合成一个层 LeNetConvPoolLayer 3 rng 随机数生成器用于初始化 W 4 input 4 维的向量 theano tensor dtensor4 5 filter shape number of filters num input feature maps filter height filter width 6 image shape batch size num input feature maps image height image w idth 7 poolsize rows cols 8 9 class LeNetConvPoolLayer object 10 def init self rng input filter shape image shape poolsi ze 2 2 11 12 assert condition condition 为 True 则继续往下执行 condition 为 False 中断程序 13 image shape 1 和 filter shape 1 都是 num input feature maps 它们必须是一样的 14 assert image shape 1 filter shape 1 15 self input input 16 17 每个隐层神经元即像素与上一层的连接数为 num input feature maps filter height filter width 18 可以用 numpy prod filter shape 1 来求得 19 fan in numpy prod filter shape 1 20 21 lower layer 上每个神经元获得的梯度来自于 num output feature maps filter height filter width pooling siz e 22 fan out filter shape 0 numpy prod filter shape 2 23 numpy prod poolsize 24 25 以上求得 fan in fan out 将它们代入公式以此来随机初始化 W W 就是线性卷积核 26 W bound numpy sqrt 6 fan in fan out 27 self W theano shared 28 numpy asarray 29 rng uniform low W bound high W bound size filter shape 30 dtype theano config floatX 31 32 borrow True 33 34 35 the bias is a 1D tensor one bias per output feature map 36 偏置 b是一维向量每个输出图的特征图都对应一个偏置 37 而输出的特征图的个数由 filter 个数决定因此用 filter shape 0 即 number of filters 来初始化 38 b values numpy zeros filter shape 0 dtype theano confi g floatX 39 self b theano shared value b values borrow True 40 41 将输入图像与 filter 卷积 conv conv2d 函数 42 卷积完没有加 b再通过 sigmoid 这里是一处简化 43 conv out conv conv2d 44 input input 45 filters self W 46 filter shape filter shape 47 image shape image shape 48 49 50 maxpooling 最大子采样过程 51 pooled out downsample max pool 2d 52 input conv out 53 ds poolsize 54 ignore border True 55 56 57 加偏置再通过 tanh 映射得到卷积子采样层的最终输出 58 因为 b是一维向量这里用维度转换函数 dimshuffle 将其 reshape 比如 b是 10 59 则 b dimshuffle x 0 x x 将其 reshape 为 1 10 1 1 60 self output T tanh pooled out self b dimshuffle x 0 x x 61 卷积采样层的参数 62 self params self W self b 定义隐含层 HiddenLayer 这个跟上一篇文章 DeepLearning tutorial 3 MLP 多层感知机原理简介代码详解中的 HiddenLayer 是一致的直接拿过来 python view plain copy 1 2 注释 3 这是定义隐藏层的类首先明确隐藏层的输入即 input 输出即隐藏层的神经元个数输入层与隐藏层是全连接的 4 假设输入是 n in 维的向量也可以说时 n in 个神经元隐藏层有 n out 个神经元则因为是全连接 5 一共有 n in n out 个权重故 W大小时 n in n out n in 行 n out 列每一列对应隐藏层的每一个神经元的连接权重 6 b是偏置隐藏层有 n out 个神经元故 b时 n out 维向量 7 rng 即随机数生成器 numpy random RandomState 用于初始化 W 8 input 训练模型所用到的所有输入并不是 MLP 的输入层 MLP 的输入层的神经元个数时 n in 而这里的参数 input 大小是 n example n in 每一行一个样本即每一行作为 MLP 的输入层 9 activation 激活函数这里定义为函数 tanh 10 11 class HiddenLayer object 12 def init self rng input n in n out W None b None 13 activation T tanh 14 self input input 类 HiddenLayer 的 input 即所传递进来的 input 15 16 17 注释 18 代码要兼容 GPU 则必须使用 dtype theano config floatX 并且定义为 theano shared 19 另外 W 的初始化有个规则如果使用 tanh 函数则在 sqrt 6 n in n hidden 到 sqrt 6 n in n hidden 之间均匀 20 抽取数值来初始化 W 若时 sigmoid 函数则以上再乘 4倍 21 22 如果 W未初始化则根据上述方法初始化 23 加入这个判断的原因是有时候我们可以用训练好的参数来初始化 W 见我的上一篇文章 24 if W is None 25 W values numpy asarray 26 rng uniform 27 low numpy sqrt 6 n in n out 28 high numpy sqrt 6 n in n out 29 size n in n out 30 31 dtype theano config floatX 32 33 if activation theano tensor nnet sigmoid 34 W values 4 35 W theano shared value W values name W borrow True 36 37 if b is None 38 b values numpy zeros n out dtype theano config floa tX 39 b theano shared value b values name b borrow True 40 41 用上面定义的 W b 来初始化类 HiddenLayer 的 W b 42 self W W 43 self b b 44 45 隐含层的输出 46 lin output T dot input self W self b 47 self output 48 lin output if activation is None 49 else activation lin output 50 51 52 隐含层的参数 53 self params self W self b 定义分类器 Softmax 回归采用 Softmax 这跟 DeepLearning tutorial 1 Softmax 回归原理简介代码详解中的 LogisticRegression 是一样的直接拿过来 python view plain copy 1 2 定义分类层 LogisticRegression 也即 Softmax 回归 3 在 deeplearning tutorial 中直接将 LogisticRegression 视为 Softmax 4 而我们所认识的二类别的逻辑回归就是当 n out 2 时的 LogisticRegression 5 6 参数说明 7 input 大小就是 n example n in 其中 n example 是一个 batch 的大小 8 因为我们训练时用的是 Minibatch SGD 因此 input 这样定义 9 n in 即上一层隐含层的输出 10 n out 输出的类别数 11 class LogisticRegression object 12 def init self input n in n out 13 14 W 大小是 n in 行 n out 列 b 为 n out 维向量即每个输出对应 W的一列以及 b 的一个元素 15 self W theano shared 16 value numpy zeros 17 n in n out 18 dtype theano config floatX 19 20 name W 21 borrow True 22 23 24 self b theano shared 25 value numpy zeros 26 n out 27 dtype theano config floatX 28 29 name b 30 borrow True 31 32 33 input 是 n example n in W 是 n in n out 点乘得到 n example n out 加上偏置 b 34 再作为 T nnet softmax 的输入得到 p y given x 35 故 p y given x 每一行代表每一个样本被估计为各类别的概率 36 PS b 是 n out 维向量与 n example n out 矩阵相加内部其实是先复制 n example 个 b 37 然后 n example n out 矩阵的每一行都加 b 38 self p y given x T nnet softmax T dot input self W self b 39 40 argmax 返回最大值下标因为本例数据集是 MNIST 下标刚好就是类别 axis 1 表示按行操作 41 self y pred T argmax self p y given x axis 1 42 43 params LogisticRegression 的参数 44 self params self W self b 到这里 CNN 的基本构件都有了下面要用这些构件组装成 LeNet5 当然是简化的上面已经说了具体来说就是组装成 LeNet5 input LeNetConvPoolLayer 1 LeNetConvPoolLa yer 2 HiddenLayer LogisticRegression output 然后将其应用于 MNIST 数据集用 BP 算法去解这个模型得到最优的参数 3 加载 MNIST数据集 mnist pkl gz python view plain copy 1 2 加载 MNIST 数据集 load data 3 4 def load data dataset 5 dataset 是数据集的路径程序首先检测该路径下有没有 MNIST 数据集没有的话就下载 MNIST 数据集 6 这一部分就不解释了与 softmax 回归算法无关 7 data dir data file os path split dataset 8 if data dir and not os path isfile dataset 9 Check if dataset is in the data directory 10 new path os path join 11 os path split file 0 12 13 data 14 dataset 15 16 if os path isfile new path or data file mnist pkl gz 17 dataset new path 18 19 if not os path isfile dataset and data file mnist pkl gz 20 import urllib 21 origin 22 http www iro umontreal ca lisa deep data mnist mnist pkl gz 23 24 print Downloading data from s origin 25 urllib urlretrieve origin dataset 26 27 print loading data 28 以上是检测并下载数据集 mnist pkl gz 不是本文重点下面才是 load data 的开始 29 30 从 mnist pkl gz 里加载 train set valid set test set 它们都是包括 label 的 31 主要用到 python 里的 gzip open 函数以及 cPickle load 32 rb 表示以二进制可读的方式打开文件 33 f gzip open dataset rb 34 train set valid set test set cPickle load f 35 f close 36 37 38 将数据设置成 shared variables 主要时为了 GPU 加速只有 shared variables 才能存到 GPU memory 中 39 GPU 里数据类型只能是 float 而 data y 是类别所以最后又转换为 int 返回 40 def shared dataset data xy borrow True 41 data x data y data xy 42 shared x theano shared numpy asarray data x 43 dtype theano config fl oatX 44 borrow borrow 45 shared y theano shared numpy asarray data y 46 dtype theano config fl oatX 47 borrow borrow 48 return shared x T cast shared y int32 49 50 51 test set x test set y shared dataset test set 52 valid set x valid set y shared dataset valid set 53 train set x train set y shared dataset train set 54 55 rval train set x train set y valid set x valid set y 56 test set x test set y 57 return rval 4 实现 LeNet5 并测试 python view plain copy 1 2 实现 LeNet5 3 LeNet5 有两个卷积层第一个卷积层有 20 个卷积核第二个卷积层有 50 个卷积核 4 5 def evaluate lenet5 learning rate 0 1 n epochs 200 6 dataset mnist pkl gz 7 nkerns 20 50 batch size 500 8 9 learning rate 学习速率随机梯度前的系数 10 n epochs 训练步数每一步都会遍历所有 batch 即所有样本 11 batch size 这里设置为 500 即每遍历完 500 个样本才计算梯度并更新参数 12 nkerns 20 50 每一个 LeNetConvPoolLayer 卷积核的个数第一个 LeNetConvPoolLayer 有 13 20 个卷积核第二个有 50 个 14 15 16 rng numpy random RandomState 23455 17 18 加载数据 19 datasets load data dataset 20 train set x train set y datasets 0 21 valid set x valid set y datasets 1 22 test set x test set y datasets 2 23 24 计算 batch 的个数 25 n train batches train set x get value borrow True shape 0 26 n valid batches valid set x get value borrow True shape 0 27 n test batches test set x get value borrow True shape 0 28 n train batches batch size 29 n valid batches batch size 30 n test batches batch size 31 32 定义几个变量 index 表示 batch 下标 x 表示输入的训练数据 y对应其标签 33 index T lscalar 34 x T matrix x 35 y T ivector y 36 37 38 BUILD ACTUAL MODEL 39 40 print building the model 41 42 43 我们加载进来的 batch 大小的数据是 batch size 28 28 但是 LeNetConvPoolLayer 的输入是四维的所以要 reshape 44 layer0 input x reshape batch size 1 28 28 45 46 layer0 即第一个 LeNetConvPoolLayer 层 47 输入的单张图片 28 28 经过 conv 得到 28 5 1 28 5 1 24 24 48 经过 maxpooling 得到 24 2 24 2 12 12 49 因为每个 batch 有 batch size 张图第一个 LeNetConvPoolLayer 层有 nkerns 0 个卷积核 50 故 layer0 输出为 batch size nkerns 0 12 12 51 layer0 LeNetConvPoolLayer 52 rng 53 input layer0 input 54 image shape batch size 1 28 28 55 filter shape nkerns 0 1 5 5 56 poolsize 2 2 57 58 59 60 layer1 即第二个 LeNetConvPoolLayer 层 61 输入是 layer0 的输出每张特征图为 12 12 经过 conv 得到 12 5 1 12 5 1 8 8 62 经过 maxpooling 得到 8 2 8 2 4 4 63 因为每个 batch 有 batch size 张图特征图第二个 LeNetConvPoolLayer 层有 nkerns 1 个卷积核 64 故 layer1 输出为 batch size nkerns 1 4 4 65 layer1 LeNetConvPoolLayer 66 rng 67 input layer0 output 68 image shape batch size nkerns 0 12 12 输入 nkerns 0 张特征图即 layer0 输出 nkerns 0 张特征图 69 filter shape nkerns 1 nkerns 0 5 5 70 poolsize 2 2 71 72 73 74 前面定义好了两个 LeNetConvPoolLayer layer0 和 layer1 layer1 后面接 layer2 这是一个全连接层相当于 MLP 里面的隐含层 75 故可以用 MLP 中定义的 HiddenLayer 来初始化 layer2 layer2 的输入是二维的 batch size num pixels 76 故要将上层中同一张图经不同卷积核卷积出来的特征图合并为一维向量 77 也就是将 layer1 的输出 batch size nkerns 1 4 4 flatten 为 batch size nkerns 1 4 4 500 800 作为 layer2 的输入 78 500 800 表示有 500 个样本每一行代表一个样本 layer2 的输出大小是 batch size n out 500 500 79 layer2 input layer1 output flatten 2 80 layer2 HiddenLayer 81 rng 82 input layer2 input 83 n in nkerns 1 4 4 84 n out 500 85 activation T tanh 86 87 88 最后一层 layer3 是分类层用的是逻辑回归中定义的 LogisticRegression 89 layer3 的输入是 layer2 的输出 500 500 layer3 的输出就是 batch size n out 500 10 90 layer3 LogisticRegression input layer2 output n in 500 n out 10 91 92 代价函数 NLL 93 cost layer3 negative log likelihood y 94 95 test model 计算测试误差 x y 根据给定的 index 具体化然后调用 layer3 96 layer3 又会逐层地调用 layer2 layer1 layer0 故 test model 其实就是整个 CNN 结构 97 test model 的输入是 x y 输出是 layer3 errors y 的输出即误差 98 test model theano function 99 index 100 layer3 errors y 101 givens 102 x test set x index batch size index 1 batch s ize 103 y test set y index batch size index 1 batch s ize 104 105 106 validate model 验证模型分析同上 107 validate model theano function 108 index 109 layer3 errors y 110 givens 111 x valid set x index batch size index 1 batch size 112 y valid set y index batch size index 1 batch size 113 114 115 116 下面是 train model 涉及到优化算法即 SGD 需要计算梯度更新参数 117 参数集 118 params layer3 params layer2 params layer1 params layer0 p arams 119 120 对各个参数的梯度 121 grads T grad cost params 122 123 因为参数太多在 updates 规则里面一个一个具体地写出来是很麻烦的所以下面用了一个 for in 自动生成规则对 param i param i learning rate grad i 124 updates 125 param i param i learning rate grad i 126 for param i grad i in zip params grads 127 128 129 train model 代码分析同 test model train model 里比 test model validation model 多出 updates 规则 130 train model theano function 131 index 132 cost 133 updates updates 134 givens 135 x train set x index batch size index 1 batch size 136 y train set y index batch size index 1 batch size 137 138 139 140 141 142 开始训练 143 144 print training 145 patience 10000 146 patience increase 2 147 improvement threshold 0 995 148 149 validation frequency min n train batches patience 2 150 这样设置 validation frequency 可以保证每一次 epoch 都会在验证集上测试 151 152 best validation loss numpy inf 最好的验证集上的 loss 最好即最小 153 best iter 0 最好的迭代次数以 batch 为单位比如 best iter 10000 说明在训练完第 10000 个 batch 时达到 best validation loss 154 test score 0 155 start time time clock 156 157 epoch 0 158 done looping False 159 160 下面就是训练过程了 while 循环控制的时步数 epoch 一个 epoch 会遍历所有的 batch 即所有的图片 161 for 循环是遍历一个个 batch 一次一个 batch 地训练 for 循环体里会用 train model minibatch index 去训练模型 162 train model 里面的 updatas 会更新各个参数 163 for 循环里面会累加训练过的 batch 数 iter 当 iter 是 validation frequency 倍数时则会在验证集上测试 164 如果验证集的损失 this validation loss 小于之前最佳的损失 best validation loss 165 则更新 best validation loss 和 best iter 同时在 testset 上测试 166 如果验证集的损失 this validation loss 小于 best validation loss improvement threshold 时则更新 patience 167 当达到最大步数 n epoch 时或者 patience iter 时结束训练 168 while epoch n epochs and not done looping 169 epoch epoch 1 170 for minibatch index in xrange n train batches 171 172 iter epoch 1 n train batches minibatch index 173 174 if iter 100 0 175 print training iter iter 176 cost ij train model minibatch index 177 cost ij 没什么用后面都没有用到只是为了调用 train model 而 train model 有返回值 178 if iter 1 validation frequency 0 179 180 compute zero one loss on validation set 181 validation losses validate model i for i 182 in xrange n valid batches 183 this validation loss numpy mean validation losses 184 print epoch i minibatch i i validation error f 185 epoch minibatch index 1 n train batches 186 this validation loss 100 187 188 189 if this validation loss best validation loss 190 191 192 if this validation loss best validation loss 193 improvement threshold 194 patience max patience iter patience in crease 195 196 197 best validation loss this validation loss 198 best iter iter 199 200 201 test losses 202 test model i 203 for i in xrange n test batches 204 205 test score numpy mean test losses 206 print epoch i minibatch i i test er ror of 207 best model f 208 epoch minibatch index 1 n train batc hes 209 test score 100 210 211 if patience sys stderr The code for file 221 os path split file 1 222 ran for 2fm end time start time 60 Convolutional Neural Networks LeNet The Convolution Operator ConvOp is the main workhorse for implementing a convolutional layer in Theano ConvOp is used bytheano tensor signal conv2d which takes two symbolic inputs a 4D tensor corresponding to a mini batch of input images The shape of the tensor is as follows mini batch size number of input feature maps image height image width a 4D tensor corresponding to the weight matrix The shape of the tensor is number of feature maps at layer m number of feature maps at layer m 1 filter height filter width Below is the Theano code for implementing a convolutional layer similar to the one of Figure 1 The input consists of 3 features maps an RGB color image of size 120 x160 We use two convolutional filters with 9x9 receptive fields import theano from theano import tensor as T from theano tensor nnet import conv2d import numpy rng numpy random RandomState 23455 instantiate 4D tensor for input input T tensor4 name input initialize shared variable for weights w shp 2 3 9 9 w bound numpy sqrt 3 9 9 W theano shared numpy asarray rng uniform low 1 0 w bound high 1 0 w bound size w shp dtype input dtype name W initialize shared variable for bias 1D tensor with random values IMPORTANT biases are usually initialized to zero However in this particular application we simply apply the convolutional layer to an image without learning the parameters We therefore initialize them to random values to simulate learning b shp 2 b theano shared numpy asarray rng uniform low 5 high 5 size b shp dtype input dtype name b build symbolic expression that computes the convolution of input with filters in w conv out conv2d input W build symbolic expression to add bias and apply activation function i e produce neural net layer output A few words on dimshuffle dimshuffle is a powerful tool in reshaping a tensor what it allows you to do is to shuffle dimension around but also to insert new ones along which the tensor will be broadcastable dimshuffle x 2 x 0 1 This will work on 3d tensors with no broadcastable dimensions The first dimension will be broadcastable then we will have the third dimension of the input tensor as the second of the resulting tensor etc If the tensor has shape 20 30 40 the resulting tensor will have dimensions 1 40 1 20 30 AxBxC tensor is mapped to 1xCx1xAxB tensor More examples dimshuffle x make a 0d scalar into a 1d vector dimshuffle 0 1 identity dimshuffle 1 0 inverts the first and second dimensions dimshuffle x 0 make a row out of a 1d vector N to 1xN dimshuffle 0 x make a column out of a 1d vector N to Nx1 dimshuffle 2 0 1 AxBxC to CxAxB dimshuffle 0 x 1 AxB to Ax1xB dimshuffle 1 x 0 AxB to Bx1xA output T nnet sigmoid conv out b dimshuffle x 0 x x create theano function to compute filtered images f theano function input output MaxPooling from theano tensor signal import pool input T dtensor4 input maxpool shape 2 2 pool out pool pool 2d input maxpool shape ignore border True f theano function input pool out invals numpy random RandomState 1 rand 3 2 5 5 print With ignore border set to True print invals 0 0 n invals 0 0 print output 0 0 n f invals 0 0 pool out pool pool 2d input maxpool shape ignore border False f theano function input pool out print With ignore border set to False print invals 1 0 n invals 1 0 print output 1 0 n f invals 1 0 The Full Model LeNet 请注意术语卷积可以对应于不同的数学运算 theano tensor nnet conv2d 这是几乎所有的最近发表的卷积模型最常用的一个在该操作中每个输出特征映射通过不同的 2d 滤波器连接到每个输入特征映射其值是通过相应滤波器的所有输入的单个卷积的和在原来的 LeNet 模型的卷积在这项工作中每个输出特征映射只能连接到输入特征映射的一个子集用于信号处理的卷积 theano tensor signal conv conv2d 它只适用于单通道输入在这里我们使用的第一个操作所以这个模型略有不同从原来的 LeNet 研究使用 2 的原因之一将减少所需的计算量但现代硬件使其具有完全连接模式的快速性另一个原因是稍微减少自由参数的数量但是我们还有其他的正则化技术 class LeNetConvPoolLayer object Pool Layer of a convolutional network def init self rng input filter shape image shape poolsize 2 2 Allocate a LeNetConvPoolLayer with shared variable internal parameters type rng numpy random RandomState param rng a random number generator used to initialize weights type input theano tensor dtensor4 param input symbolic image tensor of shape image shape type filter shape tuple or list of length 4 param filter shape number of filters num input feature maps filter height filter width type image shape tuple or list of length 4 param image shape batch size num input feature maps image height image width type poolsize tuple or list of length 2 param poolsize the downsampling pooling factor rows cols assert image shape 1 filter shape 1 self input input there are num input feature maps filter height filter width inputs to each hidden unit fan in numpy prod filter shape 1 each unit in the lower layer receives a gradient from num output feature maps filter height filter width pooling size fan out filter shape 0 numpy prod filter shape 2 numpy prod poolsize initialize weights with random weights W bound numpy sqrt 6 fan in fan out self W theano shared numpy asarray rng uniform low W bound high W bound size filter shape dtype theano config floatX borrow True the bias is a 1D tensor one bias per output feature map b values numpy zeros filter shape 0 dtyp

展开阅读全文