Tensorflow Conv2D和MaxPool2D原理

本文連結地址:

「連結」

卷積神經網路（CNN）是指在所有網路中，至少優一層使用了卷積運算運算的神經網路，因此命名為卷積神經網路。

那麼什麼是卷積呢？如果需要卷積一個二點陣圖片，首先定義一個卷積核（kernel），即權重矩陣，它能表面每一次卷積那個方向的值更重要，然後逐步在二維輸入資料上“掃描卷積”。當卷積核“滑動”的同時進行卷積：計算權重矩陣和掃描所得的資料矩陣的乘積，求和後得到一個畫素輸出。

步長（Stride）為每次卷積核移動格數，填充（Padding）為是否對元素資料進行邊緣填充。當不填充的時候，即Tensorflow中的VALID選項，卷積後的資料會比輸入資料小，而加入合適的填充後，即Tensorflow中的SAME選項，卷積後的資料可以保持和輸入資料大小一致。如下圖所示：

那麼什麼是池化呢？池化也稱為欠取樣或下采樣。主要用於特徵降維，在保持旋轉、平移、伸縮等不變性的前提下，壓縮資料和引數的數量，減小過擬合，同時提高模型的容錯性。常用的有按均值池化（mean-pooling）：更大地保留影象背景資訊，按最大值池化（max-pooling）：更多的保留紋理資訊。如下圖所示，池化大大壓縮了資料：

原理

以下面的測試程式來討論Conv2D（卷積）和MaxPool2D（池化）的具體實現原理：

首先定義一個具有1張圖，大小為6X6，通道為1個，即輸入資料img的shape為［1，6，6，1］。

然後定義過濾器filter，NHWC格式，即批次，高，寬，通道格式，和輸入資料是同一個格式。

測試程式

如下程式和圖片示例，一個 3X3 的filter（卷積核的特徵是提取圖片從左上到右下的特徵）在 6X6 的圖片上，按照步長 1X1 從左往右，從上往下計算蒙版區域輸出結果。

卷積計算：

圖例1的卷積計算為：1*1 + 0*-1 + 0*-1 + 0*-1 + 1*1 + 0*-1 + 0*-1 + 0*-1 + 1*1 = 3

圖例2的卷積計算為：1*1 + 0*-1 + 0*-1 + 0*-1 + 1*1 + 1*-1 + 0*-1 + 0*-1 + 0*1 = 1

池化計算：

圖例3，4，5，6分別取區域內的最大值做池化運算（提取圖片的紋理特徵而不是背景）。

由此，從原來的 6X6 矩陣變成了 2X2 矩陣。

fitler1是一個filter的情況，輸出：

******************** op2 ********************tf。Tensor（［［［［3。］［0。］］［［3。］［1。］］］］， shape=（1， 2， 2， 1）， dtype=float32）

filter3是2個filter的情況，輸出：

******************** op4 ********************tf。Tensor（［［［［ 3。 -1。］［ 0。 1。］］［［ 3。 0。］［ 1。 3。］］］］， shape=（1， 2， 2， 2）， dtype=float32）

下面為整個程式程式碼：

from __future__ import division， print_function， absolute_import import tensorflow as tf img = tf。reshape（［ 1。0， 0。0， 0。0， 0。0， 0。0， 1。0， 0。0， 1。0， 0。0， 0。0， 1。0， 0。0， 0。0， 0。0， 1。0， 1。0， 0。0， 0。0， 1。0， 0。0， 0。0， 0。0， 1。0， 0。0， 0。0， 1。0， 0。0， 0。0， 1。0， 0。0， 0。0， 0。0， 1。0， 0。0， 1。0， 0。0］，［1，6，6，1］）filter1 = tf。reshape（［ 1。0， -1。0， -1。0， -1。0， 1。0， -1。0， -1。0， -1。0， 1。0］，［3，3，1，1］）filter2 = tf。reshape（［ -1。0， 1。0， -1。0， -1。0， 1。0， -1。0， -1。0， 1。0， -1。0］，［3，3，1，1］）print（img）print（filter1）print（filter2）op1 = tf。nn。conv2d（img， filter1， strides=［1， 1， 1， 1］， padding=‘VALID’） print（‘*’ * 20 + ‘ op1 ’ + ‘*’ * 20）print（op1） op2 = tf。nn。max_pool2d（input=op1， ksize=［2，2］， strides=［2，2］， padding=‘VALID’）print（‘*’ * 20 + ‘ op2 ’ + ‘*’ * 20）print（op2） filter3 = tf。reshape（［ 1。0，-1。0， -1。0，1。0， -1。0，-1。0， -1。0，-1。0， 1。0，1。0， -1。0，-1。0， -1。0，-1。0， -1。0，1。0， 1。0，-1。0］，［3，3，1，2］）op3 = tf。nn。conv2d（img， filter3， strides=［1， 1， 1， 1］， padding=‘VALID’） print（‘*’ * 20 + ‘ op3 ’ + ‘*’ * 20）print（op3） op4 = tf。nn。max_pool2d（input=op3， ksize=［2，2］， strides=［2，2］， padding=‘VALID’）print（‘*’ * 20 + ‘ op4 ’ + ‘*’ * 20）print（op4）

程式輸出為：

tf。Tensor（［［［［1。］［0。］［0。］［0。］［0。］［1。］］［［0。］［1。］［0。］［0。］［1。］［0。］］［［0。］［0。］［1。］［1。］［0。］［0。］］［［1。］［0。］［0。］［0。］［1。］［0。］］［［0。］［1。］［0。］［0。］［1。］［0。］］［［0。］［0。］［1。］［0。］［1。］［0。］］］］， shape=（1， 6， 6， 1）， dtype=float32）tf。Tensor（［［［［ 1。］］［［-1。］］［［-1。］］］［［［-1。］］［［ 1。］］［［-1。］］］［［［-1。］］［［-1。］］［［ 1。］］］］， shape=（3， 3， 1， 1）， dtype=float32）tf。Tensor（［［［［-1。］］［［ 1。］］［［-1。］］］［［［-1。］］［［ 1。］］［［-1。］］］［［［-1。］］［［ 1。］］［［-1。］］］］， shape=（3， 3， 1， 1）， dtype=float32）******************** op1 ********************tf。Tensor（［［［［ 3。］［-1。］［-3。］［-1。］］［［-3。］［ 1。］［ 0。］［-3。］］［［-3。］［-3。］［ 0。］［ 1。］］［［ 3。］［-2。］［-2。］［-1。］］］］， shape=（1， 4， 4， 1）， dtype=float32）******************** op2 ********************tf。Tensor（［［［［3。］［0。］］［［3。］［1。］］］］， shape=（1， 2， 2， 1）， dtype=float32）******************** op3 ********************tf。Tensor（［［［［ 3。 -1。］［-1。 -1。］［-3。 -1。］［-1。 -1。］］［［-3。 -1。］［ 1。 -1。］［ 0。 -2。］［-3。 1。］］［［-3。 -1。］［-3。 -1。］［ 0。 -2。］［ 1。 1。］］［［ 3。 -1。］［-2。 0。］［-2。 -4。］［-1。 3。］］］］， shape=（1， 4， 4， 2）， dtype=float32）******************** op4 ********************tf。Tensor（［［［［ 3。 -1。］［ 0。 1。］］［［ 3。 0。］［ 1。 3。］］］］， shape=（1， 2， 2， 2）， dtype=float32）

別眨眼網

Tensorflow Conv2D和MaxPool2D原理

相關推薦