Chapter 5: Implementing Convolutional Neural Network

この章では、次の2つの話題を扱う。

  • 子宮頚癌 (cervical cancer)
  • 数字認識 (digit recognition)

子宮頸癌のクラス分類

データセットは https://www.kaggle.com/c/intel-mobileodt-cervical-cancer-screening からダウンロードできる。

  • train.7z : training set. 画像が Type_1, Type_2, Type_3 に分類されてラベル付けされている。
  • test.7z : test set

最初に、githubから https://github.com/ml-resources/deeplearning-keras/tree/ed1/ch05 をcloneすること。

[注意] Kaggleから古いデータがダウンロードできないようなので、このトピックスをとりあえずskipする。


Digit recognition

28x28 の白黒画像(輝度0-255)を、'0', '1', ..., '9' の10クラスに分類する。

In [1]:
from keras.datasets import mnist

(X_train, y_train), (X_test, y_test) = mnist.load_data()
Using TensorFlow backend.
In [2]:
import numpy as np
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Dense, Flatten, Dropout
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.layers.core import Activation

np.random.seed(0)

matplotlib.pyplot

subplot(nrows, ncols, index, **kwargs)
subplot(pos, **kwargs)
subplot(ax)

pos は3桁の数字で、nrows, ncols, index の意味を持つ。 index は 1 から始まることに注意。 下の例で221を指定している場合は、2列2行の枠の1番目(最初)に表示する。

In [4]:
import matplotlib.pyplot as plt
from keras import backend as K

K.set_image_data_format('channels_last')

# plot 4 images
plt.subplot(221)
plt.imshow(X_train[1], cmap=plt.get_cmap('gray'))
plt.subplot(222)
plt.imshow(X_train[2], cmap=plt.get_cmap('gray'))
plt.subplot(223)
plt.imshow(X_train[3], cmap=plt.get_cmap('gray'))
plt.subplot(224)
plt.imshow(X_train[4], cmap=plt.get_cmap('gray'))

plt.show()
In [5]:
# normalize the dataset
X_train = X_train / 255
X_test = X_test / 255

# data exploration
print("Number of training examples =", X_train.shape[0])
print("Number of classes =", len(np.unique(y_train)))
print("Dimension of images =", X_train[1].shape)

unique, count = np.unique(y_train, return_counts=True)
print("Number of occurrences of each class =", dict(zip(unique, count)))
Number of training examples = 60000
Number of classes = 10
Dimension of images = (28, 28)
Number of occurrences of each class = {0: 5923, 1: 6742, 2: 5958, 3: 6131, 4: 5842, 5: 5421, 6: 5918, 7: 6265, 8: 5851, 9: 5949}
In [6]:
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1).astype('float32')
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1).astype('float32')
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
In [7]:
model = Sequential()
model.add(Conv2D(40, kernel_size=5, padding="same", input_shape=(28, 28, 1), activation='relu'))
model.add(Conv2D(50, kernel_size=5, padding="valid", activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2,2)))
model.add(Dropout(0.2))

model.add(Flatten())
model.add(Dense(100))
model.add(Activation("relu"))
model.add(Dropout(0.2))
model.add(Dense(10))
model.add(Activation("softmax"))

model.summary()
WARNING: Logging before flag parsing goes to stderr.
W0802 13:10:16.857734 21720 deprecation_wrapper.py:119] From D:\sys\Anaconda3\envs\deep-graph\lib\site-packages\keras\backend\tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

W0802 13:10:17.159691 21720 deprecation_wrapper.py:119] From D:\sys\Anaconda3\envs\deep-graph\lib\site-packages\keras\backend\tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0802 13:10:17.262594 21720 deprecation_wrapper.py:119] From D:\sys\Anaconda3\envs\deep-graph\lib\site-packages\keras\backend\tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

W0802 13:10:17.439008 21720 deprecation_wrapper.py:119] From D:\sys\Anaconda3\envs\deep-graph\lib\site-packages\keras\backend\tensorflow_backend.py:3976: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

W0802 13:10:17.442008 21720 deprecation_wrapper.py:119] From D:\sys\Anaconda3\envs\deep-graph\lib\site-packages\keras\backend\tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

W0802 13:10:17.457010 21720 deprecation.py:506] From D:\sys\Anaconda3\envs\deep-graph\lib\site-packages\keras\backend\tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 28, 28, 40)        1040      
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 24, 24, 50)        50050     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 12, 12, 50)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 12, 12, 50)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 7200)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 100)               720100    
_________________________________________________________________
activation_1 (Activation)    (None, 100)               0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 100)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1010      
_________________________________________________________________
activation_2 (Activation)    (None, 10)                0         
=================================================================
Total params: 772,200
Trainable params: 772,200
Non-trainable params: 0
_________________________________________________________________
In [8]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=32, batch_size=200, validation_split=0.2)
scores = model.evaluate(X_test, y_test, verbose=10)
print(scores)
W0802 13:10:23.466839 21720 deprecation_wrapper.py:119] From D:\sys\Anaconda3\envs\deep-graph\lib\site-packages\keras\optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

W0802 13:10:23.495837 21720 deprecation_wrapper.py:119] From D:\sys\Anaconda3\envs\deep-graph\lib\site-packages\keras\backend\tensorflow_backend.py:3295: The name tf.log is deprecated. Please use tf.math.log instead.

W0802 13:10:23.844546 21720 deprecation.py:323] From D:\sys\Anaconda3\envs\deep-graph\lib\site-packages\tensorflow\python\ops\math_grad.py:1250: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Train on 48000 samples, validate on 12000 samples
Epoch 1/32
48000/48000 [==============================] - 99s 2ms/step - loss: 0.2473 - acc: 0.9239 - val_loss: 0.0631 - val_acc: 0.9806
Epoch 2/32
48000/48000 [==============================] - 94s 2ms/step - loss: 0.0663 - acc: 0.9794 - val_loss: 0.0517 - val_acc: 0.9848
Epoch 3/32
48000/48000 [==============================] - 97s 2ms/step - loss: 0.0454 - acc: 0.9857 - val_loss: 0.0379 - val_acc: 0.9893
Epoch 4/32
48000/48000 [==============================] - 94s 2ms/step - loss: 0.0339 - acc: 0.9891 - val_loss: 0.0384 - val_acc: 0.9886
Epoch 5/32
48000/48000 [==============================] - 94s 2ms/step - loss: 0.0273 - acc: 0.9913 - val_loss: 0.0416 - val_acc: 0.9882
Epoch 6/32
48000/48000 [==============================] - 94s 2ms/step - loss: 0.0250 - acc: 0.9921 - val_loss: 0.0381 - val_acc: 0.9898
Epoch 7/32
48000/48000 [==============================] - 96s 2ms/step - loss: 0.0188 - acc: 0.9938 - val_loss: 0.0380 - val_acc: 0.9891
Epoch 8/32
48000/48000 [==============================] - 100s 2ms/step - loss: 0.0167 - acc: 0.9941 - val_loss: 0.0347 - val_acc: 0.9907
Epoch 9/32
48000/48000 [==============================] - 99s 2ms/step - loss: 0.0134 - acc: 0.9956 - val_loss: 0.0359 - val_acc: 0.9899
Epoch 10/32
48000/48000 [==============================] - 97s 2ms/step - loss: 0.0122 - acc: 0.9959 - val_loss: 0.0339 - val_acc: 0.9916
Epoch 11/32
48000/48000 [==============================] - 94s 2ms/step - loss: 0.0101 - acc: 0.9967 - val_loss: 0.0390 - val_acc: 0.9905
Epoch 12/32
48000/48000 [==============================] - 94s 2ms/step - loss: 0.0116 - acc: 0.9961 - val_loss: 0.0371 - val_acc: 0.9901
Epoch 13/32
48000/48000 [==============================] - 96s 2ms/step - loss: 0.0099 - acc: 0.9967 - val_loss: 0.0378 - val_acc: 0.9912
Epoch 14/32
48000/48000 [==============================] - 98s 2ms/step - loss: 0.0105 - acc: 0.9965 - val_loss: 0.0423 - val_acc: 0.9902
Epoch 15/32
48000/48000 [==============================] - 96s 2ms/step - loss: 0.0079 - acc: 0.9973 - val_loss: 0.0382 - val_acc: 0.9903
Epoch 16/32
48000/48000 [==============================] - 96s 2ms/step - loss: 0.0083 - acc: 0.9971 - val_loss: 0.0448 - val_acc: 0.9897
Epoch 17/32
48000/48000 [==============================] - 94s 2ms/step - loss: 0.0085 - acc: 0.9971 - val_loss: 0.0448 - val_acc: 0.9906
Epoch 18/32
48000/48000 [==============================] - 94s 2ms/step - loss: 0.0069 - acc: 0.9977 - val_loss: 0.0370 - val_acc: 0.9919
Epoch 19/32
48000/48000 [==============================] - 95s 2ms/step - loss: 0.0089 - acc: 0.9967 - val_loss: 0.0346 - val_acc: 0.9918
Epoch 20/32
48000/48000 [==============================] - 94s 2ms/step - loss: 0.0063 - acc: 0.9978 - val_loss: 0.0401 - val_acc: 0.9911
Epoch 21/32
48000/48000 [==============================] - 96s 2ms/step - loss: 0.0060 - acc: 0.9978 - val_loss: 0.0430 - val_acc: 0.9909
Epoch 22/32
48000/48000 [==============================] - 98s 2ms/step - loss: 0.0069 - acc: 0.9976 - val_loss: 0.0411 - val_acc: 0.9918
Epoch 23/32
48000/48000 [==============================] - 96s 2ms/step - loss: 0.0059 - acc: 0.9980 - val_loss: 0.0383 - val_acc: 0.9928
Epoch 24/32
48000/48000 [==============================] - 94s 2ms/step - loss: 0.0057 - acc: 0.9983 - val_loss: 0.0402 - val_acc: 0.9919
Epoch 25/32
48000/48000 [==============================] - 94s 2ms/step - loss: 0.0046 - acc: 0.9983 - val_loss: 0.0499 - val_acc: 0.9895
Epoch 26/32
48000/48000 [==============================] - 95s 2ms/step - loss: 0.0049 - acc: 0.9983 - val_loss: 0.0441 - val_acc: 0.9921
Epoch 27/32
48000/48000 [==============================] - 94s 2ms/step - loss: 0.0048 - acc: 0.9984 - val_loss: 0.0478 - val_acc: 0.9913
Epoch 28/32
48000/48000 [==============================] - 95s 2ms/step - loss: 0.0052 - acc: 0.9982 - val_loss: 0.0510 - val_acc: 0.9917
Epoch 29/32
48000/48000 [==============================] - 93s 2ms/step - loss: 0.0058 - acc: 0.9982 - val_loss: 0.0465 - val_acc: 0.9911
Epoch 30/32
48000/48000 [==============================] - 96s 2ms/step - loss: 0.0040 - acc: 0.9986 - val_loss: 0.0474 - val_acc: 0.9910
Epoch 31/32
48000/48000 [==============================] - 96s 2ms/step - loss: 0.0047 - acc: 0.9985 - val_loss: 0.0418 - val_acc: 0.9913
Epoch 32/32
48000/48000 [==============================] - 94s 2ms/step - loss: 0.0045 - acc: 0.9985 - val_loss: 0.0448 - val_acc: 0.9921
[0.03423846114096948, 0.993]
In [ ]: