"Save to image file".
For the sake of simplicity, this tutorial will use the image data from the database of tensorflow.keras.
Google Colab can use tensorflow by default, but if tensorflow is not installed in your jupyter notebook environment, install it whth the pip command.
## If you use jupyter notebook on your PC, please install 'tensorflow' package.
# ! pip install tensorflow
uint8 type arrays of [0, 255] and color images to float32 type arrays of [0, 1], and then apply Axes.imshow().
Parameters:
X: Numpy Array or PIL image
Shapes of the supported Numpy array type:
(M, N) ... a gray scale image. call it with cmap='gray', vmin=0, vmax=255
(M, N, 3 or 4) ... a color image. Numpy array with 'float32' type elements of range[0.0, 1.0], or with uint8' type elements of range [0, 255].
For a gray scale image (number of channel = 1), convert it to a Numpy array with 'unit8' type elements of range [0, 255]. And for a color image (number of channels = 3 or 4), convert it to a Numpy array with 'float32' type element of range [0.0, 1.0].
Then apply the 'Axes.imshow()' function to the image.
Axes.imshow()¶The image displayed must be a 2-dimensional array. If the elements of the array are integers, the brightness (0: black, 255: white) is expressed in range [0, 255]. When the element of the array are 'float', the brightness (0.0: black, 1.0: white) is expressed in the range [0.0, 1.0].
import tensorflow as tf
print(tf.__version__)
2.8.0
# Prepare the image data of MNIST (handwritten characters)
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
print(x_train.shape, y_train.shape, x_test.shape, y_test.shape)
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz 11493376/11490434 [==============================] - 0s 0us/step 11501568/11490434 [==============================] - 0s 0us/step (60000, 28, 28) (60000,) (10000, 28, 28) (10000,)
# Examine the types of elements of the Numpy array of a gray scale image.
type(x_train[0][0][0])
numpy.uint8
# sample code 7-1
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
fig, ax = plt.subplots(1, 2, figsize=(2.8 * 2, 2.8))
img1 = x_train[0]
img2 = 255 - img1 # invert image
ax[0].imshow(img1, cmap='gray')
ax[0].axis('off')
ax[1].imshow(img2, cmap='gray')
ax[1].axis('off')
plt.show()
Axes.imshow()¶Color image data is a tensor in (Rows, Cols, Channels) or (Channels, Rows, Cols) format. The number of channels element is 3 which represents the brightness of RGB.
In CIFAR10 using now, the image format is (Rows, Cols, Channels).
When each element of RGB is an integer, it represents brightness (0: dark, 255: bright) in the range [0, 255]. When each element of RGB is a floating fraction, the brightness (0.0: dark, 1.0: bright) is expressed in the range [0.0, 1.0].
# Prepare image data of cifar10
import tensorflow as tf
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
print(x_train.shape, y_train.shape, x_test.shape, y_test.shape)
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz 170500096/170498071 [==============================] - 4s 0us/step 170508288/170498071 [==============================] - 4s 0us/step (50000, 32, 32, 3) (50000, 1) (10000, 32, 32, 3) (10000, 1)
# Examine the types of elements of the Numpy array of a RGB image.
type(x_train[0][0][0][0])
numpy.uint8
# sample code 7-2
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
fig, ax = plt.subplots(1, 2, figsize=(3.2 * 2, 3.2))
img1 = x_train[6] # [0, 255]
img2 = img1.astype('float32') / 255. # [0.0, 1.0]
ax[0].imshow(img1)
ax[0].axis('off')
ax[1].imshow(img2)
ax[1].axis('off')
plt.show()
# The types of elements of the Numpy array of a RGB image.
# When using images in Deep Learning, it is easier to use 'float32' type in the range of [0.0, 1.0] or [-1.0, 1.0].
type(img2[0,0,0])
numpy.float32
In preparation for 7-3, download (a part of) the face image file of VidTIMIT dataset from the network and extract it.
Official WWW of VidTIMIT dataset:
http://conradsanderson.id.au/vidtimit/
zip files of 2 persons of VidTIMIT dataset:
https://zenodo.org/record/158963/files/fadg0.zip
https://zenodo.org/record/158963/files/faks0.zip
# Download data from the specified URL to the specified path.
import os
import urllib.request
url = 'https://zenodo.org/record/158963/files/fadg0.zip'
filepath = 'data/fadg0.zip'
dpath, fname = os.path.split(filepath)
os.makedirs(dpath, exist_ok=True)
urllib.request.urlretrieve(url, filepath)
('data/fadg0.zip', <http.client.HTTPMessage at 0x7f0a1626e250>)
# Examine the downloaded file.
if os.name == 'nt':
LS = 'dir'
LS_R = 'dir /s'
else:
LS = 'ls -l'
LS_R = 'ls -lR'
!{LS} data
total 79684 -rw-r--r-- 1 root root 81593138 Mar 28 13:43 fadg0.zip
# Extract the zip file to the specified folder.
import zipfile
with zipfile.ZipFile(filepath, 'r') as f:
f.extractall(dpath)
! {LS} data
total 79688 drwxr-xr-x 4 root root 4096 Mar 28 13:43 fadg0 -rw-r--r-- 1 root root 81593138 Mar 28 13:43 fadg0.zip
! {LS} data/fadg0
total 8 drwxr-xr-x 2 root root 4096 Mar 28 13:43 audio drwxr-xr-x 15 root root 4096 Mar 28 13:43 video
! {LS} data/fadg0/video
total 84 drwxr-xr-x 2 root root 12288 Mar 28 13:43 head drwxr-xr-x 2 root root 12288 Mar 28 13:43 head2 drwxr-xr-x 2 root root 20480 Mar 28 13:43 head3 drwxr-xr-x 2 root root 4096 Mar 28 13:43 sa1 drwxr-xr-x 2 root root 4096 Mar 28 13:43 sa2 drwxr-xr-x 2 root root 4096 Mar 28 13:43 si1279 drwxr-xr-x 2 root root 4096 Mar 28 13:43 si1909 drwxr-xr-x 2 root root 4096 Mar 28 13:43 si649 drwxr-xr-x 2 root root 4096 Mar 28 13:43 sx109 drwxr-xr-x 2 root root 4096 Mar 28 13:43 sx19 drwxr-xr-x 2 root root 4096 Mar 28 13:43 sx199 drwxr-xr-x 2 root root 4096 Mar 28 13:43 sx289 drwxr-xr-x 2 root root 4096 Mar 28 13:43 sx379
It is asumed that the image file is in the following path.
./data/fadg0/video/head/[0-9]*
load_img() function of 'tensorflow.keras' to load image data from an image file.load_img() / img_to_array() / array_to_img() / save_img() function is in either
Since the return value of the load_img() function is PIL format image data, it is easier to use later if it is converted to a Numpy array.
image_pil = load_img(path)
load_img() function might be immediately converted from PIL format to Numpy array. load_img() / img_to_array() / array_to_img() / save_img() function is in either
load_img() function is PIL format image data, it is easier to use later if it is converted to a Numpy array.
You can use the img_to_array() function for this conversion, but the numpy.array() function seems to be more popular.
The img_to_array() function returns a Numpy array whose element is 'uint8' of range [0, 255].
Applying np.array() to PIL format data without specifying dtype parameter also returns a Numpy array whose element is 'uint8' of range [0, 255].
image_uint8 = np.array(imgage_pil)
When passing image data through a neural network, convert it to the Numpy array with 'float32' element type of the range [0.0, 1.0] or [-1.0, 1.0].
Use the following code to convert from a Numpy array of 'uint8' element type of range [0, 255].
image = image_uint8.astype('float32') / 255. # [0, 255] --> [0, 1]
or
image = image_uint8.astype('float32') / 127.5 - 127.5 # [0, 255] --> [-1, 1]
The code of the reverse conversion is as follows.
image_uint8 = (image * 255).astype('uint8') # [0, 1] ---> [0, 255]
or
image_uint8 = ((image + 1) * 127.5).astype('uint8') # [-1, 1] --> [0, 255]
# Get the file paths at once
import os
import glob
DATA_DIR = './data/fadg0/video/head'
import re
def atoi(text):
return int(text) if text.isdigit() else text
def natural_keys(text):
return [ atoi(c) for c in re.split(r'(\d+)', text)]
# Use glob.glob to load the files in name order. The key argument was specified to support numbers in filenames.
DATA_PATHS = sorted(glob.glob(os.path.join(DATA_DIR, '*')), key=natural_keys)
print(len(DATA_PATHS))
print(DATA_PATHS[0])
346 ./data/fadg0/video/head/001
# sample code 7-3
# Loading image files and converting them to Numpy arrays.
import numpy as np
import tensorflow as tf
image_uint8 = np.array(tf.keras.preprocessing.image.load_img(DATA_PATHS[0]))
image = image_uint8.astype('float32') / 255.0
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots(1,1,figsize=(6,6))
ax.imshow(image)
ax.axis('off')
plt.show()
When displaying or saving a color image data, convert it to the Numpy array of 'float32' element with range[0.0, 1.0]. Use the following code to convert a Numpy array of 'float32' element type of the range [-1.0, 1.0] to the range [0.0, 1.0].
To convert between a Numpy array with 'float32' element type of range [0.0, 1.0] and that of range [-1.0, 1.0].
image = image * 2 - 1 # [0, 1] ---> [-1, 1]
image = (image + 1) / 2 # [-1, 1] --> [0, 1]
Use the numpy.clip() function to guarantee the range of element values.
Parameters:
a: array
a_min: Change value less than a_min to a_min
a_max: Change value greater than a_max to a_max
Returns:
clipped_array: array whose element value range is [a_min, a_max]
image = np.clip(image, 0, 1) # clipping element values between 0 and 1.
# sample code 7-4
# [0, 1] --> [-1, 1]
imageMP = image * 2 - 1
# [-1, 1] --> [0, 1]
image2 = np.clip((imageMP + 1) * 0.5, 0.0, 1.0)
save_img() function to save image data in Numpy array to a file.The function of load_img() / img_to_array() / array_to_img() / save_img() is in either
The format of the image file to be saved can be specified by the file_format parameter, but if omitted, it is determined from the extension of the file name.
image_pil = load_img(path)
Image data might be treaded as a Numpy array of the element type 'float32' and value range [0.0, 1.0] or [-1.0, 1.0]. If the range of element is [-1.0, 1.0], it is necessary to be converted to the range of [0.0, 1.0] when displaying or saving it.
To save it, use the save_img() function of 'tensorflow.keras'.
image data to the file path.# sample code 7-5
import tensorflow as tf
save_path = 'data/new_image.jpg'
tf.keras.preprocessing.image.save_img(save_path, image)
! {LS} {save_path}
-rw-r--r-- 1 root root 14875 Mar 28 13:43 data/new_image.jpg