ML-Fundamentals - Images Data Augmentation

Introduction
Requirements
- Knowledge
- Modules
Data
Exercises
Summary and Outlook
Literature
Licenses

Introduction

When trying to solve image recognition tasks like image classification or image segmentation having enough training data is one of the most important things. But even if you might think you have enough, more is always better.

In this notebook you will practice common techniques to get the most out of your data set.

Requirements

Knowledge

You should have a basic knowledge of:

numpy

Suitable sources for acquiring this knowledge are:

numpy quickstart

Python Modules

By deep.TEACHING convention, all python modules needed to run the notebook are loaded centrally at the beginning.

import numpy as np
import matplotlib.pyplot as plt
from scipy import ndimage

Data

Here we define a batch of images we will use to test our methods. They contain the least features necessary to do the planned operations and do not distract from the effect by containing too many details. Just execute the cells.

n_images = 2
width = 8
height = 8
colors = 3
images = np.full((n_images,height,width,colors), 255)

def plot_images(imgs):
    for i in range(len(imgs)):
        ax = plt.subplot(1,len(imgs),i+1)
        ax.imshow(imgs[i])

w_tmp = width//2
h_tmp = height//2

images[0,:h_tmp,:w_tmp,0] = 50
images[0,:h_tmp,:w_tmp,1] = 50
images[0,h_tmp:,:w_tmp,1] = 0
images[0,h_tmp:,:w_tmp,2] = 0
    
images[1,:h_tmp,:w_tmp,0] = 50
images[1,:h_tmp,:w_tmp,2] = 50
images[1,:h_tmp,w_tmp:,1] = 150
images[1,:h_tmp,w_tmp:,2] = 0

images[images > 255] = 255
plot_images(images)

Exercises

Let us practice some common techniques here. Always keep in mind, that not all techniques can be applied (or make sense) to all kinds of images as explained briefly at the individual exercises.

Exercise - Mirroring

Mirroring, or also flipping, has the advantage, that the aspect ratio of the image does not change. An image of 100x50 pixel e.g. will stay 100x50. Real world images like photographs of houses, animals, cars, etc. can most likely always be mirrored at the vertical axes. Mirroring up and down is often not a good idea since it does not make much sense to have a photo where the sky is at the bottom and the car is driving on the top. Cases where it might make sense are photos which were take from the air only showing the ground, such as images of microscopy or astronomy.

Task:

Implement the functions to mirror the whole batch of images and return the result.

Hint:

Always use numpy methods. You should never need to implement a loop in this exercise.

Your plots should look like the following:

internet connection needed

def mirror_batch_up_down(imgs):
    raise NotImplementedError()
    
def mirror_batch_left_right(imgs):
    raise NotImplementedError()

plot_images(mirror_batch_up_down(images))

plot_images(mirror_batch_left_right(images))

Exercise - Cropping

Cropping is an excellent technique to multiply your data set by a big order. Imagine you have images of the size 256x256. By cropping a random 224x224 piece, you get$ (256-224)^2 = 32^2 = 1024 $ different crops from it! But take care you do not crop too much, so meaningful parts do not vanish. For example, an image labeled with the class cat and the cat sits in the 32x32 bottom left corner only.

Task:

Implement the function to randomly crop a$ a x a $ piece from your batch, with ,$ 0 < a < 8 $

The region that is cropped is randomly picked. Your plots may look like the following:

internet connection needed

def crop(imgs, a):
    raise NotImplementedError()

plot_images(crop(images, 5))

Exercise - Rotation

There are many possibilities we can rotate our images. The easiest way, which does not create any artifacts, are rotations by 90, 180 and 270 degrees. Note that the aspect ratio changes when rotating by 90 or 270 degrees. Again these heavy rotations do often only make sense with data of microscopy or astronomy. Rotation by smaller angles (like 5 degrees) on the other hand can be used with real world photos. However, there you have to handle artifacts, which will occur. Also the image will not fit in the same array anymore. A common practice there is to rotate by like 2, 3 degrees and then just crop the center of the image, so it still fits into an array.

Task:

Implement the functions to rotate the whole batch of images by a multiple (param k) of 90 degrees.
Implement the function to rotate the batch by 45 degrees and crop as little as needed to fit the image into an array.

Hint:

For 2. you can use methods of the scipy.ndimage package.

Your plots should look like the following:

internet connection needed

def rotate_batch_by_k_times_90_degree(imgs, k):
    raise NotImplementedError()

plot_images(rotate_batch_by_k_times_90_degree(images, 1))

def rotate_batch(imgs, degree=45):
    raise NotImplementedError()

plot_images(rotate_batch(images, 45))

Exercise - Noise

Adding noise is also a good technique to help your model generalize better.

Task:

Implement the function to apply a small random noise to your images.

Your plots may look like the following:

internet connection needed

def add_noise(imgs):
    raise NotImplementedError()

plot_images(add_noise(images))

Exercise - Combine Everything

Task:

Finally implement the method get_augmented_data, which combines all methods randomly with a certain chance (except rotatation of 45 degree).

Your plots should look like the following:

internet connection needed

def get_augmented_data(imgs):
    raise NotImplementedError()

plot_images(get_augmented_data(images))

Summary and Outlook

In this exercise you implemented techniques to modify a batch of images

mirroring
cropping
rotation
adding noise

You learned in which scenarios each of the methods are useful and how you combine them to get the most out of your dataset.

Licenses

Notebook License (CC-BY-SA 4.0)

The following license applies to the complete notebook, including code cells. It does however not apply to any referenced external media (e.g., images).

Exercise: Images Data Augmentation
by Klaus Strohmenger
is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at https://gitlab.com/deep.TEACHING.

Code License (MIT)

The following license only applies to code cells of the notebook.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

ML-Fundamentals - Images Data Augmentation

Table of Contents

Introduction

Requirements

Knowledge

Python Modules

Data

Exercises

Exercise - Mirroring

Exercise - Cropping

Exercise - Rotation

Exercise - Noise

Exercise - Combine Everything

Summary and Outlook

Licenses

Notebook License (CC-BY-SA 4.0)

Code License (MIT)