Data augmentation¶

In this example, we will see how to use n2d2.provider.DataProvider and n2d2.transform.Transformation to load data and do some data augmentation.

You can find the full python script here data_augmentation.py.

Preliminary¶

For this tutorial, we will use n2d2 for data augmentation, and numpy and matplotlib for the visualization.

We will create a method plot_tensor to save the generated images from an n2d2.Tensor

import n2d2
import matplotlib.pyplot as plt

def plot_tensor(tensor, path):
    plt.imshow(tensor[0,0,:], cmap='gray', vmin=0, vmax=255)
    plt.savefig(path)

Loading data¶

We will begin by creating a n2d2.database.MNIST driver to load the MNIST dataset. We will then create a provider to get the images, we use a batch size of 1 to get only one image.

database = n2d2.database.MNIST(data_path="/local/DATABASE/mnist", validation=0.1)
provider = n2d2.provider.DataProvider(database, [28, 28, 1], batch_size=1)

You can get the number of data per partition by using the method n2d2.database.Database.get_partition_summary() which will print the paritionement of data.

database.get_partition_summary()

Output :

Number of stimuli : 70000
Learn         : 54000 stimuli (77.14%)
Test          : 10000 stimuli (14.29%)
Validation    : 6000 stimuli (8.57%)
Unpartitioned : 0 stimuli (0.0%)

To select which partition you want to read from you need to use the method n2d2.provider.DataProvider.set_partition()

To read data from a n2d2.provider.DataProvider you can use multiple methods.

You can use the methods n2d2.provider.DataProvider.read_batch() or n2d2.provider.DataProvider.read_random_batch().

Note

Since n2d2.provider.DataProvider is an iterable, so you can use the next() function or a for loop !

# for loop example
for data in provider:
    pass
# next example
data = next(provider)

For this tutorial we will use n2d2.provider.DataProvider.read_batch() !

With this code we will get the first image and plot it :

image = provider.read_batch(idx=0).to_numpy() * 255
plot_tensor(image, "first_stimuli.png")