MNIST Image Classification Using LeNet (Tensorflow Backend)¶
[Notebook] [TF Implementation] [Torch Implementation]
In this example, we are going to demonstrate how to train an MNIST image classification model using a LeNet model architecture and TensorFlow backend.
Import the required libraries¶
import tensorflow as tf
import fastestimator as fe
import numpy as np
import matplotlib.pyplot as plt
import tempfile
from fastestimator.util import BatchDisplay, GridDisplay
#training parameters
epochs = 2
batch_size = 32
train_steps_per_epoch = None
eval_steps_per_epoch = None
save_dir = tempfile.mkdtemp()
from fastestimator.dataset.data import mnist
train_data, eval_data = mnist.load_data()
test_data = eval_data.split(0.5)
Set up a preprocessing pipeline¶
In this example, the data preprocessing steps include adding a channel to the images (since they are grey-scale) and normalizing the image pixel values to the range [0, 1]. We set up these processing steps using Ops
. The Pipeline
also takes our data sources and batch size as inputs.
from fastestimator.op.numpyop.univariate import ExpandDims, Minmax
pipeline = fe.Pipeline(train_data=train_data,
eval_data=eval_data,
test_data=test_data,
batch_size=batch_size,
ops=[ExpandDims(inputs="x", outputs="x_out"),
Minmax(inputs="x_out", outputs="x_out")])
Validate Pipeline
¶
In order to make sure the pipeline works as expected, we need to visualize its output. Pipeline.get_results
will return a batch of pipeline output to enable this:
data = pipeline.get_results()
data_xin = data["x"]
data_xout = data["x_out"]
print("the pipeline input data size: {}".format(data_xin.numpy().shape))
print("the pipeline output data size: {}".format(data_xout.numpy().shape))
print("the maximum pixel value of output image: {}".format(np.max(data_xout.numpy())))
print("the minimum pixel value of output image: {}".format(np.min(data_xout.numpy())))
the pipeline input data size: (32, 28, 28) the pipeline output data size: (32, 28, 28, 1) the maximum pixel value of output image: 1.0 the minimum pixel value of output image: 0.0
num_samples = 5
indices = np.random.choice(batch_size, size=num_samples, replace=False)
inputs = tf.gather(data_xin.numpy(), indices)
outputs = tf.gather(data_xout.numpy(), indices)
fig = GridDisplay([BatchDisplay(image=inputs, title='Pipeline Input'),
BatchDisplay(image=outputs, title='Pipeline Output')
])
fig.show()
2022-05-17 18:39:52.069204: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-05-17 18:39:52.548658: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 38420 MB memory: -> device: 0, name: NVIDIA A100-SXM4-40GB, pci bus id: 0000:90:00.0, compute capability: 8.0
Step 2 - Network
construction¶
FastEstimator supports both PyTorch and TensorFlow, so this section could use either backend.
We are going to only demonstrate the TensorFlow backend in this example.
Model construction¶
Here we are going to import one of FastEstimator's pre-defined model architectures, which was written in TensorFlow. We create a model instance by compiling our model definition function along with a specific model optimizer.
from fastestimator.architecture.tensorflow import LeNet
model = fe.build(model_fn=LeNet, optimizer_fn="adam")
Network
definition¶
We are going to connect the model and Ops
together into a Network
. Ops
are the basic components of a Network
. They can be logic for loss calculation, model update rules, or even models themselves.
from fastestimator.op.tensorop.loss import CrossEntropy
from fastestimator.op.tensorop.model import ModelOp, UpdateOp
network = fe.Network(ops=[
ModelOp(model=model, inputs="x_out", outputs="y_pred"),
CrossEntropy(inputs=("y_pred", "y"), outputs="ce"),
UpdateOp(model=model, loss_name="ce")
])
2022-05-17 18:39:53.965393: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
Step 3 - Estimator
definition and training¶
In this step, we define an Estimator
to connect our Network
with our Pipeline
and set the traces
which compute accuracy (Accuracy
), save the best model (BestModelSaver
), and change the model learning rate over time (LRScheduler
).
from fastestimator.schedule import cosine_decay
from fastestimator.trace.adapt import LRScheduler
from fastestimator.trace.io import BestModelSaver
from fastestimator.trace.metric import Accuracy
traces = [
Accuracy(true_key="y", pred_key="y_pred"),
BestModelSaver(model=model, save_dir=save_dir, metric="accuracy", save_best_mode="max"),
LRScheduler(model=model, lr_fn=lambda step: cosine_decay(step, cycle_length=3750, init_lr=1e-3))
]
estimator = fe.Estimator(pipeline=pipeline,
network=network,
epochs=epochs,
traces=traces,
train_steps_per_epoch=train_steps_per_epoch,
eval_steps_per_epoch=eval_steps_per_epoch)
estimator.fit() # start the training process
______ __ ______ __ _ __ / ____/___ ______/ /_/ ____/____/ /_(_)___ ___ ____ _/ /_____ _____ / /_ / __ `/ ___/ __/ __/ / ___/ __/ / __ `__ \/ __ `/ __/ __ \/ ___/ / __/ / /_/ (__ ) /_/ /___(__ ) /_/ / / / / / / /_/ / /_/ /_/ / / /_/ \__,_/____/\__/_____/____/\__/_/_/ /_/ /_/\__,_/\__/\____/_/ FastEstimator-Warn: the key 'x' is being pruned since it is unused outside of the Pipeline. To prevent this, you can declare the key as an input of a Trace or TensorOp.
2022-05-17 18:39:58.639145: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8100 2022-05-17 18:39:59.830141: I tensorflow/stream_executor/cuda/cuda_blas.cc:1786] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
FastEstimator-Start: step: 1; logging_interval: 100; num_device: 1; FastEstimator-Train: step: 1; ce: 2.3435714; model_lr: 0.0009999998; FastEstimator-Train: step: 100; ce: 0.4765919; model_lr: 0.0009982482; steps/sec: 321.43; FastEstimator-Train: step: 200; ce: 0.20118365; model_lr: 0.000993005; steps/sec: 436.99; FastEstimator-Train: step: 300; ce: 0.099995025; model_lr: 0.0009843073; steps/sec: 427.64; FastEstimator-Train: step: 400; ce: 0.0135647105; model_lr: 0.000972216; steps/sec: 396.01; FastEstimator-Train: step: 500; ce: 0.21389233; model_lr: 0.00095681596; steps/sec: 437.99; FastEstimator-Train: step: 600; ce: 0.11859661; model_lr: 0.0009382152; steps/sec: 451.64; FastEstimator-Train: step: 700; ce: 0.0150666125; model_lr: 0.0009165442; steps/sec: 437.15; FastEstimator-Train: step: 800; ce: 0.19784231; model_lr: 0.00089195487; steps/sec: 411.11; FastEstimator-Train: step: 900; ce: 0.09315811; model_lr: 0.00086461985; steps/sec: 415.99; FastEstimator-Train: step: 1000; ce: 0.08798367; model_lr: 0.00083473074; steps/sec: 418.98; FastEstimator-Train: step: 1100; ce: 0.13316694; model_lr: 0.00080249726; steps/sec: 409.57; FastEstimator-Train: step: 1200; ce: 0.101819955; model_lr: 0.0007681455; steps/sec: 414.03; FastEstimator-Train: step: 1300; ce: 0.02795989; model_lr: 0.00073191634; steps/sec: 418.81; FastEstimator-Train: step: 1400; ce: 0.14486375; model_lr: 0.000694064; steps/sec: 419.0; FastEstimator-Train: step: 1500; ce: 0.00974; model_lr: 0.000654854; steps/sec: 412.56; FastEstimator-Train: step: 1600; ce: 0.011371499; model_lr: 0.00061456126; steps/sec: 423.26; FastEstimator-Train: step: 1700; ce: 0.2414889; model_lr: 0.00057346845; steps/sec: 428.78; FastEstimator-Train: step: 1800; ce: 0.052666433; model_lr: 0.0005318639; steps/sec: 423.53; FastEstimator-Train: step: 1875; epoch: 1; epoch_time: 8.27 sec; Eval Progress: 1/156; Eval Progress: 52/156; steps/sec: 466.18; Eval Progress: 104/156; steps/sec: 601.23; Eval Progress: 156/156; steps/sec: 572.91; FastEstimator-BestModelSaver: Saved model to /tmp/tmpx2c0kg_c/model_best_accuracy.h5 FastEstimator-Eval: step: 1875; epoch: 1; accuracy: 0.9882; ce: 0.035068836; max_accuracy: 0.9882; since_best_accuracy: 0; FastEstimator-Train: step: 1900; ce: 0.074525386; model_lr: 0.00049003924; steps/sec: 35.29; FastEstimator-Train: step: 2000; ce: 0.14389601; model_lr: 0.00044828805; steps/sec: 396.1; FastEstimator-Train: step: 2100; ce: 0.016613875; model_lr: 0.00040690304; steps/sec: 414.19; FastEstimator-Train: step: 2200; ce: 0.021673243; model_lr: 0.00036617456; steps/sec: 413.71; FastEstimator-Train: step: 2300; ce: 0.006829162; model_lr: 0.00032638825; steps/sec: 415.69; FastEstimator-Train: step: 2400; ce: 0.0015307405; model_lr: 0.00028782323; steps/sec: 426.31; FastEstimator-Train: step: 2500; ce: 0.025499923; model_lr: 0.00025075; steps/sec: 412.58; FastEstimator-Train: step: 2600; ce: 0.0025987546; model_lr: 0.00021542858; steps/sec: 401.93; FastEstimator-Train: step: 2700; ce: 0.006895067; model_lr: 0.00018210671; steps/sec: 399.96; FastEstimator-Train: step: 2800; ce: 0.026727632; model_lr: 0.00015101816; steps/sec: 404.2; FastEstimator-Train: step: 2900; ce: 0.002839017; model_lr: 0.00012238097; steps/sec: 418.88; FastEstimator-Train: step: 3000; ce: 0.035146505; model_lr: 9.639601e-05; steps/sec: 431.53; FastEstimator-Train: step: 3100; ce: 0.0043295464; model_lr: 7.324555e-05; steps/sec: 430.43; FastEstimator-Train: step: 3200; ce: 0.17529601; model_lr: 5.3091975e-05; steps/sec: 402.93; FastEstimator-Train: step: 3300; ce: 0.0788301; model_lr: 3.6076646e-05; steps/sec: 412.71; FastEstimator-Train: step: 3400; ce: 0.031229565; model_lr: 2.231891e-05; steps/sec: 410.9; FastEstimator-Train: step: 3500; ce: 0.0007895004; model_lr: 1.1915274e-05; steps/sec: 406.93; FastEstimator-Train: step: 3600; ce: 0.014925261; model_lr: 4.9387068e-06; steps/sec: 411.33; FastEstimator-Train: step: 3700; ce: 0.13713205; model_lr: 1.4381463e-06; steps/sec: 375.61; FastEstimator-Train: step: 3750; epoch: 2; epoch_time: 7.16 sec; Eval Progress: 1/156; Eval Progress: 52/156; steps/sec: 428.12; Eval Progress: 104/156; steps/sec: 560.12; Eval Progress: 156/156; steps/sec: 585.27; FastEstimator-BestModelSaver: Saved model to /tmp/tmpx2c0kg_c/model_best_accuracy.h5 FastEstimator-Eval: step: 3750; epoch: 2; accuracy: 0.9908; ce: 0.027614402; max_accuracy: 0.9908; since_best_accuracy: 0; FastEstimator-Finish: step: 3750; model_lr: 1e-06; total_time: 21.62 sec;
Model testing¶
Estimator.test
triggers model testing using the test dataset that was specified in Pipeline
. We can evaluate the model's accuracy on this previously unseen data.
estimator.test()
FastEstimator-Test: step: 3750; epoch: 2; accuracy: 0.991; ce: 0.02567655;
Inferencing¶
Now let's run inferencing on several images directly using the model that we just trained.
We randomly select 5 images from the testing dataset and infer them image by image by leveraging Pipeline.transform
and Network.transform
:
num_samples = 5
indices = np.random.choice(batch_size, size=num_samples, replace=False)
inputs = []
outputs = []
predictions = []
for idx in indices:
inputs.append(test_data["x"][idx])
data = {"x": inputs[-1]}
# run the pipeline
data = pipeline.transform(data, mode="infer")
outputs.append(data["x_out"].squeeze(axis=(0,3)))
# run the network
data = network.transform(data, mode="infer")
predictions.append(np.argmax(data["y_pred"].numpy().squeeze(axis=(0))))
fig = GridDisplay([BatchDisplay(image=np.stack(inputs), title="Pipeline Input"),
BatchDisplay(image=np.stack(outputs), title="Pipeline Output"),
BatchDisplay(text=np.stack(predictions), title="Predictions")
])
fig.show()