Preliminary Setup¶
Let's get some imports and object construction out of the way:
import tempfile
import os
import numpy as np
import fastestimator as fe
from fastestimator.architecture.tensorflow import LeNet
from fastestimator.dataset.data import mnist
from fastestimator.op.numpyop.univariate import ExpandDims, Minmax
from fastestimator.op.tensorop.loss import CrossEntropy
from fastestimator.op.tensorop.model import ModelOp, UpdateOp
from fastestimator.schedule import cosine_decay
from fastestimator.trace.adapt import LRScheduler
from fastestimator.trace.io import BestModelSaver
from fastestimator.trace.metric import Accuracy
root_output_dir = tempfile.mkdtemp()
def get_estimator(extra_traces):
# step 1
train_data, eval_data = mnist.load_data()
test_data = eval_data.split(100)
test_data['id'] = [i for i in range(len(test_data))] # Assign some data ids for the test report to look at
pipeline = fe.Pipeline(train_data=train_data,
eval_data=eval_data,
test_data=test_data,
batch_size=32,
ops=[ExpandDims(inputs="x", outputs="x"), Minmax(inputs="x", outputs="x")])
# step 2
model = fe.build(model_fn=LeNet, optimizer_fn="adam")
network = fe.Network(ops=[
ModelOp(model=model, inputs="x", outputs="y_pred"),
CrossEntropy(inputs=("y_pred", "y"), outputs="ce"),
UpdateOp(model=model, loss_name="ce")
])
# step 3
traces = [
Accuracy(true_key="y", pred_key="y_pred"),
BestModelSaver(model=model, save_dir=root_output_dir, metric="accuracy", save_best_mode="max"),
LRScheduler(model=model, lr_fn=lambda step: cosine_decay(step, cycle_length=3750, init_lr=1e-3))
]
traces.extend(extra_traces)
estimator = fe.Estimator(pipeline=pipeline,
network=network,
epochs=2,
traces=traces,
train_steps_per_epoch=100,
eval_steps_per_epoch=100,
log_steps=10)
return estimator
Overview and Dependencies¶
FastEstimator provides Traces which allow you to automatically generate traceability documents and test reports. These reports are written in the LaTeX file format, and then automatically compiled into PDF documents if you have LaTeX installed on your machine. If you don't have LaTeX installed on your training machine, you can still generate the report files and then move them to a different computer in order to compile them manually. Generating traceability documents also requires GraphViz which, unlike LaTeX, must be installed in order for training to proceed.
Installing Dependencies:
On Linux:
apt-get install -y graphviz texlive-latex-base texlive-latex-extra
On SageMaker:
unset PYTHONPATH
export DEBIAN_FRONTEND=noninteractive
apt-get install -y graphviz texlive-latex-base texlive-latex-extra
On Mac:
brew install graphviz
brew cask install mactex
On Windows:
winget install graphviz
winget install TeXLive
Traceability¶
Traceability reports are designed to capture all the information about the state of your system when an experiment was run. The report will include training graphs, operator architecture diagrams, model architecture diagrams, a summary of your system configuration, and the values of all variables used to instantiate objects during training. It will also automatically save a copy of your log output to disk, which can be especially useful for comparing different experiment configurations without worrying about forgetting what settings were used for each run. To generate this report, simply add a Traceability trace to your list of traces:
from fastestimator.trace.io import Traceability
save_dir = os.path.join(root_output_dir, 'report')
est = get_estimator([Traceability(save_dir)])
print(f"The root save directory is: {root_output_dir}")
print(f"The traceability report will be written to: {save_dir}")
print(f"Logs and images from the report will be written to: {os.path.join(save_dir, 'resources')}")
The root save directory is: /var/folders/3r/h9kh47050gv6rbt_pgf8cl540000gn/T/tmptrgna3na The traceability report will be written to: /var/folders/3r/h9kh47050gv6rbt_pgf8cl540000gn/T/tmptrgna3na/report Logs and images from the report will be written to: /var/folders/3r/h9kh47050gv6rbt_pgf8cl540000gn/T/tmptrgna3na/report/resources
When using Traceability, you must pass a summary name to the Estimator.fit() call. This will become the name of your report.
est.fit("Sample MNIST Report")
______ __ ______ __ _ __ / ____/___ ______/ /_/ ____/____/ /_(_)___ ___ ____ _/ /_____ _____ / /_ / __ `/ ___/ __/ __/ / ___/ __/ / __ `__ \/ __ `/ __/ __ \/ ___/ / __/ / /_/ (__ ) /_/ /___(__ ) /_/ / / / / / / /_/ / /_/ /_/ / / /_/ \__,_/____/\__/_____/____/\__/_/_/ /_/ /_/\__,_/\__/\____/_/ FastEstimator-Start: step: 1; logging_interval: 10; num_device: 0; FastEstimator-Train: step: 1; ce: 2.29492; model_lr: 0.0009999998; FastEstimator-Train: step: 10; ce: 2.2441; model_lr: 0.0009999825; steps/sec: 70.13; FastEstimator-Train: step: 20; ce: 1.6545596; model_lr: 0.0009999298; steps/sec: 65.46; FastEstimator-Train: step: 30; ce: 0.7863882; model_lr: 0.0009998423; steps/sec: 68.01; FastEstimator-Train: step: 40; ce: 0.8883647; model_lr: 0.0009997196; steps/sec: 67.27; FastEstimator-Train: step: 50; ce: 0.57198644; model_lr: 0.0009995619; steps/sec: 64.75; FastEstimator-Train: step: 60; ce: 0.96408165; model_lr: 0.0009993691; steps/sec: 59.96; FastEstimator-Train: step: 70; ce: 0.6204411; model_lr: 0.0009991414; steps/sec: 65.0; FastEstimator-Train: step: 80; ce: 0.46853542; model_lr: 0.0009988786; steps/sec: 62.4; FastEstimator-Train: step: 90; ce: 0.6022613; model_lr: 0.0009985808; steps/sec: 63.77; FastEstimator-Train: step: 100; ce: 0.31527373; model_lr: 0.0009982482; steps/sec: 67.53; FastEstimator-Train: step: 100; epoch: 1; epoch_time: 2.97 sec; FastEstimator-BestModelSaver: Saved model to /var/folders/3r/h9kh47050gv6rbt_pgf8cl540000gn/T/tmptrgna3na/model_best_accuracy.h5 FastEstimator-Eval: step: 100; epoch: 1; accuracy: 0.88; ce: 0.4036578; max_accuracy: 0.88; since_best_accuracy: 0; FastEstimator-Train: step: 110; ce: 0.2963678; model_lr: 0.0009978806; steps/sec: 16.23; FastEstimator-Train: step: 120; ce: 0.7164498; model_lr: 0.000997478; steps/sec: 64.31; FastEstimator-Train: step: 130; ce: 0.45244628; model_lr: 0.0009970407; steps/sec: 63.75; FastEstimator-Train: step: 140; ce: 0.12593144; model_lr: 0.0009965684; steps/sec: 65.46; FastEstimator-Train: step: 150; ce: 0.24340093; model_lr: 0.0009960613; steps/sec: 64.74; FastEstimator-Train: step: 160; ce: 0.19809574; model_lr: 0.0009955195; steps/sec: 64.78; FastEstimator-Train: step: 170; ce: 0.2975997; model_lr: 0.0009949428; steps/sec: 61.15; FastEstimator-Train: step: 180; ce: 0.18596129; model_lr: 0.0009943316; steps/sec: 57.29; FastEstimator-Train: step: 190; ce: 0.16816229; model_lr: 0.0009936856; steps/sec: 51.65; FastEstimator-Train: step: 200; ce: 0.1781869; model_lr: 0.000993005; steps/sec: 60.26; FastEstimator-Train: step: 200; epoch: 2; epoch_time: 2.09 sec; FastEstimator-BestModelSaver: Saved model to /var/folders/3r/h9kh47050gv6rbt_pgf8cl540000gn/T/tmptrgna3na/model_best_accuracy.h5 FastEstimator-Eval: step: 200; epoch: 2; accuracy: 0.936875; ce: 0.20415045; max_accuracy: 0.936875; since_best_accuracy: 0; FastEstimator-Finish: step: 200; model_lr: 0.000993005; total_time: 6.86 sec; FastEstimator-Traceability: Report written to /var/folders/3r/h9kh47050gv6rbt_pgf8cl540000gn/T/tmptrgna3na/report/sample_mnist_report.pdf
<fastestimator.summary.summary.Summary at 0x17a55be80>
If everything went according to plan, then inside your root save directory you should now have the following files:
/report
sample_mnist_report.pdf
sample_mnist_report.tex
/resources
sample_mnist_report_logs.png
sample_mnist_report_model.pdf
sample_mnist_report.txt
You could then switch up your experiment parameters and call .fit() with a new experiment name in order to write more reports into the same folder. A call to fastestimator logs ./resources
would then allow you to easily compare these experiments, as described in Advanced Tutorial 6
Our report should look something like this (use Chrome or Firefox to view):
from IPython.display import IFrame
IFrame('../resources/t10a_traceability.pdf', width=600, height=800)
Test Report¶
Test Reports can provide an automatically generated overview summary of how well your model is performing. This could be useful if, for example, you needed to submit documentation to a regulatory agency. Test Reports can also be used to highlight particular failure cases so that you can investigate problematic data points in more detail.
The TestReport
trace takes a list of TestCase
objects as input. These are further subdivided into two types: aggregate and per-instance. Aggregate test cases run at the end of the test epoch and deal with aggregated information (typically metrics such as accuracy). Per-instance tests run at the end of every step during testing, and are meant to evaluate every element within a batch independently. If your data dictionary happens to contain data instance ids, you can also use these to find problematic inputs.
from fastestimator.trace.io.test_report import TestCase, TestReport
save_dir = os.path.join(root_output_dir, 'report2')
# Note that the name of the input to the 'criteria' function must match a key in the data dictionary
agg_test_easy = TestCase(description='Accuracy should be greater than 1%', criteria=lambda accuracy: accuracy > 0.01)
agg_test_hard = TestCase(description='Accuracy should be greater than 99%', criteria=lambda accuracy: accuracy > 0.99)
inst_test_hard = TestCase(description='All Data should be correctly classified', criteria=lambda y, y_pred: np.equal(y,np.argmax(y_pred, axis=-1)), aggregate=False, fail_threshold=0)
inst_test_easy = TestCase(description='At least one image should be correctly classified', criteria=lambda y, y_pred: np.equal(y,np.argmax(y_pred, axis=-1)), aggregate=False, fail_threshold=len(est.pipeline.data['test'])-1)
report = TestReport(test_cases=[agg_test_easy, agg_test_hard, inst_test_easy, inst_test_hard], save_path=save_dir, data_id='id')
est = get_estimator([report])
print(f"The root save directory is: {root_output_dir}")
print(f"The test report will be written to: {save_dir}")
print(f"A json summary of the report will be written to: {os.path.join(save_dir, 'resources')}")
The root save directory is: /var/folders/3r/h9kh47050gv6rbt_pgf8cl540000gn/T/tmptrgna3na The test report will be written to: /var/folders/3r/h9kh47050gv6rbt_pgf8cl540000gn/T/tmptrgna3na/report2 A json summary of the report will be written to: /var/folders/3r/h9kh47050gv6rbt_pgf8cl540000gn/T/tmptrgna3na/report2/resources
est.fit("MNIST")
est.test()
______ __ ______ __ _ __ / ____/___ ______/ /_/ ____/____/ /_(_)___ ___ ____ _/ /_____ _____ / /_ / __ `/ ___/ __/ __/ / ___/ __/ / __ `__ \/ __ `/ __/ __ \/ ___/ / __/ / /_/ (__ ) /_/ /___(__ ) /_/ / / / / / / /_/ / /_/ /_/ / / /_/ \__,_/____/\__/_____/____/\__/_/_/ /_/ /_/\__,_/\__/\____/_/ FastEstimator-Start: step: 1; logging_interval: 10; num_device: 0; FastEstimator-Train: step: 1; ce: 2.3143482; model1_lr: 0.0009999998; FastEstimator-Train: step: 10; ce: 2.1496544; model1_lr: 0.0009999825; steps/sec: 52.26; FastEstimator-Train: step: 20; ce: 1.8073351; model1_lr: 0.0009999298; steps/sec: 53.76; FastEstimator-Train: step: 30; ce: 1.1469091; model1_lr: 0.0009998423; steps/sec: 54.22; FastEstimator-Train: step: 40; ce: 0.58918655; model1_lr: 0.0009997196; steps/sec: 51.02; FastEstimator-Train: step: 50; ce: 0.51719546; model1_lr: 0.0009995619; steps/sec: 50.71; FastEstimator-Train: step: 60; ce: 1.4550462; model1_lr: 0.0009993691; steps/sec: 53.08; FastEstimator-Train: step: 70; ce: 0.30743462; model1_lr: 0.0009991414; steps/sec: 54.5; FastEstimator-Train: step: 80; ce: 0.42680097; model1_lr: 0.0009988786; steps/sec: 49.95; FastEstimator-Train: step: 90; ce: 0.40482238; model1_lr: 0.0009985808; steps/sec: 53.35; FastEstimator-Train: step: 100; ce: 0.48199874; model1_lr: 0.0009982482; steps/sec: 51.36; FastEstimator-Train: step: 100; epoch: 1; epoch_time: 2.67 sec; FastEstimator-BestModelSaver: Saved model to /var/folders/3r/h9kh47050gv6rbt_pgf8cl540000gn/T/tmptrgna3na/model1_best_accuracy.h5 FastEstimator-Eval: step: 100; epoch: 1; accuracy: 0.8759375; ce: 0.39020747; max_accuracy: 0.8759375; since_best_accuracy: 0; FastEstimator-Train: step: 110; ce: 0.42094475; model1_lr: 0.0009978806; steps/sec: 13.26; FastEstimator-Train: step: 120; ce: 0.40665025; model1_lr: 0.000997478; steps/sec: 53.11; FastEstimator-Train: step: 130; ce: 0.3574287; model1_lr: 0.0009970407; steps/sec: 51.02; FastEstimator-Train: step: 140; ce: 0.14667201; model1_lr: 0.0009965684; steps/sec: 54.13; FastEstimator-Train: step: 150; ce: 0.41308922; model1_lr: 0.0009960613; steps/sec: 54.12; FastEstimator-Train: step: 160; ce: 0.12534323; model1_lr: 0.0009955195; steps/sec: 53.03; FastEstimator-Train: step: 170; ce: 0.18102367; model1_lr: 0.0009949428; steps/sec: 51.0; FastEstimator-Train: step: 180; ce: 0.13506973; model1_lr: 0.0009943316; steps/sec: 52.05; FastEstimator-Train: step: 190; ce: 0.10690997; model1_lr: 0.0009936856; steps/sec: 51.8; FastEstimator-Train: step: 200; ce: 0.14696348; model1_lr: 0.000993005; steps/sec: 53.67; FastEstimator-Train: step: 200; epoch: 2; epoch_time: 2.51 sec; FastEstimator-BestModelSaver: Saved model to /var/folders/3r/h9kh47050gv6rbt_pgf8cl540000gn/T/tmptrgna3na/model1_best_accuracy.h5 FastEstimator-Eval: step: 200; epoch: 2; accuracy: 0.9275; ce: 0.22968279; max_accuracy: 0.9275; since_best_accuracy: 0; FastEstimator-Finish: step: 200; model1_lr: 0.000993005; total_time: 7.65 sec; FastEstimator-Test: step: 200; epoch: 2; accuracy: 0.97; ce: 0.14020246; FastEstimator-TestReport: Report written to /var/folders/3r/h9kh47050gv6rbt_pgf8cl540000gn/T/tmptrgna3na/report2/mnist_TestReport.pdf
<fastestimator.summary.summary.Summary at 0x17ba5ddc0>
If everything went according to plan, then inside your root save directory you should now have the following files:
/report2
mnist_TestReport.pdf
mnist_TestReport.tex
/resources
mnist_TestReport.json
Our report should look something like this (use Chrome or Firefox to view):
from IPython.display import IFrame
IFrame('../resources/t10a_test.pdf', width=600, height=800)