Experimenting with Sacred

The basics on experimenting with sacred: executing, logging and reproducing — October 2, 2017

When studying machine learning, you’ll often see yourself training multiple models, testing different features or tweaking parameters. While a tab sheet might help you, it’s not perfect for keeping logs, results and code. Believe me, I tried. An alternative is Sacred. This package allows you to execute multiple experiments and record everything in an organized fashion.

I’ll go through the basics here, but it’s never a bad idea to go through the documentation.

InceptionV3 network
The Inception-v3 Architecture

Basics

It all begins with a script. A entry point to your experiment. Sacred requires you to wrap said script with an simple annotation. For example, suppose there’s a file named ex.py:

from sacred import Experiment

ex = Experiment('1-check-dataset')

@ex.config
def my_config():
  data_dir = '/datasets/imagenet/train/'
  classes = None
  batch_size = 32

@ex.automain
def main(data_dir, classes, batch_size):
  from matplotlib import pyplot
  from keras.applications.inception_v3 import InceptionV3, preprocess_input
  from keras.preprocessing.image import ImageDataGenerator

  g = ImageDataGenerator(preprocess_function=preprocess_input)
  train = g.flow_from_directory(data_dir,
                                classes=classes,
                                batch_size=batch_size)

  model = InceptionV3(weights=None)
  model.compile(optimizer='adam', loss='categorical_crossentropy')
  model.fit_generator(train, epochs=1000, steps_per_epoch=100, ...)

This experiment can be easily executed with python ex.py -F ./logs/.

Sacred will take over and:

  1. The my_config function is executed and captures all compatible variables (i.e. str, int, float, dict, list) defined in it
  2. Info on the execution, the config defined and the code itself are copied to ./logs/1/{'run.json'|'config.json'|'../_sources/_run.py'}. The output buffer, in turn, is constantly copied to ./logs/1/cout.txt.
  3. main is called with the parameters captured from my_config!

You can also pass new parameters on-the-fly by executing it with python ex.py with batch_size=256, but it’s always a good idea to confirm if everything is in order with python ex.py print_config.

Reproducing an experiment becomes easy too: python ex.py with ./logs/config.json

Other persistence back-ends are also supported (e.g. MongoDB).

Hints

Importing inside the main script function.

Imports such as tensorflow take a while to complete, and sometimes you are just interested in finding how are the parameters with print_config. For this reason, I seldom import anything but sacred in the module. Just do it inside your experiment’s main function, like I did in the previous example.

Sacred and Progress Bars

I faced a problem of huge logs when combining Sacred with scripts that trained keras models, where hundreds of lines were only stages of the progress bars. This can be easily solved with this:

from sacred import Experiment, utils

ex = Experiment('a-nice-experiment')
ex.captured_out_filter = utils.apply_backspaces_and_linefeeds