# Overview

While components encapsulate functionality using life cycle events, it is up to the execution to invoke the components. The event paradigm of the components allows for the composition of arbitrary component execution schedules. To make this more concrete, consider the following simple example:

from machinable import Experiment, execute

execute(Experiment().component("optimization").repeat(3))

The execution definition can be read as Import component 'optimization' and repeat its execution in three independent trials. Note that the experiment object does not trigger the execution but merely describes the execution plan and is then triggered using the execute method.

Crucially, machinable can take care of the intricacies of the execution based on this high-level description, i.e. import and construction of the components and trigger of its event life cycle. It can also keep track of the used configuration, generate seeds for controlled randomness and prepare a unique storage path to keep results. Since the execution details are abstracted away, it does not matter whether you run on a local computer or a distributed remote cluster.

# Defining executions

All aspect of the execution can be controlled as arguments of the execute() method.

from machinable import execute
execute(
    experiment,  # which components with what configuration
    storage,     # where to store results etc.
    engine,      # execution target, e.g. remote execution, multiprocessing etc.
    index,       # database that can be used to search for executions later
    project,     # the machinable project to use
    seed         # random seed
)

For even finer grained control, you can instantiate the Execution object directly using the same arguments. Notably, execute() is an alias for Execution().summary().submit().

For every execution, machinable will generate a unique 6-digit experiment ID (e.g. OY1p1o) that will be printed at the beginning of the execution output. The ID encodes the global random seed and is used as a relative directory to write any data generated by the experiment.

TIP

You can specify a system-wide default for storage, engine and index. Learn more

# Experiment

The experiment is the only required argument of the execution and specifies what components are executed. In the simplest case, it can be the name of a single component that will be executed using its default configuration. We will discuss the experiment specification in detail in the following section.

# Storage

By default, the storage is the non-permanent system memory which is useful during development. To keep your results, make sure to pass in a filesystem url to the storage parameter.

execute(..., storage='~/results')         # local file system
execute(..., storage='s3://bucket')       # s3 store

# Engines

While experiments are executed locally and sequential by default, machinable provides different Engines for parallel and remote execution. For example, to execute components in parallel processes you may specify the number of processes:

execute(..., engine='native:5')

To learn more about available engines and their options, refer to the Engine section.

# Seeding and reproducibility

machinable chooses and sets a global random seed automatically at execution time. You can also determine the seed with the seed parameter by passing in a number or an experiment ID:

execute(Experiment().component('controlled_randomness'), seed=42)

To re-use the seed of a given experiment and reproduce the execution results, you can pass the experiment ID as the seed:

execute(Experiment().component('controlled_randomness'), seed='OY1p1o')

If you need more control over randomness and how packages are being seeded, you can overwrite the on_seeding event in your component class.

# Code backups

machinable automatically backs up the code base at execution time in a zip file that can be used to reproduce the results. Note that the project directory needs to be under Git-version control to determine which files are included and ignored during the backup (.gitignore file).

# Import arguments and CLI

It is often helpful to move frequently used execution arguments into modules, for example:

# ./experiments/baseline.py
from machinable import Experiment
baseline_experiment = Experiment().component('example')
# ./engines/remote.py
from machinable.engine import Remote
my_remote_execution_engine = Remote(host="ssh://remote", directory="~/project")
# ./main.py
from machinable import execute
from experiments.baseline import baseline_experiment
from engines.remote import my_remote_execution_engine
execute(baseline_experiment, engine=my_remote_execution_engine)

You can simplify such imports by passing the module path prefixed with @/ as an execution argument, for instance:

from machinable import execute
execute("@/experiments/baseline", engine="@/engines/remote")

Note that you do not need to specify the actual variable name (e.g. baseline_experiment) since machinable will search the module for instances automatically.

# ./experiments/baseline.py
from machinable import Experiment
# no need to assign the experiment to a variable
Experiment().component('example')

If the module contains more than one instance, only the last one will be returned.

As a further simplification, using a simple @ will instruct machinable to search in the following default modules.

Argument Module
experiment experiments
storage storages
engine engines
index indexes
from machinable import execute
# @baseline -> @/experiments/baseline
# @remote -> @/engines/remote
execute("@baseline", engine="@remote")

The @-notation is particularly useful when used in combination with the command line interface, as it allows you to specify complex arguments in a concise way.

$ machinable execute @baseline --engine @remote

With the basic execution concepts out of the way, the following sections will focus on the fundamental experiment and engine arguments in more detail.