Whenever you execute an experiment, machinable generates a unique 6-digit experiment ID (e.g.
9eW1PC) and creates a new directory of the same name in the specified storage location. This directory is used to write all data that is generated by the experiment, including the used configuration, system metrics, status information and results. More specifically, it may look like this:
~/results ├── 9eW1PC │ ├── U6RTBBqSwK25/ │ │ ├── component.json │ │ ├── components.json │ │ ├── host.json │ │ ├── log.txt │ │ ├── state.json │ │ └── data/ │ ├── ... │ ├── host.json │ └── execution.json └── ...
While it is possible to read and navigate the folder manually, machinable provides interfaces for efficient data retrieval. One advantage when working with the storage abstraction is that it removes the overhead of thinking about how the data is actually being stored and read from the disk.
The storage interface is read-only, meaning it will never modify or remove data generated during the execution.
# Retrieving experiment data
To load an experiment from a storage location, you can use the
from machinable.storage import get_experiment experiment = get_experiment("~/results/9eW1PC")
The returned ExperimentStorage provides simplified access to the experiment's data.
experiment.experiment_id >>> 9eW1PC experiment.started_at >>> DateTime(2020, 9, 13, 22, 9, 55, 470235, tzinfo=Timezone('+01:00')) experiment.is_finished() >>> True
The interface will cache the data to enable reload-free fast access. If experiments are still running, machinable will reload changing information automatically.
To access the experiments components, use
experiment.components. Note that the method returns a collection of component objects rather than a single object.
experiment.components >>> Collection (1) <Storage: Component <d4tSlSA744Di>>
The collection interface forms a wrapper for working with the lists and provides a wealth of manipulation operations. For example, we could select the components that have already finished executing:
experiment.components.filter(lambda x: x.is_finished()).first()
pandas is available, you can turn the collection into a dataframe using the
The collection reference documentation provides a comprehensive overview of all available options.
# Searching a directory
You can recursively retrieve all experiments within a directory using the
find_experiments method that returns a collection of the found experiments.
from machinable.storage import find_experiments find_experiments("~/results") >>> Collection (1) <Storage: Experiment <9eW1PC>>
# Managing experiments
The discussed storage interfaces are fairly minimal when it comes to organisation of your experiments. In particular, they require you to keep track of storage locations. To organise and query many experiments more effectively, you can use Indexes that provide database-like features covered in the next section.