Whenever you execute an experiment, machinable generates a unique 6-digit experiment ID (e.g.
9eW1PC) and creates a new directory of the same name in the specified storage location. This directory is used to write all data that is generated by the experiment, including the used configuration, system metrics, status information and results. More specifically, it may look like this:
~/results ├── 9eW1PC │ ├── U6RTBBqSwK25/ │ │ ├── component.json │ │ ├── components.json │ │ ├── host.json │ │ ├── log.txt │ │ ├── state.json │ │ └── store.json │ ├── ... │ ├── host.json │ └── execution.json └── ...
While it is possible to read and navigate the folder manually, machinable provides an interface for efficient data retrieval that allows you to query the store like a database.
The storage interface is read-only, meaning it will never modify or remove data generated during the execution.
# Retrieving data
To load the experimental data from a store location, instantiate the
Storage interface and add the storage location.
from machinable import Storage storage = Storage() storage.add('~/results') # or shorter: storage = Storage('~/results')
machinable will index all directories of the storage and cache the data to enable reload-free fast access. If experiments are still running, machinable will reload information automatically.
To retrieve experiments, you can use the
find() method that returns an experiment based on their unique ID.
experiment = storage.find('K45al') >>> <Experiment (3K45al)>
The experiment object provides convenient access to the experiment's data.
To access the experiments components, use
experiment.components. Note that the method returns a collection of component objects rather than a single object.
The collection interface forms a wrapper for working with the list of components and provides a wealth of manipulation operations. For example, we could select the components that have already finished executing:
experiment.components.filter(lambda x: x.is_finished()).first()
The collection reference documentation provides a comprehensive overview over available options.
In practice, the interfaces allow you to quickly retrieve the data that is needed to analyse the results. One of the key advantages when working with the storage abstraction is that it removes the overhead of thinking about how the data is actually being stored and read from the disk. Compared with manual result management, it can significantly reduce the effort to organize, retrieve and analyse results.