# Storage

Whenever you execute an experiment, machinable generates a unique 6-digit experiment ID (e.g. 9eW1PC) and creates a new directory of the same name in the specified storage location. This directory is used to write all data that is generated by the experiment, including the used configuration, system metrics, status information and results. More specifically, it may look like this:

~/results
├── 9eW1PC
│   ├── U6RTBBqSwK25/
│   │   ├── component.json
│   │   ├── components.json
│   │   ├── host.json
│   │   ├── log.txt
│   │   ├── state.json
│   │   └── data/
│   ├── ... 
│   ├── host.json
│   └── execution.json
└── ...

While it is possible to read and navigate the folder manually, machinable provides interfaces for efficient data retrieval. One advantage when working with the storage abstraction is that it removes the overhead of thinking about how the data is actually being stored and read from the disk.

TIP

The storage interface is read-only, meaning it will never modify or remove data generated during the execution.

# Retrieving experiment data

To load an experiment from a storage location, you can use the get_experiment method.

from machinable.storage import get_experiment
experiment = get_experiment("~/results/9eW1PC")

The returned ExperimentStorage provides simplified access to the experiment's data.

experiment.experiment_id
>>> 9eW1PC
experiment.started_at
>>> DateTime(2020, 9, 13, 22, 9, 55, 470235, tzinfo=Timezone('+01:00'))
experiment.is_finished()
>>> True

The interface will cache the data to enable reload-free fast access. If experiments are still running, machinable will reload changing information automatically.

To access the experiments components, use experiment.components. Note that the method returns a collection of component objects rather than a single object.

experiment.components
>>> Collection (1) <Storage: Component <d4tSlSA744Di>>

The collection interface forms a wrapper for working with the lists and provides a wealth of manipulation operations. For example, we could select the components that have already finished executing:

experiment.components.filter(lambda x: x.is_finished()).first()

TIP

If pandas is available, you can turn the collection into a dataframe using the as_dataframe() method.

The collection reference documentation provides a comprehensive overview of all available options.

# Searching a directory

You can recursively retrieve all experiments within a directory using the find_experiments method that returns a collection of the found experiments.

from machinable.storage import find_experiments
find_experiments("~/results")
>>> Collection (1) <Storage: Experiment <9eW1PC>>

# Managing experiments

The discussed storage interfaces are fairly minimal when it comes to organisation of your experiments. In particular, they require you to keep track of storage locations. To organise and query many experiments more effectively, you can use Indexes that provide database-like features covered in the next section.