Components are the core interface to implement functionality in a machinable project. Technically, they are simply classes that inherit from the base class
machinable.Component as defined in the python module that is specified in the
machinable.yaml. For instance, to implement a component that encapsulates some optimization problem, we could create the following source file:
# optimization.py from machinable import Component class DummyOptimization(Component): def on_create(self): print("Creating the optimization model with the following configuration: ", self.config) def on_execute(self): for i in range(3): print('Training step', i)
Note that it does not matter how you name the class as long as the class inherits from the component base class and is registered in the
machinable.yaml, for instance:
components: - optimization: learning_rate: 0.1
The component base provides a variety of interfaces that bootstrap the implementation of the component and are described below.
# Life cycle
Components expose a number of life cycle events that can be overwritten to hook into the execution cycle at a certain point. All event methods start with
on_ and are documented in the event reference. In the example above, the
on_execute events are implemented and will thus be triggered during execution.
The component life cycle allows you to implement using any framework and standard python methods without worrying about the execution logic (i.e. configuration parsing, parallel execution, etc.). Moreover, the event paradigm provides a clear semantic while the object orientation enables flexible code sharing mechanisms (e.g. inheritance, mixins, etc.).
Components can consume their configuration via the
from machinable import Component class MyComponent(Component): def on_create(self): print(self.config.config_value) print(self.config.nested.value) print(self.config['nested']['value'])
>>> 1 2 2
For convenience, the dict interface can be accessed using the
. object notation and provides a few helper methods like pretty-printing
Flags are configuration values that are associated with the particular execution, for example the random seeds or worker ids. They are accessible via the
self.flags object, that supports the
. object notation. You can add your own flags through basic assignment, e.g.
self.flags.counter = 1. To avoid name collision, all native machinable flags use UPPERCASE (e.g.
The observer interface
self.observer allows for the storing of data and results of the component. Note that you don't have to specify where the observer data is being stored. You can specify the storage location before the execution and machinable will manage unique directories automatically. The data can later be retrieved using the Observations interface.
self.log provides a standard logger interface that outputs to the console and a log file.
self.log.info('Component created') self.log.debug('Component initialized')
self.record provides an interface for tabular logging, that is, storing recurring data points at each iteration. The results become available as a table where each row represents each iteration.
for iteration in range(10): self.record['iteration'] = iteration loss, acc = ... # write column values self.record['accuracy'] = acc self.record['loss'] = loss # save at the end of the iteration to start a new row self.record.save()
If you use the
on_execute_iteration event, iteration information and
record.save() will be triggered automatically at the end of each iteration.
Sometimes it is useful to have multiple tabular loggers, for example to record training and validation performance separately. You can create custom record loggers using
self.observer.get_record_writer(scope) which returns a new instance of a record writer that you can use just like the main record writer.
You can use
self.observer.store() to store any other Python object, for example:
self.observer.store('final_accuracy', [0.85, 0.92])
Note that to protect unintended data loss, overwriting will fail unless the
overwrite argument is explicitly set.
For larger data structures, it can be more suitable to store data in specific file formats by appending a file extension, i.e.:
self.observer.store('data.txt', 'a string') self.observer.store('data.p', generic_object) self.observer.store('data.json', jsonable_object) self.observer.store('data.npy', numpy_array)
Refer to the observer reference for more details.
# Config methods
While config references allow you to make static references, configuration values can be more complex. They might, for example, evolve during the course of execution or obey non-trivial conditions. Config methods allow you to implement such complex configuration values. To define a config method just add a regular Python method to the component class. The method name must start with
config_. You can then 'call' the method directly in the
machinable.yaml configuration, for example:
components: - my_components.base_component: learning_rate: learning_rate(base=0.01)
Here, the learning rate parameter is defined as a config method that takes a base learning rate parameter. The config method
config_learning_rate needs to be defined in the corresponding component:
from machinable import Component class MyBaseModel(Component): def on_create(self): self.epoch = None def on_execute(self): for epoch in range(5): self.epoch = epoch print(epoch, self.config.learning_rate) def config_learning_rate(self, base=0.1): if not self.epoch: return base # default learning rate return base * self.epoch
The method is executed whenever
self.config.learning_rate is being accessed; as a result, the execution output prints:
0 0 1 0.01 2 0.02 3 0.03 4 0.04
Config methods hence allow for the expression of arbitrary configuration dependencies and are a powerful tool for implementing complex configuration patterns more efficiently. They can also be useful for parsing configuration values into Python objects. For instance, you might define a config method
dtype: dtype('f32') returns
# Child components
In many cases, it can be useful to organise components in a hierarchical way. For example, your component may implement a certain prediction problem and you want to encapsulate different prediction strategies in sub-components.
machinable allows you to use components as
child component, meaning they become available to a parent
node component. Consider the following components:
# child_component_example.py from machinable import Component class PredictionStrategy(Component): def on_create(self): self.model = ... # set up some model def predict(self, data): return self.model.predict(data)
# node_component_example.py from machinable import Component class PredictionBenchmark(Component): def on_create(self, prediction_strategy): self.prediction_strategy = prediction_strategy # load data self.data = ... print(self.prediction_strategy.predict(self.data))
Here, the child component encapsulates the model while the node component implements the benchmark control flow. The child components becomes available as argument to the
on_create event of the node component.
In general, the child component can access the parent node via
self.node while the node component can access its child components via
To designate components as child use the
children argument of Task.component() that will be discussed in the following section.