Getting started#
In this tutorial, you will learn how to define and execute experiments, trying the key features. You can follow it sequentially or jump to specific questions as needed.
Installation#
MLtraq requires Python 3.9+ and depends on SQLAlchemy 2.0+, Pandas 1.5.3+, and Joblib 1.3.2+, which are installed as dependencies. To install:
Examples#
The code examples are fully self-contained to reproduce the outputs. In this example, the version of MLtraq used to compile this tutorial is shown. Make sure to have the latest release installed.
Key concepts#
-
Experimentation: The process of systematically changing and testing different input values in an algorithm to observe their impact on performance, behavior, or outcomes. Experiments can be defined and executed, with their outcomes and/or results persisted for later analysis.
-
Session: A
sessionobject lets you define the connection to a database, load and add experiments. Sessions are bound to a database. -
Experiment: An
experimentobject manages a collection ofrunobjects. Experiments can be created, persisted, loaded and executed. It implements the experimentation process. Arunis an instantiation of the experiment with a configuration of input values. The execution of anexperimentrequires the execution of all itsruns. Experiments are bound to a database and are unaware of sessions. -
Run: A
runobject is an instantiation of the experiment with a configuration of input values. The execution of arunis defined as the chained evaluation ofstepfunctions, whose sole parameter is therunobject itself. Runs are unaware of databases, sessions, experiments or other runs, and are isolated from the rest of the experiment. -
Step: Step functions are a Python functions that take as sole input the
runobject, changing its internal state. There is no return value. Steps can access the configuration of therunin the attributesrun.configandrun.params, and can change the state of therunby modifying the attributesrun.vars,run.stateandrun.fields.
Tip
An overview of the run state attributes can be found in State management, with a discussion of their semantics in the Model of computation.