Spaces:
Sleeping
Active Learning Module
The physicsnemo.active_learning namespace is used for defining the "scaffolding"
that can be used to construct automated, end-to-end active learning workflows.
For areas of science that are difficult to source ground-truths to train on
(of which there are many), an active learning curriculum attempts to train a
model with improved data efficiency; better generalization performance but requiring
fewer training samples.
Generally, an active learning workflow can be decomposed into three "phases" that are - in the simplest case - run sequentially:
- Training/fine-tuning: A "learner" or surrogate model is initially trained on available data, and in subsequent active learning iterations, is fine-tuned with the new data appended on the original dataset.
- Querying: One or more strategies that encode some heuristics for what new data is most informative for the learner. Examples of this include uncertainty-based methods, which may screen a pool of unlabeled data for those the model is least confident with.
- Labeling: A method of obtaining ground truth (labels) for new data points, pipelined from the querying stage. This may entail running an expensive solver, or acquiring experimental data.
The three phases are repeated until the learner converges. Because "convergence" may not be easily defined, we define an additional phase which we call metrology: this represents a phase most similar to querying, but allows a user to define some set of criteria to monitor over the course of active learning beyond simple validation metrics to ensure the model can be used with confidence as surrogates (e.g. within a simulation loop).
How to use this module
With the context above in mind, inspecting the driver module will give you
a sense for how the end-to-end workflow functions; the Driver class acts
as an orchestrator for all the phases of active learning we described above.
From there, you should realize that Driver is written in a highly abstract
way: we need concrete strategies that implement querying, labeling, and metrology
concepts. The protocols module provides the scaffolding to do so - we implement
various components as typing.Protocol which are used for structural sub-typing:
they can be thought of as abstract classes that define an expected interface
in a function or class from which you can define your own classes by either
inheriting from them, or defining your own class that implements the expected
methods and attributes.
In order to perform the training portion of active learning, we provide a
minimal yet functional DefaultTrainingLoop inside the loop module. This
loop simply requires a protocols.TrainingProtocol to be passed, which is
a function that defines the logic for computing the loss per batch/training
step.
Configuring workflows
The config module defines some simple dataclasses that can be used
to configure the behavior of various parts of active learning, e.g. how
training is conducted, etc. Because Driver is designed to be checkpointable,
with the exception of a few parts such as datasets, everything should be
JSON-serializable.
Restarting workflows
For classes and functions that are created at runtime, checkpointing requires
that these components can be recreated when restarting from a checkpoint. To
that end, the _registry module provides a user-friendly way to instantiate
objects: user-defined strategy classes can be added to the registry to enable
their creation in checkpoint restarts.