pypers.pipeline
- class pypers.pipeline.Configurator(pipeline: Pipeline)
Bases:
object
Automatically configures hyperparameters of a pipeline.
- Parameters:
pipeline (Pipeline) – An instance of the Pipeline class.
- configure(base_cfg, input)
Configure the hyperparameters of the pipeline.
- class pypers.pipeline.Pipeline(configurator: Optional[Configurator] = None)
Bases:
object
Defines a processing pipeline.
This class defines a processing pipeline that consists of multiple stages. Each stage performs a specific operation on the input data. The pipeline processes the input data by executing the process method of each stage successively.
Note that hyperparameters are not set automatically if the
process_image()
method is used directly. Hyperparameters are only set automatically if theconfigure
method or batch processing is used.- Parameters:
configurator (Configurator, optional) – An instance of the Configurator class used to automatically configure hyperparameters of the pipeline. If not provided, a default Configurator instance will be created.
- configure(base_cfg, *args, **kwargs)
Automatically configures hyperparameters.
- property fields
- find(stage_id, not_found_dummy=inf)
Returns the position of the stage identified by
stage_id
.Returns
not_found_dummy
if the stage is not found.
- get_extra_stages(first_stage, last_stage, available_inputs)
- process(input, cfg, first_stage=None, last_stage=None, data=None, log_root_dir=None, out=None, **kwargs)
Processes the input.
The
process()
methods of the stages of the pipeline are executed successively.- Parameters:
input – The input to be processed (can be
None
if and only ifdata
is notNone
).cfg – A
Config
object which represents the hyperparameters.first_stage – The name of the first stage to be executed.
last_stage – The name of the last stage to be executed.
data – The results of a previous execution.
log_root_dir – Path to a directory where log files should be written to.
out – An instance of an
Output
sub-class,'muted'
if no output should be produced, orNone
if the default output should be used.
- Returns:
Tuple
(data, cfg, timings)
, wheredata
is the pipeline data object comprising all final and intermediate results,cfg
are the finally used hyperparameters, andtimings
is a dictionary containing the execution time of each individual pipeline stage (in seconds).
The parameter
data
is used if and only iffirst_stage
is notNone
. In this case, the outputs produced by the stages of the pipeline which are being skipped must be fed in using thedata
parameter obtained from a previous execution of this method.
- stage(stage_id)
- class pypers.pipeline.ProcessingControl(first_stage: Optional[str] = None, last_stage: Optional[str] = None)
Bases:
object
A class used to control the processing of stages in a pipeline.
This class keeps track of the first and last stages of a pipeline, and determines whether a given stage should be processed based on its position in the pipeline.
- Parameters:
- step(stage)
Determines whether the given stage should be processed.
If the stage is the first stage of the pipeline, processing starts. If the stage is the last stage of the pipeline, processing stops after this stage.
- class pypers.pipeline.Stage
Bases:
object
A pipeline stage.
Each stage can be controlled by a separate set of hyperparameters. Refer to the documentation of the respective pipeline stages for details. Most hyperparameters reside in namespaces, which are uniquely associated with the corresponding pipeline stages.
- Parameters:
name – Readable identifier of this stage.
id – The stage ID, used as the hyperparameter namespace. Defaults to the result of the
suggest_stage_id()
function if not specified.inputs – List of inputs required by this stage.
outputs – List of outputs produced by this stage.
Automation
Hyperparameters can be set automatically using the
configure()
method.Inputs and outputs
Each stage must declare its required inputs and the outputs it produces. These are used by
create_pipeline()
to automatically determine the stage order. The inputinput
is provided by the pipeline itself.- add_callback(name, cb)
- configure(*args, **kwargs)
- consumes = []
- enabled_by_default = True
- inputs = []
- outputs = []
- process(cfg: Optional[Config] = None, log_root_dir: Optional[str] = None, out: Optional[Output] = None, **inputs)
Executes the current pipeline stage.
This method runs the current stage of the pipeline with the provided inputs, configuration parameters, and logging settings. It then returns the outputs produced by this stage.
- Parameters:
input_data (dict) – A dictionary containing the inputs required by this stage. Each key-value pair in the dictionary represents an input name and its corresponding value.
cfg (dict) – A dictionary containing the hyperparameters to be used by this stage. Each key-value pair in the dictionary represents a hyperparameter name and its corresponding value.
log_root_dir (str, optional) – The path to the directory where log files will be written. If this parameter is
None
, no log files will be written.out (
Output
, ‘muted’, or None, optional) – An instance of a subclass ofOutput
to handle the output of this stage. If this parameter is'muted'
, no output will be produced. If this parameter isNone
, the default output handler will be used.
- Returns:
A dictionary containing the outputs produced by this stage. Each key-value pair in the dictionary represents an output name and its corresponding value.
- Return type:
- remove_callback(name, cb)
- skip(data, out=None, **kwargs)
- pypers.pipeline.create_pipeline(stages: Sequence)
Creates and returns a new
Pipeline
object configured for the given stages.The stage order is determined automatically.
- pypers.pipeline.suggest_stage_id(class_name: str) str
Suggest stage ID based on a class name.
This function validates the class name, then finds and groups tokens in the class name. Tokens are grouped if they are consecutive and alphanumeric, but do not start with numbers. The function then converts the tokens to lowercase, removes underscores, and joins them with hyphens.
- Parameters:
class_name (str) – The name of the class to suggest a configuration namespace for.
- Returns:
A string of hyphen-separated tokens from the class name.
- Return type:
- Raises:
AssertionError – If the class name is not valid.