detectron2.evaluation package

class detectron2.evaluation.CityscapesEvaluator(dataset_name)[source]

Bases: detectron2.evaluation.evaluator.DatasetEvaluator

Evaluate instance segmentation results using cityscapes API.

Note

  • It does not work in multi-machine distributed training.
  • It contains a synchronization, therefore has to be used on all ranks.
__init__(dataset_name)[source]
Parameters:dataset_name (str) – the name of the dataset. It must have the following metadata associated with it: “thing_classes”, “gt_dir”.
reset()[source]
process(inputs, outputs)[source]
evaluate()[source]
Returns:dict – has a key “segm”, whose value is a dict of “AP” and “AP50”.
class detectron2.evaluation.COCOEvaluator(dataset_name, cfg, distributed, output_dir=None)[source]

Bases: detectron2.evaluation.evaluator.DatasetEvaluator

Evaluate object proposal, instance detection/segmentation, keypoint detection outputs using COCO’s metrics and APIs.

__init__(dataset_name, cfg, distributed, output_dir=None)[source]
Parameters:
  • dataset_name (str) –

    name of the dataset to be evaluated. It must have either the following corresponding metadata:

    ”json_file”: the path to the COCO format annotation

    Or it must be in detectron2’s standard dataset format so it can be converted to COCO format automatically.

  • cfg (CfgNode) – config instance
  • distributed (True) – if True, will collect results from all ranks for evaluation. Otherwise, will evaluate the results in the current process.
  • output_dir (str) – optional, an output directory to dump results.
reset()[source]
process(inputs, outputs)[source]
Parameters:
  • inputs – the inputs to a COCO model (e.g., GeneralizedRCNN). It is a list of dict. Each dict corresponds to an image and contains keys like “height”, “width”, “file_name”, “image_id”.
  • outputs – the outputs of a COCO model. It is a list of dicts with key “instances” that contains Instances.
evaluate()[source]
class detectron2.evaluation.DatasetEvaluator[source]

Bases: object

Base class for a dataset evaluator.

The function inference_on_dataset() runs the model over all samples in the dataset, and have a DatasetEvaluator to process the inputs/outputs.

This class will accumulate information of the inputs/outputs (by process()), and produce evaluation results in the end (by evaluate()).

reset()[source]

Preparation for a new round of evaluation. Should be called before starting a round of evaluation.

process(input, output)[source]

Process an input/output pair.

Parameters:
  • input – the input that’s used to call the model.
  • output – the return value of model(output)
evaluate()[source]

Evaluate/summarize the performance, after processing all input/output pairs.

Returns:dict – A new evaluator class can return a dict of arbitrary format as long as the user can process the results. In our train_net.py, we expect the following format:
  • key: the name of the task (e.g., bbox)
  • value: a dict of {metric name: score}, e.g.: {“AP50”: 80}
class detectron2.evaluation.DatasetEvaluators(evaluators)[source]

Bases: detectron2.evaluation.evaluator.DatasetEvaluator

reset()[source]
process(input, output)[source]
evaluate()[source]
detectron2.evaluation.inference_context(model)[source]

A context where the model is temporarily changed to eval mode, and restored to previous mode afterwards.

Parameters:model – a torch Module
detectron2.evaluation.inference_on_dataset(model, data_loader, evaluator)[source]

Run model on the data_loader and evaluate the metrics with evaluator. The model will be used in eval mode.

Parameters:
  • model (nn.Module) –

    a module which accepts an object from data_loader and returns some outputs. It will be temporarily set to eval mode.

    If you wish to evaluate a model in training mode instead, you can wrap the given model and override its behavior of .eval() and .train().

  • data_loader – an iterable object with a length. The elements it generates will be the inputs to the model.
  • evaluator (DatasetEvaluator) – the evaluator to run. Use DatasetEvaluators([]) if you only want to benchmark, but don’t want to do any evaluation.
Returns:

The return value of evaluator.evaluate()

class detectron2.evaluation.LVISEvaluator(dataset_name, cfg, distributed, output_dir=None)[source]

Bases: detectron2.evaluation.evaluator.DatasetEvaluator

Evaluate object proposal and instance detection/segmentation outputs using LVIS’s metrics and evaluation API.

__init__(dataset_name, cfg, distributed, output_dir=None)[source]
Parameters:
  • dataset_name (str) –

    name of the dataset to be evaluated. It must have the following corresponding metadata:

    ”json_file”: the path to the LVIS format annotation
  • cfg (CfgNode) – config instance
  • distributed (True) – if True, will collect results from all ranks for evaluation. Otherwise, will evaluate the results in the current process.
  • output_dir (str) – optional, an output directory to dump results.
reset()[source]
process(inputs, outputs)[source]
Parameters:
  • inputs – the inputs to a LVIS model (e.g., GeneralizedRCNN). It is a list of dict. Each dict corresponds to an image and contains keys like “height”, “width”, “file_name”, “image_id”.
  • outputs – the outputs of a LVIS model. It is a list of dicts with key “instances” that contains Instances.
evaluate()[source]
class detectron2.evaluation.COCOPanopticEvaluator(dataset_name, output_dir)[source]

Bases: detectron2.evaluation.evaluator.DatasetEvaluator

Evaluate Panoptic Quality metrics on COCO using PanopticAPI. It saves panoptic segmentation prediction in output_dir

It contains a synchronize call and has to be called from all workers.

__init__(dataset_name, output_dir)[source]
Parameters:
  • dataset_name (str) – name of the dataset
  • output_dir (str) – output directory to save results for evaluation
reset()[source]
process(inputs, outputs)[source]
evaluate()[source]
class detectron2.evaluation.PascalVOCDetectionEvaluator(dataset_name)[source]

Bases: detectron2.evaluation.evaluator.DatasetEvaluator

Evaluate Pascal VOC AP. It contains a synchronization, therefore has to be called from all ranks.

Note that this is a rewrite of the official Matlab API. The results should be similar, but not identical to the one produced by the official API.

__init__(dataset_name)[source]
Parameters:dataset_name (str) – name of the dataset, e.g., “voc_2007_test”
reset()[source]
process(inputs, outputs)[source]
evaluate()[source]
Returns:dict – has a key “segm”, whose value is a dict of “AP”, “AP50”, and “AP75”.
class detectron2.evaluation.SemSegEvaluator(dataset_name, distributed, num_classes, ignore_label=255, output_dir=None)[source]

Bases: detectron2.evaluation.evaluator.DatasetEvaluator

Evaluate semantic segmentation

__init__(dataset_name, distributed, num_classes, ignore_label=255, output_dir=None)[source]
Parameters:
  • dataset_name (str) – name of the dataset to be evaluated.
  • distributed (True) – if True, will collect results from all ranks for evaluation. Otherwise, will evaluate the results in the current process.
  • num_classes (int) – number of classes
  • ignore_label (int) – value in semantic segmentation ground truth. Predictions for the
  • pixels should be ignored. (corresponding) –
  • output_dir (str) – an output directory to dump results.
reset()[source]
process(inputs, outputs)[source]
Parameters:
  • inputs – the inputs to a model. It is a list of dicts. Each dict corresponds to an image and contains keys like “height”, “width”, “file_name”.
  • outputs – the outputs of a model. It is either list of semantic segmentation predictions (Tensor [H, W]) or list of dicts with key “sem_seg” that contains semantic segmentation prediction in the same format.
evaluate()[source]

Evaluates standard semantic segmentation metrics (http://cocodataset.org/#stuff-eval):

  • Mean intersection-over-union averaged across classes (mIoU)
  • Frequency Weighted IoU (fwIoU)
  • Mean pixel accuracy averaged across classes (mACC)
  • Pixel Accuracy (pACC)
encode_json_sem_seg(sem_seg, input_file_name)[source]

Convert semantic segmentation to COCO stuff format with segments encoded as RLEs. See http://cocodataset.org/#format-results

detectron2.evaluation.print_csv_format(results)[source]

Print main metrics in a format similar to Detectron, so that they are easy to copypaste into a spreadsheet.

Parameters:results (OrderedDict[dict]) – task_name -> {metric -> score}
detectron2.evaluation.verify_results(cfg, results)[source]
Parameters:results (OrderedDict[dict]) – task_name -> {metric -> score}
Returns:bool – whether the verification succeeds or not