# fvcore documentation¶

Detectron2 depends on utilities in fvcore. We include part of fvcore documentation here for easier reference.

## fvcore.nn¶

fvcore.nn.activation_count(model: torch.nn.Module, inputs: Tuple[Any, ], supported_ops: Optional[Dict[str, Callable[[List[Any], List[Any]], Union[Counter[str], numbers.Number]]]] = None) → Tuple[DefaultDict[str, float], Counter[str]][source]

Given a model and an input to the model, compute the total number of activations of the model.

Parameters
• model (nn.Module) – The model to compute activation counts.

• inputs (tuple) – Inputs that are passed to model to count activations. Inputs need to be in a tuple.

• supported_ops (dict(str,Callable) or None) – provide additional handlers for extra ops, or overwrite the existing handlers for convolution and matmul. The key is operator name and the value is a function that takes (inputs, outputs) of the op.

Returns

tuple[defaultdict, Counter]

A dictionary that records the number of

activation (mega) for each operation and a Counter that records the number of unsupported operations.

class fvcore.nn.ActivationCountAnalysis(model: torch.nn.Module, inputs: Union[torch.Tensor, Tuple[torch.Tensor, ]])[source]

Bases: fvcore.nn.jit_analysis.JitModelAnalysis

Provides access to per-submodule model activation count obtained by tracing a model with pytorch’s jit tracing functionality. By default, comes with standard activation counters for convolutional and dot-product operators.

Handles for additional operators may be added, or the default ones overwritten, using the .set_op_handle(name, func) method. See the method documentation for details.

Activation counts can be obtained as:

• .total(module_name=""): total activation count for a module

• .by_operator(module_name=""): activation counts for the module, as a Counter over different operator types

• .by_module(): Counter of activation counts for all submodules

• .by_module_and_operator(): dictionary indexed by descendant of Counters over different operator types

An operator is treated as within a module if it is executed inside the module’s __call__ method. Note that this does not include calls to other methods of the module or explicit calls to module.forward(...).

Example usage:

>>> import torch.nn as nn
>>> import torch
>>> class TestModel(nn.Module):
...     def __init__(self):
...        super().__init__()
...        self.fc = nn.Linear(in_features=1000, out_features=10)
...        self.conv = nn.Conv2d(
...            in_channels=3, out_channels=10, kernel_size=1
...        )
...        self.act = nn.ReLU()
...    def forward(self, x):
...        return self.fc(self.act(self.conv(x)).flatten(1))

>>> model = TestModel()
>>> inputs = (torch.randn((1,3,10,10)),)
>>> acts = ActivationCountAnalysis(model, inputs)
>>> acts.total()
1010
>>> acts.total("fc")
10
>>> acts.by_operator()
Counter({"conv" : 1000, "addmm" : 10})
>>> acts.by_module()
Counter({"" : 1010, "fc" : 10, "conv" : 1000, "act" : 0})
>>> acts.by_module_and_operator()
{"" : Counter({"conv" : 1000, "addmm" : 10}),
"conv" : Counter({"conv" : 1000}),
"act" : Counter()
}

__init__(model: torch.nn.Module, inputs: Union[torch.Tensor, Tuple[torch.Tensor, ]])None[source]
Parameters
• model – The model to analyze

• inputs – The inputs to the model for analysis.

We will trace the execution of model.forward(inputs). This means inputs have to be tensors or tuple of tensors (see https://pytorch.org/docs/stable/generated/torch.jit.trace.html#torch.jit.trace). In order to trace other methods or unsupported input types, you may need to implement a wrapper module.

ancestor_mode(mode: str) → T

Sets how to determine the ancestor modules of an operator. Must be one of “owner” or “caller”.

• “caller”: an operator belongs to all modules that is currently executing forward() at the time the operator is called.

• “owner”: an operator belongs to the last module that’s executing forward() at the time the operator is called, plus this module’s recursive parents. If an module has multiple parents (e.g. a shared module), only one will be picked.

For most cases, a module only calls submodules it owns, so both options would work identically. In certain edge cases, this option will affect the hierarchy of results, but won’t affect the total count.

by_module() → Counter[str]

Returns the statistics for all submodules, aggregated over all operators.

Returns

Counter(str) – statistics counter grouped by submodule names

by_module_and_operator() → Dict[str, Counter[str]]

Returns the statistics for all submodules, separated out by operator type for each submodule. The operator handle determines the name associated with each operator type.

Returns

dict(str, Counter(str)) – The statistics for each submodule and each operator. Grouped by submodule names, then by operator name.

by_operator(module_name: str = '') → Counter[str]

Returns the statistics for a requested module, grouped by operator type. The operator handle determines the name associated with each operator type.

Parameters

module_name (str) – The submodule to get data for. Defaults to the entire model.

Returns

Counter(str) – The statistics for each operator.

canonical_module_name(name: str)str

Returns the canonical module name of the given name, which might be different from the given name if the module is shared. This is the name that will be used as a key when statistics are output using .by_module() and .by_module_and_operator().

Parameters

name (str) – The name of the module to find the canonical name for.

Returns

str – The canonical name of the module.

clear_op_handles() → fvcore.nn.jit_analysis.JitModelAnalysis

Clears all operator handles currently set.

copy(new_model: Optional[torch.nn.Module] = None, new_inputs: Union[None, torch.Tensor, Tuple[torch.Tensor, ]] = None) → fvcore.nn.jit_analysis.JitModelAnalysis

Returns a copy of the JitModelAnalysis object, keeping all settings, but on a new model or new inputs.

Parameters
• new_model (nn.Module or None) – a new model for the new JitModelAnalysis. If None, uses the original model.

• new_inputs (typing.Tuple[object, ..] or None) – new inputs for the new JitModelAnalysis. If None, uses the original inputs.

Returns

JitModelAnalysis – the new model analysis object

set_op_handle(*args, **kwargs: Optional[Callable[[List[Any], List[Any]], Union[Counter[str], numbers.Number]]]) → fvcore.nn.jit_analysis.JitModelAnalysis

Sets additional operator handles, or replaces existing ones.

Parameters
• args – (str, Handle) pairs of operator names and handles.

• kwargs – mapping from operator names to handles.

If a handle is None, the op will be explicitly ignored. Otherwise, handle should be a function that calculates the desirable statistic from an operator. The function must take two arguments, which are the inputs and outputs of the operator, in the form of list(torch._C.Value). The function should return a counter object with per-operator statistics.

Examples

handlers = {"aten::linear": my_handler}
counter.set_op_handle("aten::matmul", None, "aten::bmm", my_handler2)
.set_op_handle(**handlers)

total(module_name: str = '')int

Returns the total aggregated statistic across all operators for the requested module.

Parameters

module_name (str) – The submodule to get data for. Defaults to the entire model.

Returns

int – The aggregated statistic.

tracer_warnings(mode: str) → T

Sets which warnings to print when tracing the graph to calculate statistics. There are three modes. Defaults to ‘no_tracer_warning’. Allowed values are:

• ‘all’ : keeps all warnings raised while tracing

• ‘no_tracer_warning’ : suppress torch.jit.TracerWarning only

• ‘none’ : suppress all warnings raised while tracing

Parameters

mode (str) – warning mode in one of the above values.

uncalled_modules() → Set[str]

Returns a set of submodules that were never called during the trace of the graph. This may be because they were unused, or because they were accessed via direct calls .forward() or with other python methods. In the latter case, statistics will not be attributed to the submodule, though the statistics will be included in the parent module.

Returns

set(str)

The set of submodule names that were never called

during the trace of the model.

uncalled_modules_warnings(enabled: bool) → T

Sets if warnings from uncalled submodules are shown. Defaults to true. A submodule is considered “uncalled” if it is never called during tracing. This may be because it is actually unused, or because it is accessed via calls to .forward() or other methods of the module. The set of uncalled modules may be obtained from uncalled_modules() regardless of this setting.

Parameters

enabled (bool) – Set to ‘True’ to show warnings.

unsupported_ops(module_name: str = '') → Counter[str]

Lists the number of operators that were encountered but unsupported because no operator handle is available for them. Does not include operators that are explicitly ignored.

Parameters

module_name (str) – The submodule to list unsupported ops. Defaults to the entire model.

Returns

Counter(str) – The number of occurences each unsupported operator.

unsupported_ops_warnings(enabled: bool) → T

Sets if warnings for unsupported operators are shown. Defaults to True. Counts of unsupported operators may be obtained from unsupported_ops() regardless of this setting.

Parameters

enabled (bool) – Set to ‘True’ to show unsupported operator warnings.

fvcore.nn.flop_count(model: torch.nn.Module, inputs: Tuple[Any, ], supported_ops: Optional[Dict[str, Callable[[List[Any], List[Any]], Union[Counter[str], numbers.Number]]]] = None) → Tuple[DefaultDict[str, float], Counter[str]][source]

Given a model and an input to the model, compute the per-operator Gflops of the given model.

Parameters
• model (nn.Module) – The model to compute flop counts.

• inputs (tuple) – Inputs that are passed to model to count flops. Inputs need to be in a tuple.

• supported_ops (dict(str,Callable) or None) – provide additional handlers for extra ops, or overwrite the existing handlers for convolution and matmul and einsum. The key is operator name and the value is a function that takes (inputs, outputs) of the op. We count one Multiply-Add as one FLOP.

Returns

tuple[defaultdict, Counter]

A dictionary that records the number of

gflops for each operation and a Counter that records the number of unsupported operations.

class fvcore.nn.FlopCountAnalysis(model: torch.nn.Module, inputs: Union[torch.Tensor, Tuple[torch.Tensor, ]])[source]

Bases: fvcore.nn.jit_analysis.JitModelAnalysis

Provides access to per-submodule model flop count obtained by tracing a model with pytorch’s jit tracing functionality. By default, comes with standard flop counters for a few common operators. Note that:

1. Flop is not a well-defined concept. We just produce our best estimate.

2. We count one fused multiply-add as one flop.

Handles for additional operators may be added, or the default ones overwritten, using the .set_op_handle(name, func) method. See the method documentation for details.

Flop counts can be obtained as:

• .total(module_name=""): total flop count for the module

• .by_operator(module_name=""): flop counts for the module, as a Counter over different operator types

• .by_module(): Counter of flop counts for all submodules

• .by_module_and_operator(): dictionary indexed by descendant of Counters over different operator types

An operator is treated as within a module if it is executed inside the module’s __call__ method. Note that this does not include calls to other methods of the module or explicit calls to module.forward(...).

Example usage:

>>> import torch.nn as nn
>>> import torch
>>> class TestModel(nn.Module):
...    def __init__(self):
...        super().__init__()
...        self.fc = nn.Linear(in_features=1000, out_features=10)
...        self.conv = nn.Conv2d(
...            in_channels=3, out_channels=10, kernel_size=1
...        )
...        self.act = nn.ReLU()
...    def forward(self, x):
...        return self.fc(self.act(self.conv(x)).flatten(1))

>>> model = TestModel()
>>> inputs = (torch.randn((1,3,10,10)),)
>>> flops = FlopCountAnalysis(model, inputs)
>>> flops.total()
13000
>>> flops.total("fc")
10000
>>> flops.by_operator()
Counter({"addmm" : 10000, "conv" : 3000})
>>> flops.by_module()
Counter({"" : 13000, "fc" : 10000, "conv" : 3000, "act" : 0})
>>> flops.by_module_and_operator()
{"" : Counter({"addmm" : 10000, "conv" : 3000}),
"conv" : Counter({"conv" : 3000}),
"act" : Counter()
}

__init__(model: torch.nn.Module, inputs: Union[torch.Tensor, Tuple[torch.Tensor, ]])None[source]
Parameters
• model – The model to analyze

• inputs – The inputs to the model for analysis.

We will trace the execution of model.forward(inputs). This means inputs have to be tensors or tuple of tensors (see https://pytorch.org/docs/stable/generated/torch.jit.trace.html#torch.jit.trace). In order to trace other methods or unsupported input types, you may need to implement a wrapper module.

ancestor_mode(mode: str) → T

Sets how to determine the ancestor modules of an operator. Must be one of “owner” or “caller”.

• “caller”: an operator belongs to all modules that is currently executing forward() at the time the operator is called.

• “owner”: an operator belongs to the last module that’s executing forward() at the time the operator is called, plus this module’s recursive parents. If an module has multiple parents (e.g. a shared module), only one will be picked.

For most cases, a module only calls submodules it owns, so both options would work identically. In certain edge cases, this option will affect the hierarchy of results, but won’t affect the total count.

by_module() → Counter[str]

Returns the statistics for all submodules, aggregated over all operators.

Returns

Counter(str) – statistics counter grouped by submodule names

by_module_and_operator() → Dict[str, Counter[str]]

Returns the statistics for all submodules, separated out by operator type for each submodule. The operator handle determines the name associated with each operator type.

Returns

dict(str, Counter(str)) – The statistics for each submodule and each operator. Grouped by submodule names, then by operator name.

by_operator(module_name: str = '') → Counter[str]

Returns the statistics for a requested module, grouped by operator type. The operator handle determines the name associated with each operator type.

Parameters

module_name (str) – The submodule to get data for. Defaults to the entire model.

Returns

Counter(str) – The statistics for each operator.

canonical_module_name(name: str)str

Returns the canonical module name of the given name, which might be different from the given name if the module is shared. This is the name that will be used as a key when statistics are output using .by_module() and .by_module_and_operator().

Parameters

name (str) – The name of the module to find the canonical name for.

Returns

str – The canonical name of the module.

clear_op_handles() → fvcore.nn.jit_analysis.JitModelAnalysis

Clears all operator handles currently set.

copy(new_model: Optional[torch.nn.Module] = None, new_inputs: Union[None, torch.Tensor, Tuple[torch.Tensor, ]] = None) → fvcore.nn.jit_analysis.JitModelAnalysis

Returns a copy of the JitModelAnalysis object, keeping all settings, but on a new model or new inputs.

Parameters
• new_model (nn.Module or None) – a new model for the new JitModelAnalysis. If None, uses the original model.

• new_inputs (typing.Tuple[object, ..] or None) – new inputs for the new JitModelAnalysis. If None, uses the original inputs.

Returns

JitModelAnalysis – the new model analysis object

set_op_handle(*args, **kwargs: Optional[Callable[[List[Any], List[Any]], Union[Counter[str], numbers.Number]]]) → fvcore.nn.jit_analysis.JitModelAnalysis

Sets additional operator handles, or replaces existing ones.

Parameters
• args – (str, Handle) pairs of operator names and handles.

• kwargs – mapping from operator names to handles.

If a handle is None, the op will be explicitly ignored. Otherwise, handle should be a function that calculates the desirable statistic from an operator. The function must take two arguments, which are the inputs and outputs of the operator, in the form of list(torch._C.Value). The function should return a counter object with per-operator statistics.

Examples

handlers = {"aten::linear": my_handler}
counter.set_op_handle("aten::matmul", None, "aten::bmm", my_handler2)
.set_op_handle(**handlers)

total(module_name: str = '')int

Returns the total aggregated statistic across all operators for the requested module.

Parameters

module_name (str) – The submodule to get data for. Defaults to the entire model.

Returns

int – The aggregated statistic.

tracer_warnings(mode: str) → T

Sets which warnings to print when tracing the graph to calculate statistics. There are three modes. Defaults to ‘no_tracer_warning’. Allowed values are:

• ‘all’ : keeps all warnings raised while tracing

• ‘no_tracer_warning’ : suppress torch.jit.TracerWarning only

• ‘none’ : suppress all warnings raised while tracing

Parameters

mode (str) – warning mode in one of the above values.

uncalled_modules() → Set[str]

Returns a set of submodules that were never called during the trace of the graph. This may be because they were unused, or because they were accessed via direct calls .forward() or with other python methods. In the latter case, statistics will not be attributed to the submodule, though the statistics will be included in the parent module.

Returns

set(str)

The set of submodule names that were never called

during the trace of the model.

uncalled_modules_warnings(enabled: bool) → T

Sets if warnings from uncalled submodules are shown. Defaults to true. A submodule is considered “uncalled” if it is never called during tracing. This may be because it is actually unused, or because it is accessed via calls to .forward() or other methods of the module. The set of uncalled modules may be obtained from uncalled_modules() regardless of this setting.

Parameters

enabled (bool) – Set to ‘True’ to show warnings.

unsupported_ops(module_name: str = '') → Counter[str]

Lists the number of operators that were encountered but unsupported because no operator handle is available for them. Does not include operators that are explicitly ignored.

Parameters

module_name (str) – The submodule to list unsupported ops. Defaults to the entire model.

Returns

Counter(str) – The number of occurences each unsupported operator.

unsupported_ops_warnings(enabled: bool) → T

Sets if warnings for unsupported operators are shown. Defaults to True. Counts of unsupported operators may be obtained from unsupported_ops() regardless of this setting.

Parameters

enabled (bool) – Set to ‘True’ to show unsupported operator warnings.

fvcore.nn.sigmoid_focal_loss(inputs: torch.Tensor, targets: torch.Tensor, alpha: float = - 1, gamma: float = 2, reduction: str = 'none')torch.Tensor[source]

Loss used in RetinaNet for dense detection: https://arxiv.org/abs/1708.02002. :param inputs: A float tensor of arbitrary shape.

The predictions for each example.

Parameters
• targets

A float tensor with the same shape as inputs. Stores the binary

classification label for each element in inputs

(0 for the negative class and 1 for the positive class).

• alpha – (optional) Weighting factor in range (0,1) to balance positive vs negative examples. Default = -1 (no weighting).

• gamma – Exponent of the modulating factor (1 - p_t) to balance easy vs hard examples.

• reduction – ‘none’ | ‘mean’ | ‘sum’ ‘none’: No reduction will be applied to the output. ‘mean’: The output will be averaged. ‘sum’: The output will be summed.

Returns

Loss tensor with the reduction option applied.

fvcore.nn.sigmoid_focal_loss_star(inputs: torch.Tensor, targets: torch.Tensor, alpha: float = - 1, gamma: float = 1, reduction: str = 'none')torch.Tensor[source]

FL* described in RetinaNet paper Appendix: https://arxiv.org/abs/1708.02002. :param inputs: A float tensor of arbitrary shape.

The predictions for each example.

Parameters
• targets

A float tensor with the same shape as inputs. Stores the binary

classification label for each element in inputs

(0 for the negative class and 1 for the positive class).

• alpha – (optional) Weighting factor in range (0,1) to balance positive vs negative examples. Default = -1 (no weighting).

• gamma – Gamma parameter described in FL*. Default = 1 (no weighting).

• reduction – ‘none’ | ‘mean’ | ‘sum’ ‘none’: No reduction will be applied to the output. ‘mean’: The output will be averaged. ‘sum’: The output will be summed.

Returns

Loss tensor with the reduction option applied.

fvcore.nn.giou_loss(boxes1: torch.Tensor, boxes2: torch.Tensor, reduction: str = 'none', eps: float = 1e-07)torch.Tensor[source]

Generalized Intersection over Union Loss (Hamid Rezatofighi et. al) https://arxiv.org/abs/1902.09630

Gradient-friendly IoU loss with an additional penalty that is non-zero when the boxes do not overlap and scales with the size of their smallest enclosing box. This loss is symmetric, so the boxes1 and boxes2 arguments are interchangeable.

Parameters
• boxes1 (Tensor) – box locations in XYXY format, shape (N, 4) or (4,).

• boxes2 (Tensor) – box locations in XYXY format, shape (N, 4) or (4,).

• reduction – ‘none’ | ‘mean’ | ‘sum’ ‘none’: No reduction will be applied to the output. ‘mean’: The output will be averaged. ‘sum’: The output will be summed.

• eps (float) – small number to prevent division by zero

fvcore.nn.parameter_count(model: torch.nn.Module) → DefaultDict[str, int][source]

Count parameters of a model and its submodules.

Parameters

model – a torch module

Returns

dict (str-> int) – the key is either a parameter name or a module name. The value is the number of elements in the parameter, or in all parameters of the module. The key “” corresponds to the total number of parameters of the model.

fvcore.nn.parameter_count_table(model: torch.nn.Module, max_depth: int = 3)str[source]

Format the parameter count of the model (and its submodules or parameters) in a nice table. It looks like this:

| name                            | #elements or shape   |
|:--------------------------------|:---------------------|
| model                           | 37.9M                |
|  backbone                       |  31.5M               |
|   backbone.fpn_lateral3         |   0.1M               |
|    backbone.fpn_lateral3.weight |    (256, 512, 1, 1)  |
|    backbone.fpn_lateral3.bias   |    (256,)            |
|   backbone.fpn_output3          |   0.6M               |
|    backbone.fpn_output3.weight  |    (256, 256, 3, 3)  |
|    backbone.fpn_output3.bias    |    (256,)            |
|   backbone.fpn_lateral4         |   0.3M               |
|    backbone.fpn_lateral4.weight |    (256, 1024, 1, 1) |
|    backbone.fpn_lateral4.bias   |    (256,)            |
|   backbone.fpn_output4          |   0.6M               |
|    backbone.fpn_output4.weight  |    (256, 256, 3, 3)  |
|    backbone.fpn_output4.bias    |    (256,)            |
|   backbone.fpn_lateral5         |   0.5M               |
|    backbone.fpn_lateral5.weight |    (256, 2048, 1, 1) |
|    backbone.fpn_lateral5.bias   |    (256,)            |
|   backbone.fpn_output5          |   0.6M               |
|    backbone.fpn_output5.weight  |    (256, 256, 3, 3)  |
|    backbone.fpn_output5.bias    |    (256,)            |
|   backbone.top_block            |   5.3M               |
|    backbone.top_block.p6        |    4.7M              |
|    backbone.top_block.p7        |    0.6M              |
|   backbone.bottom_up            |   23.5M              |
|    backbone.bottom_up.stem      |    9.4K              |
|    backbone.bottom_up.res2      |    0.2M              |
|    backbone.bottom_up.res3      |    1.2M              |
|    backbone.bottom_up.res4      |    7.1M              |
|    backbone.bottom_up.res5      |    14.9M             |
|    ......                       |    .....             |

Parameters
• model – a torch module

• max_depth (int) – maximum depth to recursively print submodules or parameters

Returns

str – the table to be printed

fvcore.nn.get_bn_modules(model: torch.nn.Module) → List[torch.nn.Module][source]

Find all BatchNorm (BN) modules that are in training mode. See fvcore.precise_bn.BN_MODULE_TYPES for a list of all modules that are included in this search.

Parameters

model (nn.Module) – a model possibly containing BN modules.

Returns

list[nn.Module] – all BN modules in the model.

fvcore.nn.update_bn_stats(model: torch.nn.Module, data_loader: Iterable[Any], num_iters: int = 200, progress: Optional[str] = None)None[source]

Recompute and update the batch norm stats to make them more precise. During training both BN stats and the weight are changing after every iteration, so the running average can not precisely reflect the actual stats of the current model. In this function, the BN stats are recomputed with fixed weights, to make the running average more precise. Specifically, it computes the true average of per-batch mean/variance instead of the running average. See Sec. 3 of the paper “Rethinking Batch in BatchNorm” for details.

Parameters
• model (nn.Module) –

the model whose bn stats will be recomputed.

Note that:

1. This function will not alter the training mode of the given model. Users are responsible for setting the layers that needs precise-BN to training mode, prior to calling this function.

2. Be careful if your models contain other stateful layers in addition to BN, i.e. layers whose state can change in forward iterations. This function will alter their state. If you wish them unchanged, you need to either pass in a submodule without those layers, or backup the states.

• data_loader (iterator) – an iterator. Produce data as inputs to the model.

• num_iters (int) – number of iterations to compute the stats.

• progress – None or “tqdm”. If set, use tqdm to report the progress.

fvcore.nn.flop_count_str(flops: fvcore.nn.flop_count.FlopCountAnalysis, activations: Optional[fvcore.nn.activation_count.ActivationCountAnalysis] = None)str[source]

Calculates the parameters and flops of the model with the given inputs and returns a string representation of the model that includes the parameters and flops of every submodule. The string is structured to be similar that given by str(model), though it is not guaranteed to be identical in form if the default string representation of a module has been overridden. If a module has zero parameters and flops, statistics will not be reported for succinctness.

The trace can only register the scope of a module if it is called directly, which means flops (and activations) arising from explicit calls to .forward() or to other python functions of the module will not be attributed to that module. Modules that are never called will have ‘N/A’ listed for their flops; this means they are either unused or their statistics are missing for this reason. Any such flops are still counted towards the parent

Example:

>>> import torch
>>> import torch.nn as nn

>>> class InnerNet(nn.Module):
...     def __init__(self):
...         super().__init__()
...         self.fc1 = nn.Linear(10,10)
...         self.fc2 = nn.Linear(10,10)
...     def forward(self, x):
...         return self.fc1(self.fc2(x))

>>> class TestNet(nn.Module):
...     def __init__(self):
...         super().__init__()
...         self.fc1 = nn.Linear(10,10)
...         self.fc2 = nn.Linear(10,10)
...         self.inner = InnerNet()
...     def forward(self, x):
...         return self.fc1(self.fc2(self.inner(x)))

>>> inputs = torch.randn((1,10))
>>> print(flop_count_str(FlopCountAnalysis(model, inputs)))
TestNet(
#params: 0.44K, #flops: 0.4K
(fc1): Linear(
in_features=10, out_features=10, bias=True
#params: 0.11K, #flops: 100
)
(fc2): Linear(
in_features=10, out_features=10, bias=True
#params: 0.11K, #flops: 100
)
(inner): InnerNet(
#params: 0.22K, #flops: 0.2K
(fc1): Linear(
in_features=10, out_features=10, bias=True
#params: 0.11K, #flops: 100
)
(fc2): Linear(
in_features=10, out_features=10, bias=True
#params: 0.11K, #flops: 100
)
)
)

Parameters
• flops (FlopCountAnalysis) – the flop counting object

• activations (bool) – If given, the activations of each layer will also be calculated and included in the representation.

Returns

str – a string representation of the model with the number of parameters and flops included.

fvcore.nn.flop_count_table(flops: fvcore.nn.flop_count.FlopCountAnalysis, max_depth: int = 3, activations: Optional[fvcore.nn.activation_count.ActivationCountAnalysis] = None, show_param_shapes: bool = True)str[source]

Format the per-module parameters and flops of a model in a table. It looks like this:

| model                            | #parameters or shape   | #flops    |
|:---------------------------------|:-----------------------|:----------|
| model                            | 34.6M                  | 65.7G     |
|  s1                              |  15.4K                 |  4.32G    |
|   s1.pathway0_stem               |   9.54K                |   1.23G   |
|    s1.pathway0_stem.conv         |    9.41K               |    1.23G  |
|    s1.pathway0_stem.bn           |    0.128K              |           |
|   s1.pathway1_stem               |   5.9K                 |   3.08G   |
|    s1.pathway1_stem.conv         |    5.88K               |    3.08G  |
|    s1.pathway1_stem.bn           |    16                  |           |
|  s1_fuse                         |  0.928K                |  29.4M    |
|   s1_fuse.conv_f2s               |   0.896K               |   29.4M   |
|    s1_fuse.conv_f2s.weight       |    (16, 8, 7, 1, 1)    |           |
|   s1_fuse.bn                     |   32                   |           |
|    s1_fuse.bn.weight             |    (16,)               |           |
|    s1_fuse.bn.bias               |    (16,)               |           |
|  s2                              |  0.226M                |  7.73G    |
|   s2.pathway0_res0               |   80.1K                |   2.58G   |
|    s2.pathway0_res0.branch1      |    20.5K               |    0.671G |
|    s2.pathway0_res0.branch1_bn   |    0.512K              |           |
|    s2.pathway0_res0.branch2      |    59.1K               |    1.91G  |
|   s2.pathway0_res1.branch2       |   70.4K                |   2.28G   |
|    s2.pathway0_res1.branch2.a    |    16.4K               |    0.537G |
|    s2.pathway0_res1.branch2.a_bn |    0.128K              |           |
|    s2.pathway0_res1.branch2.b    |    36.9K               |    1.21G  |
|    s2.pathway0_res1.branch2.b_bn |    0.128K              |           |
|    s2.pathway0_res1.branch2.c    |    16.4K               |    0.537G |
|    s2.pathway0_res1.branch2.c_bn |    0.512K              |           |
|   s2.pathway0_res2.branch2       |   70.4K                |   2.28G   |
|    s2.pathway0_res2.branch2.a    |    16.4K               |    0.537G |
|    s2.pathway0_res2.branch2.a_bn |    0.128K              |           |
|    s2.pathway0_res2.branch2.b    |    36.9K               |    1.21G  |
|    s2.pathway0_res2.branch2.b_bn |    0.128K              |           |
|    s2.pathway0_res2.branch2.c    |    16.4K               |    0.537G |
|    s2.pathway0_res2.branch2.c_bn |    0.512K              |           |
|    ............................. |    ......              |    ...... |

Parameters
• flops (FlopCountAnalysis) – the flop counting object

• max_depth (int) – The max depth of submodules to include in the table. Defaults to 3.

• activations (ActivationCountAnalysis or None) – If given, include activation counts as an additional column in the table.

• show_param_shapes (bool) – If true, shapes for parameters will be included in the table. Defaults to True.

Returns

str – The formatted table.

Examples:

print(flop_count_table(FlopCountAnalysis(model, inputs)))

fvcore.nn.smooth_l1_loss(input: torch.Tensor, target: torch.Tensor, beta: float, reduction: str = 'none')torch.Tensor[source]

Smooth L1 loss defined in the Fast R-CNN paper as:

              | 0.5 * x ** 2 / beta   if abs(x) < beta
smoothl1(x) = |
| abs(x) - 0.5 * beta   otherwise,


where x = input - target.

Smooth L1 loss is related to Huber loss, which is defined as:

           | 0.5 * x ** 2                  if abs(x) < beta
huber(x) = |
| beta * (abs(x) - 0.5 * beta)  otherwise


Smooth L1 loss is equal to huber(x) / beta. This leads to the following differences:

• As beta -> 0, Smooth L1 loss converges to L1 loss, while Huber loss converges to a constant 0 loss.

• As beta -> +inf, Smooth L1 converges to a constant 0 loss, while Huber loss converges to L2 loss.

• For Smooth L1 loss, as beta varies, the L1 segment of the loss has a constant slope of 1. For Huber loss, the slope of the L1 segment is beta.

Smooth L1 loss can be seen as exactly L1 loss, but with the abs(x) < beta portion replaced with a quadratic function such that at abs(x) = beta, its slope is 1. The quadratic segment smooths the L1 loss near x = 0.

Parameters
• input (Tensor) – input tensor of any shape

• target (Tensor) – target value tensor with the same shape as input

• beta (float) – L1 to L2 change point. For beta values < 1e-5, L1 loss is computed.

• reduction – ‘none’ | ‘mean’ | ‘sum’ ‘none’: No reduction will be applied to the output. ‘mean’: The output will be averaged. ‘sum’: The output will be summed.

Returns

The loss with the reduction option applied.

Note

PyTorch’s builtin “Smooth L1 loss” implementation does not actually implement Smooth L1 loss, nor does it implement Huber loss. It implements the special case of both in which they are equal (beta=1). See: https://pytorch.org/docs/stable/nn.html#torch.nn.SmoothL1Loss.

fvcore.nn.c2_msra_fill(module: torch.nn.Module)None[source]

Initialize module.weight using the “MSRAFill” implemented in Caffe2. Also initializes module.bias to 0.

Parameters

module (torch.nn.Module) – module to initialize.

fvcore.nn.c2_xavier_fill(module: torch.nn.Module)None[source]

Initialize module.weight using the “XavierFill” implemented in Caffe2. Also initializes module.bias to 0.

Parameters

module (torch.nn.Module) – module to initialize.

## fvcore.common¶

class fvcore.common.checkpoint.Checkpointer(model: torch.nn.Module, save_dir: str = '', *, save_to_disk: bool = True, **checkpointables: Any)[source]

Bases: object

A checkpointer that can save/load model as well as extra checkpointable objects.

__init__(model: torch.nn.Module, save_dir: str = '', *, save_to_disk: bool = True, **checkpointables: Any)None[source]
Parameters
• model (nn.Module) – model.

• save_dir (str) – a directory to save and find checkpoints.

• save_to_disk (bool) – if True, save checkpoint to disk, otherwise disable saving for this checkpointer.

• checkpointables (object) – any checkpointable objects, i.e., objects that have the state_dict() and load_state_dict() method. For example, it can be used like Checkpointer(model, “dir”, optimizer=optimizer).

add_checkpointable(key: str, checkpointable: Any)None[source]

Add checkpointable object for this checkpointer to track.

Parameters
• key (str) – the key used to save the object

• checkpointable – any object with state_dict() and load_state_dict() method

save(name: str, **kwargs: Any)None[source]

Dump model and checkpointables to a file.

Parameters
• name (str) – name of the file.

• kwargs (dict) – extra arbitrary data to save.

load(path: str, checkpointables: Optional[List[str]] = None) → Dict[str, Any][source]

Parameters
• path (str) – path or url to the checkpoint. If empty, will not load anything.

• checkpointables (list) – List of checkpointable names to load. If not specified (None), will load all the possible checkpointables.

Returns

dict – extra data loaded from the checkpoint that has not been processed. For example, those saved with save(**extra_data)().

has_checkpoint()bool[source]
Returns

bool – whether a checkpoint exists in the target directory.

get_checkpoint_file()str[source]
Returns

str – The latest checkpoint file in target directory.

get_all_checkpoint_files() → List[str][source]
Returns

list

All available checkpoint files (.pth files) in target

directory.

resume_or_load(path: str, *, resume: bool = True) → Dict[str, Any][source]

If resume is True, this method attempts to resume from the last checkpoint, if exists. Otherwise, load checkpoint from the given path. This is useful when restarting an interrupted training job.

Parameters
• path (str) – path to the checkpoint.

• resume (bool) – if True, resume from the last checkpoint if it exists and load the model together with all the checkpointables. Otherwise only load the model without loading any checkpointables.

Returns

same as load().

tag_last_checkpoint(last_filename_basename: str)None[source]

Tag the last checkpoint.

Parameters

last_filename_basename (str) – the basename of the last filename.

class fvcore.common.checkpoint.PeriodicCheckpointer(checkpointer: fvcore.common.checkpoint.Checkpointer, period: int, max_iter: Optional[int] = None, max_to_keep: Optional[int] = None, file_prefix: str = 'model')[source]

Bases: object

Save checkpoints periodically. When .step(iteration) is called, it will execute checkpointer.save on the given checkpointer, if iteration is a multiple of period or if max_iter is reached.

checkpointer

the underlying checkpointer object

Type

Checkpointer

__init__(checkpointer: fvcore.common.checkpoint.Checkpointer, period: int, max_iter: Optional[int] = None, max_to_keep: Optional[int] = None, file_prefix: str = 'model')None[source]
Parameters
• checkpointer – the checkpointer object used to save checkpoints.

• period (int) – the period to save checkpoint.

• max_iter (int) – maximum number of iterations. When it is reached, a checkpoint named “{file_prefix}_final” will be saved.

• max_to_keep (int) – maximum number of most current checkpoints to keep, previous checkpoints will be deleted

• file_prefix (str) – the prefix of checkpoint’s filename

step(iteration: int, **kwargs: Any)None[source]

Perform the appropriate action at the given iteration.

Parameters
save(name: str, **kwargs: Any)None[source]

Same argument as Checkpointer.save(). Use this method to manually save checkpoints outside the schedule.

Parameters
class fvcore.common.config.CfgNode(init_dict=None, key_list=None, new_allowed=False)[source]

Bases: yacs.config.CfgNode

Our own extended version of yacs.config.CfgNode. It contains the following extra features:

1. The merge_from_file() method supports the “_BASE_” key, which allows the new CfgNode to inherit all the attributes from the base configuration file(s).

2. Keys that start with “COMPUTED_” are treated as insertion-only “computed” attributes. They can be inserted regardless of whether the CfgNode is frozen or not.

3. With “allow_unsafe=True”, it supports pyyaml tags that evaluate expressions in config. See examples in https://pyyaml.org/wiki/PyYAMLDocumentation#yaml-tags-and-python-types Note that this may lead to arbitrary code execution: you must not load a config file from untrusted sources before manually inspecting the content of the file.

classmethod load_yaml_with_base(filename: str, allow_unsafe: bool = False) → Dict[str, Any][source]
Just like yaml.load(open(filename)), but inherit attributes from its

_BASE_.

Parameters
• filename (str or file-like object) – the file name or file of the current config. Will be used to find the base config file.

Returns

merge_from_file(cfg_filename: str, allow_unsafe: bool = False)None[source]

Merge configs from a given yaml file.

Parameters
• cfg_filename – the file name of the yaml config.

merge_from_other_cfg(cfg_other: fvcore.common.config.CfgNode) → Callable[], None][source]
Parameters

cfg_other (CfgNode) – configs to merge from.

merge_from_list(cfg_list: List[str]) → Callable[], None][source]
Parameters

cfg_list (list) – list of configs to merge from.

class fvcore.common.history_buffer.HistoryBuffer(max_length: int = 1000000)[source]

Bases: object

Track a series of scalar values and provide access to smoothed values over a window or the global average of the series.

__init__(max_length: int = 1000000)None[source]
Parameters

max_length – maximal number of values that can be stored in the buffer. When the capacity of the buffer is exhausted, old values will be removed.

update(value: float, iteration: Optional[float] = None)None[source]

Add a new scalar value produced at certain iteration. If the length of the buffer exceeds self._max_length, the oldest element will be removed from the buffer.

latest()float[source]

Return the latest scalar value added to the buffer.

median(window_size: int)float[source]

Return the median of the latest window_size values in the buffer.

avg(window_size: int)float[source]

Return the mean of the latest window_size values in the buffer.

global_avg()float[source]

Return the mean of all the elements in the buffer. Note that this includes those getting removed due to limited buffer storage.

values() → List[Tuple[float, float]][source]
Returns

list[(number, iteration)] – content of the current buffer.

class fvcore.common.param_scheduler.ParamScheduler[source]

Bases: object

Base class for parameter schedulers. A parameter scheduler defines a mapping from a progress value in [0, 1) to a number (e.g. learning rate).

WHERE_EPSILON = 1e-06
__call__(where: float)float[source]

Get the value of the param for a given point at training.

We update params (such as learning rate) based on the percent progress of training completed. This allows a scheduler to be agnostic to the exact length of a particular run (e.g. 120 epochs vs 90 epochs), as long as the relative progress where params should be updated is the same. However, it assumes that the total length of training is known.

Parameters

where – A float in [0,1) that represents how far training has progressed

class fvcore.common.param_scheduler.ConstantParamScheduler(value: float)[source]

Returns a constant value for a param.

WHERE_EPSILON = 1e-06
class fvcore.common.param_scheduler.CosineParamScheduler(start_value: float, end_value: float)[source]

Cosine decay or cosine warmup schedules based on start and end values. The schedule is updated based on the fraction of training progress. The schedule was proposed in ‘SGDR: Stochastic Gradient Descent with Warm Restarts’ (https://arxiv.org/abs/1608.03983). Note that this class only implements the cosine annealing part of SGDR, and not the restarts.

Example

CosineParamScheduler(start_value=0.1, end_value=0.0001)

WHERE_EPSILON = 1e-06
class fvcore.common.param_scheduler.ExponentialParamScheduler(start_value: float, decay: float)[source]

Exponetial schedule parameterized by a start value and decay. The schedule is updated based on the fraction of training progress, where, with the formula param_t = start_value * (decay ** where).

Example

Corresponds to a decreasing schedule with values in [2.0, 0.04).

WHERE_EPSILON = 1e-06
class fvcore.common.param_scheduler.LinearParamScheduler(start_value: float, end_value: float)[source]

Linearly interpolates parameter between start_value and end_value. Can be used for either warmup or decay based on start and end values. The schedule is updated after every train step by default.

Example

LinearParamScheduler(start_value=0.0001, end_value=0.01)


Corresponds to a linear increasing schedule with values in [0.0001, 0.01)

WHERE_EPSILON = 1e-06
class fvcore.common.param_scheduler.CompositeParamScheduler(schedulers: , lengths: List[float], interval_scaling: Sequence[str])[source]

Composite parameter scheduler composed of intermediate schedulers. Takes a list of schedulers and a list of lengths corresponding to percentage of training each scheduler should run for. Schedulers are run in order. All values in lengths should sum to 1.0.

Each scheduler also has a corresponding interval scale. If interval scale is ‘fixed’, the intermediate scheduler will be run without any rescaling of the time. If interval scale is ‘rescaled’, intermediate scheduler is run such that each scheduler will start and end at the same values as it would if it were the only scheduler. Default is ‘rescaled’ for all schedulers.

Example

schedulers = [
ConstantParamScheduler(value=0.42),
CosineParamScheduler(start_value=0.42, end_value=1e-4)
]
CompositeParamScheduler(
schedulers=schedulers,
interval_scaling=['rescaled', 'rescaled'],
lengths=[0.3, 0.7])


The parameter value will be 0.42 for the first [0%, 30%) of steps, and then will cosine decay from 0.42 to 0.0001 for [30%, 100%) of training.

WHERE_EPSILON = 1e-06
class fvcore.common.param_scheduler.MultiStepParamScheduler(values: List[float], num_updates: Optional[int] = None, milestones: Optional[List[int]] = None)[source]

Takes a predefined schedule for a param value, and a list of epochs or steps which stand for the upper boundary (excluded) of each range.

Example

MultiStepParamScheduler(
values=[0.1, 0.01, 0.001, 0.0001],
milestones=[30, 60, 80, 120]
)


Then the param value will be 0.1 for epochs 0-29, 0.01 for epochs 30-59, 0.001 for epochs 60-79, 0.0001 for epochs 80-120. Note that the length of values must be equal to the length of milestones plus one.

__init__(values: List[float], num_updates: Optional[int] = None, milestones: Optional[List[int]] = None)None[source]
Parameters
• values – param value in each range

• num_updates – the end of the last range. If None, will use milestones[-1]

• milestones – the boundary of each range. If None, will evenly split num_updates

For example, all the following combinations define the same scheduler:

• num_updates=90, milestones=[30, 60], values=[1, 0.1, 0.01]

• milestones=[30, 60, 90], values=[1, 0.1, 0.01]

• milestones=[3, 6, 9], values=[1, 0.1, 0.01] (ParamScheduler is scale-invariant)

WHERE_EPSILON = 1e-06
class fvcore.common.param_scheduler.StepParamScheduler(num_updates: Union[int, float], values: List[float])[source]

Takes a fixed schedule for a param value. If the length of the fixed schedule is less than the number of epochs, then the epochs are divided evenly among the param schedule. The schedule is updated after every train epoch by default.

Example

StepParamScheduler(values=[0.1, 0.01, 0.001, 0.0001], num_updates=120)


Then the param value will be 0.1 for epochs 0-29, 0.01 for epochs 30-59, 0.001 for epoch 60-89, 0.0001 for epochs 90-119.

WHERE_EPSILON = 1e-06
class fvcore.common.param_scheduler.StepWithFixedGammaParamScheduler(base_value: float, num_decays: int, gamma: float, num_updates: int)[source]

Decays the param value by gamma at equal number of steps so as to have the specified total number of decays.

Example

StepWithFixedGammaParamScheduler(


Then the param value will be 0.1 for epochs 0-29, 0.01 for epochs 30-59, 0.001 for epoch 60-89, 0.0001 for epochs 90-119.

WHERE_EPSILON = 1e-06
class fvcore.common.param_scheduler.PolynomialDecayParamScheduler(base_value: float, power: float)[source]

Decays the param value after every epoch according to a polynomial function with a fixed power. The schedule is updated after every train step by default.

Example

PolynomialDecayParamScheduler(base_value=0.1, power=0.9)


Then the param value will be 0.1 for epoch 0, 0.099 for epoch 1, and so on.

WHERE_EPSILON = 1e-06
class fvcore.common.registry.Registry(*args, **kwds)[source]

The registry that provides name -> object mapping, to support third-party users’ custom modules.

To create a registry (e.g. a backbone registry):

BACKBONE_REGISTRY = Registry('BACKBONE')


To register an object:

@BACKBONE_REGISTRY.register()
class MyBackbone():
...


Or:

BACKBONE_REGISTRY.register(MyBackbone)

__init__(name: str)None[source]
Parameters

name (str) – the name of this registry

register(obj: Any = None) → Any[source]

Register the given object under the the name obj.__name__. Can be used as either a decorator or not. See docstring of this class for usage.

get(name: str) → Any[source]
class fvcore.common.timer.Timer[source]

Bases: object

A timer which computes the time elapsed since the start/reset of the timer.

reset()None[source]

Reset the timer.

pause()None[source]

Pause the timer.

is_paused()bool[source]
Returns

bool – whether the timer is currently paused

resume()None[source]

Resume the timer.

seconds()float[source]
Returns

(float)

the total number of seconds since the start/reset of the

timer, excluding the time when the timer is paused.

avg_seconds()float[source]
Returns

(float) – the average number of seconds between every start/reset and pause.