Developer Documentation¶
Abstract Superclasses¶
Most of the implemented losses fall under the category of supervised losses. In other words they represent functions with two parameters (the true targets and the predicted outcomes) to compute their value.
-
class
SupervisedLoss
¶ Abstract subtype of
Loss
. A loss is considered supervised, if all the information needed to computevalue(loss, features, targets, outputs)
are contained intargets
andoutputs
, and thus allows for the simplificationvalue(loss, targets, outputs)
.
-
class
DistanceLoss
¶ Abstract subtype of
SupervisedLoss
. A supervised loss that can be simplified toL(targets, outputs) = L(targets - outputs)
is considered distance-based.
-
class
MarginLoss
¶ Abstract subtype of
SupervisedLoss
. A supervised loss, where the targets are in {-1, 1}, and which can be simplified toL(targets, outputs) = L(targets * outputs)
is considered margin-based.
Shared Interface¶
-
value
(loss, difference) Computes the value of the loss function for each observation in
difference
individually and returns the result as an array of the same size as the parameter.Parameters: - loss (
DistanceLoss
) – An instance of the loss we are interested in. - difference (
AbstractArray
) – The result of subtracting the true targets from the predicted outputs.
Returns: The value of the loss function for the elements in
difference
.Return type: AbstractArray
- loss (
-
deriv
(loss, difference) Computes the derivative of the loss function for each observation in
difference
individually and returns the result as an array of the same size as the parameter.Parameters: - loss (
DistanceLoss
) – An instance of the loss we are interested in. - difference (
AbstractArray
) – The result of subtracting the true targets from the predicted outputs.
Returns: The derivatives of the loss function for the elements in
difference
.Return type: AbstractArray
- loss (
Regression vs Classification¶
We can further divide the supervised losses into two useful
sub-categories: DistanceLoss
for regression and
MarginLoss
for classification.
Losses for Regression¶
Supervised losses that can be expressed as a univariate function
of output - target
are referred to as distance-based losses.
value(L2DistLoss(), difference)
Distance-based losses are typically utilized for regression problems.
That said, there are also other losses that are useful for
regression problems that don’t fall into this category, such as
the PeriodicLoss
.
Note
In the literature that this package is partially based on,
the convention for the distance-based losses is target - output
(see [STEINWART2008] p. 38).
We chose to diverge from this definition because it would force
a difference between the results for the unary and the binary
version of the derivative.
Losses for Classification¶
Margin-based losses are supervised losses where the values of the
targets are restricted to be in \(\{1,-1\}\), and which can
be expressed as a univariate function output * target
.
value(L1HingeLoss(), agreement)
Note
Throughout the codebase we refer to the result of
output * target
as agreement
.
The discussion that lead to this convention can be found
issue #9
Margin-based losses are usually used for binary classification. In contrast to other formalism, they do not natively provide probabilities as output.