Margin-based Losses¶
This section lists all the subtypes of MarginLoss
that are implemented in this package.
ZeroOneLoss¶
-
class
ZeroOneLoss
¶ The classical classification loss. It penalizes every missclassified observation with a loss of \(1\) while every correctly classified observation has a loss of \(0\). It is not convex nor continuous and thus seldomly used directly. Instead one usually works with some classification-calibrated surrogate loss, such as one of those listed below.
Lossfunction | Derivative |
---|---|
\[\begin{split}L(a) = \begin{cases} 1 & \quad \text{if } a < 0 \\ 0 & \quad \text{otherwise}\\ \end{cases}\end{split}\]
|
\[L'(a) = 0\]
|
PerceptronLoss¶
-
class
PerceptronLoss
¶ The perceptron loss linearly penalizes every prediction where the resulting agreement \(a < 0\). It is Lipschitz continuous and convex, but not strictly convex.
Lossfunction | Derivative |
---|---|
\[L(a) = \max \{ 0, - a \}\]
|
\[\begin{split}L'(a) = \begin{cases} -1 & \quad \text{if } a < 0 \\ 0 & \quad \text{otherwise}\\ \end{cases}\end{split}\]
|
L1HingeLoss¶
-
class
L1HingeLoss
¶ The hinge loss linearly penalizes every predicition where the resulting agreement \(a < 1\) . It is Lipschitz continuous and convex, but not strictly convex.
Lossfunction | Derivative |
---|---|
\[L(a) = \max \{ 0, 1 - a \}\]
|
\[\begin{split}L'(a) = \begin{cases} -1 & \quad \text{if } a < 1 \\ 0 & \quad \text{otherwise}\\ \end{cases}\end{split}\]
|
SmoothedL1HingeLoss¶
-
class
SmoothedL1HingeLoss
¶ -
γ
¶
As the name suggests a smoothed version of the L1 hinge loss. It is Lipschitz continuous and convex, but not strictly convex.
-
Lossfunction | Derivative |
---|---|
\[\begin{split}L(a) = \begin{cases} \frac{1}{2 \gamma} \cdot \max \{ 0, 1 - a \} ^2 & \quad \text{if } a \ge 1 - \gamma \\ 1 - \frac{\gamma}{2} - a & \quad \text{otherwise}\\ \end{cases}\end{split}\]
|
\[\begin{split}L'(a) = \begin{cases} - \frac{1}{\gamma} \cdot \max \{ 0, 1 - a \} & \quad \text{if } a \ge 1 - \gamma \\ - 1 & \quad \text{otherwise}\\ \end{cases}\end{split}\]
|
ModifiedHuberLoss¶
-
class
ModifiedHuberLoss
¶ A special (4 times scaled) case of the
SmoothedL1HingeLoss
with \(\gamma = 2\). It is Lipschitz continuous and convex, but not strictly convex.
Lossfunction | Derivative |
---|---|
\[\begin{split}L(a) = \begin{cases} \max \{ 0, 1 - a \} ^2 & \quad \text{if } a \ge -1 \\ - 4 a & \quad \text{otherwise}\\ \end{cases}\end{split}\]
|
\[\begin{split}L'(a) = \begin{cases} - 2 \cdot \max \{ 0, 1 - a \} & \quad \text{if } a \ge -1 \\ - 4 & \quad \text{otherwise}\\ \end{cases}\end{split}\]
|
DWDMarginLoss¶
-
class
DWDMarginLoss
¶ -
q
¶
The distance weighted discrimination margin loss. A differentiable generalization of the L1 hinge loss that is different than the
SmoothedL1HingeLoss
-
Lossfunction | Derivative |
---|---|
\[\begin{split}L(a) = \begin{cases} 1 - a & \quad \text{if } a \le \frac{q}{q+1} \\ \frac{1}{a^q} \frac{q^q}{(q+1)^{q+1}} & \quad \text{otherwise}\\ \end{cases}\end{split}\]
|
\[\begin{split}L'(a) = \begin{cases} - 1 & \quad \text{if } a \le \frac{q}{q+1} \\ - \frac{1}{a^{q+1}} \left( \frac{q}{q+1} \right)^{q+1} & \quad \text{otherwise}\\ \end{cases}\end{split}\]
|
L2MarginLoss¶
-
class
L2MarginLoss
¶ The margin-based least-squares loss for classification, which quadratically penalizes every prediction where \(a \ne 1\). It is locally Lipschitz continuous and strongly convex.
Lossfunction | Derivative |
---|---|
\[L(a) = {\left( 1 - a \right)}^2\]
|
\[L'(a) = 2 \left( a - 1 \right)\]
|
L2HingeLoss¶
-
class
L2HingeLoss
¶ The truncated version of the least-squares loss. It quadratically penalizes every predicition where the resulting agreement \(a < 1\) . It is locally Lipschitz continuous and convex, but not strictly convex.
Lossfunction | Derivative |
---|---|
\[L(a) = \max \{ 0, 1 - a \} ^2\]
|
\[\begin{split}L'(a) = \begin{cases} 2 \left( a - 1 \right) & \quad \text{if } a < 1 \\ 0 & \quad \text{otherwise}\\ \end{cases}\end{split}\]
|
LogitMarginLoss¶
-
class
LogitMarginLoss
¶ The margin version of the logistic loss. It is infinitely many times differentiable, strictly convex, and Lipschitz continuous.
Lossfunction | Derivative |
---|---|
\[L(a) = \ln (1 + e^{-a})\]
|
\[L'(a) = - \frac{1}{1 + e^a}\]
|
ExpLoss¶
-
class
ExpLoss
¶ The margin-based exponential Loss used for classification, which penalizes every prediction exponentially. It is infinitely many times differentiable, locally Lipschitz continuous and strictly convex, but not clipable.
Lossfunction | Derivative |
---|---|
\[L(a) = e^{-a}\]
|
\[L'(a) = - e^{-a}\]
|
SigmoidLoss¶
-
class
SigmoidLoss
¶ The so called sigmoid loss is a continuous margin-base loss which penalizes every prediction with a loss within in the range (0,2). It is infinitely many times differentiable, Lipschitz continuous but nonconvex.
Lossfunction | Derivative |
---|---|
\[L(a) = 1 - \tanh(a)\]
|
\[L'(a) = - \textrm{sech}^2 (a)\]
|