Margin-based Losses¶

This section lists all the subtypes of MarginLoss that are implemented in this package.

ZeroOneLoss¶

class ZeroOneLoss¶: The classical classification loss. It penalizes every missclassified observation with a loss of \(1\) while every correctly classified observation has a loss of \(0\). It is not convex nor continuous and thus seldomly used directly. Instead one usually works with some classification-calibrated surrogate loss, such as one of those listed below.

Lossfunction	Derivative

\[\begin{split}L(a) = \begin{cases} 1 & \quad \text{if } a < 0 \\ 0 & \quad \text{otherwise}\\ \end{cases}\end{split}\]	\[L'(a) = 0\]

PerceptronLoss¶

class PerceptronLoss¶: The perceptron loss linearly penalizes every prediction where the resulting agreement \(a < 0\). It is Lipschitz continuous and convex, but not strictly convex.

Lossfunction	Derivative

\[L(a) = \max \{ 0, - a \}\]	\[\begin{split}L'(a) = \begin{cases} -1 & \quad \text{if } a < 0 \\ 0 & \quad \text{otherwise}\\ \end{cases}\end{split}\]

L1HingeLoss¶

class L1HingeLoss¶: The hinge loss linearly penalizes every predicition where the resulting agreement \(a < 1\) . It is Lipschitz continuous and convex, but not strictly convex.

Lossfunction	Derivative

\[L(a) = \max \{ 0, 1 - a \}\]	\[\begin{split}L'(a) = \begin{cases} -1 & \quad \text{if } a < 1 \\ 0 & \quad \text{otherwise}\\ \end{cases}\end{split}\]

SmoothedL1HingeLoss¶

class SmoothedL1HingeLoss¶

γ¶

As the name suggests a smoothed version of the L1 hinge loss. It is Lipschitz continuous and convex, but not strictly convex.

Lossfunction	Derivative

\[\begin{split}L(a) = \begin{cases} \frac{1}{2 \gamma} \cdot \max \{ 0, 1 - a \} ^2 & \quad \text{if } a \ge 1 - \gamma \\ 1 - \frac{\gamma}{2} - a & \quad \text{otherwise}\\ \end{cases}\end{split}\]	\[\begin{split}L'(a) = \begin{cases} - \frac{1}{\gamma} \cdot \max \{ 0, 1 - a \} & \quad \text{if } a \ge 1 - \gamma \\ - 1 & \quad \text{otherwise}\\ \end{cases}\end{split}\]

ModifiedHuberLoss¶

class ModifiedHuberLoss¶: A special (4 times scaled) case of the SmoothedL1HingeLoss with \(\gamma = 2\). It is Lipschitz continuous and convex, but not strictly convex.

Lossfunction	Derivative

\[\begin{split}L(a) = \begin{cases} \max \{ 0, 1 - a \} ^2 & \quad \text{if } a \ge -1 \\ - 4 a & \quad \text{otherwise}\\ \end{cases}\end{split}\]	\[\begin{split}L'(a) = \begin{cases} - 2 \cdot \max \{ 0, 1 - a \} & \quad \text{if } a \ge -1 \\ - 4 & \quad \text{otherwise}\\ \end{cases}\end{split}\]

DWDMarginLoss¶

class DWDMarginLoss¶

q¶

The distance weighted discrimination margin loss. A differentiable generalization of the L1 hinge loss that is different than the SmoothedL1HingeLoss

Lossfunction	Derivative

\[\begin{split}L(a) = \begin{cases} 1 - a & \quad \text{if } a \le \frac{q}{q+1} \\ \frac{1}{a^q} \frac{q^q}{(q+1)^{q+1}} & \quad \text{otherwise}\\ \end{cases}\end{split}\]	\[\begin{split}L'(a) = \begin{cases} - 1 & \quad \text{if } a \le \frac{q}{q+1} \\ - \frac{1}{a^{q+1}} \left( \frac{q}{q+1} \right)^{q+1} & \quad \text{otherwise}\\ \end{cases}\end{split}\]

L2MarginLoss¶

class L2MarginLoss¶: The margin-based least-squares loss for classification, which quadratically penalizes every prediction where \(a \ne 1\). It is locally Lipschitz continuous and strongly convex.

Lossfunction	Derivative

\[L(a) = {\left( 1 - a \right)}^2\]	\[L'(a) = 2 \left( a - 1 \right)\]

L2HingeLoss¶

class L2HingeLoss¶: The truncated version of the least-squares loss. It quadratically penalizes every predicition where the resulting agreement \(a < 1\) . It is locally Lipschitz continuous and convex, but not strictly convex.

Lossfunction	Derivative

\[L(a) = \max \{ 0, 1 - a \} ^2\]	\[\begin{split}L'(a) = \begin{cases} 2 \left( a - 1 \right) & \quad \text{if } a < 1 \\ 0 & \quad \text{otherwise}\\ \end{cases}\end{split}\]

LogitMarginLoss¶

class LogitMarginLoss¶: The margin version of the logistic loss. It is infinitely many times differentiable, strictly convex, and Lipschitz continuous.

Lossfunction	Derivative

\[L(a) = \ln (1 + e^{-a})\]	\[L'(a) = - \frac{1}{1 + e^a}\]

ExpLoss¶

class ExpLoss¶: The margin-based exponential Loss used for classification, which penalizes every prediction exponentially. It is infinitely many times differentiable, locally Lipschitz continuous and strictly convex, but not clipable.

Lossfunction	Derivative

\[L(a) = e^{-a}\]	\[L'(a) = - e^{-a}\]

SigmoidLoss¶

class SigmoidLoss¶: The so called sigmoid loss is a continuous margin-base loss which penalizes every prediction with a loss within in the range (0,2). It is infinitely many times differentiable, Lipschitz continuous but nonconvex.

Lossfunction	Derivative

\[L(a) = 1 - \tanh(a)\]	\[L'(a) = - \textrm{sech}^2 (a)\]