Model loss functions
Model loss functions
loss_mean_squared_error(y_true, y_pred)
loss_mean_absolute_error(y_true, y_pred)
loss_mean_absolute_percentage_error(y_true, y_pred)
loss_mean_squared_logarithmic_error(y_true, y_pred)
loss_squared_hinge(y_true, y_pred)
loss_hinge(y_true, y_pred)
loss_categorical_hinge(y_true, y_pred)
loss_logcosh(y_true, y_pred)
loss_categorical_crossentropy(y_true, y_pred)
loss_sparse_categorical_crossentropy(y_true, y_pred)
loss_binary_crossentropy(y_true, y_pred)
loss_kullback_leibler_divergence(y_true, y_pred)
loss_poisson(y_true, y_pred)
loss_cosine_proximity(y_true, y_pred)
Arguments
y_true | True labels (Tensor) |
y_pred | Predictions (Tensor of the same shape as |
Details
Loss functions are to be supplied in the loss
parameter of the
compile()
function.
Loss functions can be specified either using the name of a built in loss function (e.g. 'loss = binary_crossentropy'), a reference to a built in loss function (e.g. 'loss = loss_binary_crossentropy()') or by passing an artitrary function that returns a scalar for each data-point and takes the following two arguments:
y_true
True labels (Tensor)y_pred
Predictions (Tensor of the same shape asy_true
)
The actual optimized objective is the mean of the output array across all datapoints.
Categorical Crossentropy
When using the categorical_crossentropy loss, your targets should be in
categorical format (e.g. if you have 10 classes, the target for each sample
should be a 10-dimensional vector that is all-zeros except for a 1 at the
index corresponding to the class of the sample). In order to convert
integer targets into categorical targets, you can use the Keras utility
function to_categorical()
:
categorical_labels <- to_categorical(int_labels, num_classes = NULL)
loss_logcosh
log(cosh(x))
is approximately equal to (x ** 2) / 2
for small x
and
to abs(x) - log(2)
for large x
. This means that 'logcosh' works mostly
like the mean squared error, but will not be so strongly affected by the
occasional wildly incorrect prediction. However, it may return NaNs if the
intermediate value cosh(y_pred - y_true)
is too large to be represented
in the chosen precision.