Closed
Description
Description
- [BUG Report]: Earlystopping should triger stopping the training if
loss
value is not improved by at least_min_delta
. - [Suggestion]: I also suggest that the training should stop after
loss
reaches_base_line
which should infer to the optimal value (the threshold to the desired loss value).
Reproduction Steps
var layers = tf.keras.layers;
Tensor input = tf.keras.Input((-1, 1), name: "input_place_holder");
Tensor hiddenLayer1 = layers.Dense(7, activation: "linear").Apply(input);
Tensor output = layers.Dense(2, activation: "linear").Apply(hiddenLayer1);
var model = tf.keras.Model(input, output);
Tensorflow.NumPy.NDArray inputs = new Tensorflow.NumPy.NDArray(new float[,] { { -5 }, { -4 }, { -3 }, { -2 }, { -1 }, { 0 }, { 1 }, { 2 }, { 3 }, { 4 }, { 5 } });
Tensorflow.NumPy.NDArray outputs = new Tensorflow.NumPy.NDArray(new float[,] { { 0, 2 }, { 1, 4 }, { 2, 6 }, { 3, 8 }, { 4, 10 }, { 5, 12 }, { 6, 14 }, { 7, 16 }, { 8, 18 }, { 9, 20 }, { 10, 22 } });
List<Tensorflow.NumPy.NDArray> weights = model.get_weights();
var callbackPars = new Tensorflow.Keras.Callbacks.CallbackParams() { Model = model };
var earlystoppingCB = new Tensorflow.Keras.Callbacks.EarlyStopping(parameters: callbackPars,
monitor: "loss",
patience: 3,
mode: "min",
restore_best_weights: true,
//baseline: 0.001f,
min_delta: 0.0001f);
Tensorflow.Keras.Engine.ICallback historyCB;
float learningRate = 0.1f;
for (int i = 0; i < 100; i++)
{
model.compile(optimizer: tf.keras.optimizers.SGD(learningRate), loss: tf.keras.losses.MeanSquaredError());
historyCB = model.fit(inputs, outputs, epochs: 1000, callbacks: new List<Tensorflow.Keras.Engine.ICallback>() { earlystoppingCB });
if (historyCB.history["loss"][historyCB.history["loss"].Count - 1] < 0.5f)
break;
model.set_weights(weights);
learningRate /= 10f;
}
Tensor results0 = model.predict(tf.constant(new float[,] { { -8 } }));
Known Workarounds
Suggested fix to the bug
- The fix for the bug is in line 134 in the file
Tensorflow.Keras.Callbacks.EarlyStopping.cs
:
bool less_op = (monitor_value - _min_delta) < reference_value;
--->bool less_op = (monitor_value + _min_delta) < reference_value;
Why:
since _min_delta
is always positive, and monitor_value
is a slight improvement of reference_value
, then (monitor_value - _min_delta) < reference_value
will always be true. Therefore, it keeps resetting _wait
to 0.
The suggestion to earlystopping using _baseline
- The fix to the suggestion is in line 88 in the file
Tensorflow.Keras.Callbacks.EarlyStopping.cs
:
if (_baseline == 0f || _is_improvement(current, _baseline))
--->if (_baseline == 0f || _is_improvement(_baseline, current))
Why:
We should check if the current improved loss
is less than _baseline
. If yes then do not reset _wait
to 0.
Configuration and Other Information
No response
Metadata
Metadata
Assignees
Labels
No labels