Closed
Description
Here is the minimal example to reproduce it (the loaded model is a model used to train mnist):
C# code:
var model = keras.models.load_model("xxx");
model.summary();
model.compile(new Adam(0.0001f), new LossesApi().SparseCategoricalCrossentropy(), new string[] { "accuracy" });
var x = tf.reshape(tf.range(0, 784, dtype: TF_DataType.TF_FLOAT), (1, 784));
var result = model.Apply(x);
var loss = new LossesApi().SparseCategoricalCrossentropy().Call(tf.constant(3), result);
Console.WriteLine(loss);
Python code:
model: tf.keras.Model = tf.keras.models.load_model("xxx")
model.summary()
model.compile("adam", "sparse_categorical_crossentropy", ["accuracy"])
x = tf.reshape(tf.range(0, 784), [1, 28, 28])
result: tf.Tensor = model(x)
print(result, tf.float64)
loss = tf.keras.losses.sparse_categorical_crossentropy(tf.constant(3), result)
print(loss)
Model summary:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 28, 28, 1)] 0
flatten (Flatten) (None, 784) 0
dense (Dense) (None, 100) 78500
dense_1 (Dense) (None, 10) 1010
softmax (Softmax) (None, 10) 0
=================================================================
Total params: 79,510
Trainable params: 79,510
Non-trainable params: 0
_________________________________________________________________
Both the results of forward and backward are different between python and csharp, which has a bad impact on model performance.