Вот подробности для обеих моделей:
8-бит:
Tensorserving_default_input_2:0 - dtype: , shape: [1 9]
Тензор FeedforwardNN/dense_3/MatMul; FeedforwardNN/dense_3/BiasAdd - dtype: , shape: [4]
Тензор FeedforwardNN/batch_normalization_2/batchnorm/mul_1;FeedforwardNN/batch_normalization_2/batchnorm/add_1;FeedforwardNN/dense_3/MatMul;FeedforwardNN/dense_3/BiasAdd - dtype: , форма: [ 4 20]
Тензорная связь с прямой связьюNN/dense_2/MatMul; ;FeedforwardNN/batch_normalization_1/batchnorm/add_1;FeedforwardNN/dense_2/MatMul;FeedforwardNN/dense_2/BiasAdd - dtype: , форма: [20 32]
Тензорная прямая связьNN/dense_1 /MatMul;FeedforwardNN/dense_1/BiasAdd - dtype: , shape: [32]
Тензорная прямая связь/batch_normalization/batchnorm/mul_1;FeedforwardNN/batch_normalization/batchnorm/add_1; FeedforwardNN/dense_1/MatMul;FeedforwardNN/dense_1/BiasAdd - dtype: , shape: [32 32]
Тензор FeedforwardNN/dense/BiasAdd/ReadVariableOp - dtype: , форма: [32]
Тензорная прямая связьNN/dense/MatMul - dtype: , форма: [32 9]
Тензор tfl.quantize - dtype: , shape: [1 9]
Тензор FeedforwardNN/dense/MatMul; FeedforwardNN/dense/BiasAdd - dtype: , shape: [ 1 32]
Tensor FeedforwardNN/dense/leaky_re_lu/LeakyRelu - dtype: , shape: [ 1 32]
Тензор FeedforwardNN/batch_normalization/batchnorm/mul_1; FeedforwardNN/batch_normalization/batchnorm/add_1; FeedforwardNN/dense_1/MatMul; FeedforwardNN/dense_1/BiasAdd1 - dtype: , форма: [ 1 32]
Тензорная связь с прямой связьюNN/dense_1/leaky_re_lu_1/LeakyRelu - dtype: , форма: [ 1 32]
Тензорная связь с прямой связьюNN/batch_normalization_1/batchnorm/mul_1;FeedforwardNN/batch_normalization_1/batchnorm/add_1 ;FeedforwardNN/dense_2/MatMul;FeedforwardNN/dense_2/BiasAdd1 - dtype: , shape: [ 1 20]
Тензор FeedforwardNN/dense_2/leaky_re_lu_2/LeakyRelu - dtype: < класс 'numpy.int8'>, форма: [ 1 20]
Тензор FeedforwardNN/batch_normalization_2/batchnorm/mul_1; FeedforwardNN/batch_normalization_2/batchnorm/add_1; FeedforwardNN/dense_3/MatMul; FeedforwardNN/dense_3/BiasAdd1 - dtype: < класс 'numpy.int8'>, форма: [1 4]
Tensor StatefulPartitionedCall:01 - dtype: , форма: [1 4]
Tensor StatefulPartitionedCall:0 - dtype: , shape: [1 4]
16-бит:
Тензор обслуживающего_default_input_2:0 - dtype: , форма: [1 9]
Тензор FeedforwardNN/dense_1/MatMul;FeedforwardNN/ Density_1/BiasAdd - dtype: , shape: [32]
Тензорная связь с прямой связьюNN/batch_normalization/batchnorm/mul_1; FeedforwardNN/batch_normalization/batchnorm/add_1; FeedforwardNN/dense_1/MatMul; FeedforwardNN/dense_1 /BiasAdd - dtype: , форма: [32 32]
Тензор FeedforwardNN/dense_2/MatMul;FeedforwardNN/dense_2/BiasAdd - dtype: , форма: [20]
Тензорная связь FeedforwardNN/batch_normalization_1/batchnorm/mul_1; FeedforwardNN/batch_normalization_1/batchnorm/add_1; FeedforwardNN/dense_2/MatMul; FeedforwardNN/dense_2/BiasAdd - dtype: , форма: [ 20 32]
Тензорная прямая связьNN/dense_3/MatMul;FeedforwardNN/dense_3/BiasAdd - dtype: , shape: [4]
Тензорная прямая связьNN/batch_normalization_2/batchnorm/mul_1;FeedforwardNN /batch_normalization_2/batchnorm/add_1;FeedforwardNN/dense_3/MatMul;FeedforwardNN/dense_3/BiasAdd - dtype: , shape: [ 4 20]
Тензор FeedforwardNN/dense/BiasAdd/ReadVariableOp - dtype : , форма: [32]
Тензорная прямая связь NN/dense/MatMul - dtype: , форма: [32 9]
Тензорная прямая связь NN/ Density_1/MatMul;FeedforwardNN/dense_1/BiasAdd1 - dtype: , shape: [32]
Тензор FeedforwardNN/batch_normalization/batchnorm/mul_1;FeedforwardNN/batch_normalization/batchnorm/add_1;FeedforwardNN/dense_1 /MatMul;FeedforwardNN/dense_1/BiasAdd1 - dtype: , shape: [32 32]
Тензор FeedforwardNN/dense_2/MatMul;FeedforwardNN/dense_2/BiasAdd1 - dtype: , shape: [20]
Тензор FeedforwardNN/batch_normalization_1/batchnorm/mul_1; FeedforwardNN/batch_normalization_1/batchnorm/add_1; FeedforwardNN/dense_2/MatMul; FeedforwardNN/dense_2/BiasAdd1 - dtype: , форма: [20 32]
Тензорная прямая связь/dense_3/MatMul; /batchnorm/mul_1;FeedforwardNN/batch_normalization_2/batchnorm/add_1;FeedforwardNN/dense_3/MatMul;FeedforwardNN/dense_3/BiasAdd1 - dtype: , форма: [ 4 20]
Тензор FeedforwardNN/dense /BiasAdd/ReadVariableOp1 - dtype: , shape: [32]
Тензорная прямая связьNN/dense/MatMul1 - dtype: , shape: [32 9]Тензор FeedforwardNN/dense/MatMul; FeedforwardNN/dense/BiasAdd - dtype: , shape: [ 1 32]
Тензор FeedforwardNN/dense/leaky_re_lu/LeakyRelu - dtype: , форма: [ 1 32]
Тензор FeedforwardNN/batch_normalization/batchnorm/mul_1; FeedforwardNN/batch_normalization/batchnorm/add_1; FeedforwardNN/dense_1/MatMul; FeedforwardNN/dense_1/BiasAdd2 - dtype: , shape: [ 1 32]
Тензорная прямая связь/dense_1/leaky_re_lu_1/LeakyRelu - dtype: , форма: [ 1 32]
Тензорная прямаяNN/ Batch_normalization_1/batchnorm/mul_1;FeedforwardNN/batch_normalization_1/batchnorm/add_1;FeedforwardNN/dense_2/MatMul;FeedforwardNN/dense_2/BiasAdd2 - dtype: , форма: [ 1 20]
Tensor FeedforwardNN/ Density_2/leaky_re_lu_2/LeakyRelu - dtype: , shape: [ 1 20]
Тензор FeedforwardNN/batch_normalization_2/batchnorm/mul_1; FeedforwardNN/batch_normalization_2/batchnorm/add_1; FeedforwardNN/dense_3/MatMul; FeedforwardNN/dense_3/BiasAdd2 - dtype: , форма: [1 4]
Tensor StatefulPartitionedCall:0 - dtype: , форма: [1 4]< /p>
Это мой тест:
Код: Выделить всё
#This snipet of code takes 3.8s
Код: Выделить всё
interpreter = tf.lite.Interpreter(model_path="8bit_model.tflite")
interpreter.allocate_tensors()
def evaluate_quantized_model8bit(X_test):
predicted_labels = []
for i in range(len(X_test)):
input_data = X_test[i].reshape(1, -1).astype(np.uint8)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
predicted_label = np.argmax(output_data)
predicted_labels.append(predicted_label)
return predicted_labels
predicted_labelsQ = evaluate_quantized_model(test_data)
Код: Выделить всё
#And this other one takes 1.1s
Код: Выделить всё
interpreter = tf.lite.Interpreter(model_path="16bit_model.tflite")
interpreter.allocate_tensors()
def evaluate_quantized_model16bit(X_test):
predicted_labels = []
for i in range(len(X_test)):
input_data = X_test[i].reshape(1, -1).astype(np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
predicted_label = np.argmax(output_data)
predicted_labels.append(predicted_label)
return predicted_labels
predicted_labelsQ = evaluate_quantized_model(test_data)
Подробнее здесь: https://stackoverflow.com/questions/790 ... -bit-model