Deep learning

Tensorflow Batchnormalization (전이학습) 중요한 사항

비비이잉 2021. 9. 13. 16:41
반응형

Important notes about BatchNormalization layer

Many image models contain BatchNormalization layers. That layer is a special case on every imaginable count. Here are a few things to keep in mind.

  • BatchNormalization contains 2 non-trainable weights that get updated during training. These are the variables tracking the mean and variance of the inputs.
  • When you set bn_layer.trainable = False, the BatchNormalization layer will run in inference mode, and will not update its mean & variance statistics. This is not the case for other layers in general, as weight trainability & inference/training modes are two orthogonal concepts. But the two are tied in the case of the BatchNormalization layer.
  • When you unfreeze a model that contains BatchNormalization layers in order to do fine-tuning, you should keep the BatchNormalization layers in inference mode by passing training=False when calling the base model. Otherwise the updates applied to the non-trainable weights will suddenly destroy what the model has learned.

<출처 : https://www.tensorflow.org/guide/keras/transfer_learning?hl=Eng  >

 

def unfreeze_model(model):
    # We unfreeze the top 20 layers while leaving BatchNorm layers frozen
    for layer in model.layers[-20:]:
        if not isinstance(layer, layers.BatchNormalization):
            layer.trainable = True

    optimizer = tf.keras.optimizers.Adam(learning_rate=1e-4)
    model.compile(
        optimizer=optimizer, loss="categorical_crossentropy", metrics=["accuracy"]
    )


unfreeze_model(model)

epochs = 10  # @param {type: "slider", min:8, max:50}
hist = model.fit(ds_train, epochs=epochs, validation_data=ds_test, verbose=2)
plot_hist(hist)

# 출처 : https://keras.io/examples/vision/image_classification_efficientnet_fine_tuning/

케라스에서나 tensorflow 공식 문서를 살펴보면 BatchNormalization에 관한 이야기가 함께 나온다. 

 

Tips for fine tuning EfficientNet

On unfreezing layers:

  • The BathcNormalization layers need to be kept frozen (more details). If they are also turned to trainable, the first epoch after unfreezing will significantly reduce accuracy.
  • In some cases it may be beneficial to open up only a portion of layers instead of unfreezing all. This will make fine tuning much faster when going to larger models like B7.
  • Each block needs to be all turned on or off. This is because the architecture includes a shortcut from the first layer to the last layer for each block. Not respecting blocks also significantly harms the final performance.

 

그대로 가져와봤다. BatchNormalization을 Unfreezing layer에서는 무조건 frozen시켜야 한다라고 나와있다. 만약 이 레이어들이 trainable, 훈련가능하다면 동결을 해제하고 난 첫번째 에포크에서 정확도가 확 줄어든다는 것이다. 

 

근데 여기서 주의해야할 점이 있다 : )

 

tf.keras랑 keras랑 trainable을 해줄때에 문제가 생긴다 

반응형

'Deep learning' 카테고리의 다른 글

Imbalance data training  (0) 2021.09.14
Trainable 정리  (0) 2021.09.13
EfficientNetV2  (0) 2021.08.30
Keras Tuner Custom objective 설정 방법  (0) 2021.08.25
Keras Tuner 사용방법(튜토리얼)  (0) 2021.08.24