EfficientNet Transfer Learning & Fine tuning

Deep learning/Studying

EfficientNet Transfer Learning & Fine tuning

비비이잉 2021. 9. 8. 11:20

💪🏻Training a model from a scratch(모델을 처음부터 학습)

: 정확도는 아주 느리게 올라가고 overfitting 될 가능성이 크다

trainable params, non-trainable params를 출력해보면 학습가능한 파라미터의 수가 훨씬 큼을 알 수 있다.

로스값은 1~4사이값에 있고 train accruacy 0.6, valid accruacy는 0.2수준에 머문다.

from tensorflow.keras.applications import EfficientNetB0

with strategy.scope():
    inputs = layers.Input(shape=(IMG_SIZE, IMG_SIZE, 3))
    x = img_augmentation(inputs)
    outputs = EfficientNetB0(include_top=True, weights=None, classes=NUM_CLASSES)(x)

    model = tf.keras.Model(inputs, outputs)
    model.compile(
        optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"]
    )

model.summary()

epochs = 40  
hist = model.fit(ds_train, epochs=epochs, validation_data=ds_test, verbose=2)

💪🏻Transfer Learning from pre-trained weights

1️⃣ freeze all layers and train only the top layers.

-비교적 큰 learning rate(1e-2)값이 사용되고, validation accuracy와 loss값은 training accruacy와 loss값에 비해서 더 좋다. 왜냐하면 This is because the regularization is strong, which only suppresses training-time metrics.

-50epoch정도에서 수렴하고, augmentation layer를 추가하지 않을 경우, accuracy는 ~60퍼센트에만 도달한다.

def build_model(num_classes):
    inputs = layers.Input(shape=(IMG_SIZE, IMG_SIZE, 3))
    x = img_augmentation(inputs)
    model = EfficientNetB0(include_top=False, input_tensor=x, weights="imagenet")

    # Freeze the pretrained weights
    model.trainable = False

    # Rebuild top
    x = layers.GlobalAveragePooling2D(name="avg_pool")(model.output)
    x = layers.BatchNormalization()(x)

    top_dropout_rate = 0.2
    x = layers.Dropout(top_dropout_rate, name="top_dropout")(x)
    outputs = layers.Dense(NUM_CLASSES, activation="softmax", name="pred")(x)

    # Compile
    model = tf.keras.Model(inputs, outputs, name="EfficientNet")
    optimizer = tf.keras.optimizers.Adam(learning_rate=1e-2)
    model.compile(
        optimizer=optimizer, loss="categorical_crossentropy", metrics=["accuracy"]
    )
    return model

2️⃣ Unfreeze a number of layers and fit the model using smaller learning rate.

-feature extraction이 pre-trained model로 잘 되었다면 이 과정에서 validation accuracy를 높이는데에는 한계가 있다.

-그러나, pre-trained weights를 imagenet과 어느정도 다른 데이터셋에 사용한다면 이 fine tuning 과정은 feature extraction 과정에 있어서 중요하다.

Batchnormalization layer에 layer.trainable = False 하는 것은 레이어를 얼린다는 것이고, 그렇기 때문에 내부의 상태는 학습중에 바뀌지 않는다. "FROZEN SATE", "INFERENCE MODE" 두 가지의 상태는 명확하게 다른 상태이다.

💪🏻Tips for fine tuning EfficientNet

On unfreezing layers: (동결시키지 않은 레이어에서는)

The BathcNormalization layers need to be kept frozen (more details). If they are also turned to trainable, the first epoch after unfreezing will significantly reduce accuracy.

In some cases it may be beneficial to open up only a portion of layers instead of unfreezing all. This will make fine tuning much faster when going to larger models like B7.

Each block needs to be all turned on or off. This is because the architecture includes a shortcut from the first layer to the last layer for each block. Not respecting blocks also significantly harms the final performance.

Some other tips for utilizing EfficientNet:

Larger variants of EfficientNet do not guarantee improved performance, especially for tasks with less data or fewer classes. In such a case, the larger variant of EfficientNet chosen, the harder it is to tune hyperparameters.

EMA (Exponential Moving Average) is very helpful in training EfficientNet from scratch, but not so much for transfer learning.

Do not use the RMSprop setup as in the original paper for transfer learning. The momentum and learning rate are too high for transfer learning. It will easily corrupt the pretrained weight and blow up the loss. A quick check is to see if loss (as categorical cross entropy) is getting significantly larger than log(NUM_CLASSES) after the same epoch. If so, the initial learning rate/momentum is too high.

Smaller batch size benefit validation accuracy, possibly due to effectively providing regularization.

출처: https://keras.io/examples/vision/image_classification_efficientnet_fine_tuning/

Uploaded by Notion2Tistory v1.1.0

'Deep learning > Studying' 카테고리의 다른 글

Weight initialization(가중치초기화 ) (0)	2021.09.10
전이학습 ,파인튜닝(미세조정) 간단정리 (0)	2021.09.09
Overfitting Underfitting (0)	2021.09.07
Bayesian Optimiation (0)	2021.09.03
Hyperband (Successive halving Algorithm 보완한 알고리즘) (0)	2021.08.23

현재글EfficientNet Transfer Learning & Fine tuning

SILVER

bayesian optimization, Brightness, c#, c##PointToScreen, class imbalance #, dat file, grid search, ins, isns, labels,

Today :
Yesterday :

일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

SILVER