Session 8 - Batch Normalization and Regularization
Assignment
1) Change the dataset to CIFAR10
2) Make this network:
i) C1 C2 c3 P1 C3 C4 C5 c6 P2 C7 C8 C9 GAP C10
ii) Keep the parameter count less than 50000
iii) Try and add one layer to another
iv) Max Epochs is 20
3) You are making 3 versions of the above code (in each case achieve above 70% accuracy):
i) Network with Group Normalization
ii) Network with Layer Normalization
iii) Network with Batch Normalization
4) Share these details
i) Training accuracy for 3 models
ii) Test accuracy for 3 models
5) Find 10 misclassified images for the BN model, and show them as a 5x2 image matrix in 3 separately annotated images.
6) write an explanatory README file that explains:
i) what is your code all about
ii) your findings for normalization techniques
iii) add all your graphs
iv) your collection-of-misclassified-images
7) Upload your complete assignment on GitHub and share the link on LMS
Normalization Summary
S.No. | Normalization Type | Results | Analysis | File Link |
---|---|---|---|---|
1 | Batch Normalization | <ul><li>Best Train Accuracy - 82.59%</li><li> Best Test Accuracy - 79.60%</li><li> Test Accuracy - 79.39%</li><li>Total Parameters - 39,420</li></ul> | Better Performing Model with Lesser Parameters | Open |
2 | Group Normalization | <ul><li>Best Train Accuracy - 84.09%</li><li> Best Test Accuracy - 77.35%</li><li> Test Accuracy - 76.84%</li><li>Total Parameters - 40,360</li></ul> | The number of parameters increases on replacing batch normalization with group normalization. In order to keep the parameters within limits, the model capacity is decreased by decreasing number of channels. This leads to drop in model’s performance. | Open |
3 | Layer Normalization | <ul><li>Best Train Accuracy - 76.42% </li><li> Best Test Accuracy - 68.48%</li><li> Test Accuracy - 68.48%</li><li>Total Parameters - 55,612</li></ul> | The number of parameters increases a lot and it becomes difficult to constrain the parameters count within limit(50K). For this model’s capacity is decreased since not much scope for changing model’s structure. This leads to decrease in performance. | Open |
With CIFAR10 as the dataset: Among the 3 normalization, Batch Normalization gives the best performance with minimum parameters. This is because the number of parameters belonging to batch normalization is minimum among the 3 and so there is a scope of increasing model’s capacity.
1) Batch Normalization
Model Architecture
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 16, 32, 32] 432
ReLU-2 [-1, 16, 32, 32] 0
BatchNorm2d-3 [-1, 16, 32, 32] 32
Dropout-4 [-1, 16, 32, 32] 0
Conv2d-5 [-1, 32, 32, 32] 4,608
ReLU-6 [-1, 32, 32, 32] 0
BatchNorm2d-7 [-1, 32, 32, 32] 64
Dropout-8 [-1, 32, 32, 32] 0
Conv2d-9 [-1, 16, 32, 32] 512
MaxPool2d-10 [-1, 16, 16, 16] 0
Conv2d-11 [-1, 24, 16, 16] 3,456
ReLU-12 [-1, 24, 16, 16] 0
BatchNorm2d-13 [-1, 24, 16, 16] 48
Dropout-14 [-1, 24, 16, 16] 0
Conv2d-15 [-1, 28, 16, 16] 6,048
ReLU-16 [-1, 28, 16, 16] 0
BatchNorm2d-17 [-1, 28, 16, 16] 56
Dropout-18 [-1, 28, 16, 16] 0
Conv2d-19 [-1, 32, 16, 16] 8,064
ReLU-20 [-1, 32, 16, 16] 0
BatchNorm2d-21 [-1, 32, 16, 16] 64
Dropout-22 [-1, 32, 16, 16] 0
Conv2d-23 [-1, 16, 16, 16] 512
MaxPool2d-24 [-1, 16, 8, 8] 0
Conv2d-25 [-1, 20, 8, 8] 2,880
ReLU-26 [-1, 20, 8, 8] 0
BatchNorm2d-27 [-1, 20, 8, 8] 40
Dropout-28 [-1, 20, 8, 8] 0
Conv2d-29 [-1, 26, 8, 8] 4,680
ReLU-30 [-1, 26, 8, 8] 0
BatchNorm2d-31 [-1, 26, 8, 8] 52
Dropout-32 [-1, 26, 8, 8] 0
Conv2d-33 [-1, 32, 8, 8] 7,488
ReLU-34 [-1, 32, 8, 8] 0
BatchNorm2d-35 [-1, 32, 8, 8] 64
Dropout-36 [-1, 32, 8, 8] 0
AvgPool2d-37 [-1, 32, 1, 1] 0
Conv2d-38 [-1, 10, 1, 1] 320
================================================================
Total params: 39,420
Trainable params: 39,420
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 2.50
Params size (MB): 0.15
Estimated Total Size (MB): 2.67
----------------------------------------------------------------
Results:
- Best Train Accuracy - 82.59%
- Best Test Accuracy - 79.60%
- Test Accuracy - 79.39%
- Total Parameters - 39,420
Train/Test Logs
Adjusting learning rate of group 0 to 1.0000e-01.
Epoch 1
Train: Loss=1.2112 Batch_id=390 Accuracy=49.44: 100%|██████████| 391/391 [00:21<00:00, 17.83it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 1.4187, Accuracy: 4908/10000 (49.08%)
Epoch 2
Train: Loss=0.8207 Batch_id=390 Accuracy=62.78: 100%|██████████| 391/391 [00:19<00:00, 20.27it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 0.9953, Accuracy: 6378/10000 (63.78%)
Epoch 3
Train: Loss=0.8988 Batch_id=390 Accuracy=67.42: 100%|██████████| 391/391 [00:18<00:00, 20.71it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 1.0392, Accuracy: 6375/10000 (63.75%)
Epoch 4
Train: Loss=1.0030 Batch_id=390 Accuracy=70.75: 100%|██████████| 391/391 [00:19<00:00, 19.72it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 0.8529, Accuracy: 6989/10000 (69.89%)
Epoch 5
Train: Loss=0.6245 Batch_id=390 Accuracy=72.43: 100%|██████████| 391/391 [00:20<00:00, 19.27it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 0.7671, Accuracy: 7314/10000 (73.14%)
Epoch 6
Train: Loss=0.7977 Batch_id=390 Accuracy=74.34: 100%|██████████| 391/391 [00:20<00:00, 19.40it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 0.8924, Accuracy: 6906/10000 (69.06%)
Epoch 7
Train: Loss=0.7652 Batch_id=390 Accuracy=75.23: 100%|██████████| 391/391 [00:19<00:00, 19.71it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 0.7223, Accuracy: 7506/10000 (75.06%)
Epoch 8
Train: Loss=0.7648 Batch_id=390 Accuracy=76.18: 100%|██████████| 391/391 [00:19<00:00, 19.71it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.7115, Accuracy: 7552/10000 (75.52%)
Epoch 9
Train: Loss=0.6539 Batch_id=390 Accuracy=79.76: 100%|██████████| 391/391 [00:19<00:00, 19.92it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.6143, Accuracy: 7865/10000 (78.65%)
Epoch 10
Train: Loss=0.8766 Batch_id=390 Accuracy=80.60: 100%|██████████| 391/391 [00:19<00:00, 19.67it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.6078, Accuracy: 7871/10000 (78.71%)
Epoch 11
Train: Loss=0.5852 Batch_id=390 Accuracy=80.99: 100%|██████████| 391/391 [00:20<00:00, 19.17it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.6078, Accuracy: 7879/10000 (78.79%)
Epoch 12
Train: Loss=0.7320 Batch_id=390 Accuracy=81.28: 100%|██████████| 391/391 [00:20<00:00, 19.20it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.6079, Accuracy: 7862/10000 (78.62%)
Epoch 13
Train: Loss=0.6432 Batch_id=390 Accuracy=81.33: 100%|██████████| 391/391 [00:20<00:00, 19.02it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.5993, Accuracy: 7896/10000 (78.96%)
Epoch 14
Train: Loss=0.6544 Batch_id=390 Accuracy=81.73: 100%|██████████| 391/391 [00:20<00:00, 19.25it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.5973, Accuracy: 7935/10000 (79.35%)
Epoch 15
Train: Loss=0.5268 Batch_id=390 Accuracy=81.69: 100%|██████████| 391/391 [00:20<00:00, 19.43it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.6023, Accuracy: 7891/10000 (78.91%)
Epoch 16
Train: Loss=0.5817 Batch_id=390 Accuracy=81.55: 100%|██████████| 391/391 [00:19<00:00, 19.71it/s]Adjusting learning rate of group 0 to 1.0000e-03.
Test set: Average loss: 0.6014, Accuracy: 7911/10000 (79.11%)
Epoch 17
Train: Loss=0.4692 Batch_id=390 Accuracy=82.18: 100%|██████████| 391/391 [00:19<00:00, 19.98it/s]Adjusting learning rate of group 0 to 1.0000e-03.
Test set: Average loss: 0.5913, Accuracy: 7956/10000 (79.56%)
Epoch 18
Train: Loss=0.6526 Batch_id=390 Accuracy=82.40: 100%|██████████| 391/391 [00:20<00:00, 18.86it/s]Adjusting learning rate of group 0 to 1.0000e-03.
Test set: Average loss: 0.5920, Accuracy: 7941/10000 (79.41%)
Epoch 19
Train: Loss=0.6541 Batch_id=390 Accuracy=82.61: 100%|██████████| 391/391 [00:20<00:00, 18.81it/s]Adjusting learning rate of group 0 to 1.0000e-03.
Test set: Average loss: 0.5910, Accuracy: 7960/10000 (79.60%)
Epoch 20
Train: Loss=0.6497 Batch_id=390 Accuracy=82.59: 100%|██████████| 391/391 [00:20<00:00, 19.21it/s]Adjusting learning rate of group 0 to 1.0000e-03.
Test set: Average loss: 0.5914, Accuracy: 7939/10000 (79.39%)
Train/Test Visualization
10 Mis-classified Images
2) Group Normalization
Model Architecture
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 16, 32, 32] 432
ReLU-2 [-1, 16, 32, 32] 0
GroupNorm-3 [-1, 16, 32, 32] 32
Dropout-4 [-1, 16, 32, 32] 0
Conv2d-5 [-1, 32, 32, 32] 4,608
ReLU-6 [-1, 32, 32, 32] 0
GroupNorm-7 [-1, 32, 32, 32] 64
Dropout-8 [-1, 32, 32, 32] 0
Conv2d-9 [-1, 16, 32, 32] 512
MaxPool2d-10 [-1, 16, 16, 16] 0
Conv2d-11 [-1, 24, 16, 16] 3,456
ReLU-12 [-1, 24, 16, 16] 0
GroupNorm-13 [-1, 24, 16, 16] 48
Dropout-14 [-1, 24, 16, 16] 0
Conv2d-15 [-1, 28, 16, 16] 6,048
ReLU-16 [-1, 28, 16, 16] 0
GroupNorm-17 [-1, 28, 16, 16] 56
Dropout-18 [-1, 28, 16, 16] 0
Conv2d-19 [-1, 32, 16, 16] 8,064
ReLU-20 [-1, 32, 16, 16] 0
GroupNorm-21 [-1, 32, 16, 16] 64
Dropout-22 [-1, 32, 16, 16] 0
Conv2d-23 [-1, 16, 16, 16] 512
MaxPool2d-24 [-1, 16, 8, 8] 0
Conv2d-25 [-1, 20, 8, 8] 2,880
ReLU-26 [-1, 20, 8, 8] 0
GroupNorm-27 [-1, 20, 8, 8] 40
Dropout-28 [-1, 20, 8, 8] 0
Conv2d-29 [-1, 28, 8, 8] 5,040
ReLU-30 [-1, 28, 8, 8] 0
GroupNorm-31 [-1, 28, 8, 8] 56
Dropout-32 [-1, 28, 8, 8] 0
Conv2d-33 [-1, 32, 8, 8] 8,064
ReLU-34 [-1, 32, 8, 8] 0
GroupNorm-35 [-1, 32, 8, 8] 64
Dropout-36 [-1, 32, 8, 8] 0
AvgPool2d-37 [-1, 32, 1, 1] 0
Conv2d-38 [-1, 10, 1, 1] 320
================================================================
Total params: 40,360
Trainable params: 40,360
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 2.51
Params size (MB): 0.15
Estimated Total Size (MB): 2.67
----------------------------------------------------------------
Results:
- Best Train Accuracy - 84.09%
- Best Test Accuracy - 77.35%
- Test Accuracy - 76.84%
- Total Parameters - 40,360
Train/Test Logs
Epoch 1
Train: Loss=1.5306 Batch_id=390 Accuracy=36.48: 100%|██████████| 391/391 [00:20<00:00, 19.48it/s]
Test set: Average loss: 1.4840, Accuracy: 4458/10000 (44.58%)
Epoch 2
Train: Loss=1.2313 Batch_id=390 Accuracy=53.96: 100%|██████████| 391/391 [00:19<00:00, 20.47it/s]
Test set: Average loss: 1.1905, Accuracy: 5691/10000 (56.91%)
Epoch 3
Train: Loss=0.9978 Batch_id=390 Accuracy=61.65: 100%|██████████| 391/391 [00:18<00:00, 20.66it/s]
Test set: Average loss: 1.0877, Accuracy: 6094/10000 (60.94%)
Epoch 4
Train: Loss=0.8820 Batch_id=390 Accuracy=65.29: 100%|██████████| 391/391 [00:19<00:00, 20.15it/s]
Test set: Average loss: 0.9620, Accuracy: 6614/10000 (66.14%)
Epoch 5
Train: Loss=0.9931 Batch_id=390 Accuracy=68.30: 100%|██████████| 391/391 [00:19<00:00, 19.57it/s]
Test set: Average loss: 0.8844, Accuracy: 6875/10000 (68.75%)
Epoch 6
Train: Loss=1.0570 Batch_id=390 Accuracy=71.02: 100%|██████████| 391/391 [00:19<00:00, 19.60it/s]
Test set: Average loss: 0.8330, Accuracy: 7032/10000 (70.32%)
Epoch 7
Train: Loss=0.6323 Batch_id=390 Accuracy=73.09: 100%|██████████| 391/391 [00:20<00:00, 19.55it/s]
Test set: Average loss: 0.8228, Accuracy: 7120/10000 (71.20%)
Epoch 8
Train: Loss=0.5310 Batch_id=390 Accuracy=74.65: 100%|██████████| 391/391 [00:19<00:00, 20.25it/s]
Test set: Average loss: 0.7560, Accuracy: 7325/10000 (73.25%)
Epoch 9
Train: Loss=0.7021 Batch_id=390 Accuracy=76.32: 100%|██████████| 391/391 [00:19<00:00, 20.39it/s]
Test set: Average loss: 0.7814, Accuracy: 7312/10000 (73.12%)
Epoch 10
Train: Loss=0.6399 Batch_id=390 Accuracy=77.36: 100%|██████████| 391/391 [00:19<00:00, 19.74it/s]
Test set: Average loss: 0.7428, Accuracy: 7348/10000 (73.48%)
Epoch 11
Train: Loss=0.6346 Batch_id=390 Accuracy=78.00: 100%|██████████| 391/391 [00:20<00:00, 19.43it/s]
Test set: Average loss: 0.7384, Accuracy: 7455/10000 (74.55%)
Epoch 12
Train: Loss=0.4105 Batch_id=390 Accuracy=79.51: 100%|██████████| 391/391 [00:20<00:00, 19.22it/s]
Test set: Average loss: 0.7197, Accuracy: 7489/10000 (74.89%)
Epoch 13
Train: Loss=0.6398 Batch_id=390 Accuracy=79.90: 100%|██████████| 391/391 [00:19<00:00, 20.23it/s]
Test set: Average loss: 0.7232, Accuracy: 7522/10000 (75.22%)
Epoch 14
Train: Loss=0.5218 Batch_id=390 Accuracy=80.68: 100%|██████████| 391/391 [00:19<00:00, 20.28it/s]
Test set: Average loss: 0.6646, Accuracy: 7698/10000 (76.98%)
Epoch 15
Train: Loss=0.4599 Batch_id=390 Accuracy=81.22: 100%|██████████| 391/391 [00:19<00:00, 20.27it/s]
Test set: Average loss: 0.6925, Accuracy: 7587/10000 (75.87%)
Epoch 16
Train: Loss=0.5416 Batch_id=390 Accuracy=82.04: 100%|██████████| 391/391 [00:20<00:00, 19.32it/s]
Test set: Average loss: 0.6946, Accuracy: 7587/10000 (75.87%)
Epoch 17
Train: Loss=0.4043 Batch_id=390 Accuracy=82.41: 100%|██████████| 391/391 [00:20<00:00, 19.29it/s]
Test set: Average loss: 0.6540, Accuracy: 7728/10000 (77.28%)
Epoch 18
Train: Loss=0.5031 Batch_id=390 Accuracy=82.89: 100%|██████████| 391/391 [00:20<00:00, 19.13it/s]
Test set: Average loss: 0.7014, Accuracy: 7632/10000 (76.32%)
Epoch 19
Train: Loss=0.3903 Batch_id=390 Accuracy=83.37: 100%|██████████| 391/391 [00:19<00:00, 19.74it/s]
Test set: Average loss: 0.6668, Accuracy: 7735/10000 (77.35%)
Epoch 20
Train: Loss=0.4610 Batch_id=390 Accuracy=84.09: 100%|██████████| 391/391 [00:19<00:00, 19.68it/s]
Test set: Average loss: 0.6781, Accuracy: 7684/10000 (76.84%)
Train/Test Visualization
10 Mis-classified Images
3) Layer Normalization
Model Architecture
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 8, 30, 30] 216
ReLU-2 [-1, 8, 30, 30] 0
LayerNorm-3 [-1, 8, 30, 30] 14,400
Dropout-4 [-1, 8, 30, 30] 0
Conv2d-5 [-1, 10, 28, 28] 720
ReLU-6 [-1, 10, 28, 28] 0
LayerNorm-7 [-1, 10, 28, 28] 15,680
Dropout-8 [-1, 10, 28, 28] 0
Conv2d-9 [-1, 8, 28, 28] 80
MaxPool2d-10 [-1, 8, 14, 14] 0
Conv2d-11 [-1, 10, 14, 14] 720
ReLU-12 [-1, 10, 14, 14] 0
LayerNorm-13 [-1, 10, 14, 14] 3,920
Dropout-14 [-1, 10, 14, 14] 0
Conv2d-15 [-1, 12, 14, 14] 1,080
ReLU-16 [-1, 12, 14, 14] 0
LayerNorm-17 [-1, 12, 14, 14] 4,704
Dropout-18 [-1, 12, 14, 14] 0
Conv2d-19 [-1, 14, 14, 14] 1,512
ReLU-20 [-1, 14, 14, 14] 0
LayerNorm-21 [-1, 14, 14, 14] 5,488
Dropout-22 [-1, 14, 14, 14] 0
Conv2d-23 [-1, 8, 14, 14] 112
MaxPool2d-24 [-1, 8, 7, 7] 0
Conv2d-25 [-1, 10, 7, 7] 720
ReLU-26 [-1, 10, 7, 7] 0
LayerNorm-27 [-1, 10, 7, 7] 980
Dropout-28 [-1, 10, 7, 7] 0
Conv2d-29 [-1, 12, 7, 7] 1,080
ReLU-30 [-1, 12, 7, 7] 0
LayerNorm-31 [-1, 12, 7, 7] 1,176
Dropout-32 [-1, 12, 7, 7] 0
Conv2d-33 [-1, 14, 7, 7] 1,512
ReLU-34 [-1, 14, 7, 7] 0
LayerNorm-35 [-1, 14, 7, 7] 1,372
Dropout-36 [-1, 14, 7, 7] 0
AvgPool2d-37 [-1, 14, 1, 1] 0
Conv2d-38 [-1, 10, 1, 1] 140
================================================================
Total params: 55,612
Trainable params: 55,612
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 0.80
Params size (MB): 0.21
Estimated Total Size (MB): 1.03
----------------------------------------------------------------
Results:
- Best Train Accuracy - 76.42%
- Best Test Accuracy - 68.48%
- Test Accuracy - 68.48%
- Total Parameters - 55,612
Train/Test Logs
Adjusting learning rate of group 0 to 1.0000e-01.
Epoch 1
Train: Loss=1.5604 Batch_id=390 Accuracy=27.79: 100%|██████████| 391/391 [00:20<00:00, 19.18it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 1.6360, Accuracy: 3908/10000 (39.08%)
Epoch 2
Train: Loss=1.2691 Batch_id=390 Accuracy=44.53: 100%|██████████| 391/391 [00:20<00:00, 18.70it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 1.4476, Accuracy: 4765/10000 (47.65%)
Epoch 3
Train: Loss=1.2596 Batch_id=390 Accuracy=52.46: 100%|██████████| 391/391 [00:19<00:00, 20.09it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 1.2390, Accuracy: 5554/10000 (55.54%)
Epoch 4
Train: Loss=1.1582 Batch_id=390 Accuracy=57.66: 100%|██████████| 391/391 [00:19<00:00, 19.94it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 1.1633, Accuracy: 5809/10000 (58.09%)
Epoch 5
Train: Loss=0.9210 Batch_id=390 Accuracy=61.32: 100%|██████████| 391/391 [00:19<00:00, 20.07it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 1.0992, Accuracy: 6061/10000 (60.61%)
Epoch 6
Train: Loss=1.0223 Batch_id=390 Accuracy=63.65: 100%|██████████| 391/391 [00:19<00:00, 19.58it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 1.1038, Accuracy: 6042/10000 (60.42%)
Epoch 7
Train: Loss=0.9897 Batch_id=390 Accuracy=65.52: 100%|██████████| 391/391 [00:19<00:00, 19.62it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 1.0206, Accuracy: 6353/10000 (63.53%)
Epoch 8
Train: Loss=0.8556 Batch_id=390 Accuracy=67.30: 100%|██████████| 391/391 [00:19<00:00, 19.91it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 0.9880, Accuracy: 6479/10000 (64.79%)
Epoch 9
Train: Loss=1.0066 Batch_id=390 Accuracy=68.71: 100%|██████████| 391/391 [00:19<00:00, 20.25it/s]Adjusting learning rate of group 0 to 1.0000e-01.
Test set: Average loss: 0.9711, Accuracy: 6535/10000 (65.35%)
Epoch 10
Train: Loss=0.7505 Batch_id=390 Accuracy=69.90: 100%|██████████| 391/391 [00:19<00:00, 20.48it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.9523, Accuracy: 6605/10000 (66.05%)
Epoch 11
Train: Loss=0.7378 Batch_id=390 Accuracy=73.83: 100%|██████████| 391/391 [00:19<00:00, 19.91it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.9037, Accuracy: 6799/10000 (67.99%)
Epoch 12
Train: Loss=0.8557 Batch_id=390 Accuracy=74.50: 100%|██████████| 391/391 [00:20<00:00, 19.35it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.9033, Accuracy: 6827/10000 (68.27%)
Epoch 13
Train: Loss=0.6750 Batch_id=390 Accuracy=74.85: 100%|██████████| 391/391 [00:20<00:00, 19.53it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.9049, Accuracy: 6826/10000 (68.26%)
Epoch 14
Train: Loss=0.6658 Batch_id=390 Accuracy=75.02: 100%|██████████| 391/391 [00:19<00:00, 20.19it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.9070, Accuracy: 6815/10000 (68.15%)
Epoch 15
Train: Loss=0.6038 Batch_id=390 Accuracy=75.38: 100%|██████████| 391/391 [00:18<00:00, 20.60it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.9041, Accuracy: 6832/10000 (68.32%)
Epoch 16
Train: Loss=0.5226 Batch_id=390 Accuracy=75.62: 100%|██████████| 391/391 [00:19<00:00, 20.31it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.9062, Accuracy: 6811/10000 (68.11%)
Epoch 17
Train: Loss=0.5457 Batch_id=390 Accuracy=75.79: 100%|██████████| 391/391 [00:20<00:00, 19.51it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.9070, Accuracy: 6822/10000 (68.22%)
Epoch 18
Train: Loss=0.7538 Batch_id=390 Accuracy=75.95: 100%|██████████| 391/391 [00:19<00:00, 19.68it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.9090, Accuracy: 6838/10000 (68.38%)
Epoch 19
Train: Loss=0.6772 Batch_id=390 Accuracy=76.19: 100%|██████████| 391/391 [00:19<00:00, 20.18it/s]Adjusting learning rate of group 0 to 1.0000e-02.
Test set: Average loss: 0.9156, Accuracy: 6814/10000 (68.14%)
Epoch 20
Train: Loss=0.5566 Batch_id=390 Accuracy=76.42: 100%|██████████| 391/391 [00:18<00:00, 20.65it/s]Adjusting learning rate of group 0 to 1.0000e-03.
Test set: Average loss: 0.9115, Accuracy: 6848/10000 (68.48%)
Train/Test Visualization
10 Mis-classified Images