Skip to content

Commit d86dbe4

Browse files
committed
Update README.md and few more comments
1 parent 0d253e2 commit d86dbe4

File tree

2 files changed

+13
-7
lines changed

2 files changed

+13
-7
lines changed

README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,11 @@
22

33
## What's New
44

5+
### Feb 12, 2021
6+
* Update Normalization-Free nets to include new NFNet-F (https://arxiv.org/abs/2102.06171) model defs
7+
58
### Feb 10, 2021
6-
* First Normalizer-Free model training experiments done,
9+
* First Normalization-Free model training experiments done,
710
* nf_resnet50 - 80.68 top-1 @ 288x288, 80.31 @ 256x256
811
* nf_regnet_b1 - 79.30 @ 288x288, 78.75 @ 256x256
912
* More model archs, incl a flexible ByobNet backbone ('Bring-your-own-blocks')
@@ -164,6 +167,7 @@ A full version of the list below with source links can be found in the [document
164167
* Inception-ResNet-V2 and Inception-V4 - https://arxiv.org/abs/1602.07261
165168
* MobileNet-V3 (MBConvNet w/ Efficient Head) - https://arxiv.org/abs/1905.02244
166169
* NASNet-A - https://arxiv.org/abs/1707.07012
170+
* NFNet-F - https://arxiv.org/abs/2102.06171
167171
* NF-RegNet / NF-ResNet - https://arxiv.org/abs/2101.08692
168172
* PNasNet - https://arxiv.org/abs/1712.00559
169173
* RegNet - https://arxiv.org/abs/2003.13678

timm/models/nfnet.py

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -236,7 +236,7 @@ def forward(self, x):
236236

237237

238238
class NormFreeBlock(nn.Module):
239-
"""Normalization-free pre-activation block.
239+
"""Normalization-Free pre-activation block.
240240
"""
241241

242242
def __init__(
@@ -351,6 +351,7 @@ def create_stem(in_chs, out_chs, stem_type='', conv_layer=None, act_layer=None):
351351
return nn.Sequential(stem), stem_stride, stem_feature
352352

353353

354+
# from https://github.com/deepmind/deepmind-research/tree/master/nfnets
354355
_nonlin_gamma = dict(
355356
identity=1.0,
356357
celu=1.270926833152771,
@@ -371,10 +372,13 @@ def create_stem(in_chs, out_chs, stem_type='', conv_layer=None, act_layer=None):
371372

372373

373374
class NormFreeNet(nn.Module):
374-
""" Normalization-free ResNets and RegNets
375+
""" Normalization-Free Network
375376
376-
As described in `Characterizing signal propagation to close the performance gap in unnormalized ResNets`
377+
As described in :
378+
`Characterizing signal propagation to close the performance gap in unnormalized ResNets`
377379
- https://arxiv.org/abs/2101.08692
380+
and
381+
`High-Performance Large-Scale Image Recognition Without Normalization` - https://arxiv.org/abs/2102.06171
378382
379383
This model aims to cover both the NFRegNet-Bx models as detailed in the paper's code snippets and
380384
the (preact) ResNet models described earlier in the paper.
@@ -432,7 +436,7 @@ def __init__(self, cfg: NfCfg, num_classes=1000, in_chans=3, global_pool='avg',
432436
blocks += [NormFreeBlock(
433437
in_chs=prev_chs, out_chs=out_chs,
434438
alpha=cfg.alpha,
435-
beta=1. / expected_var ** 0.5, # NOTE: beta used as multiplier in block
439+
beta=1. / expected_var ** 0.5,
436440
stride=stride if block_idx == 0 else 1,
437441
dilation=dilation,
438442
first_dilation=first_dilation,
@@ -477,8 +481,6 @@ def __init__(self, cfg: NfCfg, num_classes=1000, in_chans=3, global_pool='avg',
477481
if m.bias is not None:
478482
nn.init.zeros_(m.bias)
479483
elif isinstance(m, nn.Conv2d):
480-
# as per discussion with paper authors, original in haiku is
481-
# hk.initializers.VarianceScaling(1.0, 'fan_in', 'normal')' w/ zero'd bias
482484
nn.init.kaiming_normal_(m.weight, mode='fan_in', nonlinearity='linear')
483485
if m.bias is not None:
484486
nn.init.zeros_(m.bias)

0 commit comments

Comments
 (0)