EfficientNet

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
EfficientNet
DeveloperGoogle AI
Initial releaseMay 2019
Repositorygithub.com/tensorflow/tpu/tree/master/models/official/efficientnet
Written inPython
Engine
    Lua error in Module:EditAtWikidata at line 29: attempt to index field 'wikibase' (a nil value).
    LicenseApache License 2.0
    WebsiteGoogle AI Blog

    EfficientNet is a family of convolutional neural networks (CNNs) for computer vision published by researchers at Google AI in 2019.[1] Its key innovation is compound scaling, which uniformly scales all dimensions of depth, width, and resolution using a single parameter.

    EfficientNet models have been adopted in various computer vision tasks, including image classification, object detection, and segmentation.

    Compound scaling

    [edit | edit source]

    EfficientNet introduces compound scaling, which, instead of scaling one dimension of the network at a time, such as depth (number of layers), width (number of channels), or resolution (input image size), uses a compound coefficient ϕ to scale all three dimensions simultaneously. Specifically, given a baseline network, the depth, width, and resolution are scaled according to the following equations:[1]depth multiplier: d=αϕwidth multiplier: w=βϕresolution multiplier: r=γϕsubject to αβ2γ22 and α1,β1,γ1. The αβ2γ22 condition is such that increasing ϕ by a factor of ϕ0 would increase the total FLOPs of running the network on an image approximately 2ϕ0 times. The hyperparameters α, β, and γ are determined by a small grid search. The original paper suggested 1.2, 1.1, and 1.15, respectively.

    Architecturally, they optimized the choice of modules by neural architecture search (NAS), and found that the inverted bottleneck convolution (which they called MBConv) used in MobileNet worked well.

    The EfficientNet family is a stack of MBConv layers, with shapes determined by the compound scaling. The original publication consisted of 8 models, from EfficientNet-B0 to EfficientNet-B7, with increasing model size and accuracy. EfficientNet-B0 is the baseline network, and subsequent models are obtained by scaling the baseline network by increasing ϕ.

    Variants

    [edit | edit source]

    EfficientNet has been adapted for fast inference on edge TPUs[2] and centralized TPU or GPU clusters by NAS.[3]

    EfficientNet V2 was published in June 2021. The architecture was improved by further NAS search with more types of convolutional layers.[4] It also introduced a training method, which progressively increases image size during training, and uses regularization techniques like dropout, RandAugment,[5] and Mixup.[6] The authors claim this approach mitigates accuracy drops often associated with progressive resizing.

    See also

    [edit | edit source]

    References

    [edit | edit source]
    1. ^ a b Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
    2. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
    3. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
    4. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
    5. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
    6. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
    [edit | edit source]