← ML Research Wiki / 2310.02386

ScaleNet: An Unsupervised Representation Learning Method for Limited Information

Huili Huang School of Computational Science and Engineering Georgia Institute of Technology 756 W Peachtree St NW30308AtlantaGAUSA, M Mahdi [email protected] School of Computational Science and Engineering Georgia Institute of Technology 756 W Peachtree St NW30308AtlantaGAUSA (2023)

Paper Information

arXiv ID

2310.02386

Venue

German Conference on Pattern Recognition

Domain

computer vision

Contents

Abstract
Methods
Datasets
Results
Limitations
Related Work
External Resources

Abstract

Although large-scale labeled data are essential for deep convolutional neural networks (ConvNets) to learn high-level semantic visual representations, it is time-consuming and impractical to collect and annotate large-scale datasets.A simple and efficient unsupervised representation learning method named ScaleNet based on multi-scale images is proposed in this study to enhance the performance of ConvNets when limited information is available.The input images are first resized to a smaller size and fed to the ConvNet to recognize the rotation degree.Next, the ConvNet learns the rotation-prediction task for the original size images based on the parameters transferred from the previous model.The CIFAR-10 and ImageNet datasets are examined on different architectures such as AlexNet and ResNet50 in this study.The current study demonstrates that specific image features, such as Harris corner information, play a critical role in the efficiency of the rotation-prediction task.The ScaleNet supersedes the RotNet by ≈ 7% in the limited CIFAR-10 dataset.The transferred parameters from a ScaleNet model with limited data improve the ImageNet Classification task by about 6% compared to the RotNet model.This study shows the capability of the ScaleNet method to improve other cutting-edge models such as SimCLR by learning effective features for classification tasks.

Summary

This study proposes a novel unsupervised representation learning method called ScaleNet, optimized for situations with limited information. The approach enhances the performance of convolutional neural networks (ConvNets) by utilizing a multi-scale image framework that trains the network to recognize and predict the rotation of images. The paper evaluates ScaleNet on the CIFAR-10 and ImageNet datasets using architectures like AlexNet and ResNet50. The results indicate that ScaleNet outperforms existing methods like RotNet and SimCLR, particularly in limited data conditions, by effectively learning high-level visual representations. Specifically, the ScaleNet demonstrates a 7% improvement over RotNet on CIFAR-10 and a 6% boost in performance for ImageNet classification tasks. Key findings suggest that the inclusion of specific features, such as Harris corner information, is crucial for improving efficiency in rotation-prediction tasks, demonstrating the potential of ScaleNet for advancing self-supervised learning methodologies in data-scarce environments.

Methods

This paper employs the following methods:

Self-supervised Learning
Representation Learning

Models Used

ScaleNet
RotNet
SimCLR
AlexNet
ResNet50

Datasets

The following datasets were used in this research:

CIFAR-10
ImageNet

Evaluation Metrics

None specified

Results

ScaleNet outperforms RotNet by ≈ 7% on CIFAR-10
ScaleNet improves ImageNet classification by 6% compared to RotNet
SimCLR's performance enhanced by ∼4% when combined with ScaleNet

Limitations

The authors identified the following limitations:

Limited availability of labeled data
Challenges in capturing high-level representations with small datasets
Dependence on specific features like corner information

Technical Requirements

Number of GPUs: 2
GPU Type: NVIDIA K80, RTX 2070

Keywords

ScaleNet unsupervised learning self-supervised learning rotation prediction multi-scale images

Papers Using Similar Methods

External Resources

Funding: Google Cloud Platform (GCP) Research
References: 52
Influential Citations: 91

ScaleNet: An Unsupervised Representation Learning Method for Limited Information

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Limitations add

Technical Requirements edit

Keywords add

Related Papers