Awesome
ImageNet subset generator
Generate a subsets from the original ImageNet1K dataset. Some commonly used subsets:
- SimclrV2 10% subset
- SemiViT 10% subset
- SimclrV2 1% subset
- SemiViT 1% subset
- Extreme low-shot subsets from MSN
Usage
git clone https://github.com/BenediktAlkin/ImageNetSubsetGenerator
cd ImageNetSubsetGenerator
Generate subset
python main_subset.py --in1k_path <ImageNet1K_path> --out_path <out_path> --version in100_sololearn
- this will copy the corresponding samples from the
ImageNet1K_path
toout_path
- it can then be readily used with e.g. torchvision ImageFolder
subset = ImageFolder(root=<out_path>)
For example: python main_subset.py --in1k_path /data/imagenet1k --out_path /data/imagenet1k_10percent_simclrv2 --version in1k_10percent_simclrv2
You can find all supported versions here or via python main_subset.py --help
.
Check classes/samples of dataset
python main_statistics.py <path>
train n_classes: 1000
valid n_classes: 1000
train n_samples: 1282169
valid n_samples: 50000
train classes: ['n01440764', ...]
valid classes: ['n01440764', ...]