Awesome
Steam Filtered Image Data
This repository provides details about image data:
- downloaded from Steam,
- filtered following parts of the procedure in
download-steam-banners-data
.
A dataset, called Steam-OneFace, is shared in this section.
Data
Data downloaded from Steam consists of:
- PNG logos with transparency,
- JPG vertical banners.
Filtered image data is shared on Google Drive.
The up-to-date list of appIDs for which I have tried to download image data is available in:
data/download-queries/app_ids.txt
. Most of these ~48k appIDs do not feature any image data.
Here is an example of a vertical banner:
Here is an example of a logo:
Filtering
The filtering consists in the removal of:
- blank images, i.e. images for which grayscale image intensity extrema are equal,
- images of uncommon resolution:
- anything but 640x360 for logos,
- anything but 300x450 for vertical banners,
- images with uncommon bands:
- anything but RGBA for logos,
- anything but RGB for banners,
- images for which the bounding box of the non-zero regions covers:
- 100% of the image space for transparent logos,
- strictly less than 100% image space for banners, which happens with vignetting.
Suggestions of filtering
Suggestions of filtering include:
- removal of duplicate images with
imagededup
, - filtering based on the number of detected faces, as in
steam-face-detection
.
The enforcement of such filtering is left to the reader. Otherwise, it would be difficult to keep filtered data up-to-date.
Steam-OneFace dataset
The notebook build_steam_oneface_dataset.ipynb
shows an application of the filters suggested above.
There are three options for the face detector:
With the face_alignment
module
This allows to build a dataset, called Steam-OneFace
, of 1688 images which should all feature exactly one face.
This dataset is shared on Google Drive in:
- the original resolution (300x450):
steam-oneface-hr.tar.gz
(94 MB) - a lower resolution (256x256):
steam-oneface-lr.tar.gz
(52 MB)
To use this dataset on Google Colab, run the following:
!gdown --id 1QptHrW9vloTtP--YJsxMY8PZWI2D8NJt
!tar xf steam-oneface-lr.tar.gz
import glob
from pathlib import Path
file_names = glob.glob('steam-oneface-lr/*.jpg')
app_ids = [int(Path(fname).stem) for fname in file_names]
With the retinaface
module
The dataset consists of 2472 images, shared in:
- the original resolution (300x450):
steam-oneface-hr_with_retinaface.tar.gz
(133 MB) - a lower resolution (256x256):
steam-oneface-lr_with_retinaface.tar.gz
(74 MB)
To use this dataset on Google Colab, run the following:
!gdown --id 1-0Nk7H6Cn3Nt60EdHG_NWSA8ohi2oBqr
!tar xf steam-oneface-lr_with_retinaface.tar.gz
With the dlib
module
The dataset consists of 305 images, shared in:
- the original resolution (300x450):
steam-oneface-hr_with_dlib.tar.gz
(16 MB) - a lower resolution (256x256):
steam-oneface-lr_with_dlib.tar.gz
(9 MB)
To use this dataset on Google Colab, run the following:
!gdown --id 1-4RIn9G9Bee2JZ1bK1gkkgkLocHuWJJ4
!tar xf steam-oneface-lr_with_dlib.tar.gz
With several detection modules
The notebook trim_steam_oneface_dataset.ipynb
trims the dataset by intersecting the results of different detectors.
The trimmed datasets are:
Steam-OneFace-small
Steam-OneFace-small
:- 993 images,
- obtained with modules
face_alignment
andretinaface
,
To use this dataset on Google Colab, run the following:
!gdown --id 1-1V5fDhPo75iDtAbrD18rppV-lf51bPW
!tar xf steam-oneface-small-lr.tar.gz
Steam-OneFace-tiny
Steam-OneFace-tiny
:- 168 images,
- obtained with modules
dlib
,face_alignment
andretinaface
.
To use this dataset on Google Colab, run the following:
!gdown --id 1-2sCVgBUmu6LFug1pzBfmL8zNFFBq27F
!tar xf steam-oneface-tiny-lr.tar.gz
References
- To download images:
download_steam_banners.ipynb
inwoctezuma/google-colab
- To filter out duplicates, etc.:
- for PNG logos:
remove_duplicates.ipynb
inwoctezuma/google-colab
- for JPG banners:
remove_duplicates.ipynb
inwoctezuma/steam-stylegan2-ada
- for PNG logos:
- To detect faces:
detect_faces_on_steam_banners.ipynb
inwoctezuma/steam-face-detection