Awesome
Keras implementation of PSPNet(caffe)
Implemented Architecture of Pyramid Scene Parsing Network in Keras.
For the best compability please use Python3.5
Setup
- Install dependencies:
- Tensorflow (-gpu)
- Keras
- numpy
- scipy
- pycaffe(PSPNet)(optional for converting the weights)
pip install -r requirements.txt --upgrade
- Converted trained weights are needed to run the network.
Weights(in
.h5 .json
format) have to be downloaded and placed into directoryweights/keras
Already converted weights can be downloaded here:
- pspnet50_ade20k.h5 pspnet50_ade20k.json
- pspnet101_cityscapes.h5 pspnet101_cityscapes.json
- pspnet101_voc2012.h5 pspnet101_voc2012.json
Convert weights by yourself(optional)
(Note: this is not required if you use .h5/.json weights)
Running this needs the compiled original PSPNet caffe code and pycaffe.
python weight_converter.py <path to .prototxt> <path to .caffemodel>
Usage:
python pspnet.py -m <model> -i <input_image> -o <output_path>
python pspnet.py -m pspnet101_cityscapes -i example_images/cityscapes.png -o example_results/cityscapes.jpg
python pspnet.py -m pspnet101_voc2012 -i example_images/pascal_voc.jpg -o example_results/pascal_voc.jpg
List of arguments:
-m --model - which model to use: 'pspnet50_ade20k', 'pspnet101_cityscapes', 'pspnet101_voc2012'
--id - (int) GPU Device id. Default 0
-s --sliding - Use sliding window
-f --flip - Additional prediction of flipped image
-ms --multi_scale - Predict on multiscale images
Keras results:
Implementation details
-
The interpolation layer is implemented as custom layer "Interp"
-
Forward step takes about ~1 sec on single image
-
Memory usage can be optimized with:
config = tf.ConfigProto() config.gpu_options.per_process_gpu_memory_fraction = 0.3 sess = tf.Session(config=config)
-
ndimage.zoom
can take a long time