[computer vision] Faster-RCNN: Towards Real-Time Object Detection with Region Proposal Networks

PDF

Summary

Faster-RCNN introduces a Region Proposal Network (RPN) that shares full-image convolutional features with detection network to reduce region proposal time, and therefore object recognition time. 5 FPS is achieved using RPN they proposed.

Architecture

Training

A 4-step training algorithm is adopted:

Finetune RPN using labeled image bounding boxes after initialized with ImageNet data
Train the classification network (initialized with IMageNet data) by Fast RCNN using proposals generated from step 1 RPN
Fix the shared convolutional layers, only fine-tune only the layers unique to RPN
Fix the shared convolutional layers, only fine-tune the layers unique to Fast RCNN

A Brief History of CNNs in Image Segmentation: From R-CNN to Mask R-CNN

Summary

Architecture

Training

Related material