figure 1: classic network of object detection

model

figure 2: the confrontation between YOLO and SSD

details

Multi-scale feature maps for detection

Convolutional predictors for detection: 3x3 kernel default boxes

Default boxes and aspect ratios: scale: aspect ratios: 4 or 6 mxn—>(c+4)kmn figure 3: SSD working framework

loss

Matching strategy: every ground truth match with IOU higher than a threshold(0.5)

Loss function: L(conf): softmax L(loc): Smooth L1 Loss

trick

Negative mining: ratio between the negatives and positives is at most 3:1

Data Augmentation: For every image:

Use the entire original input image.

Sample a patch so that the minimum IOU overlap with the objects is 0.1, 0.3, 0.5, 0.7, or 0.9.

Randomly sample a patch

Then for every patch , horizontally flipped with probability of 0.5 mAP update from 65.4% to 74.3%

experiment result

figure 4: PASCAL VOC2007 test detection results

figure 5: PASCAL VOC2007 test detection results of different models

figure 6: Sensitivity and impact of different object characteristics on VOC2007 test set using

figure 7: Detection examples on COCO test-dev with SSD512 model

model analysis

Data augmentation is crucial

More default box shapes is better

Atrous is faster

Multiple output layers at different resolutions is better

figure 8: Effects of various design choices and components on SSD performance

figure 9: Effects of using multiple output layers

conclusion

The core of SSD :predicting category scores and box offsets for a fixed set of default bounding boxes in multiscale

Faster and better

Single Shot MultiBox Detector

SSD

model

details

loss

trick

experiment result

model analysis

conclusion

CATALOG

FEATURED TAGS

FRIENDS

related work

model

details

loss

trick

experiment result

model analysis

conclusion

CATALOG

FEATURED TAGS

FRIENDS