4.2 Grad-CAM++

Grad-CAM++ is the extension of Grad-CAM which we observed earlier. Grad-CAM is not good at localizing multiple objects in images belonging to the same class. For multiple object images, Grad-CAM do not capture the object in it’s entirety. This is required for better recognition tasks and hence, Grad-CAM++ fills for these caveats.

../../../_images/gradcampp.png

Grad-CAM++ Architecture

Grad-CAM++ provides pixel-wise weighting of the gradients of the output w.r.t. to the particular spatial position in the final feature map towards overall decision of the CNN. This provides a measure of importance of each pixel in the feature map towards overall decision of the CNN.

../../../_images/gradcampp_intuit.png

Grad-CAM++ Intuition

Note

Below contains images of math and the underlying logic surrounding Grad-CAM++. Images are uploaded since writing such complex math and getting it rendered was difficult. Please bear my hand writing.

../../../_images/page1.jpg

Math Page 1

../../../_images/page2.jpg

Math Page 2

../../../_images/page3.jpg

Math Page 3

../../../_images/page4.jpg

Math Page 4

../../../_images/gradcampp_vis.png

Original Image vs Grad-CAM++ visualization for layer 43 of VGG-16 network