I read the paper 'CRAFTING A MULTI-TASK CNN FOR VIEWPOINT ESTIMATION' and find this pository which is used for object detection,and I saw the result of examples. So I want to know whether the text on the top left corner is the result for viewpoint estimation,including class and azimuthal angle?