INT8/Quantization in torch-TensorRT 1.4 #2086
                  
                    
                      chichun-charlie-liu
                    
                  
                
                  started this conversation in
                General
              
            Replies: 1 comment
-
| We are planning to improve and maintain the same level (with Torchscript) of INT8 support in dynamo as well. This is scheduled for next release. | 
Beta Was this translation helpful? Give feedback.
                  
                    0 replies
                  
                
            
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
        
    
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
It appears that INT8 is not ready in the newly released torch-TRT 1.4, as the new dynamo.compile() checks the precision and rejects anything other than FP32 and FP16. But digging into deeper level, there seems to be some INT8/quantization components, similar to those from ver1.3?
I'm just curious if you could elaborate a little on the INT8 implementation plan or status, and if possible, any schedule to release a newer version that enables INT8?
Thanks a lot!
Beta Was this translation helpful? Give feedback.
All reactions