Since it's already integrated into NeMo and vLLM, what is the authors' opinion? What is expected behavior? cc @abcdabcd987 @nandor @amanshanbhag