Orion-14B-Chat-Int4 私有化部署问题，求解答

系统环境：
(Orion) PS D:\Huggin face\Orion-14B-App-Demo-CN\demo> nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Wed_Feb__8_05:53:42_Coordinated_Universal_Time_2023
Cuda compilation tools, release 12.1, V12.1.66
Build cuda_12.1.r12.1/compiler.32415258_0

demo.py

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("\Huggin face\Orion-14B-Chat-Int4", trust_remote_code=True,use_safetensors=True)
model = AutoModelForCausalLM.from_pretrained("\Huggin face\Orion-14B-Chat-Int4", torch_dtype=torch.bfloat16,device_map="auto", trust_remote_code=True,use_safetensors=True)

messages = [{"role": "user", "content": "hi,who are you?"}]
response = model.chat(tokenizer, messages, streaming=False)
print(response)

(Orion) PS D:\Huggin face\Orion-14B-App-Demo-CN\demo> python demo.py
bin D:\Users\Administrator\anaconda3\envs\Orion\Lib\site-packages\bitsandbytes\libbitsandbytes_cuda121.dll
鲯榅鲯鲯榅 mathemat鲯鲯榅榅鲯鲯鲯鲯榅鲯榅鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯榅鲯榅榅榅鲯鲯鲯鲯鲯鲯榅鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯榅鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯榅鲯鲯榅鲯鲯榅鲯榅鲯榅鲯鲯榅榅榅榅鲯榅鲯鲯鲯榅鲯鲯鲯鲯鲯榅鲯鲯鲯鲯鲯鲯榅鲯鲯鲯鲯榅鲯鲯鲯鲯榅鲯鲯榅鲯鲯鲯鲯鲯鲯鲯鲯榅鲯鲯鲯榅鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯榅榅鲯鲯鲯鲯鲯榅鲯鲯鲯榅 鲯鲯鲯鲯鲯鲯鲯榅鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯榅鲯鲯鲯鲯鲯榅鲯榅鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯榅鲯鲯鲯榅鲯鲯榅鲯鲯鲯鲯鲯榅鲯榅榅鲯鲯榅鲯鲯鲯鲯榅鲯鲯鲯鲯鲯鲯鲯榅鲯鲯鲯榅榅鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯鲯榅鲯 

任何提问，模型回答都是这种胡言乱语。

另外，从hugginface直接下的OrionStarAI/Orion-14B-Chat-Int4 模型文件（safetensors格式）的，需要手工执行quant.py脚本量化后才能正常推理吗？请高人解答下，谢谢！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Orion-14B-Chat-Int4 私有化部署问题，求解答 #38

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Orion-14B-Chat-Int4 私有化部署问题，求解答 #38

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions