Skip to content

Remove prefill calibration #17827

@abhinaykukkadapu

Description

@abhinaykukkadapu

To help with the long calibration issue, @haowhsu-quic helped by removing prefill calibration and copying the decode quant params to prefill which should help with significant time taken during prefill.

cc @cccclai @winskuo-quic @shewu-quic @haowhsu-quic @DannyYuyang-quic @cbilgin

Metadata

Metadata

Assignees

Labels

partner: qualcommFor backend delegation, kernels, demo, etc. from the 3rd-party partner, Qualcomm

Type

Projects

Status

To triage

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions