Reference model for LoRA

Greetings,

Quick question about our LoRA setup: are we initializing a separate reference model during training (e.g., for KL/anchor comparisons), or are we reusing the frozen base model as the reference since the base is frozen under PEFT?

If a separate reference is currently spun up, is there a way to configure the pipeline to reuse the base model as the reference to save memory/compute? Pointers to the relevant config flags or code path would be much appreciated.

Thank you very much.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reference model for LoRA #4309

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reference model for LoRA #4309

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions