Support composite calibration dataset by ivanl-cerebras · Pull Request #16 · CerebrasResearch/reap

ivanl-cerebras · 2026-03-17T09:57:18Z

This PR adds:

Support for tool calling datasets (Salesforce/xlam-function-calling-60k, SWE-bench/SWE-smith-trajectories)
Support for composite dataset specification (format used: theblackcat102/evol-codealpaca-v1:128,Salesforce/xlam-function-calling-60k:128,open-r1/Mixture-of-Thoughts[code]:128,open-r1/Mixture-of-Thoughts[math]:128,open-r1/Mixture-of-Thoughts[science]:128,SWE-bench/SWE-smith-trajectories(tool):128)
Using batch size > 1 for calibration (handles padding masks construction and treatment when computing pruning metrics)

Eval results for a trial run

Model: Qwen/Qwen3-30B-A3B
Compression level: 0.25
Dataset: "theblackcat102/evol-codealpaca-v1:8,Salesforce/xlam-function-calling-60k:8,open-r1/Mixture-of-Thoughts[code]:8,open-r1/Mixture-of-Thoughts[math]:8,open-r1/Mixture-of-Thoughts[science]:8,SWE-bench/SWE-smith-trajectories(tool):8"
Batch size: 16
MSL: 2048

HumanEval / HumanEval+: 0.9207 / 0.8780
MBPP / MBPP+: 0.8492 / 0.7222

ivanl-cerebras · 2026-03-19T11:30:42Z

сс @nikolail-cerebras to review

mklasby

LGTM, minor changes for cleaning up debugger and we can set attention mask in a context manager to make more robust

mklasby · 2026-03-19T18:28:13Z

src/reap/main.py

        tokenizer.save_pretrained(merged_model_dir)
    except Exception as e:
-        import pdb; breakpoint()
+        import pdb


Let's raise here instead

mklasby · 2026-03-19T18:29:01Z

src/reap/observer.py

+        self._current_attention_mask: Optional[torch.Tensor] = None
+        super().__init__(model, hook_config)
+
+    def set_attention_mask(self, attention_mask: Optional[torch.Tensor]):


Rewrite as context manager or manage in a preforwards hook attached to full model

mklasby · 2026-03-19T18:32:31Z

src/reap/main.py

                logger.info("No previous data found @ %s", f_name)
                for sample in tqdm(cat_data, desc=f"Processing {category} samples"):
-                    model(sample.to(model.device))
+                    attention_mask = sample.get("attention_mask", None)


Suggested change

attention_mask = sample.get("attention_mask", None)

attn_mask = sample.get("attention_mask", None)

with observer.set_attention_mask(attn_mask)

...

model(**sample)

support composite calibration dataset

8f4ceef

ivanl-cerebras force-pushed the il/composite_dataset branch from 2916882 to a67ba6c Compare March 18, 2026 03:08

enable batch_size > 1 during calibration

46d78eb

ivanl-cerebras force-pushed the il/composite_dataset branch from a67ba6c to 46d78eb Compare March 18, 2026 03:14

ivanl-cerebras mentioned this pull request Mar 18, 2026

Add layerwise calibration observer #17

Draft

ivanl-cerebras added 2 commits March 19, 2026 11:03

fixes for tool calling and composite datasets

e1f165b

add README note

012bddc

ivanl-cerebras marked this pull request as ready for review March 19, 2026 11:27

ivanl-cerebras requested review from mikel-cerebras and vithursant March 19, 2026 11:27

mklasby approved these changes Mar 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support composite calibration dataset#16

Support composite calibration dataset#16
ivanl-cerebras wants to merge 4 commits intomainfrom
il/composite_dataset

ivanl-cerebras commented Mar 17, 2026 •

edited

Loading

Uh oh!

ivanl-cerebras commented Mar 19, 2026 •

edited

Loading

Uh oh!

mklasby left a comment

Uh oh!

mklasby Mar 19, 2026

Uh oh!

mklasby Mar 19, 2026

Uh oh!

mklasby Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-                    attention_mask = sample.get("attention_mask", None)
+                   attn_mask = sample.get("attention_mask", None)
+                    with observer.set_attention_mask(attn_mask)
+                        ...
+                        model(**sample)

Conversation

ivanl-cerebras commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Eval results for a trial run

Uh oh!

ivanl-cerebras commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mklasby left a comment

Choose a reason for hiding this comment

Uh oh!

mklasby Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

mklasby Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

mklasby Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ivanl-cerebras commented Mar 17, 2026 •

edited

Loading

ivanl-cerebras commented Mar 19, 2026 •

edited

Loading