Skip to content

Commit da27db6

Browse files
Merge pull request #33 from ShayanTalaei/sprint
Sprint Paper Upload
2 parents da69eed + af43359 commit da27db6

File tree

5 files changed

+42
-0
lines changed

5 files changed

+42
-0
lines changed

_data/people.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,11 @@ jonsaadfalcon:
1414
url: https://jonsaadfalcon.com/
1515
title: PhD Student
1616

17+
shayantalaei:
18+
name: Shayan Talaei
19+
url: https://www.linkedin.com/in/shayan-talaei-6b65a0229/
20+
title: PhD Student
21+
1722
# Visiting
1823

1924
bradleybrown:

_pubs/sprint.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
---
2+
title: 'SPRINT: Enabling Interleaved Planning and Parallelized Execution in Reasoning Models'
3+
authors:
4+
- name: Emil Biju
5+
equal: true
6+
affiliation: Stanford University
7+
- key: shayantalaei
8+
equal: true
9+
affiliation: Stanford University
10+
- name: Zhemin Huang
11+
equal: true
12+
affiliation: Stanford University
13+
- name: Mohammadreza Pourreza
14+
affiliation: Google
15+
- name: Amin Saberi
16+
affiliation: Stanford University
17+
- key: azaliamirhoseini
18+
affiliation: Stanford University
19+
venue: preprint
20+
year: 2025
21+
date: 2025-06-06
22+
has_pdf: true
23+
doi: 10.48550/arXiv.2506.05745
24+
tags:
25+
- machine learning
26+
- artificial intelligence
27+
- reasoning
28+
teaser: SPRINT enables LRMs to dynamically identify and exploit parallelization opportunities during reasoning, reducing sequential tokens by up to 39% while maintaining performance.
29+
materials:
30+
- name: Paper
31+
url: https://arxiv.org/abs/2506.05745
32+
type: file-pdf
33+
- name: Codebase
34+
url: https://github.com/ShayanTalaei/SPRINT
35+
type: code
36+
---
37+
Large reasoning models (LRMs) excel at complex reasoning tasks but typically generate lengthy sequential chains-of-thought, resulting in long inference times before arriving at the final answer. To address this challenge, we introduce SPRINT, a novel post-training and inference-time framework designed to enable LRMs to dynamically identify and exploit opportunities for parallelization during their reasoning process. SPRINT incorporates an innovative data curation pipeline that reorganizes natural language reasoning trajectories into structured rounds of long-horizon planning and parallel execution. By fine-tuning LRMs on a small amount of such curated data, the models learn to dynamically identify independent subtasks within extended reasoning processes and effectively execute them in parallel. Through extensive evaluations, we show that the models fine-tuned with the SPRINT framework match the performance of reasoning models on complex domains such as mathematics while generating up to ~39% fewer sequential tokens on problems requiring more than 8000 output tokens. Finally, we observe consistent results transferred to two out-of-distribution tasks of GPQA and Countdown with up to 45% and 65% reduction in average sequential tokens for longer reasoning trajectories, while achieving the performance of the fine-tuned reasoning model.

imgs/people/shayantalaei.jpg

4.98 MB
Loading

imgs/thumbs/sprint.png

26.3 KB
Loading

pubs/sprint.pdf

830 KB
Binary file not shown.

0 commit comments

Comments
 (0)