Issue Description
Describe the bug
I encountered an OSError: 页面文件太小,无法完成操作。 (os error 1455) (English: "The paging file is too small for this operation to complete") when running the script to process a PDF.
It seems the crash happens during the initialization of the marker-pdf / surya models for content extraction. The script attempted to fallback to CPU after the initial failure but failed again with the same error.
To Reproduce
Run the following command on Windows:
python main.py "E:\temp\agentfuzz-security25.pdf" --language en --model gpt-4o --theme Madrid --output-dir output --verbose
Error Log
2025-12-15 11:53:04,688 - INFO - Starting marker-pdf content extraction: E:\temp\agentfuzz-security25.pdf
2025-12-15 11:53:04,688 - INFO - Initializing models... (device preference: None)
2025-12-15 11:53:06,062 - WARNING - Model initialization failed: 页面文件太小,无法完成操作。 (os error 1455). Retrying with device='cpu'...
2025-12-15 11:53:07,387 - ERROR - Content extraction failed: 页面文件太小,无法完成操作。 (os error 1455)
Traceback (most recent call last):
File "D:\sourcecode\Auto-Slides\modules\lightweight_extractor.py", line 76, in extract_content
converter = PdfConverter(artifact_dict=create_model_dict(device=device))
...
File "D:\sourcecode\Auto-Slides\venv\Lib\site-packages\transformers\modeling_utils.py", line 4450, in from_pretrained
with safe_open(checkpoint_files[0], framework="pt") as f:
OSError: 页面文件太小,无法完成操作。 (os error 1455)
Environment
- OS: Windows
- Project Path:
D:\sourcecode\Auto-Slides
- Task: PDF Content Extraction
Additional context
The error 1455 usually indicates that the Windows commit limit (RAM + Page File) has been reached. It appears that loading the surya recognition model consumes a significant amount of memory, triggering this system limit even when falling back to CPU.
Issue Description
Describe the bug
I encountered an
OSError: 页面文件太小,无法完成操作。 (os error 1455)(English: "The paging file is too small for this operation to complete") when running the script to process a PDF.It seems the crash happens during the initialization of the
marker-pdf/suryamodels for content extraction. The script attempted to fallback to CPU after the initial failure but failed again with the same error.To Reproduce
Run the following command on Windows:
python main.py "E:\temp\agentfuzz-security25.pdf" --language en --model gpt-4o --theme Madrid --output-dir output --verboseError Log
Environment
D:\sourcecode\Auto-SlidesAdditional context
The error 1455 usually indicates that the Windows commit limit (RAM + Page File) has been reached. It appears that loading the
suryarecognition model consumes a significant amount of memory, triggering this system limit even when falling back to CPU.