Skip to content

Conversation

@HarshitR2004
Copy link
Contributor

  • What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
    Bug fix

  • Why was this change needed? (You can also link to an open issue here)
    Unhandled Vector Store Save Failures #2080
    Fixed unhandled vector store save failures in the embedding pipeline that could cause data loss and application crashes. The embed_and_store_documents function lacked error handling around store.save_local() calls which could lead to unexpected failures.

  • Other information:
    Improved observability with clear error messages distinguishing between different save failure types.

@vercel
Copy link

vercel bot commented Oct 19, 2025

@HarshitR2004 is attempting to deploy a commit to the Arc53 Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions github-actions bot added the application Application label Oct 19, 2025
@dartpain
Copy link
Contributor

Looks like ruff lint is failing:
--> application/parser/embedding_pipeline.py:3:26
|
1 | import os
2 | import logging
3 | from typing import List, Optional, Any
| ^^^^^^^^
4 | from retry import retry
5 | from tqdm import tqdm
|
help: Remove unused import: typing.Optional

Found 1 error.
[*] 1 fixable with the --fix option.

@codecov
Copy link

codecov bot commented Oct 22, 2025

Codecov Report

❌ Patch coverage is 20.00000% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 42.25%. Comparing base (498e2b7) to head (acc31a6).
⚠️ Report is 95 commits behind head on main.

Files with missing lines Patch % Lines
application/parser/embedding_pipeline.py 20.00% 12 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2081      +/-   ##
==========================================
+ Coverage   34.67%   42.25%   +7.57%     
==========================================
  Files         131      136       +5     
  Lines        8703     9325     +622     
==========================================
+ Hits         3018     3940     +922     
+ Misses       5685     5385     -300     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@HarshitR2004
Copy link
Contributor Author

@dartpain I have fixed the code formating issue please look into the PR again

@vercel
Copy link

vercel bot commented Oct 23, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
oss-docsgpt Ready Ready Preview Comment Oct 23, 2025 9:23pm

@github-actions github-actions bot added the tests Tests label Oct 26, 2025
@HarshitR2004
Copy link
Contributor Author

@dartpain I have added the tests to improve the coverage, pls look at it

Copy link
Collaborator

@ManishMadan2882 ManishMadan2882 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @HarshitR2004,
I tested this update by triggering no space left on my system(Linux) with the command
fallocate -l 11.5G /tmp/fill_disk_test.img where 11.5G is the available space (df -h )
The logs had the OSError: Unable to save vector store to temp/local/https://docs.docsgpt.cloud/: [Errno 28] No space left on device: '/tmp/tmp6_g7_z2t' as expected
the implementation is FAISS specific, but without the error handling proposed the TASK_STATUS remained in progress state which was an issue

Thanks for the update!

@HarshitR2004
Copy link
Contributor Author

@ManishMadan2882 Thanks for the detailed testing and approval. Much appreciated.

@dartpain dartpain merged commit 695191d into arc53:main Oct 31, 2025
1 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

application Application tests Tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants