Skip to content

Conversation

@JinHai-CN
Copy link
Contributor

What problem does this PR solve?

As title.

Type of change

  • Refactoring

zpf121 and others added 5 commits December 16, 2025 11:59
Signed-off-by: zpf121 <1219290549@qq.com>
### What problem does this PR solve?

Environment:
```
Ubuntu 25.10
Clang-20/21
Gcc-15.2
```

The `simde` package will introduce error ` definition with same mangled
name ` because of the macro `SIMDE_FUNCTION_ATTRIBUTES` is defined as
`static`, which will lead to multiple definitions under c++ 23 modules.

Issue link:infiniflow#3166

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Ubuntu <ubuntu@ragflow-arm2.asia-northeast1-b.c.ragflow-462809.internal>
### What problem does this PR solve?

1. Remove all the code related to **BufferObj/BufferManager** and
refactor all FileWorker implementations.
- For FileWorker instances that contain a `char *` payload, we now
manage their loading state directly with **mmap** instead of using a
state machine + LRU.
- For other FileWorker types (curruntly only **HnswFIleWorker**), we
manage their loading state solely with an LRU cache.
2. Becauce the state-machine-based loading management has been
eliminated, we use the `tmp` and `data` directories to handle the part
of state.
- All data generated after previous checkpoint is mmapped into `tmp`.
- During a checkpoint we call `msync`, the copy the data from `tmp` to
`data`.
- This results in higher disk usage while the database is running, and
the copy operation creates a large amount of disk I/O, which can cause
background cleaning to fail (e.g., RocksDB read/write timeouts). To
mitigate the second issue, more frequent checkpoints may be required.
3. For **VersionFileWorker**, the payload is not a `char *`, but we
still manage it purely with `mmap`. This leads to sub-optimal
performance on certain code paths; the issue will be addressed in a
future fix.
4. Comment out all code related `snapshot` and wait for adaptation.
5. There are some bugs in the `show` command.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Breaking Change (fix or feature that could cause existing
functionality not to work as expected)
- [x] Refactoring
- [x] Performance Improvement
- [x] Test cases

---------

Signed-off-by: noob <yixiao121314@outlook.com>
Co-authored-by: qinling0210 <88864212+qinling0210@users.noreply.github.com>
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
@JinHai-CN JinHai-CN added the ci PR can be test label Dec 22, 2025
@JinHai-CN JinHai-CN marked this pull request as draft December 22, 2025 04:00
@JinHai-CN JinHai-CN marked this pull request as ready for review December 22, 2025 04:00
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
@codecov
Copy link

codecov bot commented Dec 22, 2025

Codecov Report

❌ Patch coverage is 66.66667% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 39.93%. Comparing base (394bea8) to head (7bfc099).

Files with missing lines Patch % Lines
src/storage/new_txn/new_txn_index_impl.cpp 72.72% 2 Missing and 1 partial ⚠️
src/storage/invertedindex/memory_indexer_impl.cpp 50.00% 2 Missing ⚠️
src/storage/bg_task/mem_index_appender_impl.cpp 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3186      +/-   ##
==========================================
- Coverage   47.04%   39.93%   -7.11%     
==========================================
  Files         722      722              
  Lines      150043   150043              
  Branches    27247    27247              
==========================================
- Hits        70591    59925   -10666     
- Misses      69835    81108   +11273     
+ Partials     9617     9010     -607     
Flag Coverage Δ
debug http test 26.42% <27.77%> (-0.08%) ⬇️
debug parallel test 17.52% <16.66%> (+0.02%) ⬆️
debug pysdk test ?
debug sqllogical test ?
debug unit test 35.53% <66.66%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci PR can be test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants