-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Refactor SqliteSessionService for better concurrency #4017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello @rishabhindoria, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the concurrency of the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Response from ADK Triaging Agent Hello @rishabhindoria, thank you for your contribution! I've noticed that this PR does not yet include unit tests. As per our contribution guidelines, all code changes should include tests to ensure code quality and prevent regressions. Could you please add unit tests covering the new concurrency improvements? This will help the reviewers validate the effectiveness of your fix. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors SqliteSessionService to enable SQLite's Write-Ahead Logging (WAL) mode, which is an excellent change to improve concurrency by allowing simultaneous reads and writes. The introduction of an asyncio.Lock to serialize write operations is also a good practice to prevent potential race conditions in an asynchronous environment. The overall approach is sound. I have provided a few specific comments for minor improvements, including fixing a typo in a log message, removing a redundant database operation, and addressing an indentation inconsistency to maintain code style.
| def __init__(self, db_path: str): | ||
| """Initializes the SQLite session service with a database path.""" | ||
| self._db_path = db_path | ||
| self._db_lock = asyncio.Lock() # Lock to serialize write operations | ||
|
|
||
| if self._is_migration_needed(): | ||
| raise RuntimeError( | ||
| f"Database {db_path} seems to use an old schema." | ||
| " Please run the migration command to" | ||
| " migrate it to the new schema. Example: `python -m" | ||
| " google.adk.sessions.migrate_from_sqlalchemy_sqlite" | ||
| f" --source_db_path {db_path} --dest_db_path" | ||
| f" {db_path}.new` then backup {db_path} and rename" | ||
| f" {db_path}.new to {db_path}." | ||
| ) | ||
|
|
||
| # Initialize WAL mode for better concurrency | ||
| self._initialize_wal_mode() | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation throughout the SqliteSessionService class has been changed from 2 spaces to 4 spaces. This introduces an inconsistency with other files in the project (e.g., base_session_service.py, _session_util.py) which use 2-space indentation. To maintain a consistent code style across the repository, please revert the indentation of this class to use 2 spaces.
| conn.execute("PRAGMA busy_timeout=30000") | ||
| conn.commit() | ||
| except sqlite3.Error as e: | ||
| logger.warning(f"Failed to initialize ≈ mode: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| await db.execute("PRAGMA journal_mode=WAL") | ||
| await db.execute("PRAGMA busy_timeout=30000") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The journal_mode=WAL pragma is being set on every new database connection. However, this pragma is persistent for the database file and is already being set once during initialization in _initialize_wal_mode. Setting it on every connection is redundant. You can remove this line for a minor performance improvement and code cleanup. The busy_timeout pragma is a per-connection setting, so it is correct to set it here.
| await db.execute("PRAGMA journal_mode=WAL") | |
| await db.execute("PRAGMA busy_timeout=30000") | |
| await db.execute("PRAGMA busy_timeout=30000") |
Please ensure you have read the contribution guide before creating a pull request.
Link to Issue or Description of Change
1. Link to an existing issue (if applicable):
NA
2. Or, if no issue exists, describe the change:
Problem:
While doing adk eval using the api http://127.0.0.1:8000/apps/agentic_framework/eval_sets/{eval_dataset_filename}/run_eval for a simple 1 jsoj file containing like 10+ conversations/testcases, I was getting database locked error after investigating more online I found: By default adk eval uses local session.db file inside the hidden .adk folder. This db is a sqlite db which allows only one writer and blocks all readers.
Solution:
The breakthrough was to go inside this source code file, modify the sqlite db to be in WAL (Write ahead log mode) which enables one writer and multiple readers to read in parallel thereby increasing parallelism significantly.
Testing Plan
I created a small demo.evalset.json on my local with 4 agentic chatbot conversations and called the api using custom python script. I saw 2x speedup with this optimization since we can call the above api in a multithreaded fashion in a python script.
Unit Tests:
NA
Manual End-to-End (E2E) Tests:
NA
Checklist