Skip to content

Add faq consistency check#209

Open
cremertim wants to merge 14 commits intomainfrom
feature/faq/add-rewrite-consistency
Open

Add faq consistency check#209
cremertim wants to merge 14 commits intomainfrom
feature/faq/add-rewrite-consistency

Conversation

@cremertim
Copy link
Copy Markdown
Contributor

@cremertim cremertim commented Mar 3, 2025

Add consistency check for FAQ rewriting. Returns found inconsistencies as well as a new proposition for the FAQ

Summary by CodeRabbit

  • New Features

    • Enhanced rewriting outputs with detailed feedback—including improvement suggestions and alerts on content inconsistencies—to provide clearer status updates.
    • Introduced course-specific identification and improved FAQ integration, enabling tailored content retrieval via advanced search techniques.
    • Added new attributes for more comprehensive status updates, including suggestions, inconsistencies, and improvement.
    • Implemented a new method for fetching FAQs from the database with flexible search capabilities.
    • Added a new prompt for verifying FAQ consistency against final results.
  • Bug Fixes

    • Updated the status update method to handle new parameters for inconsistencies and improvement.
  • Chores

    • Removed an outdated FAQ rewriting prompt to streamline the overall process.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 3, 2025

"""

Walkthrough

This pull request introduces several changes to the rewriting pipeline and its associated modules. It adds a new course_id field to the RewritingPipelineExecutionDTO class and new attributes (suggestions, inconsistencies, improvement) to the RewritingStatusUpdateDTO class. A new prompt for FAQ consistency checking is introduced, while an outdated FAQ rewriting prompt is removed. The RewritingPipeline class is enhanced with FAQ retrieval and consistency checking functionalities, and the StatusCallback class's done method is updated to accommodate new parameters for reporting inconsistencies and improvements.

Changes

File(s) Change Summary
app/domain/rewriting_pipeline_execution_dto.py
app/domain/status/rewriting_status_update_dto.py
Added new course_id field in RewritingPipelineExecutionDTO and new attributes (suggestions, inconsistencies, improvement) in RewritingStatusUpdateDTO.
app/pipeline/prompts/faq_consistency_prompt.py
app/pipeline/prompts/faq_rewriting.py
Introduced faq_consistency_prompt variable; removed legacy faq_rewriting.py prompt file.
app/pipeline/rewriting_pipeline.py Extended the pipeline with new dependencies (FaqRetrieval, VectorDatabase), added FAQ retrieval, consistency checking via check_faq_consistency, and inconsistency parsing via parse_inconsistencies; updated the __call__ method to handle FAQ consistency verification and enrich callback data.
app/retrieval/faq_retrieval.py Added get_faqs_from_db method to fetch FAQs from the database using course ID and hybrid search parameters.
app/web/status/status_update.py Updated the done method in StatusCallback to handle additional parameters (inconsistencies, improvement) for enhanced status reporting; updated callback URLs to include "iris" in API paths across multiple callback classes.

Possibly related PRs

  • FAQ rewriting pipeline #191: The main PR extends the existing rewriting pipeline by adding FAQ consistency verification and related data fields, building directly on the rewriting pipeline infrastructure introduced in the retrieved PR, which initially implemented the FAQ rewriting pipeline and its core DTOs and callback classes.
  • Add problem statement rewriting to the rewriting pipeline #192: The main PR extends the rewriting pipeline to include FAQ consistency checks and related data handling, while the retrieved PR adds support for multiple rewriting variants (FAQ and problem statement) with variant-specific prompts and pipeline initialization; both modify the RewritingPipeline class but focus on different features—variant handling versus FAQ consistency verification—thus they are related through changes to the same class but address distinct aspects.

Suggested reviewers

  • bassner
    """

Note

⚡️ AI Code Reviews for VS Code, Cursor, Windsurf

CodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback.
Learn more here.


Note

⚡️ Faster reviews with caching

CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 30th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.
Enjoy the performance boost—your workflow just got faster.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 60463af and 4b87ff1.

📒 Files selected for processing (1)
  • app/pipeline/rewriting_pipeline.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/pipeline/rewriting_pipeline.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Push Docker Image
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (14)
app/pipeline/prompts/faq_consistency_prompt.py (5)

7-9: Consider removing unnecessary empty lines.

There are several consecutive empty lines here that don't serve a purpose and could be removed to improve readability.

-

-

-

19-27: Improve clarity of inconsistencies detection instructions.

The instructions for the "faqs" entry could be more concise and better organized. Consider restructuring for clarity while maintaining all requirements.

-"faqs": This entry should be a list of Strings, each string represents an FAQ.
--Make sure each faq is separated by comma.
--Also end each faq with a newline character.
--The fields are exactly named faq_id, faq_question_title and faq_question_answer
-and reside within properties dict of each list entry.
--Make sure to only include inconsistent faqs
--Do not include any additional FAQs that are consistent with the final_result.
+"faqs": This entry should be a list of strings, each representing an inconsistent FAQ.
+- Each FAQ must be comma-separated and end with a newline character
+- Only include fields named faq_id, faq_question_title, and faq_question_answer within the properties dict
+- Only include FAQs inconsistent with the final_result, exclude consistent ones

35-40: Simplify suggestion formatting instructions.

The instructions for the "suggestion" entry contain unnecessary newline characters and could be more concise.

-"suggestion": This entry is a list of strings, each string represents a suggestion to improve the final result.\n
-- Each suggestion should focus on a different inconsistency.
-- Each suggestions highlights what is the inconsistency and how it can be improved. 
-- Do not mention the term final result, call it provided text
-- Please ensure that at no time, you have a different amount of suggestions than inconsistencies.\n
-Both should have the same amount of entries.
+"suggestion": This entry is a list of strings, each representing a suggestion to improve the final result.
+- Each suggestion should focus on a unique inconsistency
+- Highlight both the inconsistency and how it can be improved
+- Refer to "final result" as "provided text"
+- Ensure the number of suggestions exactly matches the number of inconsistencies

42-44: Remove unnecessary empty lines after "improved version" explanation.

There are consecutive empty lines that don't serve a purpose in the prompt.

-"improved version": This entry should be a string that represents the improved version of the final result.
-

-
+"improved version": This entry should be a string that represents the improved version of the final result.

47-47: Missing newline at end of file.

The file is missing a newline character at the end, which is a common convention for Python files.

-"""
-47
+"""
app/domain/status/rewriting_status_update_dto.py (1)

10-10: Fix spacing in type annotation.

There's a missing space between the type annotation and the assignment operator.

-    improvement: str= ""
+    improvement: str = ""
app/retrieval/faq_retrieval.py (4)

77-77: Translate German comment to English for consistency.

The codebase appears to use English for code and comments, but this line contains a German comment.

-        hybrid_factor: float = 0.75,  # Gewichtung zwischen Vektor- und Textsuche
+        hybrid_factor: float = 0.75,  # Weighting between vector and keyword search

79-87: Translate German docstring to English for consistency.

The method docstring is written in German while the rest of the codebase appears to be in English. Consider translating the docstring to maintain language consistency.

-        """
-        Holt FAQs direkt aus der Datenbank, optional mit einer Ähnlichkeitssuche auf question_title und question_answer.
-
-        :param course_id: ID des Kurses, um nur FAQs eines bestimmten Kurses zu holen.
-        :param search_text: Optionaler Suchtext, der für eine semantische Suche verwendet wird.
-        :param result_limit: Anzahl der gewünschten Ergebnisse.
-        :param hybrid_factor: Gewichtung zwischen vektorbasierten und keywordbasierten Ergebnissen (0 = nur Vektor, 1 = nur Keywords).
-        :return: Liste der gefundenen FAQs.
+        """
+        Retrieves FAQs directly from the database, optionally with similarity search on question_title and question_answer.
+
+        :param course_id: ID of the course to filter FAQs for a specific course.
+        :param search_text: Optional search text used for semantic searching.
+        :param result_limit: Number of desired results.
+        :param hybrid_factor: Weighting between vector-based and keyword-based results (0 = vector only, 1 = keywords only).
+        :return: List of retrieved FAQs.

89-90: Consider adding a comment explaining the Filter usage.

Adding a brief English comment explaining what the filter does would improve code readability.

-        # Filter für den Kurs
+        # Filter results by course ID
         filter_weaviate = Filter.by_property("course_id").equal(course_id)

95-98: Add explanatory comments for hybrid search parameters.

The hybrid search parameters could benefit from brief explanations to help future maintainers understand their purpose.

         response = self.collection.query.hybrid(
-            query=search_text,  # Keyword-Suche
-            vector=vec,  # Vektorbasierte Ähnlichkeitssuche
-            alpha=hybrid_factor,  # Mischung aus Vektor- und Textsuche
+            query=search_text,  # Keyword search
+            vector=vec,  # Vector-based similarity search
+            alpha=hybrid_factor,  # Balance between vector and keyword search (0-1)
             return_properties=self.get_schema_properties(),
app/web/status/status_update.py (2)

113-114: Consider extending the docstring to include inconsistencies and improvement.

The method signature adds new parameters inconsistencies and improvement without updating the docstring to reflect them. Adding these parameters to the docstring would promote clarity and maintain consistency.


128-131: Use consistent approach for setting new attributes.

Using hasattr(self.status, ...) before assigning inconsistencies and improvement is sensible. Ensure all new attributes added in the status DTO (e.g., inconsistencies, improvement) follow a similar pattern, so that the presence or absence of corresponding fields in child DTOs won't cause errors.

app/pipeline/rewriting_pipeline.py (2)

46-49: Handle VectorDatabase instantiation failures.

Calls to VectorDatabase() and FaqRetrieval(self.db.client) may fail if the database connection is misconfigured or offline. Consider adding try-except blocks or fallback handlers for reliability.


87-90: Initialize newly introduced fields more descriptively.

inconsistencies = [], improvement = "", and suggestions = [] are clear placeholders. Consider adding in-code comments or docstrings around them to clarify their usage, aiding future maintainers.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c4337c6 and fcf8a5a.

📒 Files selected for processing (7)
  • app/domain/rewriting_pipeline_execution_dto.py (1 hunks)
  • app/domain/status/rewriting_status_update_dto.py (1 hunks)
  • app/pipeline/prompts/faq_consistency_prompt.py (1 hunks)
  • app/pipeline/prompts/faq_rewriting.py (0 hunks)
  • app/pipeline/rewriting_pipeline.py (5 hunks)
  • app/retrieval/faq_retrieval.py (2 hunks)
  • app/web/status/status_update.py (3 hunks)
💤 Files with no reviewable changes (1)
  • app/pipeline/prompts/faq_rewriting.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Push Docker Image
🔇 Additional comments (12)
app/domain/rewriting_pipeline_execution_dto.py (1)

7-7: LGTM: Course ID field correctly added.

The new course_id field is properly typed as an integer with the appropriate JSON alias "courseId", which aligns well with the PR objective of enhancing FAQ consistency checking by allowing course-specific operations.

app/pipeline/prompts/faq_consistency_prompt.py (1)

1-46: Well-structured prompt with clear guidelines for FAQ consistency checking.

The prompt effectively outlines the task of verifying consistency between FAQs and a final result, with detailed instructions on the expected output format including consistency status, inconsistencies, suggestions, and improvement options.

app/domain/status/rewriting_status_update_dto.py (1)

8-10: New status attributes added for consistency reporting.

The new attributes for suggestions, inconsistencies, and improvement are properly typed and initialized with appropriate default values.

app/retrieval/faq_retrieval.py (2)

5-6: Added appropriate import for the new functionality.

The import of Filter from Weaviate is correctly added for the new database filtering functionality.


72-114: Well-implemented database retrieval method for FAQs.

The new get_faqs_from_db method provides a robust way to fetch FAQs filtered by course ID with optional semantic search capabilities. The method handles both cases (with and without search text) appropriately and returns properly formatted results.

app/web/status/status_update.py (1)

143-144: Resetting inconsistencies but not improvement may be inconsistent.

Currently, self.status.suggestions and self.status.inconsistencies are reset to None at the end of the done method, but improvement is not. If the improvement field is also intended to be ephemeral, consider resetting it as well for consistency or provide a rationale for preserving it.

app/pipeline/rewriting_pipeline.py (6)

1-3: Imports look good.

New imports for json and expanded typing support are appropriate for handling the JSON responses and improved type hints.


21-22: Check usage of FaqRetrieval and VectorDatabase.

These additions expand the pipeline’s retrieval layer. Ensure that database credentials or connections are handled securely and that usage is tested, especially if running in production or with real data.


65-68: Good approach: Reusing stored prompts for rewriting.

Using a standard prompt template helps maintain consistency. Just ensure that any special characters in dto.to_be_rewritten are escaped or handled if needed (e.g., backticks or braces).


91-104: Check potential issues if dto.course_id is missing or invalid.

When variant == "faq", you retrieve FAQs using dto.course_id. Validate that dto.course_id is not None or empty to avoid unexpected behavior from the database call.


147-154: Validate data structure for parse_inconsistencies.

The function directly accesses faq_id, faq_question_title, and faq_question_answer. If the JSON structure lacks these fields, this will raise a KeyError. A defensive check or default value would make the parsing more robust.


16-16:

Details

❓ Verification inconclusive

Ensure the new prompt module is committed and tested.

faq_consistency_prompt is imported here. Verify that this module is properly tracked in the repository and tested to confirm that expected content exists in the prompt.


🏁 Script executed:

#!/bin/bash
# Confirm that faq_consistency_prompt.py exists and is tested
fd faq_consistency_prompt.py
rg -A 3 "test_faq_consistency_prompt"

Length of output: 115


I didn't find any direct test references using the string "test_faq_consistency_prompt". To be extra sure, let's run an additional check for any test files that reference faq_consistency_prompt:

Please run this script and share the output so we can confirm whether tests for the new prompt module exist.


🏁 Script executed:

#!/bin/bash
# Searching for any occurrence of "faq_consistency_prompt" in files likely to be tests.
rg -l "faq_consistency_prompt" --glob '*test*'

Length of output: 195


ACTION: Verify prompt module tests are in place

The module faq_consistency_prompt is properly tracked at app/pipeline/prompts/faq_consistency_prompt.py. However, our automated checks did not locate any tests explicitly referencing this module. Please manually verify that tests exist (or add tests if they’re missing) to confirm that the prompt's expected content is validated.

coderabbitai[bot]
coderabbitai bot previously approved these changes Mar 3, 2025
@cremertim cremertim changed the title Feature/faq/add rewrite consistency Add faq consistency check Mar 3, 2025
coderabbitai[bot]
coderabbitai bot previously approved these changes Mar 3, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
app/pipeline/rewriting_pipeline.py (1)

111-151: ⚠️ Potential issue

Add error handling for JSON parsing and missing keys.

There's no error handling for JSON parsing or accessing dictionary keys, which could lead to unhandled exceptions if the LLM returns malformed responses.

Add error handling for JSON parsing and missing keys:

-        result = response.contents[0].text_content
-        data = json.loads(result)
-
-        result_dict = {
-            "type": data["type"],
-            "message": data["message"],
-            "faqs": data["faqs"],
-            "suggestion": data["suggestion"],
-            "improved version": data["improved version"],
-        }
+        result = response.contents[0].text_content
+        try:
+            data = json.loads(result)
+            
+            # Default values for keys that might be missing
+            result_dict = {
+                "type": data.get("type", "unknown"),
+                "message": data.get("message", ""),
+                "faqs": data.get("faqs", []),
+                "suggestion": data.get("suggestion", []),
+                "improved version": data.get("improved version", ""),
+            }
+        except json.JSONDecodeError as e:
+            logging.error(f"Failed to parse JSON from LLM response: {str(e)}")
+            # Return a default dictionary indicating the error
+            result_dict = {
+                "type": "error",
+                "message": f"Failed to parse response: {str(e)}",
+                "faqs": [],
+                "suggestion": [],
+                "improved version": "",
+            }

Also, update the docstring to accurately reflect the return value structure:

        """
        Checks the consistency of the given FAQs with the provided final_result.
        Returns "consistent" if there are no inconsistencies, otherwise returns "inconsistent".

        :param faqs: List of retrieved FAQs.
        :param final_result: The result to compare the FAQs against.
+       :return: A dictionary containing:
+           - type: "consistent" or "inconsistent"
+           - message: A message explaining the result
+           - faqs: List of inconsistent FAQs with details
+           - suggestion: List of suggestions for improvement
+           - improved version: An improved version of the final_result
        """
🧹 Nitpick comments (4)
app/pipeline/rewriting_pipeline.py (4)

1-23: Imports look good, though consider organizing them in logical groups.

The imports are correctly updated to include the necessary modules and types for the new FAQ consistency checking functionality.

Consider grouping imports more logically - standard library imports, then third-party libraries, then application imports - with a blank line between each group:

import json
import logging
from typing import Literal, Optional, List, Dict

from langchain.output_parsers import PydanticOutputParser
from langchain_core.prompts import (
    ChatPromptTemplate,
)

from app.common.PipelineEnum import PipelineEnum
from app.common.pyris_message import PyrisMessage, IrisMessageRole
from app.domain.data.text_message_content_dto import TextMessageContentDTO
from app.domain.rewriting_pipeline_execution_dto import RewritingPipelineExecutionDTO
from app.llm import CapabilityRequestHandler, RequirementList, CompletionArguments
from app.pipeline import Pipeline
from app.pipeline.prompts.faq_consistency_prompt import faq_consistency_prompt
from app.pipeline.prompts.rewriting_prompts import (
    system_prompt_faq,
    system_prompt_problem_statement,
)
from app.retrieval.faq_retrieval import FaqRetrieval
from app.vector_database.database import VectorDatabase
from app.web.status.status_update import RewritingCallback

133-135: Consider using a variable for temperature value.

The temperature value is hardcoded, which may be fine for this specific case, but a constant or configuration parameter would be more maintainable.

Consider defining the temperature as a class constant or configuration parameter:

+    # Constants for LLM parameters
+    FAQ_CONSISTENCY_TEMPERATURE = 0.0
+    REWRITING_TEMPERATURE = 0.4

    def __init__(
        self, callback: RewritingCallback, variant: Literal["faq", "problem_statement"]
    ):
        # ...existing code...

    # Then in the __call__ method:
    response = self.request_handler.chat(
-        [prompt], CompletionArguments(temperature=0.4), tools=None
+        [prompt], CompletionArguments(temperature=self.REWRITING_TEMPERATURE), tools=None
    )

    # And in check_faq_consistency:
    response = self.request_handler.chat(
-        [prompt], CompletionArguments(temperature=0.0), tools=None
+        [prompt], CompletionArguments(temperature=self.FAQ_CONSISTENCY_TEMPERATURE), tools=None
    )

153-159: Remove debug logging and improve function documentation.

The debug logging statement should be removed, and the function could benefit from better documentation.

Improve the parse_inconsistencies function:

def parse_inconsistencies(inconsistencies: List[Dict[str, str]]) -> List[str]:
-   logging.info("parse consistency")
+   """
+   Formats a list of inconsistency dictionaries into human-readable strings.
+   
+   :param inconsistencies: List of dictionaries containing inconsistency details
+   :return: List of formatted strings describing each inconsistency
+   """
    parsed_inconsistencies = [
        f"FAQ ID: {entry['faq_id']}, Title: {entry['faq_question_title']}, Answer: {entry['faq_question_answer']}"
        for entry in inconsistencies
    ]
    return parsed_inconsistencies

149-149: Consistent logging level for similar events.

There's inconsistent use of logging levels - warning for inconsistencies but info for the consistency check response.

Use consistent logging levels for related information:

-        logging.info(f"Consistency FAQ consistency check response: {result_dict}")
+        if "inconsistent" in result_dict["type"].lower():
+            logging.warning(f"Consistency FAQ consistency check response: {result_dict}")
+        else:
+            logging.info(f"Consistency FAQ consistency check response: {result_dict}")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3c5617e and 8be2f5c.

📒 Files selected for processing (3)
  • app/domain/status/rewriting_status_update_dto.py (1 hunks)
  • app/pipeline/rewriting_pipeline.py (5 hunks)
  • app/web/status/status_update.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • app/domain/status/rewriting_status_update_dto.py
  • app/web/status/status_update.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Push Docker Image
🔇 Additional comments (3)
app/pipeline/rewriting_pipeline.py (3)

65-67: Refactored prompt formatting looks good.

The code now uses a format_args dictionary to pass variables to the prompt template, which is a cleaner approach.


93-94:

Details

✅ Verification successful

Verify course_id existence.

The code assumes that dto.course_id exists, but this might not be the case if the DTO wasn't properly initialized or if this field was recently added.

Ensure the RewritingPipelineExecutionDTO class has the course_id field:


🏁 Script executed:

#!/bin/bash
# Check the RewritingPipelineExecutionDTO class for the course_id field
rg -A 20 "class RewritingPipelineExecutionDTO" app/domain/rewriting_pipeline_execution_dto.py

Length of output: 280


Course ID Field Verified

The RewritingPipelineExecutionDTO class has a defined course_id field (with the alias "courseId"), so the assumption in the code is correct. No changes are needed in this part.


46-49:

Details

❓ Verification inconclusive

Database initialization looks good, but verify connection lifecycle.

The Vector Database initialization and FAQ retriever setup are properly implemented.

Make sure the database connection is properly managed throughout the application lifecycle. Consider if the database connection should be created in the constructor or passed in as a dependency.


🏁 Script executed:

#!/bin/bash
# Check whether the VectorDatabase connection is properly closed elsewhere
rg -A 5 "def __del__" app/vector_database/database.py
# Or check if there's a close method that should be called
rg -A 5 "def close" app/vector_database/database.py

Length of output: 109


Ensure Proper Lifecycle Management for Database Connection

The initialization of VectorDatabase and the FAQ retriever in app/pipeline/rewriting_pipeline.py appears fine. However, our checks did not reveal any explicit teardown methods (like __del__ or close) in the VectorDatabase class (referenced from app/vector_database/database.py). Please manually verify that the database connection is properly closed or otherwise managed during the application lifecycle. Consider whether the connection should be instantiated in the constructor or injected as a dependency to ensure resource cleanup.

coderabbitai[bot]
coderabbitai bot previously approved these changes Mar 6, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
app/pipeline/rewriting_pipeline.py (4)

87-109: 🛠️ Refactor suggestion

Add error handling for FAQ retrieval and consistency checking.

While the code correctly processes FAQ inconsistencies, it lacks error handling for potential issues during retrieval or consistency checking, which could cause pipeline failures.

        final_result = response
        inconsistencies = []
        improvement = ""
        suggestions = []

        if self.variant == "faq":
-           faqs = self.faq_retriever.get_faqs_from_db(
-               course_id=dto.course_id, search_text=response, result_limit=10
-           )
-           consistency_result = self.check_faq_consistency(faqs, final_result)
-
-           if "inconsistent" in consistency_result["type"].lower():
-               logging.warning("Detected inconsistencies in FAQ retrieval.")
-               inconsistencies = parse_inconsistencies(consistency_result["faqs"])
-               improvement = consistency_result["improved version"]
-               suggestions = consistency_result["suggestion"]
+           try:
+               faqs = self.faq_retriever.get_faqs_from_db(
+                   course_id=dto.course_id, search_text=response, result_limit=10
+               )
+               if faqs:  # Only check consistency if we have FAQs to check against
+                   consistency_result = self.check_faq_consistency(faqs, final_result)
+                   
+                   if consistency_result and "type" in consistency_result and "inconsistent" in consistency_result["type"].lower():
+                       logging.warning("Detected inconsistencies in FAQ retrieval.")
+                       inconsistencies = parse_inconsistencies(consistency_result.get("faqs", []))
+                       improvement = consistency_result.get("improved version", "")
+                       suggestions = consistency_result.get("suggestion", [])
+               else:
+                   logging.info("No FAQs retrieved for consistency check.")
+           except Exception as e:
+               logging.error(f"Error during FAQ consistency checking: {str(e)}")
+               # Continue with the pipeline even if FAQ checking fails

122-122: 🛠️ Refactor suggestion

Handle potential KeyError for 'properties'.

There's no error handling if 'properties' key is missing from a FAQ entry.

-        properties_list = [entry["properties"] for entry in faqs]
+        properties_list = []
+        for entry in faqs:
+            if "properties" in entry:
+                properties_list.append(entry["properties"])
+            else:
+                logging.warning(f"FAQ entry missing 'properties' key: {entry}")

123-126: 🛠️ Refactor suggestion

Check for empty FAQs list before processing.

There's no check to see if the properties_list is empty before trying to process it, which could lead to unexpected behavior in the consistency checking.

        properties_list = [entry["properties"] for entry in faqs]
+        
+        # Return early if no FAQs to check against
+        if not properties_list:
+            return {
+                "type": "consistent",
+                "message": "No FAQs found to check consistency against.",
+                "faqs": [],
+                "suggestion": [],
+                "improved version": "",
+            }

        consistency_prompt = faq_consistency_prompt.format(
            faqs=properties_list, final_result=final_result
        )

139-139: 🛠️ Refactor suggestion

Add error handling for JSON parsing.

json.loads(result) can raise exceptions if the LLM response is malformed. Add a try-except block to handle this case.

-        result = response.contents[0].text_content
-        data = json.loads(result)
+        result = response.contents[0].text_content
+        try:
+            data = json.loads(result)
+        except json.JSONDecodeError as e:
+            logging.error(f"Failed to parse JSON from LLM response: {e}")
+            return {
+                "type": "error",
+                "message": "Failed to parse consistency check response",
+                "faqs": [],
+                "suggestion": [],
+                "improved version": "",
+            }
🧹 Nitpick comments (3)
app/pipeline/rewriting_pipeline.py (3)

111-121: Adjust the return type annotation for check_faq_consistency.

The method is typed to return Dict[str, str], but it actually returns a dictionary with mixed value types (strings, lists, and possibly other types).

    def check_faq_consistency(
        self, faqs: List[dict], final_result: str
-    ) -> Dict[str, str]:
+    ) -> Dict[str, any]:
        """
        Checks the consistency of the given FAQs with the provided final_result.
        Returns "consistent" if there are no inconsistencies, otherwise returns "inconsistent".

        :param faqs: List of retrieved FAQs.
        :param final_result: The result to compare the FAQs against.

        """

133-135: Consider adding a timeout for the consistency check LLM call.

The LLM request for consistency checking doesn't have a timeout, which could lead to the pipeline hanging if there's an issue with the LLM service.

        response = self.request_handler.chat(
-            [prompt], CompletionArguments(temperature=0.0), tools=None
+            [prompt], CompletionArguments(temperature=0.0, timeout=30), tools=None
        )

149-150: Improve logging for the consistency check response.

The current logging statement could be improved to better indicate the consistency check status.

-        logging.info(f"Consistency FAQ consistency check response: {result_dict}")
+        log_level = logging.WARNING if result_dict.get("type", "").lower() == "inconsistent" else logging.INFO
+        logging.log(log_level, f"FAQ consistency check result: {result_dict.get('type', 'unknown')} - {result_dict.get('message', 'No message')}")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8be2f5c and 7ec5dff.

📒 Files selected for processing (2)
  • app/pipeline/prompts/faq_consistency_prompt.py (1 hunks)
  • app/pipeline/rewriting_pipeline.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/pipeline/prompts/faq_consistency_prompt.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Push Docker Image

coderabbitai[bot]
coderabbitai bot previously approved these changes Mar 11, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a42bcb7 and de0e94c.

📒 Files selected for processing (1)
  • app/retrieval/faq_retrieval.py (2 hunks)
🧰 Additional context used
🪛 GitHub Actions: Run linters
app/retrieval/faq_retrieval.py

[error] 80-80: E501 line too long (125 > 120 characters)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Push Docker Image
🔇 Additional comments (4)
app/retrieval/faq_retrieval.py (4)

5-6: Added appropriate Weaviate filter import.

The addition of the Filter import is necessary for implementing the database filtering functionality in the new method.


72-113: Well-implemented FAQ database retrieval method.

This new method enhances the class by providing direct database access to FAQs with flexible search options. The implementation supports both direct retrieval by course ID and hybrid search combining keyword and vector-based approaches.

🧰 Tools
🪛 GitHub Actions: Run linters

[error] 80-80: E501 line too long (125 > 120 characters)


89-100: Robust hybrid search implementation.

The hybrid search implementation elegantly combines keyword-based and vector-based searches using the embedding model, allowing for semantic understanding of search queries while maintaining the course_id filter constraint.


102-107: Well-structured fallback for non-search scenarios.

The method correctly handles the case when no search text is provided by falling back to a simple filtered query, maintaining consistent behavior with the search case.

coderabbitai[bot]
coderabbitai bot previously approved these changes Mar 13, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
app/pipeline/prompts/faq_consistency_prompt.py (1)

49-49: ⚠️ Potential issue

Remove or fix incomplete line.

Line 49 contains just the number "49" without any code or assignment, which appears to be a mistake.

-49
🧹 Nitpick comments (3)
app/pipeline/prompts/faq_consistency_prompt.py (3)

35-40: Improve formatting consistency in the suggestions section.

The formatting in the suggestions section is inconsistent with some lines using hyphens and others not. There's also an inconsistent use of \n vs. actual newlines.

-"suggestion": This entry is a list of strings, each string represents a suggestion to improve the final result.\n
-- Each suggestion should focus on a different inconsistency.
-- Each suggestions highlights what is the inconsistency and how it can be improved.
-- Do not mention the term final result, call it provided text
-- Please ensure that at no time, you have a different amount of suggestions than inconsistencies.\n
-Both should have the same amount of entries.
+"suggestion": This entry is a list of strings, each string represents a suggestion to improve the final result.
+- Each suggestion should focus on a different inconsistency.
+- Each suggestion highlights what is the inconsistency and how it can be improved.
+- Do not mention the term final result, call it provided text.
+- Please ensure that at no time, you have a different amount of suggestions than inconsistencies.
+- Both should have the same amount of entries.

28-34: Improve clarity in the message section.

The instructions for the "message" field could be more clearly formatted. The current format might cause confusion about what exactly should be included in the message.

-"message": "The provided text was rephrased, however it contains inconsistent information with existing FAQs."
--Make sure to always insert two new lines after the last character of this sentences.
-The affected FAQs can only contain the faq_id, faq_question_title, and faq_question_answer of inconsistent FAQs.
-Make sure to not include any additional FAQs, that are consistent with the final_result.
-Insert the faq_id, faq_question_title, and faq_question_answer of the inconsistent FAQ in the placeholder.
+"message": "The provided text was rephrased, however it contains inconsistent information with existing FAQs."
+- Make sure to always insert two new lines after the last character of this sentence.
+- The affected FAQs can only contain the faq_id, faq_question_title, and faq_question_answer of inconsistent FAQs.
+- Make sure to not include any additional FAQs that are consistent with the final_result.
+- Insert the faq_id, faq_question_title, and faq_question_answer of the inconsistent FAQ in the placeholder.

17-18:

Details

❓ Verification inconclusive

Enhance language detection logic.

The language detection instruction is fairly simplistic. Consider providing more specific guidance on how to extract the language from FAQs or use a more robust approach.

Can the language be more reliably determined from the course_id parameter mentioned in the PR summary rather than inferring it from FAQ content?


🌐 Web query:

What are reliable approaches to detect German vs English text programmatically?

💡 Result:

Reliable approaches to distinguish German from English text programmatically include a combination of statistical methods, machine learning models, and rule-based systems. The most effective methods leverage character-level analysis, pre-trained language models, and specialized libraries:

1. Specialized Language Detection Libraries

  • Lingua ([4][12])
    The most accurate option for short texts, using n-grams (1–5 characters) and rule-based detection of unique characters (e.g., German umlauts ä, ö, ü). Achieves 89.23% accuracy on German single words and 99.7% on sentences.

    # Example using Lingua (Java/JVM, Rust, or Python bindings)
    from lingua import LanguageDetectorBuilder
    detector = LanguageDetectorBuilder.from_languages("de", "en").build()
    confidence_values = detector.compute_language_confidence_values("Text hier")
  • FastText ([3][10])
    Facebook's library for text classification. Pre-trained models like lid.176.bin detect 176 languages with high accuracy for sentences but less reliable for single words.

2. Statistical & Traditional Methods

  • Naive Bayes with Character Bigrams ([10])
    Trained on character sequences (e.g., ch, sch in German vs th, sh in English), achieving 97% accuracy on multilingual datasets. Effective for short texts and computationally lightweight.

    # Python example with sklearn
    from sklearn.naive_bayes import MultinomialNB
    from sklearn.feature_extraction.text import CountVectorizer
    pipeline = Pipeline([("vectorizer", CountVectorizer(analyzer="char", ngram_range=(2,2))),
                         ("model", MultinomialNB())])
  • Langdetect ([3][7])
    A Python port of Google’s language detection system. Simple to implement but struggles with very short texts:

    from langdetect import detect
    detect("Kurzer Text auf Deutsch")  # Returns 'de'

3. Deep Learning & Transformer Models

  • Spark NLP’s LanguageDetectorDL ([6])
    A deep learning model trained on Wikipedia and Tatoeba datasets. Supports 375 languages and integrates with Apache Spark:

    from sparknlp.annotator import LanguageDetectorDL
    detector = LanguageDetectorDL.pretrained().setInputCols(["text"]).setOutputCol("language")
  • Multilingual BERT/XLM-Roberta ([2][11])
    Use embeddings from models like bert-base-multilingual or xlm-roberta-large-xnli for zero-shot classification:

    from transformers import pipeline
    classifier = pipeline("zero-shot-classification", model="joeddav/xlm-roberta-large-xnli")
    result = classifier("Beispieltext", candidate_labels=["English", "German"])

4. Hybrid & Rule-Based Systems

  • Unique Character Detection ([4][14])
    German-specific characters (ß, ä, ö, ü) and compound words (e.g., Donaudampfschifffahrtsgesellschaft) can be flagged using regex or Unicode checks.
    import re
    has_german_umlauts = re.search(r"[äöüß]", text, re.IGNORECASE)

5. Cloud APIs

  • Google Cloud Natural Language API ([14])
    Supports 10+ languages, including German and English, with syntax analysis and entity recognition:
    from google.cloud import language_v1
    client = language_v1.LanguageServiceClient()
    document = language_v1.Document(content=text, type_=language_v1.Document.Type.PLAIN_TEXT)
    response = client.detect_language(request={'document': document})

Comparison of Approaches

Method Accuracy (Short Text) Ease of Use Speed
Lingua 89–99% Moderate Fast
Naive Bayes (Bigrams) ~97% High Very Fast
Langdetect ~85% Very High Fast
Spark NLP >95% Moderate Medium
Google Cloud API >95% High Slow

For most use cases, Lingua or Langdetect are optimal for short texts, while Spark NLP or transformer models like XLM-Roberta excel with sentences and paragraphs[4][6][10]. Hybrid approaches combining rule-based checks (e.g., umlauts) with statistical models improve robustness for edge cases[4][14].

Citations:


Refine Language Detection Strategy

  • The current approach extracts the language solely by scanning the FAQ text, which can be error-prone for short or ambiguous examples.
  • Consider checking if the course_id parameter consistently encodes language information (for example, via identifiable prefixes or patterns). Using this as a primary signal might yield more reliable results.
  • If the course_id isn’t a definitive indicator, integrate a dedicated language detection module (e.g., leveraging libraries such as langdetect, Lingua, or a rule-based approach checking for German-specific characters) to robustly determine whether the course language is German or English.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between de0e94c and 9262d2b.

📒 Files selected for processing (1)
  • app/pipeline/prompts/faq_consistency_prompt.py (1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Run linters
app/pipeline/prompts/faq_consistency_prompt.py

[warning] 44-44: W291 trailing whitespace

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Push Docker Image
🔇 Additional comments (1)
app/pipeline/prompts/faq_consistency_prompt.py (1)

1-48: Overall prompt structure looks good.

The prompt is well-structured with clear sections for task description, input data, and expected output format. It provides detailed instructions for consistency checking and handling inconsistencies with appropriate formatting requirements.

🧰 Tools
🪛 GitHub Actions: Run linters

[warning] 44-44: W291 trailing whitespace

coderabbitai[bot]
coderabbitai bot previously approved these changes Mar 19, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (5)
app/pipeline/rewriting_pipeline.py (5)

87-109: Add error handling for FAQ retrieval and consistency checking.

The code lacks error handling for potential issues in the FAQ retrieval or consistency checking process. If either operation fails, the entire pipeline will fail without graceful degradation.

As suggested in a previous review, consider adding try-except blocks around these operations to ensure the pipeline continues even if there are issues with FAQ consistency checking.


122-122: Handle potential KeyError for 'properties'.

The list comprehension assumes that every entry in faqs has a 'properties' key, which may not always be true.

Consider using a safer approach to handle missing keys as suggested in the previous review.


143-143: Add error handling for JSON parsing.

JSON parsing can fail if the response is malformed, leading to an unhandled exception.

Consider adding a try-except block to handle potential JSON parsing errors.


156-162: Add error handling for missing keys in parse_inconsistencies.

The function assumes each inconsistency entry contains specific keys, but there's no error handling if these keys are missing.

As suggested in a previous review, consider using .get() with default values instead of direct dictionary access.


123-126: Check for empty FAQs list before processing.

There's no check to see if the properties list is empty before trying to process it.

As suggested in a previous review, consider adding a check for an empty properties list.

🧹 Nitpick comments (5)
app/pipeline/rewriting_pipeline.py (5)

97-101: Replace magic string "inconsistent" with a constant or enum.

Using string literals for checking condition types can lead to maintenance issues if the type values change in the future.

Consider defining a constant or enum for consistency types:

+ # At the top of the file with other imports
+ from enum import Enum
+ 
+ class ConsistencyType(Enum):
+     CONSISTENT = "consistent"
+     INCONSISTENT = "inconsistent"

# Then in the condition check:
- if "inconsistent" in consistency_result["type"].lower():
+ if ConsistencyType.INCONSISTENT.value in consistency_result["type"].lower():

134-134: Add a comment explaining the temperature setting.

The temperature is set to 0.0, which is good for deterministic responses in consistency checking, but lacks documentation.

Add a comment explaining why you're using temperature 0.0:

        response = self.request_handler.chat(
-            [prompt], CompletionArguments(temperature=0.0), tools=None
+            [prompt], CompletionArguments(temperature=0.0), tools=None  # Using temperature 0.0 for deterministic consistency checking results
        )

146-153: Simplify result dictionary construction.

The current approach of checking each key individually is verbose and could be simplified.

Use dictionary comprehension for cleaner code:

-        result_dict = {}
-
-        keys_to_check = ["type", "message", "faqs", "suggestion", "improved version"]
-
-        for key in keys_to_check:
-            if key in data:
-                result_dict[key] = data[key]
+        keys_to_check = ["type", "message", "faqs", "suggestion", "improved version"]
+        result_dict = {key: data[key] for key in keys_to_check if key in data}
+
+        # Ensure required keys have default values
+        if "type" not in result_dict:
+            result_dict["type"] = "consistent"

157-157: Improve logging in parse_inconsistencies.

Logging the entire inconsistencies list could be verbose and potentially expose sensitive information.

- logging.info(f"Parsing inconsistencies: {inconsistencies}")
+ logging.info(f"Parsing {len(inconsistencies)} inconsistencies")

112-112: Improve type hint for faqs parameter.

The type hint List[dict] is less specific than it could be.

- def check_faq_consistency(
-     self, faqs: List[dict], final_result: str
- ) -> Dict[str, str]:
+ def check_faq_consistency(
+     self, faqs: List[Dict[str, any]], final_result: str
+ ) -> Dict[str, any]:

The return type should also be Dict[str, any] since it can contain lists and other types, not just strings.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9262d2b and 0bf244f.

📒 Files selected for processing (2)
  • app/pipeline/prompts/faq_consistency_prompt.py (1 hunks)
  • app/pipeline/rewriting_pipeline.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/pipeline/prompts/faq_consistency_prompt.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Push Docker Image

Comment on lines +140 to +142
if not result or result.strip() == "":
result = '{"type": "consistent"}'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add more comprehensive error handling for empty responses.

The code handles empty responses by defaulting to a consistent result, but doesn't log this fallback behavior.

Improve the error handling by adding logging:

        if not result or result.strip() == "":
+            logging.warning("Received empty response from consistency check, defaulting to 'consistent'")
            result = '{"type": "consistent"}'
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if not result or result.strip() == "":
result = '{"type": "consistent"}'
if not result or result.strip() == "":
logging.warning("Received empty response from consistency check, defaulting to 'consistent'")
result = '{"type": "consistent"}'
🤖 Prompt for AI Agents
In app/pipeline/rewriting_pipeline.py around lines 140 to 142, the code defaults
to a consistent result when the response is empty but does not log this
fallback. Add a logging statement before setting the default result to record
that an empty or invalid response was received and the fallback was applied,
improving traceability and debugging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant