Skip to content

fix(full_techniques): add proper result handling for 75-technique evaluation mode#265

Merged
ComBba merged 3 commits intomainfrom
fix/full-techniques-sommelier-display
Feb 9, 2026
Merged

fix(full_techniques): add proper result handling for 75-technique evaluation mode#265
ComBba merged 3 commits intomainfrom
fix/full-techniques-sommelier-display

Conversation

@ComBba
Copy link
Contributor

@ComBba ComBba commented Feb 9, 2026

Summary

  • Add full_techniques branch in save_evaluation_results() to convert BMAD category scores to sommelier_outputs format
  • Add BMAD category themes (Problem Definition, Technical Design, Implementation, Documentation) to frontend sommeliers.ts
  • Add SOMMELIER_ROLES mapping for full_techniques categories in api.ts

Problem

When running 75-technique evaluation mode (full_techniques), all results were incorrectly displayed as "Jean-Pierre" / "Final Synthesis" because:

  1. Backend: save_evaluation_results() had no handling for full_techniques mode - it fell through to six_sommeliers logic which found no results
  2. Frontend: getSommelierTheme() returned jeanpierre as fallback when IDs didn't match any defined sommelier

Solution

  1. Backend (evaluation_service.py):

    • Added elif is_full_techniques: branch to handle BMAD category scores
    • Converts 4 BMAD categories (A: Problem Definition, B: Technical Design, C: Implementation, D: Documentation) to sommelier_outputs format
    • Scales raw scores to 0-100 range
  2. Frontend (sommeliers.ts):

    • Added 4 new sommelier themes for BMAD categories with appropriate colors and images
  3. Frontend (api.ts):

    • Added SOMMELIER_ROLES mapping for the new category names

Files Changed

  • backend/app/services/evaluation_service.py - Added full_techniques result handling
  • frontend/src/lib/sommeliers.ts - Added BMAD category themes
  • frontend/src/lib/api.ts - Added role mappings

Summary by CodeRabbit

릴리스 노트

  • 새 기능
    • 카테고리별 상세 평가 모드 추가: 문제 정의, 기술 설계, 구현, 문서화
    • 각 카테고리별 점수 계산 및 종합 평가 결과 제공
    • 새로운 평가 카테고리에 대한 테마 및 시각화 추가

…luation mode

- Add full_techniques branch in save_evaluation_results() to convert BMAD category scores to sommelier_outputs format
- Add BMAD category themes (Problem Definition, Technical Design, Implementation, Documentation) to frontend sommeliers.ts
- Add SOMMELIER_ROLES mapping for full_techniques categories in api.ts

Fixes issue where all 75-technique results displayed as Jean-Pierre due to missing mode-specific handling
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@vercel
Copy link

vercel bot commented Feb 9, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
somm-dev Ready Ready Preview, Comment Feb 9, 2026 3:54pm

Request Review

@gemini-code-assist
Copy link

Summary of Changes

Hello @ComBba, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical issue where the 75-technique evaluation mode was failing to display results accurately on the frontend. The changes introduce robust backend logic to correctly parse and format BMAD category scores, which are then seamlessly integrated into the frontend with new, distinct themes and role mappings. This ensures that users receive comprehensive and visually appropriate evaluation feedback for the full_techniques mode.

Highlights

  • Backend Result Handling: The save_evaluation_results() function in the backend now includes a dedicated branch to correctly process and convert BMAD category scores for the full_techniques evaluation mode, scaling them to a 0-100 range.
  • Frontend Theme Integration: New SommelierTheme definitions have been added to sommeliers.ts for the four BMAD categories (Problem Definition, Technical Design, Implementation, Documentation), complete with unique visual elements like emojis, colors, and images.
  • API Role Mapping: The SOMMELIER_ROLES mapping in api.ts has been extended to correctly associate the new BMAD categories with their respective display roles, ensuring proper data interpretation on the frontend.
Changelog
  • backend/app/services/evaluation_service.py
    • Introduced an is_full_techniques flag to identify the 75-technique evaluation mode.
    • Added a new conditional branch to handle full_techniques evaluation data, extracting normalized_score, quality_gate, and coverage.
    • Implemented logic to iterate through BMAD_CATEGORY_CONFIG (Problem Definition, Technical Design, Implementation, Documentation), retrieve raw scores, and scale them to a 0-100 range.
    • Constructed detailed summaries and recommendations for each BMAD category based on trace metadata, then appended them as SommelierOutput objects.
  • frontend/src/lib/api.ts
    • Expanded the SOMMELIER_ROLES mapping to include new entries for 'Problem Definition', 'Technical Design', 'Implementation', and 'Documentation', mapping them to 'BMAD Category A' through 'D' respectively.
  • frontend/src/lib/sommeliers.ts
    • Added four new SommelierTheme objects: problemdefinition, technicaldesign, implementation, and documentation.
    • Each new theme includes properties such as id, name, role, emoji, color, bgColor, borderColor, textColor, lightBg, description, and image for distinct visual representation.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Feb 9, 2026

Warning

Rate limit exceeded

@ComBba has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 0 minutes and 24 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Walkthrough

백엔드 평가 서비스에 새로운 "full_techniques" 평가 모드를 추가하여 카테고리별 점수를 계산하고 상세한 평가 결과를 생성합니다. 프론트엔드는 네 개의 BMAD 카테고리 매핑과 해당 테마 정의를 추가하여 이를 지원합니다.

Changes

Cohort / File(s) Summary
Backend Evaluation Logic
backend/app/services/evaluation_service.py
새로운 "full_techniques" 평가 모드 분기 추가. BMAD_CATEGORY_CONFIG를 사용하여 네 개 카테고리별 점수 계산, 등급 도출, 카테고리별 SommelierOutput 생성 및 기술 진행 상황 추적.
Frontend API Mapping
frontend/src/lib/api.ts
getEvaluationResult의 SOMMELIER_ROLES 매핑에 네 개의 새로운 키 추가: 'Problem Definition', 'Technical Design', 'Implementation', 'Documentation'을 각각의 BMAD 카테고리로 매핑.
Frontend Theme Configuration
frontend/src/lib/sommeliers.ts
SOMMELIER_THEMES에 네 개의 새로운 항목 추가: problemdefinition, technicaldesign, implementation, documentation. 각각은 아이디, 이름, 역할, 이모지, 색상, 설명, 이미지를 포함한 완전한 SommelierTheme 객체.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 네 가지 새로운 카테고리가 부드럽게 들어오네,
평가의 우아함이 더해지고 나누어지며,
기술의 각 면을 조명하는 소믈리에들,
문제에서 문서까지, 우리의 여정은 완성되고,
세밀한 점수의 춤으로 완벽하게 마무리! 🍇✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed 제목은 75-기법 평가 모드에 대한 적절한 결과 처리 추가라는 주요 변경 사항을 명확하게 요약하고 있으며, 변경 집합의 핵심 목적과 완전히 일치합니다.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/full-techniques-sommelier-display

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

… technique details

- Change full_techniques result to use 8 tasting note categories instead of 4 BMAD categories
- Add TechniqueDetail model to backend and type to frontend
- Add expandable technique details section in SommelierCard showing individual technique results
- Add 8 tasting note themes (Aroma, Palate, Body, Finish, Balance, Vintage, Terroir, Cellar Notes)
- Each category shows success/failed/skipped techniques with expand/collapse UI
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@backend/app/services/evaluation_service.py`:
- Around line 371-372: The computed scaled_score can exceed 100 if raw_score >
max_score; update the scaling logic in evaluation_service.py (the raw_score,
max_score, scaled_score computation) to clamp the result into the 0–100 range
(e.g., compute the fraction, multiply by 100, then apply min(max(..., 0), 100)
and convert to int) so UI components like getScoreTier or progress bars never
receive values >100.
🧹 Nitpick comments (2)
backend/app/services/evaluation_service.py (2)

346-347: int() 대신 round()를 사용하는 것이 더 정확합니다.

int(normalized_score)는 소수점 이하를 버립니다 (예: 84.9 → 84). grand_tasting 경로에서는 int(aggregate_score * 20)을 사용하므로 비슷한 패턴이지만, 0-100 범위의 점수를 변환할 때는 round()가 사용자 기대에 더 부합합니다.

♻️ round() 적용 제안
-        overall_score = int(normalized_score)
+        overall_score = round(normalized_score)

358-363: BMAD_CATEGORY_CONFIG를 모듈 수준 상수로 추출하는 것을 고려하세요.

대문자 네이밍은 Python에서 모듈 수준 상수 컨벤션입니다. 현재 함수 내부에 정의되어 있어 호출마다 재생성됩니다. 성능에 큰 영향은 없지만, 테스트 용이성과 재사용성을 위해 모듈 상단으로 이동하는 것이 좋습니다.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses a bug where evaluation results for the full_techniques mode were not handled correctly. The changes introduce a new logic branch in the backend to process and format these results, and the frontend is updated accordingly with new themes and mappings to display them. The implementation is clear and follows the existing structure. I've included one suggestion to improve maintainability by defining a configuration dictionary as a module-level constant.

Comment on lines +358 to +363

summary = (
f"Comprehensive evaluation using 75 techniques. "
f"Quality Gate: {quality_gate}. "
f"Coverage: {coverage * 100:.1f}%."
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better maintainability and performance, consider defining BMAD_CATEGORY_CONFIG as a module-level constant outside of this function. Since it's a static configuration, it doesn't need to be redefined on every function call. This also makes it clearer that this is a fixed configuration for the full_techniques mode.

- Move TASTING_NOTE_CONFIG to module-level constant for maintainability
- Use round() instead of int() for normalized_score (more accurate)
- Clamp scaled_score to 0-100 range to prevent UI issues
@ComBba ComBba merged commit f8b4c5b into main Feb 9, 2026
4 of 5 checks passed
@ComBba ComBba deleted the fix/full-techniques-sommelier-display branch February 9, 2026 15:55
ComBba added a commit that referenced this pull request Feb 10, 2026
…lier-display

fix(full_techniques): add proper result handling for 75-technique evaluation mode
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant