Skip to content

feat(search): 更精确的资源中文搜索#2631

Merged
Big-Cake-jpg merged 1 commit intodevfrom
feat/comp-search
Apr 5, 2026
Merged

feat(search): 更精确的资源中文搜索#2631
Big-Cake-jpg merged 1 commit intodevfrom
feat/comp-search

Conversation

@Pigeon0v0
Copy link
Copy Markdown
Contributor

@Pigeon0v0 Pigeon0v0 commented Mar 21, 2026

部分代码来自主线 2.12.3 更新

来自 Sourcery 的总结

在改进相关元数据处理和日志记录的同时,提高中文资源搜索的准确性和搜索权重。

新功能:

  • 为中文搜索引入备用的 CurseForge 搜索文本,以更好地匹配 CurseForge API 的行为。

缺陷修复:

  • 防止在解析 CurseForge 和 Modrinth 版本列表时出现重复的 Minecraft 游戏版本。

改进优化:

  • 优化中文搜索词的提取和权重分配,从中获取更准确的英文关键词,用于跨平台的 Mod 搜索。
  • 将加权搜索来源模型进行泛化,以支持每个来源拥有多个别名,并更新所有调用方以使用新的抽象层。
  • 调整项目评分机制,更好地奖励精确匹配,并对相似度贡献进行归一化处理。
  • 改进关键词后处理逻辑,包括对 OptiForge 和 OptiFabric 的特殊处理,以及更健壮的噪音词过滤。
  • 在去重日志中增加当前结果数量信息,便于调试。
  • 跳过下载不必要的 Modrinth 更新日志数据,以减小请求负载。
Original summary in English

Summary by Sourcery

Improve Chinese resource search accuracy and search weighting while refining related metadata handling and logging.

New Features:

  • Introduce an alternative CurseForge search text specifically for Chinese searches to better align with CurseForge API behavior.

Bug Fixes:

  • Prevent duplicate Minecraft game versions from appearing in parsed CurseForge and Modrinth version lists.

Enhancements:

  • Refine Chinese search term extraction and weighting to derive more accurate English keywords for Mod search across platforms.
  • Generalize the weighted search source model to support multiple aliases per source and update all callers to use the new abstraction.
  • Adjust project scoring to better reward exact matches and normalize similarity contributions.
  • Improve keyword post-processing, including special handling for OptiForge and OptiFabric, and more robust filtering of noisy terms.
  • Enrich de-duplication logs with current result counts for easier debugging.
  • Skip downloading unnecessary Modrinth changelog data to reduce request payloads.

部分代码来自主线 2.12.3 更新
@pcl-ce-automation pcl-ce-automation bot added 🛠️ 等待审查 Pull Request 已完善,等待维护者或负责人进行代码审查 size: L PR 大小评估:大型 labels Mar 21, 2026
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai bot commented Mar 21, 2026

审阅者指南

通过引入结构化的 SearchSource 抽象(支持别名)、改进关键词提取和权重分配、增加 CurseForge 特有的搜索文本处理、调优评分逻辑,并做若干相关 API 与数据清理改动,从而优化多语言搜索系统(尤其是中文查询),覆盖模组、收藏、本地文件、存档以及帮助页面等场景。

带 CurseForge 特殊处理的中文模组搜索时序图

sequenceDiagram
    actor User
    participant UI_SearchBox
    participant ModSearchService
    participant CurseForgeAPI
    participant ModrinthAPI

    User->>UI_SearchBox: 输入中文关键词
    UI_SearchBox->>ModSearchService: StartSearch(filter)

    ModSearchService->>ModSearchService: Build SearchEntry list
    ModSearchService->>ModSearchService: Search(entries, SearchText, 40, 0.2)
    ModSearchService->>ModSearchService: ExtractWords() per result
    ModSearchService->>ModSearchService: Aggregate WordWeights
    ModSearchService->>ModSearchService: Choose SearchText and CurseForgeAltSearchText
    ModSearchService->>ModSearchService: processKeywords(SearchText)
    ModSearchService->>ModSearchService: processKeywords(CurseForgeAltSearchText)

    ModSearchService->>CurseForgeAPI: GET /mods?searchFilter=CurseForgeAltSearchText
    ModSearchService->>ModrinthAPI: GET /search?query=SearchText

    CurseForgeAPI-->>ModSearchService: CurseForge results
    ModrinthAPI-->>ModSearchService: Modrinth results

    ModSearchService->>UI_SearchBox: Display merged search results
Loading

带有 SearchSource 和 SearchEntry 的更新搜索模型类图

classDiagram
    class SearchEntry_T_ {
        +T Item
        +List_SearchSource_ SearchSource
        +double Similarity
        +bool AbsoluteRight
    }

    class SearchSource {
        +string[] Aliases
        +double Weight
        +SearchSource(aliases string[], weight double)
        +SearchSource(text string, weight double)
    }

    class SearchModule {
        +double SearchSimilarityWeighted(source List_SearchSource_, query string)
        +List_SearchEntry_T_ Search(entries List_SearchEntry_T_, query string, maxBlurCount int, minBlurSimilarity double)
    }

    class CompSearchRequest {
        +string SearchText
        +string CurseForgeAltSearchText
    }

    SearchEntry_T_ --> SearchSource : uses *
    SearchModule --> SearchSource : weights
    SearchModule --> SearchEntry_T_ : evaluates
    SearchModule --> CompSearchRequest : fills
Loading

文件级改动

Change Details Files
引入 SearchSource 抽象并更新加权相似度/搜索逻辑,以支持每个文本来源拥有多个别名。
  • 将 SearchEntry 及所有调用点中的 SearchSource 从 List(Of KeyValuePair(Of String, Double)) 替换为 List(Of SearchSource)。
  • 实现带别名数组和权重的 SearchSource 类,并提供基于文本和基于别名数组的构造函数。
  • 更新 SearchSimilarityWeighted,使其对每个来源使用其别名中的最大相似度,并按来源权重加权。
  • 在 Search 函数中进行精确片段匹配前,对别名进行规范化处理(去掉空格、转小写)。
Plain Craft Launcher 2/Modules/Base/ModBase.vb
Plain Craft Launcher 2/Pages/PageInstance/PageInstanceCompResource.xaml.vb
Plain Craft Launcher 2/Pages/PageInstance/PageInstanceSaves/PageInstanceSavesDatapack.xaml.vb
Plain Craft Launcher 2/Pages/PageTools/PageToolsHelp.xaml.vb
Plain Craft Launcher 2/Pages/PageDownload/PageDownloadCompFavorites.xaml.vb
Plain Craft Launcher 2/Pages/PageInstance/PageInstanceSaves.xaml.vb
通过提取和加权候选英文关键词,以及处理 CurseForge 特有的搜索过滤行为,改进中文模组搜索效果。
  • 为中文名称构建 SearchEntry 的来源列表时,使用别名(主名称与后缀/slug 组合)并分配不同权重。
  • 在中文搜索场景中,增加 Search() 的结果窗口大小,并调整最小相似度阈值。
  • 从顶部搜索结果中提取类英文单词,过滤停用词/数字/特殊情况,并根据相似度累积词权重,对精确别名匹配给予极高权重。
  • 基于加权单词推导 Request.SearchText 和新增的 Request.CurseForgeAltSearchText,对精确匹配与模糊匹配使用不同的选择规则,并记录选出的关键词日志。
  • 引入 processKeywords 辅助函数对关键词进行规范化/过滤,对 SearchText 和 CurseForgeAltSearchText 复用该逻辑,同时保留 OptiForge/OptiFabric 特殊处理。
  • 在构建 CurseForge API 的 searchFilter 时优先使用 CurseForgeAltSearchText,当其为 null 时回退到 SearchText。
Plain Craft Launcher 2/Modules/Minecraft/ModComp.vb
调整组件搜索与元数据处理的评分与结果处理方式。
  • 修改相似度对 Scores 的贡献方式:绝对正确匹配获得固定且强的加分;当首个结果为绝对正确匹配时,其他结果采用不同的相对缩放方式。
  • 在把排好序的 Scores 加入 Storage.Results 前增加中止检查,以尊重任务取消。
  • 改进结果累积日志,增加当前 Storage.Results 数量信息。
  • 在排序和裁剪前,对来自 CurseForge 和 Modrinth 的 GameVersions 列表进行去重。
  • 在 Modrinth 版本 API 请求中增加 include_changelog=false,以减少响应载荷大小。
Plain Craft Launcher 2/Modules/Minecraft/ModComp.vb

可能关联的问题

  • #无编号(mod搜索优化建议): 该 PR 重写并增强中文搜索与关键词权重和匹配逻辑,直接提升 mod 搜索准确度,对应该建议。

提示与命令

与 Sourcery 交互

  • 触发新一轮审查: 在 Pull Request 中评论 @sourcery-ai review
  • 继续讨论: 直接回复 Sourcery 的审查评论。
  • 从审查评论生成 GitHub issue: 在审查评论下请求 Sourcery 从该评论创建 issue。你也可以直接回复审查评论 @sourcery-ai issue 来从该评论创建 issue。
  • 生成 Pull Request 标题: 在 Pull Request 标题的任意位置写上 @sourcery-ai,即可随时生成标题。也可以在 Pull Request 中评论 @sourcery-ai title 来(重新)生成标题。
  • 生成 Pull Request 摘要: 在 Pull Request 正文任意位置写上 @sourcery-ai summary,即可在对应位置生成 PR 摘要。也可以在 Pull Request 中评论 @sourcery-ai summary 来(重新)生成摘要。
  • 生成审阅者指南: 在 Pull Request 中评论 @sourcery-ai guide,即可随时(重新)生成审阅者指南。
  • 一次性解决所有 Sourcery 评论: 在 Pull Request 中评论 @sourcery-ai resolve,即可将所有 Sourcery 评论标记为已解决。如果你已经处理完所有评论且不想再看到它们,这会很有用。
  • 一次性忽略所有 Sourcery 审查: 在 Pull Request 中评论 @sourcery-ai dismiss,即可忽略所有已有的 Sourcery 审查。尤其适用于你想从头开始新的审查时——别忘了再评论 @sourcery-ai review 来触发新一轮审查!

自定义你的使用体验

打开你的 控制面板 以:

  • 启用或禁用审查功能,例如 Sourcery 自动生成的 Pull Request 摘要、审阅者指南等。
  • 更改审查语言。
  • 添加、移除或编辑自定义审查指令。
  • 调整其它审查相关设置。

获取帮助

Original review guide in English

Reviewer's Guide

Refines the multilingual search system (especially for Chinese queries) across mods, favorites, local files, saves, and help pages by introducing a structured SearchSource abstraction with aliases, improving keyword extraction and weighting, adding CurseForge-specific search text handling, tuning scoring, and making a few related API and data-cleanup adjustments.

Sequence diagram for refined Chinese mod search with CurseForge-specific handling

sequenceDiagram
    actor User
    participant UI_SearchBox
    participant ModSearchService
    participant CurseForgeAPI
    participant ModrinthAPI

    User->>UI_SearchBox: 输入中文关键词
    UI_SearchBox->>ModSearchService: StartSearch(filter)

    ModSearchService->>ModSearchService: Build SearchEntry list
    ModSearchService->>ModSearchService: Search(entries, SearchText, 40, 0.2)
    ModSearchService->>ModSearchService: ExtractWords() per result
    ModSearchService->>ModSearchService: Aggregate WordWeights
    ModSearchService->>ModSearchService: Choose SearchText and CurseForgeAltSearchText
    ModSearchService->>ModSearchService: processKeywords(SearchText)
    ModSearchService->>ModSearchService: processKeywords(CurseForgeAltSearchText)

    ModSearchService->>CurseForgeAPI: GET /mods?searchFilter=CurseForgeAltSearchText
    ModSearchService->>ModrinthAPI: GET /search?query=SearchText

    CurseForgeAPI-->>ModSearchService: CurseForge results
    ModrinthAPI-->>ModSearchService: Modrinth results

    ModSearchService->>UI_SearchBox: Display merged search results
Loading

Class diagram for updated search model with SearchSource and SearchEntry

classDiagram
    class SearchEntry_T_ {
        +T Item
        +List_SearchSource_ SearchSource
        +double Similarity
        +bool AbsoluteRight
    }

    class SearchSource {
        +string[] Aliases
        +double Weight
        +SearchSource(aliases string[], weight double)
        +SearchSource(text string, weight double)
    }

    class SearchModule {
        +double SearchSimilarityWeighted(source List_SearchSource_, query string)
        +List_SearchEntry_T_ Search(entries List_SearchEntry_T_, query string, maxBlurCount int, minBlurSimilarity double)
    }

    class CompSearchRequest {
        +string SearchText
        +string CurseForgeAltSearchText
    }

    SearchEntry_T_ --> SearchSource : uses *
    SearchModule --> SearchSource : weights
    SearchModule --> SearchEntry_T_ : evaluates
    SearchModule --> CompSearchRequest : fills
Loading

File-Level Changes

Change Details Files
Introduce SearchSource abstraction and update weighted similarity/search logic to support multiple aliases per text source.
  • Replace SearchSource from List(Of KeyValuePair(Of String, Double)) to List(Of SearchSource) in SearchEntry and all call sites.
  • Implement SearchSource class with alias array and weight, plus constructors for text and alias arrays.
  • Update SearchSimilarityWeighted to use max similarity over aliases per source, weighted by source weight.
  • Normalize aliases (remove spaces, lowercase) before exact-part matching in the Search function.
Plain Craft Launcher 2/Modules/Base/ModBase.vb
Plain Craft Launcher 2/Pages/PageInstance/PageInstanceCompResource.xaml.vb
Plain Craft Launcher 2/Pages/PageInstance/PageInstanceSaves/PageInstanceSavesDatapack.xaml.vb
Plain Craft Launcher 2/Pages/PageTools/PageToolsHelp.xaml.vb
Plain Craft Launcher 2/Pages/PageDownload/PageDownloadCompFavorites.xaml.vb
Plain Craft Launcher 2/Pages/PageInstance/PageInstanceSaves.xaml.vb
Improve Chinese mod search by extracting and weighting candidate English keywords and handling CurseForge-specific search filter behavior.
  • Build SearchEntry sources for Chinese names using aliases (primary name and suffix/slug combination with different weights).
  • Increase Search() result window and adjust minimum similarity for Chinese search.
  • Extract English-like words from top search results, filter stopwords/numbers/special cases, and accumulate word weights based on similarity, giving very high weight to exact alias matches.
  • Derive Request.SearchText and a new Request.CurseForgeAltSearchText from weighted words, with different selection rules for exact vs fuzzy matches, and log the chosen keywords.
  • Introduce processKeywords helper to normalize/filter keywords, reuse for both SearchText and CurseForgeAltSearchText, and keep OptiForge/OptiFabric special-case handling.
  • Use CurseForgeAltSearchText when building CurseForge API searchFilter, falling back to SearchText when alternative text is null.
Plain Craft Launcher 2/Modules/Minecraft/ModComp.vb
Tune scoring and result handling for component searches and metadata.
  • Change similarity contribution to Scores so that absolute-right matches get a fixed strong bonus and relative scaling differs when top result is absolute-right.
  • Add an abort check before adding sorted Scores to Storage.Results to respect task cancellation.
  • Improve logging of result accumulation to include current Storage.Results count.
  • Deduplicate GameVersions lists from CurseForge and Modrinth before sorting and trimming.
  • Add include_changelog=false to Modrinth versions API request to reduce payload.
Plain Craft Launcher 2/Modules/Minecraft/ModComp.vb

Possibly linked issues

  • #无编号(mod搜索优化建议): 该 PR重写并增强中文搜索与关键词权重和匹配逻辑,直接提升 mod 搜索准确度,对应该建议。

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - 我在这里给出了一些高层次的反馈:

  • Search 中,processedSources 现在是一个 List(Of String()),并且在构建之后似乎没有再被使用;如果你的意图只是就地规范化 Aliases,可以考虑把当前的 Select 改成对 Entry.SearchSource 做一个简单的 For Each,并完全去掉 processedSources,以避免混淆和不必要的内存分配。
  • ExtractWords lambda 中,条件 If w.Split(" ").Count > 3 AndAlso w.Contains("ftb") Then Return False 永远不可能为真,因为 w 已经是单个 token(不包含空格);建议移除或重写这个检查,让 FTB 的特殊处理逻辑能按预期生效。
  • 新的 SearchSimilarityWeighted 实现假定 totalWeight > 0;如果调用方传入的是空的或全部为 0 的 SearchSource 列表,会导致除以 0 的错误,因此在 totalWeight 为 0 时提前返回 0 可能会更安全。
给 AI 智能体的提示
Please address the comments from this code review:

## Overall Comments
-`Search` 中,`processedSources` 现在是一个 `List(Of String())`,并且在构建之后似乎没有再被使用;如果你的意图只是就地规范化 `Aliases`,可以考虑把当前的 `Select` 改成对 `Entry.SearchSource` 做一个简单的 `For Each`,并完全去掉 `processedSources`,以避免混淆和不必要的内存分配。
-`ExtractWords` lambda 中,条件 `If w.Split(" ").Count > 3 AndAlso w.Contains("ftb") Then Return False` 永远不可能为真,因为 `w` 已经是单个 token(不包含空格);建议移除或重写这个检查,让 FTB 的特殊处理逻辑能按预期生效。
- 新的 `SearchSimilarityWeighted` 实现假定 `totalWeight > 0`;如果调用方传入的是空的或全部为 0 的 `SearchSource` 列表,会导致除以 0 的错误,因此在 `totalWeight` 为 0 时提前返回 0 可能会更安全。

Sourcery 对开源项目免费——如果你觉得我们的评审有帮助,欢迎分享 ✨
帮我变得更有用!请在每条评论上点击 👍 或 👎,我会根据你的反馈来改进后续的评审。
Original comment in English

Hey - I've left some high level feedback:

  • In Search, processedSources is now a List(Of String()) and appears to be unused after being built; if the intent is only to normalize Aliases in place, consider replacing the Select with a simple For Each over Entry.SearchSource and dropping processedSources altogether to avoid confusion and unnecessary allocations.
  • In the ExtractWords lambda, the condition If w.Split(" ").Count > 3 AndAlso w.Contains("ftb") Then Return False will never be true because w is already a single token (no spaces); consider removing or rewriting this check so the FTB special case actually works as intended.
  • The new SearchSimilarityWeighted implementation assumes totalWeight > 0; if a caller ever passes an empty or all‑zero SearchSource list this will cause a divide‑by‑zero, so it may be safer to early‑return 0 when totalWeight is 0.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `Search`, `processedSources` is now a `List(Of String())` and appears to be unused after being built; if the intent is only to normalize `Aliases` in place, consider replacing the `Select` with a simple `For Each` over `Entry.SearchSource` and dropping `processedSources` altogether to avoid confusion and unnecessary allocations.
- In the `ExtractWords` lambda, the condition `If w.Split(" ").Count > 3 AndAlso w.Contains("ftb") Then Return False` will never be true because `w` is already a single token (no spaces); consider removing or rewriting this check so the FTB special case actually works as intended.
- The new `SearchSimilarityWeighted` implementation assumes `totalWeight > 0`; if a caller ever passes an empty or all‑zero `SearchSource` list this will cause a divide‑by‑zero, so it may be safer to early‑return 0 when `totalWeight` is 0.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@pcl-ce-automation pcl-ce-automation bot added 🕑 等待合并 已处理完毕,正在等待代码合并入主分支 and removed 🛠️ 等待审查 Pull Request 已完善,等待维护者或负责人进行代码审查 labels Apr 5, 2026
@MoYuan-CN MoYuan-CN linked an issue Apr 5, 2026 that may be closed by this pull request
4 tasks
@Big-Cake-jpg Big-Cake-jpg merged commit 9c9db3f into dev Apr 5, 2026
3 checks passed
@pcl-ce-automation pcl-ce-automation bot added 👌 完成 相关问题已修复或功能已实现,计划在下次版本更新时正式上线 and removed 🕑 等待合并 已处理完毕,正在等待代码合并入主分支 labels Apr 5, 2026
@Big-Cake-jpg Big-Cake-jpg deleted the feat/comp-search branch April 5, 2026 15:56
@LuLu-ling LuLu-ling mentioned this pull request Apr 5, 2026
81 tasks
SALTWOOD added a commit to PCL-Community/PCL-CSharpE that referenced this pull request Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size: L PR 大小评估:大型 👌 完成 相关问题已修复或功能已实现,计划在下次版本更新时正式上线

Projects

None yet

Development

Successfully merging this pull request may close these issues.

搜索Mod时,无法显示对应的Mod

4 participants