Fix #12271: Integrity checker for year, location, and page numbers in booktitle#15465
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Review Summary by QodoEnhance BooktitleChecker with year, location, and page detection
WalkthroughsDescription• Adds three new integrity checks to BooktitleChecker for year, location, and page numbers • Detects 4-digit years (1000–2999) in booktitle fields • Detects country names from UN-recognized list using pre-compiled regex • Detects explicit page-number patterns (pp., p., pages keywords) • Creates Countries utility class with hard-coded country name set • Adds comprehensive parameterized tests covering all new checks • Adds localization keys for new warning messages Diagramflowchart LR
BC["BooktitleChecker"]
YC["Year Check<br/>1000-2999"]
CC["Country Check<br/>UN list"]
PC["Page Check<br/>pp/pages"]
CO["Countries<br/>utility class"]
L10N["Localization<br/>keys"]
BC --> YC
BC --> CC
BC --> PC
CC --> CO
BC --> L10N
File Changes1. jablib/src/main/java/org/jabref/logic/integrity/BooktitleChecker.java
|
Code Review by Qodo
1.
|
|
The |
CI Infrastructure Issue:
|
✅ All tests passed ✅🏷️ Commit: 7a50c37 Learn more about TestLens at testlens.app. |
|
I would appreciate it if someone could review and merge it, please |
Enhance BooktitleChecker to flag booktitle values that contain: - A 4-digit year (e.g. 2015) - A country name (e.g. Norway, Austria, Singapore) - Explicit page-number patterns (e.g. "pp. 1–10", "pages 3-7") Add Countries.java with a hard-coded set of all UN-recognised country names used for the country-presence check. The set is built as a single pre-compiled regex alternation so the pattern is compiled only once. Update BooktitleCheckerTest with parameterised tests covering all three new integrity rules and the blank-value / valid-value edge cases. Closes JabRef#12271
f31059c to
5108512
Compare
- Remove separate Countries.java utility class; inline the Locale-derived country set directly into BooktitleContainsCountryChecker - Use Locale.of() instead of Locale.Builder (matches existing JabRef convention in Language.java) - Remove javadoc from new checkers to match existing checker style (BracketChecker, TitleChecker, YearChecker have none)
6d73d94 to
2268a62
Compare
Previously the year regex used digit-only lookarounds, so tokens like
ICML2015 matched the 2015 and triggered a false positive. Switching the
boundary to \p{Alnum} mirrors the country checker and prevents matching
years that are embedded inside larger alphanumeric tokens.
Also adds a regression test and a CHANGELOG entry under [Unreleased].
|
Pushed Qodo review finding 5 — year regex over-matches digits inside tokens The previous pattern Added a regression test in @Test
void booktitleYearNotFlaggedInsideAlphanumericToken() {
// "ICML2015" should NOT be flagged — the digits are part of a larger token
assertEquals(Optional.empty(), yearChecker.checkValue("Proceedings ICML2015"));
}CHANGELOG entry Added one line under
@Siedlerchr — your earlier inline feedback Both points should now be resolved on the current HEAD:
Windows CI red check on this PR The Windows-side failure ( Ready for re-review. |
Please avoid using AI to communicate with us, It is against our guidelines for contribution/interaction. |
|
Status update after the last push:
All other checks are green. Ready for re-review. |
As mentioned above, this would be a final warning - please respect our code of conduct. |
Sorry about that, you're right. It won't happen again. I'll write my own replies from here on. |
Related issues and pull requests
Closes #12271
PR Description
Adds three new integrity checks on the
booktitlefield that warn when content belongs in a dedicated field instead:ICML2015is left alone butECCTD 2015is flagged.Locale.getISOCountries()+Locale.of("", code).getDisplayCountry(Locale.ENGLISH)), with the same alphanumeric-boundary guard soUSA2015is not flagged.pp. 1234-1242orpages 1234.All four checkers (the original
BooktitleCheckerplus the three new ones) are registered againstStandardField.BOOKTITLEinFieldCheckers.Steps to test
@inproceedingsentry and set thebooktitlefield to:2015 {IEEE} International Conference on Digital Signal Processing, {DSP} 2015, Singapore→ flagged for year and locationEuropean Conference on Circuit Theory and Design, {ECCTD} 2015, Trondheim, Norway→ flagged for year and locationAdvances in Neural Information Processing Systems, pp. 1234-1242→ flagged for page numbersInternational Conference on Machine Learning→ no warningChecklist
CHANGELOG.mdin a way that can be understood by the average user (if change is visible to the user)