Skip to content

Conversation

@kpj2006
Copy link
Contributor

@kpj2006 kpj2006 commented Dec 25, 2025

The workflow applies labels in the following strict order:

  1. Issue-based labels (Step 1)

    • Fetches labels from linked issues
    • If no issue is linked, applies no-issue-linked label
    • Supports patterns: Fixes #123, Closes #123, Resolves #123, or #123
  2. File-based labels (Step 2)

    • Automatically detects file types and applies appropriate labels
    • Supported categories:
      • documentation - .md, README, CONTRIBUTING files
      • frontend - .html, .css, .jsx, .tsx files
      • backend - .py, .java, .go, .rb files
      • javascript - .js, .ts files
      • python - .py files
      • configuration - .yml, .json, .toml files
      • github-actions - workflow files
      • dependencies - package.json, requirements.txt
      • tests - test files and directories
      • docker - Dockerfile, docker-compose
      • ci-cd - CI/CD configuration files
  3. Contributor-based labels (Step 3)

    • first-time-contributor - For new contributors
    • external-contributor - For returning external contributors
    • member - For organization members
    • maintainer - For repository maintainers

    To create EXTERNAL_LABELLER_TOKEN in step 3:

    1. Go to GitHub Settings → Developer settings → Personal access tokens → Tokens (classic)
    2. Generate new token with repo and read:org scopes
    3. Add as repository secret: Settings → Secrets and variables → Actions → New repository secret
    4. Name: EXTERNAL_LABELLER_TOKEN

    how step 3 operate??
    1.Reads PR number and author username from the GitHub event payload.
    2.Calls GitHub API: is this user a member of the org?
    3.Who are you?
    ├─ org member → member
    ├─ external + new → first-time-contributor(if commitCount ≤ 1)
    ├─ external + existing → external-contributor
    └─ maintainer → maintainer (extra trust)

Summary by CodeRabbit

  • Chores
    • Introduced automated PR labeling workflow that labels pull requests based on linked issues, files changed, and contributor type (member, first-time contributor, external contributor, or maintainer).

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 25, 2025

📝 Walkthrough

Walkthrough

A new GitHub Actions workflow named "Sync PR Labels" has been introduced at .github/workflows/sync-pr-labels.yml. This workflow automatically synchronizes labels to pull requests based on three criteria: linked GitHub issues, modified files, and contributor type classification.

Changes

Cohort / File(s) Summary
GitHub Actions Workflow
\.github/workflows/sync-pr-labels\.yml
New workflow that labels PRs on open/reopen/synchronize/edit events. Implements issue-based labeling (extracts linked issue number via regex and copies labels), file-based labeling (maps changed files to predefined label categories), and contributor-based labeling (identifies member/first-time/external/maintainer status). Includes comprehensive API error handling and logging.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A hop through workflows, labels we sow,
PR metadata dances with labels aglow,
From issues to files, contributors too,
Syncing and sorting, automation anew!
The AOSSIE org now labels with flair,
Thanks to this workflow, organized with care.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'add sync-pr-labels' is directly related to the main change—introducing a new GitHub Actions workflow for automatically labeling pull requests.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
.github/workflows/sync-pr-labels.yml (2)

19-22: Remove unnecessary fetch-depth: 0.

The workflow doesn't analyze git history, so fetching the entire repository history is unnecessary. This slows down workflow execution without providing any benefit.

🔎 Proposed fix
       - name: Checkout repository
         uses: actions/checkout@v4
-        with:
-          fetch-depth: 0

66-190: Consider adding label existence checks.

The workflow assumes all labels exist in the repository. If a label doesn't exist, the addLabels API call will fail. Consider either:

  1. Creating labels automatically if they don't exist, or
  2. Filtering to only apply existing labels

This prevents workflow failures when labels are not pre-configured.

🔎 Example: Filter to existing labels

Add a helper step that fetches all repository labels, then filter the labels to apply:

// Get all available labels in the repository
const { data: repoLabels } = await github.rest.issues.listLabelsForRepo({
  owner: context.repo.owner,
  repo: context.repo.repo
});

const availableLabels = repoLabels.map(l => l.name);

// Filter labels to only those that exist
const labelsToApply = fileLabels.filter(label => availableLabels.includes(label));

if (labelsToApply.length > 0) {
  await github.rest.issues.addLabels({
    owner: context.repo.owner,
    repo: context.repo.repo,
    issue_number: prNumber,
    labels: labelsToApply
  });
}
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1df0dcd and 07f6015.

📒 Files selected for processing (1)
  • .github/workflows/sync-pr-labels.yml
🔇 Additional comments (4)
.github/workflows/sync-pr-labels.yml (4)

66-100: LGTM!

The issue-based labeling logic correctly fetches labels from the linked issue and handles both string and object label formats. Error handling is appropriate.


102-118: LGTM!

The fallback labeling for PRs without linked issues is correctly implemented.


258-279: LGTM!

The summary step provides useful observability by logging all applied labels.


193-227: Configure EXTERNAL_LABELLER_TOKEN with org read permissions or organization membership checks will silently fail.

The orgs.checkMembershipForUser API call (line 215) requires OAuth scope read:org or equivalent org-level permissions, which the default GITHUB_TOKEN cannot have. Without EXTERNAL_LABELLER_TOKEN properly configured as a repository secret with the required scopes, this call will fail and the catch block (line 220) will treat all users as non-members, incorrectly labeling organization members as external contributors or first-time contributors.

Comment on lines +3 to +22
on:
pull_request:
types: [opened, reopened, synchronize, edited]
pull_request_target:
types: [opened, reopened, synchronize, edited]

permissions:
contents: read
pull-requests: write
issues: read

jobs:
sync-labels:
if: ${{ github.repository_owner == 'AOSSIE-Org' }}
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Remove pull_request_target to prevent security vulnerability.

Using pull_request_target with repository checkout creates a serious security risk. The pull_request_target trigger runs workflows from the base branch with write permissions, and checking out the code (line 19-22) allows malicious code from forked PRs to execute with elevated privileges.

Since this workflow only needs to read PR metadata and apply labels (no untrusted code execution), use only the pull_request trigger. The pull-requests: write permission is sufficient for labeling operations.

🔎 Recommended fix
 on:
   pull_request:
     types: [opened, reopened, synchronize, edited]
-  pull_request_target:
-    types: [opened, reopened, synchronize, edited]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
on:
pull_request:
types: [opened, reopened, synchronize, edited]
pull_request_target:
types: [opened, reopened, synchronize, edited]
permissions:
contents: read
pull-requests: write
issues: read
jobs:
sync-labels:
if: ${{ github.repository_owner == 'AOSSIE-Org' }}
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
on:
pull_request:
types: [opened, reopened, synchronize, edited]
permissions:
contents: read
pull-requests: write
issues: read
jobs:
sync-labels:
if: ${{ github.repository_owner == 'AOSSIE-Org' }}
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
🤖 Prompt for AI Agents
.github/workflows/sync-pr-labels.yml lines 3-22: the workflow uses the
pull_request_target trigger which combined with actions/checkout is a security
risk; remove the entire pull_request_target block so only pull_request remains,
keep the types array (opened, reopened, synchronize, edited) on pull_request,
and ensure permissions remain minimal (contents: read and pull-requests: write)
for labeling operations; do not check out PR code in this workflow.

Comment on lines +44 to +64
script: |
const prBody = context.payload.pull_request.body || '';

// Match patterns: Fixes #123, Closes #123, Resolves #123, etc.
const issuePatterns = [
/(?:fix|fixes|fixed|close|closes|closed|resolve|resolves|resolved)\s+#(\d+)/gi,
/#(\d+)/g
];

let issueNumber = null;
for (const pattern of issuePatterns) {
const match = prBody.match(pattern);
if (match) {
const numbers = match.map(m => m.match(/\d+/)[0]);
issueNumber = numbers[0];
break;
}
}

core.setOutput('issue_number', issueNumber || '');
return issueNumber;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix overly broad issue number pattern.

The regex pattern /#(\d+)/g on line 50 will match ANY #number in the PR body, causing false positives. For example, it would incorrectly extract issue numbers from version tags, commit references, or arbitrary text containing # followed by digits.

The first pattern already handles standard linking keywords. If you want to support bare #123 references, ensure they appear in a linking context (e.g., at the start of a line or after whitespace).

🔎 Recommended fix
             // Match patterns: Fixes #123, Closes #123, Resolves #123, etc.
             const issuePatterns = [
               /(?:fix|fixes|fixed|close|closes|closed|resolve|resolves|resolved)\s+#(\d+)/gi,
-              /#(\d+)/g
+              /(?:^|\s)#(\d+)(?:\s|$)/gm
             ];

This revised pattern matches #number only when it appears at the start of a line or after whitespace, and is followed by whitespace or end of line.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
script: |
const prBody = context.payload.pull_request.body || '';
// Match patterns: Fixes #123, Closes #123, Resolves #123, etc.
const issuePatterns = [
/(?:fix|fixes|fixed|close|closes|closed|resolve|resolves|resolved)\s+#(\d+)/gi,
/#(\d+)/g
];
let issueNumber = null;
for (const pattern of issuePatterns) {
const match = prBody.match(pattern);
if (match) {
const numbers = match.map(m => m.match(/\d+/)[0]);
issueNumber = numbers[0];
break;
}
}
core.setOutput('issue_number', issueNumber || '');
return issueNumber;
script: |
const prBody = context.payload.pull_request.body || '';
// Match patterns: Fixes #123, Closes #123, Resolves #123, etc.
const issuePatterns = [
/(?:fix|fixes|fixed|close|closes|closed|resolve|resolves|resolved)\s+#(\d+)/gi,
/(?:^|\s)#(\d+)(?:\s|$)/gm
];
let issueNumber = null;
for (const pattern of issuePatterns) {
const match = prBody.match(pattern);
if (match) {
const numbers = match.map(m => m.match(/\d+/)[0]);
issueNumber = numbers[0];
break;
}
}
core.setOutput('issue_number', issueNumber || '');
return issueNumber;
🤖 Prompt for AI Agents
In .github/workflows/sync-pr-labels.yml around lines 44 to 64 the second regex
/#(\d+)/g is too broad and matches any “#number” (versions, refs) causing false
positives; replace it with a stricter pattern that only matches bare issue refs
at start-of-line or after whitespace and followed by whitespace or end-of-line
(e.g. use a regex like (?:^|\s)#(\d+)(?=\s|$)), keep the loop logic the same and
use the capture group to set issueNumber so only true bare #123 references are
picked up.

Comment on lines +166 to +177
// Check each file against label mappings
for (const file of changedFiles) {
for (const [label, patterns] of Object.entries(labelMappings)) {
for (const pattern of patterns) {
if (file.includes(pattern) || file.endsWith(pattern)) {
if (!fileLabels.includes(label)) {
fileLabels.push(label);
}
}
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix file pattern matching to prevent false positives.

The pattern matching logic on line 170 uses file.includes(pattern), which will cause false positives:

  • 'test/' matches 'my-test-utils.js' and 'contest/foo.js'
  • '.test.' matches 'latest.release.js'
  • Directory patterns don't distinguish between path segments

This could apply incorrect labels to PRs.

🔎 Recommended fix
             // Check each file against label mappings
             for (const file of changedFiles) {
               for (const [label, patterns] of Object.entries(labelMappings)) {
                 for (const pattern of patterns) {
-                  if (file.includes(pattern) || file.endsWith(pattern)) {
+                  let matches = false;
+                  if (pattern.endsWith('/')) {
+                    // Directory pattern
+                    matches = file.startsWith(pattern) || file.includes(`/${pattern}`);
+                  } else if (pattern.startsWith('.') && !pattern.includes('/')) {
+                    // Extension pattern
+                    matches = file.endsWith(pattern);
+                  } else {
+                    // Filename or substring pattern
+                    matches = file.includes(pattern);
+                  }
+                  
+                  if (matches) {
                     if (!fileLabels.includes(label)) {
                       fileLabels.push(label);
                     }
                   }
                 }
               }
             }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Check each file against label mappings
for (const file of changedFiles) {
for (const [label, patterns] of Object.entries(labelMappings)) {
for (const pattern of patterns) {
if (file.includes(pattern) || file.endsWith(pattern)) {
if (!fileLabels.includes(label)) {
fileLabels.push(label);
}
}
}
}
}
// Check each file against label mappings
for (const file of changedFiles) {
for (const [label, patterns] of Object.entries(labelMappings)) {
for (const pattern of patterns) {
let matches = false;
if (pattern.endsWith('/')) {
// Directory pattern
matches = file.startsWith(pattern) || file.includes(`/${pattern}`);
} else if (pattern.startsWith('.') && !pattern.includes('/')) {
// Extension pattern
matches = file.endsWith(pattern);
} else {
// Filename or substring pattern
matches = file.includes(pattern);
}
if (matches) {
if (!fileLabels.includes(label)) {
fileLabels.push(label);
}
}
}
}
}
🤖 Prompt for AI Agents
In .github/workflows/sync-pr-labels.yml around lines 166-177, the current file
matching uses file.includes(pattern) which creates false positives (e.g.,
'test/' matching 'my-test-utils.js'); change the logic to do precise matches:
treat patterns ending with '/' as directory prefixes (use normalized path and
file.split('/') checks or file.startsWith(normalizedDir)), treat patterns that
look like file extensions or exact filenames using file.endsWith(pattern) or
exact equality, and for complex globs add a proper glob matcher
(minimatch/micromatch) and use that instead of includes so only true
path/filename matches produce labels.

Comment on lines +211 to +242
const contributorLabels = [];

// Check if contributor is a member of the organization
try {
await github.rest.orgs.checkMembershipForUser({
org: context.repo.owner,
username: prAuthor
});
contributorLabels.push('member');
} catch (error) {
// Not a member
if (commits.data.length <= 1) {
contributorLabels.push('first-time-contributor');
} else {
contributorLabels.push('external-contributor');
}
}

// Check if PR author is a collaborator
try {
const permissionLevel = await github.rest.repos.getCollaboratorPermissionLevel({
owner: context.repo.owner,
repo: context.repo.repo,
username: prAuthor
});

if (permissionLevel.data.permission === 'admin' || permissionLevel.data.permission === 'maintain') {
contributorLabels.push('maintainer');
}
} catch (error) {
console.log('Could not check collaborator status');
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Prevent overlapping contributor labels.

A user can receive both member and maintainer labels (lines 219 and 238), which is redundant. The logic should apply only one label based on the highest privilege level.

Consider this hierarchy: maintainer > member > external-contributor > first-time-contributor.

🔎 Recommended fix
               const contributorLabels = [];
+              let contributorType = null;
               
               // Check if contributor is a member of the organization
               try {
                 await github.rest.orgs.checkMembershipForUser({
                   org: context.repo.owner,
                   username: prAuthor
                 });
-                contributorLabels.push('member');
+                contributorType = 'member';
               } catch (error) {
                 // Not a member
                 if (commits.data.length <= 1) {
-                  contributorLabels.push('first-time-contributor');
+                  contributorType = 'first-time-contributor';
                 } else {
-                  contributorLabels.push('external-contributor');
+                  contributorType = 'external-contributor';
                 }
               }
               
               // Check if PR author is a collaborator
               try {
                 const permissionLevel = await github.rest.repos.getCollaboratorPermissionLevel({
                   owner: context.repo.owner,
                   repo: context.repo.repo,
                   username: prAuthor
                 });
                 
                 if (permissionLevel.data.permission === 'admin' || permissionLevel.data.permission === 'maintain') {
-                  contributorLabels.push('maintainer');
+                  contributorType = 'maintainer';
                 }
               } catch (error) {
                 console.log('Could not check collaborator status');
               }
               
-              if (contributorLabels.length > 0) {
-                console.log(`Applying contributor-based labels: ${contributorLabels.join(', ')}`);
+              if (contributorType) {
+                console.log(`Applying contributor-based label: ${contributorType}`);
                 
                 await github.rest.issues.addLabels({
                   owner: context.repo.owner,
                   repo: context.repo.repo,
                   issue_number: prNumber,
-                  labels: contributorLabels
+                  labels: [contributorType]
                 });
               }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In .github/workflows/sync-pr-labels.yml around lines 211 to 242, the current
logic can push multiple overlapping contributor labels (e.g., both "member" and
"maintainer"); instead determine a single highest-priority label per the
hierarchy maintainer > member > external-contributor > first-time-contributor.
Change the flow to compute a single role variable (start as null), set it to
"member" only if membership check succeeds, then run the collaborator permission
check and if permission is "admin" or "maintain" override the role to
"maintainer"; if membership check failed, set role to "first-time-contributor"
or "external-contributor" based on commit count, and finally push only that
single role label (and avoid pushing duplicates or multiple contributor labels).

@Zahnentferner Zahnentferner merged commit 9b1a35b into AOSSIE-Org:main Dec 25, 2025
1 of 2 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Dec 25, 2025
@kpj2006
Copy link
Contributor Author

kpj2006 commented Dec 25, 2025

superseded: #11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants