Skip to content

GitLab Detector False Positive for webhooks headers #4493

@Maxi91f

Description

@Maxi91f

Summary

The GitLab detector incorrectly flags GitLab webhook UUIDs (specifically x-gitlab-event-uuid headers) as potential GitLab API tokens.

TruffleHog Version

trufflehog 3.90.8

Minimal Reproducible Example (MRE)

The detector requires:

  1. The keyword "gitlab" (triggers PrefixRegex(["gitlab"]))
  2. A 20-22 character string matching [a-zA-Z0-9\-=_] starting with alphanumeric
  3. Shannon entropy >= 3.6

File: gitlab_mre.json

{
  "gitlab": "a1b2-c3d4-e5f6a7b8c9d0"
}

This represents a minimal case where a 20-character hex string near "gitlab" triggers the detector, similar to how GitLab webhook UUIDs are falsely detected.

Command

trufflehog filesystem gitlab_mre.json --no-verification --fail

Output

Found unverified result 🐷🔑❓
Detector Type: Gitlab
Decoder Type: PLAIN
Raw result: a1b2-c3d4-e5f6a7b8c9d0
Rotation_guide: https://howtorotate.com/docs/tutorials/gitlab/
Version: 1
File: gitlab_mre.json
Line: 2

JSON Output (with --json flag)

{
  "SourceMetadata": {
    "Data": {
      "Filesystem": {
        "file": "gitlab_mre.json",
        "line": 2
      }
    }
  },
  "SourceID": 1,
  "SourceType": 15,
  "SourceName": "trufflehog - filesystem",
  "DetectorType": 9,
  "DetectorName": "Gitlab",
  "DetectorDescription": "GitLab is a web-based DevOps lifecycle tool that provides a Git repository manager providing wiki, issue-tracking, and CI/CD pipeline features. GitLab API tokens can be used to access and modify repository data and other resources.",
  "DecoderName": "PLAIN",
  "Verified": false,
  "VerificationFromCache": false,
  "Raw": "a1b2-c3d4-e5f6a7b8c9d0",
  "RawV2": "a1b2-c3d4-e5f6a7b8c9d0https://gitlab.com",
  "Redacted": "",
  "ExtraData": {
    "rotation_guide": "https://howtorotate.com/docs/tutorials/gitlab/",
    "version": "1"
  },
  "StructuredData": null
}

Root Cause Analysis

Current Pattern (from pkg/detectors/gitlab/v1/gitlab.go)

keyPat = regexp.MustCompile(detectors.PrefixRegex([]string{"gitlab"}) + `\b([a-zA-Z0-9][a-zA-Z0-9\-=_]{19,21})\b`)

Issue

The detector triggers on any 20-22 character alphanumeric string (with hyphens, equals, or underscores) that appears near the word "gitlab". This pattern matches:

  1. GitLab webhook UUIDs - Standard UUIDs in headers like x-gitlab-event-uuid
  2. GitLab URLs - Parts of GitLab documentation URLs (as noted in issue GitLab v1 detector: incorrect pattern resulting in false positives, 429 rate limits #3671)
  3. Other non-secret identifiers - Any UUID or identifier in GitLab-related contexts

Real-World Example

GitLab webhooks include standard headers like:

{
  ...
  "x-gitlab-event-uuid": "c3e2f2b7-a945-4c58-924b-38d8186e200a",
  ...
}

Any 20-22 character substring of the UUID (e.g., 4c58-924b-38d8186e200a) is detected as a potential token because:

  • It's near the keyword "gitlab" (in the header name)
  • It matches the pattern [a-zA-Z0-9\-=_]{20,22}
  • It has sufficient entropy

However, this is just a standard UUID event identifier, not an API token.

Related Issues

  • #3671 - GitLab v1 detector: incorrect pattern resulting in false positives, 429 rate limits
    • Documents similar issues with URL fragments being detected

I don't know if this can even be fixed, but currently this made us exclude gitlab check. We store many test scenarios with the word "gitlab" and some id near it, and even in the productive code we had to rename variables from gitlab to other words to bypass this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions