Public GitLab repositories exposed more than 17,000 secrets
Public GitLab repos exposed 17,000+ secrets across 2,800 domains, leaking API keys, tokens and credentials, and highlighting critical DevOps security risks.
Following the scanning of all 5.6 million public repositories on GitLab Cloud, a security engineer found more than 17,000 exposed secrets across over 2,800 unique domains.
Using the open-source TruffleHog tool, Luke Marshall tested the code in the repositories for sensitive credentials, such as API keys, passwords, and tokens.
Previously, the same researcher had scanned Bitbucket, where he came across 6,212 secrets spread over a total of 2.6 million repositories. He also checked the Common Crawl dataset, which is used to train AI models, exposing 12,000 valid secrets.
GitLab is a web-based Git platform used by software developers, maintainers, and DevOps teams for code hosting, CI/CD operations, development collaboration, and repository management.
Marshall used a GitLab public API endpoint to enumerate each public GitLab Cloud repository, using a custom Python script to page through all results and sort them by project ID.
This process returned 5.6 million non-duplicate repositories; their names were then sent to an AWS SQS.
Next, an AWS Lambda function pulled the repository name from SQS, ran TruffleHog against it, and logged the results.
"Each Lambda invocation executed a simple TruffleHog scan command with concurrency set to 1000," describes Marshall.
This setup allowed me to complete the scan of 5,600,000 repositories in just over 24 hours.
The total costs using the above method for all of the public GitLab Cloud repositories was $770.
The researcher found 17,430 verified live secrets, nearly three times as many as in Bitbucket, and with a 35% higher secret density (secrets per repository), too.
In fact, historical data indicates that most leaked secrets are newer than 2018. However, Marshall also found some very older secrets dating from 2009, which are still valid today.
The largest number of leaked secrets, over 5,200 of them, were GCP credentials, followed by MongoDB keys, Telegram bot tokens, and OpenAI keys.
A little more than 400 leaked GitLab keys were also found in the repositories by the researcher.
In the spirit of responsible disclosure, and because the secrets discovered were associated with 2,804 unique domains, Marshall relied on automation to notify the affected parties. He used Claude Sonnet 3.7 with web search ability and created a Python script to generate emails.
He gathered several bug bounties amounting to $9,000 in the process.
The researcher reports that in response to his notifications, many organizations revoked their secrets. However, an undisclosed number remains exposed on GitLab.