Abstract rendering of a motherboard.
Cyber Security

Newly Open-Sourced Tool from Comcast Helps Developers Protect Sensitive Info

By Vaibhav Garg
Newly Open-Sourced Tool from Comcast Helps Developers Protect Sensitive Info

GitHub is a vital resource for developers, but with so much information being uploaded every day, it also presents a unique cybersecurity challenge. That’s why we’re so excited to announce that xGitGuardTM – a tool built by Comcast technologists to keep authentication secrets out of open-source repositories – is now available as open-source software.

GitHub is the largest open-source community with millions of repositories and users. Our own Comcast development teams work on GitHub every day – whether it is publishing code, collaborating on software development, or using code from GitHub for new projects and solutions.  

xGitGuardTM was developed by Dr. Bahman Rashidi, Director of Comcast Cable’s Cybersecurity & Privacy Engineering Research team, to address the global issue of potential authentication secrets being inadvertently uploaded to GitHub. xGitGuardTM can be used to scan GitHub at scale and identify proprietary authentication secrets, specifically passwords, API keys, and tokens.

Within Comcast, xGitGuardTM is used by a variety of teams. Software development teams use xGitGuardTM to identify the presence of credentials in their own repositories. Our Product Security Incident Response Team (PSIRT) team has successfully used it to detect and address potential issues with Comcast GitHub accounts. Since 2020, when xGitGuardTM was first deployed, the tool has empowered our teams to take advantage of the full technical benefit of GitHub while also maintaining the peace of mind that their work isn’t inadvertently opening the door to external threats.

We’re now excited to share this technology with the global open-source community, and to see how they continue to evolve it to make open-source development even more secure.

xGitGuardTM uses advanced natural language processing to detect authentication secrets. It has two separate models: one for detecting credentials and one for detecting API tokens and keys. The tool follows a six-step process: search GitHub at scale, filter results, detect and extract secrets, developer identification, validate secrets and then submit for remediation.  

Search: xGitGuardTM has a unique approach to searching GitHub, it uses Primary keywords (PKEY), which helps search for documents that are related to the organization and then Secondary keys (SKEY) which is used to target documents that potentially contain secrets. With this unique approach, the entirety of GitHub is searched, but only target documents that are relevant and sensitive are returned.

Filter Results: The query engine within xGitGuardTM runs multiple queries at the same time to expedite the search process and more rapidly cover the scale of GitHub. xGitGuard also maintains a hash list of documents that have been previously processed and automatically skips them, ensuring files are only processed once.

Detect and Extract Secrets: The core of xGitGuardTM is an artificial intelligence model that processes the filtered results for secrets. Through NLP and other in-house developed text processing algorithms, the model breaks down documents to smaller tokens, removing common English words (“the,” “a,” and “an”), extracts the secrets and extra metadata (e.g., masked secret, line of code, etc.) around the detection.

Developer Identification: xGitGuardTM then identifies the developer who posted the code on GitHub, using commit logs related to repositories and their corresponding name and email address is produced. 

Validate Secrets: Through a machine learning model trained with historical detections, the input to the model is a potential secret, and the output is whether the input is an actual secret or not. The model was built using several features, with the focus on the secret itself and the line of code where the secret was found. This model is over 90% accurate in recognizing secrets from non-secret text.

Submit for Remediation: To close the loop, the validated secrets are then submitted for remediation to various internal teams.

xGitGuard workflow steps:

1

Search GitHub at Scale

2

Filter Results

3

Detect & Extract Secrets

4

Developer Identification

5

Validate Secrets

6

Submit for Remediation

At Comcast, xGitGuardTM has quickly become an invaluable tool for supporting our secure development lifecycle, allowing our development teams to build and iterate at Internet speed while keeping our technology safe. We hope others find it equally helpful, and can’t wait to see how it evolves.  

Vaibhav Garg is Executive Director, Cybersecurity Research & Public Policy, Comcast Cable.