Newly Open-Sourced Tool from Comcast Helps Developers Protect Sensitive Info

GitHub is a vital resource for developers, but with so much information being uploaded every day, it also presents a unique cybersecurity challenge. That’s why we’re so excited to announce that xGitGuard^TM – a tool built by Comcast technologists to keep authentication secrets out of open-source repositories – is now available as open-source software.

GitHub is the largest open-source community with millions of repositories and users. Our own Comcast development teams work on GitHub every day – whether it is publishing code, collaborating on software development, or using code from GitHub for new projects and solutions.

xGitGuard^TM was developed by Dr. Bahman Rashidi, Director of Comcast Cable’s Cybersecurity & Privacy Engineering Research team, to address the global issue of potential authentication secrets being inadvertently uploaded to GitHub. xGitGuard^TM can be used to scan GitHub at scale and identify proprietary authentication secrets, specifically passwords, API keys, and tokens.

Within Comcast, xGitGuard^TM is used by a variety of teams. Software development teams use xGitGuard^TM to identify the presence of credentials in their own repositories. Our Product Security Incident Response Team (PSIRT) team has successfully used it to detect and address potential issues with Comcast GitHub accounts. Since 2020, when xGitGuard^TM was first deployed, the tool has empowered our teams to take advantage of the full technical benefit of GitHub while also maintaining the peace of mind that their work isn’t inadvertently opening the door to external threats.

We’re now excited to share this technology with the global open-source community, and to see how they continue to evolve it to make open-source development even more secure.

xGitGuard^TM uses advanced natural language processing to detect authentication secrets. It has two separate models: one for detecting credentials and one for detecting API tokens and keys. The tool follows a six-step process: search GitHub at scale, filter results, detect and extract secrets, developer identification, validate secrets and then submit for remediation.

Search: xGitGuard^TM has a unique approach to searching GitHub, it uses Primary keywords (PKEY), which helps search for documents that are related to the organization and then Secondary keys (SKEY) which is used to target documents that potentially contain secrets. With this unique approach, the entirety of GitHub is searched, but only target documents that are relevant and sensitive are returned.

Filter Results: The query engine within xGitGuard^TM runs multiple queries at the same time to expedite the search process and more rapidly cover the scale of GitHub. xGitGuard also maintains a hash list of documents that have been previously processed and automatically skips them, ensuring files are only processed once.

Detect and Extract Secrets: The core of xGitGuard^TM is an artificial intelligence model that processes the filtered results for secrets. Through NLP and other in-house developed text processing algorithms, the model breaks down documents to smaller tokens, removing common English words (“the,” “a,” and “an”), extracts the secrets and extra metadata (e.g., masked secret, line of code, etc.) around the detection.

Developer Identification: xGitGuard^TM then identifies the developer who posted the code on GitHub, using commit logs related to repositories and their corresponding name and email address is produced.

Validate Secrets: Through a machine learning model trained with historical detections, the input to the model is a potential secret, and the output is whether the input is an actual secret or not. The model was built using several features, with the focus on the secret itself and the line of code where the secret was found. This model is over 90% accurate in recognizing secrets from non-secret text.

Submit for Remediation: To close the loop, the validated secrets are then submitted for remediation to various internal teams.

xGitGuard workflow steps:

Newly Open-Sourced Tool from Comcast Helps Developers Protect Sensitive Info

More From Comcast

Our Company

Connectivity & Platforms

Content & Experiences

Press Room