The SolarWinds-related break-in into Microsoft source code should be a wake-up call to all organizations big or small. This was not due to source-code vulnerabilities but via ‘internal’ intrusion where the bad guys got inside the organization through other parts of the technology perimeter.
Code vulnerability and tighter access mechanisms is one part of cybersecurity, data protection is another. SolarWinds-related vulnerabilities allowed the bad guys to access GitHub and Bitbucket by breaking into the organization from another method first.
Code burglary is very common
Following are few of the higher profile breaches of code base in recent years. Most of these have a common theme in that they originated via employees or their credentials. Sensitive data and credentials within the affected organizations was also exposed compounding the damage in some cases.
- Apple IOS Source Code Leak in 2018 when an intern walked away with code that landed on public GitHub repos.
- Snapchat source code leak by a disgruntled employee.
- Former employee of DJI leaked private keys and source code on public Github repos.
- Symantec source code was stolen over the course of many years.
- Uber source code leak was scary that it yielded credentials that were then used to get access to 7 million Uber drivers and 50 million customers.
- Scotiabank source code leaked due to mis configured GitHub repositories.
What can be leaked?
The most common types of ‘secrets’ that can be found in source code breaches include:
- Common SSH Keys
- API keys
- Passwords
- Login credentials
- PII/PHI data of employees or customers
- AWS credentials
- Google/Twitter/FB services’ keys
Methods of securing codebase
- GitHub provides token scanning and other searches that can minimize this vulnerability but is nowhere near a fool proof solution.
- Penetration testing programs
- Malware scanning in code packages used. This is especially relevant if open-source/NPM projects are an ingredient in your development.
- Sanitizing sensitive data in all packages
Securing sensitive data from finding its way into codebase
Polymer DLP allows organizations to implement least-privilege access protocols on sensitive data. For codebases, this is done via detailed scan of all repositories, codebases and user-access. Any policy violation results in redaction and alerting if any of the following is found:
- Secrets, including commonly used Password patterns
- AWS, FB, Twitter, GCP, Azure and other popular cloud credentials
- Sensitive PII/PHI/HIPAA/GDPR/CCPA data elements
- Organization-specific sensitive items
Codebase security goes beyond vulnerability analysis and access controls. Security and governance protocols need to think of risk reduction in scenarios where some parts of the source code repositories may be exposed. Removal of sensitive data, secrets and credentials within GitHub, GitLab and Bitbucket repositories is of paramount importance towards making organizations more secure.