Unfortunately, security is often more about what we don’t think of than what we do, and it’s important to cover as many bases as possible.
A trend has emerged recently of scraping sites such as GitHub for sensitive information, such as passwords, access keys, and databases. This method of stealing information has become a surprisingly effective tactic. You may be wondering why people would upload their passwords to a public site, and the answer is surprisingly simple. They are unaware they’re doing it. To find out how big this problem is, we did a bit of scraping ourselves…
How widespread are Git security issues?
We took a sample of data from the public GitHub stream to get an idea of the scale of this issue. Out of the 78.3 thousand commits we checked, 62 of them matched sensitive file patterns. Though this may not sound like a lot (a mere 0.07%), when scaled, the impact is quite large.
Given data from today alone from the Github Archive, there was 459,991 push events. If you apply that 0.07% to these numbers, it equates to roughly 322. That’s over 300 databases logins, servers credentials, and SSH private keys becoming public information each day! Note also that we’re talking commits here, not projects.
With many developers pushing code several times per day, proper understanding and usage of gitignore is essential to protecting company secrets — the easiest way for criminals to break in is for you to hand them the keys. Generally speaking configuration files that contain passwords, keys, and similar info should not be public. Luckily, there is a solution!
The gitignore file was created for the purpose of preventing files from being uploaded without needing to explicitly exclude them. Any file added to the gitignore will never be included in git commits. Not only does this feature allow for system-specific files to be untouched, but it allows for insurance that sensitive files will never be uploaded. Let’s take the following directory as an example:
If we wanted to exclude the file, “example.txt”, we would simply create a file, “.gitignore”, containing this line:
Easy, right? If we wanted to exclude all text files, we would simply add the line:
Each line pertains to a specific file or set of files to exclude. Here’s an example of a full fledged gitignore file (specifically one for ruby):
###Ruby### *.gem *.rbc /.config /coverage/ /InstalledFiles /pkg/ /spec/reports/ /spec/examples.txt /test/tmp/ /test/version_tmp/ /tmp/ .dat* .repl_history build/* .bridgesupport build-iPhoneOS/ build-iPhoneSimulator/ ## Documentation cache and generated files: /.yardoc/ /_yardoc/ /doc/ /rdoc/ ## Environment normalization: /.bundle/ /vendor/bundle /lib/bundler/man
There are quite a few other useful features of the gitignore, such as directory removal or file whitelisting — we’ve included a few helpful links below if you’re interested in some of the more advanced gitignore functionality.
Before adding a gitignore file to your project, it’s worth checking to see if one already exists. This is important even if you’re working solo, because many services and libraries will come pre-loaded with gitignore files included. Once you’ve verified that you need a fresh one, you can use gitignore.io — a great tool for finding or generating gitignore. This tool will give you a baseline gitignore, to which you can add important files or remove rules you’re not using.
Here are a few additional resources if you’re interested in learning more: Gitignore documentation: https://git-scm.com/docs/gitignore Power user CLI tools to make gitignoring easy: https://github.com/joeblau/gitignore.io/wiki/Advanced-Command-Line-Improvements A collection of useful gitignore templates: https://github.com/github/gitignore
If you’re interested in doing some digging yourself on what sorts of things aren’t being gitignore-d, Here’s a link to the code we used to do our own digging, along with a readme that shows how to extend this for your own personal pattern matching. Happy hunting!
Dylan Katz is an Engineer at GitPrime. He began coding at the age of nine, and hasn’t stopped since. He specializes in software prototyping and research. Follow @Plazmaz1 on Twitter.
Get Engineering Impact: the weekly newsletter for managers of software teams
Keep current with trends in engineering leadership, productivity, culture, and scaling development teams.