Securely Wiping Data

Securely wiping a hard drive before you pass it on to someone else or dispose of it is a very good idea. But it's not easy. The most reliable way of course is to put the hard drive in a crucible and heat it past the melting point of steel, then wait for the entire thing to liquefy. There are some obvious problems with this method: the hard drive can no longer be used, toxic gases are probably released into the atmosphere, and most people don't have the equipment.

The lengths you want to go to to wipe a file or a drive should be commensurate with the importance of the documents to you. These days, medical records, credit cards and banking details can all easily leak onto your hard drive in the form of your browser cache, so I think you should be pretty paranoid about any drive, any time. I've been amazed at the stuff I've found on drives from garage sales and flea markets: be cautious.

The basic premise of wiping a file or directory on an old-fashioned spinning hard disk is to over-write it multiple times with random junk. Many years ago - when hard drives were measured in MB instead of TB - I would download the latest version of Netscape (woo - remember that?) and use a script to write it to the disk multiple times with different names until the drive was full. This is the basic principle, but it's better to use random data and over-write several times. Why several times? Because on magnetic media, recording a value leaves a physical mark and is detectable (with a scanning electron microscope - admittedly not a resource owned by most hackers) even after a rewrite.

Because of SSD wear-levelling, this over-writing process doesn't work well (or at all) on SSDs and is in fact actively bad for them. I haven't had to deal with wiping an SSD yet, although I'm sure it's coming.

If I'm planning on disposing of a spinning hard drive (as opposed to an SSD), my preferred tool is DBAN - once upon a time that meant Darik's Boot And Nuke and it was an entirely free product: it appears to now be only semi-free (I haven't tested the newest version). It's a Linux-based boot disk, so it's OS-independent (although it assumes x86 architecture). If you're looking to wipe individual files or folders, the tool of choice under Linux is wipe (apt-get install wipe). DBAN only deals in partitions. wipe can handle partitions, but my experience suggests it's orders of magnitude slower than DBAN (which often takes a couple days to do a proper job).

Keep in mind that modern hard drives have been for many years smart enough to fix their own bad sectors: they quietly migrate data off questionable sectors, recruiting previously unused sectors in their stead. What this means is that someone with the right tools is likely to find random chunks of your data littering the "bad" sectors that the HD has hidden from the operating system (and thus any wiping program): thus my earlier suggestion of the crucible. SSDs are even harder to wipe reliably. And if you're not feeling paranoid enough, read the wipe man page: it goes off on a paranoid (but not totally implausible and also somewhat amusing) rant about how governments probably have agreements with HD companies to hide data for later retrieval, especially if they see a program like wipe coming.

DBAN is pretty self-explanatory after you boot it. I'd encourage you to remove any drive from the machine except the one you want wiped to avoid unpleasant mistakes.

If the drive is old enough, small enough, or broken enough to have no reuse value, I then disassemble it, harvest the magnets (they're neodymium, they're fun), take out the hard drive platters and scratch and bang the crap out of them. Be careful about that though: some hard drive platters are made of metal, but others are made of glass. Not kidding: good way to injure yourself if you weren't expecting it.

wipe, as mentioned, is significantly slower than DBAN per Gigabyte. As such, I've taken to over-riding its defaults. I commonly use this pattern:

# wipe -r -c -q -Q 3 tmp/

-r indicates recursion, -c means "change perms and wipe read-only files," -q indicates a "quick" wipe, which by default would be four over-writing passes on the file, but I over-ride that as well with -Q 3 which means that only three passes should be used. This is of course less secure than the default.

Recently with a media drive where I knew the person who was getting the HD and didn't have the time for a full DBAN wipe, I did this:

# wipe -r -c -l 1K -q -Q 1 media/

A recursive quick wipe with only one pass, and even then -l 1K means that only the first 1024 bytes of the file should be wiped. Since wipe also does a multiple mangle and rewrite of file names, this should make the media files unusable for any normal mortal. A determined government agency would still be able to recover 99% of each file, and from that probably not have too much trouble determining what it was: but again, this is a question of paranoia level and this kind of wipe should be good enough for what I was doing at the time.

Wiping an SSD appears to be manufacturer-specific. This article (and many other articles) has a link to various manufacturer's SSD software, but you should be able to find it yourself by searching for something like "secure erase <manufacturer> SSD" on Google. The good news is that this is immensely faster than on a spinning disk, as all it's doing is setting every storage space on the entire drive to zero.

While researching this I came across an article that pointed out that if you've encrypted your drive with a strong passphrase, you could give the drive away without bothering to wipe it. I think that's naive for a number of reasons: excessive faith in your own brilliant choice of password, excessive faith in modern encryption technology (setting aside the fact that multiple encryption schemes have been proven to be easily breakable due to poor design, the hard drive receiver could put the drive aside for a couple years and eventually use his/her new quantum computer to crack it by brute force in an hour or so). So I would still wipe a drive, but it makes the good point that if the drive fails and you can't erase it, the data will be much harder to recover even if someone can get it running again. But you're going to apply the crucible (or at least a hammer) to that non-functioning drive, right?

Update

Other options - you should use the options mentioned above, but these are also possibilities.

(2017-04-04) The shred command seems to be available by default on Debian and Ubuntu. It seems to serve a similar function to wipe, but unlike wipe the man page doesn't go into methodology much, you need to look at the online documentation. One thing it does that wipe doesn't was deal properly with devices (like /dev/sda5). All it does is overwrite the file (or device). The man page recommends doing devices, not files (because some filesystems deliberately move writes to other areas of the drive rather than the one you thought you were overwriting), and explains that ALL this does is overwrite (using patterned or pseudo-random data - that part's not clear). The default is three passes, but it says "on modern disks, a single pass should be adequate" (and by that I suspect they mean disks after about 2000 - its methodology mentions a paper from 1996).

# shred -n1 --verbose /dev/sdb5

'--verbose' is annoying but very helpful, telling you every minute or so how much progress it's made. '-n1' means one pass.

I'm beginning to think this is the simplest and best cleaning method for spinning disks.


For a single pass simple cleaning with a command that's always available:

# dd if=/dev/zero of=/dev/sdb5

Or:

# dd if=/dev/urandom of=/dev/sdb5

You may know that /dev/random supplies much better random data than /dev/urandom: this is true, but it supplies it at a rate several orders of magnitude more slowly than /dev/urandom because it has to wait on various data sources to supply good random data. In a practical sense, /dev/random supplies a few characters per second. Try to imagine how long that would take to fill an HD. I discovered this problem when I tried to fill a small file with genuinely random data. It's a source that should only be used for supplying a few digits at a time. An alternative that's faster than both:

# dd if=/dev/zero of=/dev/sdb5

Using /dev/urandom takes about 50% more time (192 minutes for an external 40G HD on a USB2 connection) than using /dev/zero.