## Just how big IS a terabyte?

### December 21, 2016

Being an old guy in computing terms, I still remember the occasion when one of my colleagues told me that they’d bought a new PC with a gigabyte hard drive.  That’s 1 GB.  1,000 MB.  It was, I believe, back in 1994.  They’d got themselves a new home machine with a 60 MHz Pentium (built on an 800 nm process) and a 1 GB hard disk.  How they gloated.  How envious was I, stuck with my 500 MB disk and a 486DX2-66 processor?

These days, we have 512 GB SD cards, which are merely the size of a postage stamp, available for only GBP 314.  A terabyte version has recently been announced by SanDisk.

The size of today’s storage devices has led to people just not appreciating just how vast those stores are!

Here comes the science part: An SI terabyte, which is a usual measure of persistent storage, is 1,000,000,000,000 (1012) bytes.  Each byte is 8 bits, and can encode a single character in ASCII or EBCDIC schemes – let’s ignore internationalisation and multi-byte Unicode for now as that just complicates things.

Note that this is distinct to a CS tebibyte which is 1,099,511,627,776 (240) bytes and is used as a more usual measurement for memory.

So, to put these numbers into some perspective, just how big IS a terabyte?

It’s quite common to talk about it in terms of hours of music or video but that is really kind of misleading… sure you can imagine a stack of 213 Blu-ray movie disks, two and half years of continual MP3 music or nearly 2 weeks of worldwide tweets but what does that actually mean in more visceral terms?

Let’s work it out in terms of space and time using, whilst they still exist as physical artefacts, traditional paperback books.  For this example, let’s use Tolstoy’s War and Peace to generate the numbers. We’ll do time first.

War and Peace has a word count of 587,285.  It’ll depend on the particular translation, I guess, but let’s use that as a working number for now.

The average typing speed for a typical person is around 40 words per minute (but obviously more for a professional typist).  That means that it would take someone, maybe you, around 245 hours to type a copy of War and Peace.  We can convert time into money too – the minimum wage is  around GBP 7 per hour, so the opportunity cost of the time taken for that typing would be around GBP 1,715.  That represents about 6 weeks of work effort.

The English language has around 4.7 letters per word on average but don’t forget the spaces and punctuation too – so make it 5.7 characters per word; 5.7 x 587,295 = 3,400,380 bytes.  A Project Gutenberg copy of War and Peace is 3,359,550 bytes so that’s quite a good correlation.  Let’s just use 3.4 MB to make the numbers easier.

A terabyte is 1,000,000 MB so it could hold 294,118 copies of War and Peace.  As calculated above for a single copy and multiplied up, that represents around  72,058,910 hours of human effort and so a labour value of about GBP 504,412,370.  (You can see why photocopiers took off…)

Those 294,118 copies would hold about 368,529,854 pages (see below), each with maybe 476 words on it on average.  Every adult in the UK could have 7 pages dedicated to them.

A terabyte is worth over 500 million pounds sterling of minimum wage workers’ typing effort and, at 8 hours a day for 260 days a year, would take 34,644 years to complete.

Best get some help to finish that before lunch.

The adult population of the UK is about 52 million so, if they all mucked in, they should get the job done in just under an hour and a half by typing some 3,325 words each, which just so happens to be around 7 pages of content. How convenient.

It’s also worth noting that, it would take 34.6 years of work for that average typist to fill a single GB, which holds 294 copies of the book.  This is why, when everything was held as plain ASCII text files, even 500 MB felt cavernous to most people!

People read, on average, at about 250 words per minute so it’d take around 5.6 years to just read a GB of text, and 5,623 years to read a TB. (Assuming a 9-5 job, 5 days a week.)

So that’s time, and by extrapolation, money.  Now for space.

Copies of War and Peace differ in physical size and page count depending on extraneous supporting content and typeface used.  Here are three off of Amazon, we’ll take an average of them as our working numbers.

ISBN-13 Pages Dimensions Volume
978-1853260629 1,024 12.2 x 5.6 x 19.8 cm 1,352.74 cm3
978-0140447934 1,440 13.0 x 6.1 x 19.8 cm 1,570.14 cm3
978-0099512246 1296 13.1 x 5.5 x 21.6 cm 1,556.28 cm3
Averaged 1,253 12.8 x 5.7 x 20.4 cm 1,493.19 cm3

We’ll just assume that the extraneous text is not material to the argument and stick with the 3.4 MB data per edition.

So the printed information density of a War and Peace paperback book is around 2,277 bytes per cubic centimetre, or 2.28 KB/cm3.  This equates to about 2.28 GB/m3.  (You can see why electronic storage is so popular…).

There are 1,000 GB in 1 TB, so a TB in such a printed form would take up around 439 cubic metres and neatly contain those 294,118 copies of War and Peace.  That’s a cube 7.6 metres (or about 25 feet) on each side.

It’s still a little abstract so let’s make it more concrete.

A community swimming pool is generally about 25m long and 13m wide. It obviously various depending on the number of lanes but this is roughly what you’d expect.  Depth varies too but, for this, imagine a 2m depth throughout – no shallow end or diving area!

That means that a terabyte of books would fill such a community pool to a depth of 1.35 metres or nearly 4.5 feet.  Each pool could hold the equivalent of around 1.5 TB in printed form.

A 6 TB disk, now commonly available for under GBP 200, holds enough data that, if printed in a paperback novel form, would fill 4 standard swimming pools, or a 50m Olympic pool to a depth of 2.1 metres.

Hopefully that gives you a feel for just how much data a terabyte, or even a mere gigabyte, actually is, when it represents textual information rather than high-definition video or music content.

As an aside (and this is highly speculative and could be utterly wrong in so many, many ways), just how small a volume can a TB be stored in? Well, it’s looking like a holographic universe (if that’s an actual thing) stores one bit in an area one Planck length per side, or 2.56E-70 m2.  If we want to store 8E12 bits we’d need 20.48E-58 mor a sphere at least 2.55E-29 metres in diameter surrounding the information.  That’s the size of the black hole formed by a mass of 8.9 g packed into a singularity, or something like that anyway. Of course, we’d likely also need sufficient information to be held to fully describe the substrate containing the data which would push out the size very substantially.  As the man said, “there’s plenty of room at the bottom“.