ZPi | Metric System

Much has been made of Google's new email service, Gmail, which promises a gigabyte of free storage. Although true paranoids have already rejected the service for important reasons, many are excited at the idea of getting all that free storage space.

But will you really be getting as much as you think?

According to the Gmail FAQ, that 1 gigabyte is actually 1,000 megabytes (and presumably by megabyte they mean 1,000,000 bytes -- otherwise, this way madness lies). Consider: if the current mailbox on your computer is reported by your OS as having an even 100 megabytes in it, you might naively think you could store ten times that on a Gmail account. Unfortunately, you would be wrong by 48,576,000 bytes (about 46 megabytes by your OS's reckoning -- quite a lot of email).

This is the sort of confusion and sneaky business practices that results when the kibioctet standard is not in wide use, as it should be.

To address this issue, as well as others, ZPi is proud to announce ZPiMail. Unlike Gmail, ZPiMail offers infinite gibioctets of storage space by leveraging the transcendental irrationality of nature itself:

Every email you have stored can be expressed as a mere string of digits (in fact, it's already stored as such on your computer). Since the number π has an infinite number of essentially random digits, the string of digits that represents one of your emails can be found within it, as can the digits representing your entire mailbox, no matter how large it may be. Instead of storing all those gibioctets of digits on your computer, why not just store the offset of the expansion of π that matches them? With ZPiMail, now you can!

(NOTE: ZPi does not currently offer software to facilitate reading your email from π, however you can rest assured that everything in your mailbox is already safely stowed away in there, as well as any future email you may receive and hypothetical emails to you from Jimmy Carter explaining all the mysteries of universe in Farsi. I apologize for this oversight, but I have been forced to prematurely announce ZPiMail in order to head off my archnemesis, Dr. Ernesto, who is attempting to steal focus with his derivative EeMail.)

In 1884, the Lord Kelvin had the following to say about inadequate measurement systems:

"You, in this country, are subjected to the British insularity in weights and measures; you use the foot, inch and yard. I am obliged to use that system, but must apologize to you for doing so, because it is so inconvenient, and I hope Americans will do everything in their power to introduce the French metrical system. ... I look upon our English system as a wickedly, brain-destroying system of bondage under which we suffer. The reason why we continue to use it, is the imaginary difficulty of making a change, and nothing else; but I do not think in America that any such difficulty should stand in the way of adopting so splendidly useful a reform." [Source]

120 years later, America (and, sadly, much of Cascadia,) still hasn't heeded His words. To the contrary, we have shackled ourselves with an additional modern form of measuremental bondage that is even more brain-destroying than anything the most wicked Brit could have devised -- one that even perverts the system that Kelvin advocated. I am speaking of the units we use to measure data on computers; the bytes, kilobytes, megabytes, gigabytes, terabytes, and, yes, even petabytes that we have all become so familiar with, and yet can be so confused by.

Let's start with the basic units. The bit, in case you didn't know, is the smallest unit of information in a binary system. There are no fractional bits, and the term is unambiguous. This is an acceptable unit. The byte is a little more convoluted. In present-day usage, 1 byte = 8 bits. However, the term originally referred to the number of bits needed to encode a character. Consequently, there were computer systems where bytes were different numbers of bits. This system-specific functional term only later became a general unit of information when the 8-bit character size became a standard, resulting in one term having two incompatible meanings (albeit one now considered obsolete) in the same field.

But the real confusion comes when bit and byte are used together. The abbreviation for byte is uppercase B, whereas the abbreviation for bit is lowercase b. In theory this seems simple and even eloquent, but in practice people often use B/b indiscriminately, usually out of ignorance of the difference (not to mention problems caused by caps lock scofflaws and e. e. cummings wannabes.) Oddly, the original "bite" was given a "y" so that it wouldn't be misspelled "bit," but this rather obvious abbreviation problem was overlooked.

The next level of trouble comes from the prefixes used with these two terms. How many bytes are in a kilobyte? The answer depends on whom you ask. Computer science people would say 1 kilobyte = 1,024 bytes (1,024 is a round number in binary notation: 10000000000.) But this ignores the accepted meanings of the Metric prefixes (kilo = 1,000, mega = 1,000,000, etc.) which means proponents of the Metric System correctly reject this usage as improper.

Now if it was just people in an unrelated field being persnickety then maybe this wouldn't really be a practical problem for computer users; However, the proper Metric meaning of the prefixes are used by some in the computing industry, although often out of ulterior motives. For instance, if a harddrive manufacturer says a drive has 10 gigabytes, then it actually has 10,000,000,000 bytes, not 10,737,418,240 bytes as your operating system would measure 10 gigabytes -- you may be getting less than you think you're getting. The result of this mix of proper and improper use of Metric prefixes is ambiguity and the potential for errors (or dishonest pricing).

So how should we solve these problems? For starters, we need to replace the term byte with one that has a more obvious abbreviated distinction from bit. Just as Kelvin urged Americans to follow the lead of the French, I am urging everyone to use the French term for 8-bits: octet. This term -- born out of anglophobia -- is both unambiguous and descriptive. Octet literally means a group of eight. In the context of informational measurement, it means 8 bits. Octet is abbreviated o, so there's no confusing it for bits. Plus, the French have already been using it for years, with no problems.

Next, we need to consistently stop misusing the Metric prefixes. A kilo is defined as 1,000 and it should never be used for something else. Instead, we should widely adopt the binary prefixes that were approved as a standard by the International Electrotechnical Commission (IEC) in 1998. These prefixes are as follows:

kibi- (Ki) = 2¹⁰ (1024)
mebi- (Mi) = 2²⁰ (1048576)
gibi- (Gi) = 2³⁰ (1073741824)
tebi- (Ti) = 2⁴⁰ (1099511627776)
pebi- (Pi) = 2⁵⁰ (1125899906842624)

(For more, see the NIST reference page on prefixes for binary multiples.)

Thus, the old, tired, and confused kilobyte (8,192 bits) will be reborn as the kibioctet (Kio). So, computer users, say hello to your new friend kibioctet, as well as his buddies mebioctet (Mio), gibioctet (Gio), tebioctet (Tio), and, yes, even pebioctet (Pio).

UPDATE (2004-06-21): Here are some trendy badges so you can educate your visitors of these important new data measurement terms and show your site's support of sensible standards:

UPDATE (2004-07-06): This article, with additional information, can now be found at ZPi Labs: Kibioctets. Any further updates will go on that page. If you want to link to this information, link there.

Metric Time & Muffins

Lyle Zapato | 2012-08-20.7160 LMT | Food | Technology

Gmail, Kibioctets, And Introducing ZPiMail

Lyle Zapato | 2004-07-17.9400 LMT | Site | Technology

Kibioctet: Your New Friend!

Lyle Zapato | 2004-06-20.8930 LMT | Technology