• Home
  • About

Overbenny's Blog

Von Computer über Origami und Glaube zu Sinnbefreitem

Feeds:
Posts
Comments
« 7.35 seconds for booting
A bug in my mobile phone »

Comments to Ubuntu 10.04 Reads File Sizes Differently

25. March 2010 by overbenny

I stumbled over Ubuntu 10.04 Reads File Sizes Differently and I have to correct some statements.

First I want to ask you, my blog reader, to read the units policy. Then think about it and read it again.

Now my criticism to the blog post:

  • We didn’t change the units policy. There were no such policy; we created one.
  • KB does not exist (in the SI or IEC standard). It’s either kB (meaning 1000 bytes) or KiB (meaning 1024 bytes). Did the author read the policy?

Now my clarifications to the commenter:

  • This policy was not Canonical’s decision. You have to blame me for creating the draft of the policy and the Technical Board for approving it.
  • This policy has nothing to do with Apple. I have never used a Mac and I don’t care what kind of byte prefixes Apple uses.
  • This policy is not connected to the decision to change the window buttons position of the default theme. This was done by different people. These two things are absolutely independent.

Correcting all applications to comply to the units policy is a goal for lucid+1 (Ubuntu 10.10). We are too late in the release cycle for the change in lucid (Ubuntu 10.04). My current plan is to create a library for inputing/outputting bytes to users. The user can then configure this library to display the units in base-2 (KiB), base-10 (kB), or the historical totally fucked-up format (KB).

Edit: My clarifications to the commenter apply to the commenter of Ubuntu implements units policy, will switch to base-10 units in future release too.

About these ads

Rate this:

Like this:

Like Loading...

Posted in Planet Ubuntu, Ubuntu | Tagged units policy | 37 Comments

37 Responses

  1. on 25. March 2010 at 14:00 ScottK

    If you can show me a microbyte, then I’ll agree that kilobytes are at all related to SI units.


    • on 25. March 2010 at 18:48 overbenny

      My hobby is to collecting microbytes (μB). I have now 62,500 microbytes. They can store a half bit together. It’s zero today. ;)


    • on 30. March 2010 at 10:53 somejan

      Fractional bits are actually very common in information theory. If I tell you the value of my six-sided-die roll, that gives you log2(6) = 2.585 bits of information. So that would be 2585 milibits or 2647 mibibits :) . AFAIK a byte is just 8 bits.


  2. on 25. March 2010 at 14:19 Miguel

    Well, one bit holds the same information as 0.125 bytes. That is, 1 bit is 125000 microbytes.


  3. on 25. March 2010 at 14:40 d0od

    Hello, I wrote the post :)

    The KB Vs. kB thing is a typo and one, for the vast majority of our readership, is likely superfluous.

    Although i did read the unit policy (and should have paid more attention re: the typo above, i will freely admit the unit policy was a bit dense to make sense of) the aim of the post on re: the unit policy wasn’t to delve into the semantic debates or reasoning’s leading to it but just present to the reader the “change” they will notice upon using Lucid: that something has changed; the way file sizes are displayed is different. Hence the use of the term ‘change’. It wasn’t meant in any negative sense. The term ‘has changed’ was also used in some of the bug reports we references, do I feel it’s likely a more pedantic issue than misleading.

    OMG! is aimed at end-users rather than more technical minded readers (though we have plenty of those too) and as such we always straddle a fine line when simplifying or boiling changes down into easy to understand digestible chunks.

    As for reader comments – those I have no control over!


  4. on 25. March 2010 at 18:20 Harz

    Ok, so its your fault not Canonicals. Fair enough. Still a really stupid decision.

    These days it seems Ubuntu is dying by a death of a thousand paper cuts(quite ironic) bought on by a bunch of hubris circle jerks employed by Canonical.


  5. on 25. March 2010 at 18:41 Mackenzie

    ScottK:
    None of the x10^(n) where n is a negative number prefixes are used because that’d be a bit silly. Are you maybe mixing up kilo- and milli-?


  6. on 25. March 2010 at 19:01 ScottK

    @Mackenzie: Which is why pretending that the kilobyte is an SI unit is ridiculous. If people are so anxious for consistency they should have made a byte 10 bits while they were at it.


  7. on 25. March 2010 at 19:17 CoolGoose

    Harz what are you smoking ? I want some.


  8. on 25. March 2010 at 19:33 En fri man

    Nice job , I have seen in some programs they define MB and so on … using standards make it better…


  9. on 25. March 2010 at 20:49 Raphink

    Thanks for posting this, and for writing this policy. At first, I was quite afraid of where it would lead, but I was soon convinced of the utility of it (as long as it only targets UI and not CLI).

    I actually just asked two people who are average computer users how many kilobytes you can put in a megabyte. One answered 1000, and other said she didn’t care. These are people who use computers daily, so that just convinces me that it won’t hurt most users.

    Additionally, I added a comment on the wiki page about command-line tools, as I would love to have proper CLI options to specify the output metric (without changing the default of each command, quite obviously).


  10. on 25. March 2010 at 21:04 Victor

    I like the kB/MB convention better because it’s a better estimate for humans at how many bytes are actually in a file. The error when truncating to KiB/MiB/GiB/TiB grows larger as the units grow. You won’t have this with kB/MB/GB/TB.

    1 TB = 1 000 000 000 000 B. Blam. EASY.
    1 TiB = 1 099 511 627 776 B?!

    It’s just easier, at a glance, to compare sizes that cross the unit barriers, e.g.:

    4 517 781 kB? 4.518 GB. BLAM. No calculations needed.

    4 517 781 KiB? Uhh.. let’s see here let me just get out gcalctool… Right, “4517781/(1024^2)” aaand we have 4.308 GiB. :-| Highly dissatisfying.

    Notice how 4.5 compares to 4.3? That “error”, as I incorrectly refer to it, will just keep growing the bigger the difference between the units you convert between. Although I have to admit there is a charm in counting using base-2, when 1 TiB is expressible as just 2^40. Then again 1 TB is just 10¹².


  11. on 25. March 2010 at 21:10 Victor

    Also, on the UnitsPolicy page, we can see this:

    “Correct basis
    Use base-10 for:

    * network bandwidth (for example, 6 MBit/s or 50 kB/s)
    * disk sizes (for example, 500 GB hard drive or 4.7 GB DVD)”

    I thought MBit/s is usually written Mbit/s? Or Mbps, sometimes. Since we’re picking apart kB versus KB, we should talk about Bit and bit also, I reckon. :)


    • on 25. March 2010 at 21:55 overbenny

      The MBit/s was a typo. I corrected it to Mbit/s. Thanks for pointing that out.

      A bit is either called bit or b. Transfer rate can be displayed in bit/s, b/s, bps, or B/s. It’s a good idea to add a recommended way to the policy. IMHO we should use B/s in most cases. If you need bits, use bit/s. It’s clearer than bps and cannot cause confusion like b/s.


  12. on 25. March 2010 at 23:56 Jeff Parsons

    I thought it was pretty simple, really:

    KiB is unambiguous. It means 2^10 bytes.

    kB/KB/kilobyte are ALL ambiguous; they have all been widely used to mean both 2^10 bytes and 10^3 bytes.

    Certainly, bits and bytes are not SI units—they don’t belong there, either; SI is for units whose definitions are defined and continually refined in terms of things we can observe in the natural world.

    But people _do_ commonly use the SI prefixes for non-SI units, even in the sciences; it’s surely can’t unspeakably evil to do so.

    Now we need to decide on a convention for how to treat kB. We have KiB for 2^10 bytes. We don’t need another one for that. Using kB to mean 3^10 bytes keeps the prefix consistent to what “k” means almost everywhere else, and gives us a unit to neatly describe a whole bunch of bit quantities in common use that already do use multiples of 10.

    So since we have to somewhat arbitrarily assign a meaning to kB (for this particular context, given that it is in general meaningless), why not use the one that is the most useful?

    (Aside: the concepts of _portions of bits_, e.g. the microbyte mentioned above, are not generally referred to in these terms, but are most certainly used in information theory.)

    (Aside: we should be using KiB wherever possible, since it’s the only option with a universally unambiguous meaning. If we’re not, it’s only because people will say “huh?” when they see it, even if they wouldn’t notice the difference between the different definitions of “kB”. At very least, an option to make all units displayed with binary prefixes is a must; it’s the only future; everything else is icky and ambiguous!)


  13. on 26. March 2010 at 0:20 William Chambers

    Hey, just wanted to say I suppose your effort to try to clean things up and bring some form of standards to the desktop. I realize there’s alot of people complaining about every little improvement that’s done but in the long run they’re well worth changing.


  14. on 26. March 2010 at 3:05 michaeleriksson

    Every change brings some degree of dissatisfaction and need of adaption. What must be done is to make a decision as to whether the short-term pain outweighs the long-term gain.

    Having been bugged by the “prefix confusion” for close to twenty years, I am very much in favour of the recent trend towards use of distinct names for the binary and decimal prefixes. My regret is obviously that this change did not come sooner, which would both have made life simpler at an earlier time and the transition easier.

    Where to best use what kind of unit is another matter—but a very secondary one.

    As for any relation to SI: It is not in anyway relevant. What matters is that certain prefixes have certain established meanings. Notably, most people (at least in the western world) will have a clear understanding of kilometer = 1000 meter and kilogram = 1000 gram. Using them in alternate meanings is asking for (and, in this case, receiving) trouble.


  15. on 26. March 2010 at 6:05 DF

    I support what you’re doing as well. Good call.


  16. on 26. March 2010 at 7:58 bullgard

    I appreciate your work towards a more standardized way of using units of measurement in Ubuntu.
    Old habbits persist long. I remember the introduction of “MHz” in American journals and the fuss of irrational comments this introduced at those times.


  17. on 26. March 2010 at 10:17 George

    You know, having half of one thing and half of another is worse than just having the wrong thing everywhere.

    I really like that it will be easier to understand once it is done but I simply cannot believe that you would put something half-baked into an LTS.


    • on 26. March 2010 at 12:52 overbenny

      We have reverted the change in lucid and will delay all units policy related changes to lucid+1.


      • on 26. March 2010 at 17:26 George

        Great! Looking forward to a consistent and easily comprehensible display of units in lucid+1.


  18. on 26. March 2010 at 19:21 Tomi

    The problem is not to choose betwenn Kio or Ko but it is that the operating system are using the good unit at the good moment.

    For example your RAM had to been sized and displayed in Kio because RAM memory is manufactured by Powers of two.
    At thé oposite, your hard drive is not.

    You can use as well Kio or Ko but you had just to know what you are speaking about


  19. on 27. March 2010 at 20:25 BlogoFlux – Latest news on Gadgets, Internet, Applications & Hardware » Blog Archive » Ubuntu implements units policy, will switch to base-10 units in future release

    [...] reported that 10.04 Lucid Lynx would make the switch to base-10, but it was eventually delayed to lucid+1 (10.10) as all applications didn’t comply with the new units policy and were still using [...]


  20. on 28. March 2010 at 3:38 Rodney Dawes

    RE: bps vs. bit/s vs. whatever else, you left out baud. :)

    And KB does exist. It is from the JEDEC standard. However, since nobody makes terabyte semiconductor chips yet, the JEDEC standard only specifies KB, MB, and GB for sizes of 2^n in. When was the last time you bought RAM that was base 10? Bytes are not base 10, they are base 2. How about a giant class action suit against storage manufacturers instead? Why aren’t they changing to display the correct values on their packaging and devices, and in advertisements? How does perpetuating the lies help users?


  21. on 28. March 2010 at 11:38 David

    I can’t stand the Language Police telling everyone to stop using ‘kilobyte’ one way and to start using it another way. It doesn’t work and it just creates confusion and havoc. If they wanted to measure bytes in base 10 (in a horribly inconsistent manner), they should have created a new unit for it, instead of pretending the standard unit was morally wrong because it’s “inconsistent”. Despite what is done in France, language cannot be dictated.

    Anyway…what I would like to see is fully-decimal measurements everywhere. That means using bits, not bytes. To human beings, the quantity of ‘8 bits’ is meaningless and confusing. A bit, on the other hand, makes sense: it’s the smallest unit of information. Wouldn’t life be much simpler if information were always measured in base 10 fully?


    • on 28. March 2010 at 11:53 michaeleriksson

      You may, as I read you, be missing the point: It is not a question of forcing a change of consistent use. The main problem is not that e.g. the “k” prefix has another meaning than elsewhere, but that its use even wrt computers was/is highly inconsistent and confusing. This change is better likened to e.g. enforcing a consistent terminology with regard to various kinds of gallons.

      (As an aside: There are large practical advantage with having numbers divisible by factors of 2, and having a unit like the byte, or otherwise counting many things by 2s, still makes sense.)


      • on 28. March 2010 at 13:03 David

        Human beings are taught to think in base 10. The confusion over the meaning of ‘kilobyte’ didn’t occur until computing became mainstream and some standards organization thought that they had some sort of Divine Right to change the meaning from the accepted usage in the field of computing. So, yes, it is about forcing a change.

        And, now, because of that arrogance, people are confused, and consistency is badly needed. The only way to do that is to avoid ambiguity. This means using either base 10 wholeheartedly or base 2 wholeheartedly. The Frankenstein mixture of the two that I have never seen outside of hard drive marketing should just go away forever.

        As I said, humans think in base 10. A fully base-10 system would be far less confusing for people (just like the metric system is far better than the English system). If the binary system makes more sense somewhere and it even makes sense to expose this fact to normal human beings, use the binary system, not the evil hybrid. I suspect that most people (though not I) would simply not care about the fact that RAM is designed in such a way that it comes only in powers of 2.


  22. on 28. March 2010 at 19:12 michaeleriksson

    @David

    I do not quite follow you, and in as far as I do, I largely disagree:

    I do not know how old the confusion is, but it must be at least several decades, and the actions of the standards organization are meant to remove this already existing confusion—it is not creating a new confusion.

    This is what a standards organization does: It suggests standards to remove confusion, make interoperation easier, whatnot. The rest of the world can choose to adopt these standars or not. There is no element of force in the sense of e.g. “Use this standard or we hit you on the head with a keyboard.” (but may be in the less drastic sense “driving for a change”).

    There is nothing wrong with using several bases as long as they are clearly distinguishable (and just the lack of distinction). If one is inferior people (outside the US, ahem) will eventually drop it. Notably the difference between these prefixes is mostly uninteresting to the man on the street, who can work by a good heuristic of 1 k ~ 1Ki, etc.

    Further, even if one type of unit is eventually used exclusively with the public, there is no reason for the specialists to forego other units. Consider e.g. measures like the Planck length, the AU, or the parsec among physicists.


    • on 28. March 2010 at 20:34 David

      SI and other standards organizations not only defined the ‘kibi-’ prefix, but they define the meaning of ‘kilo-’ in the context of bytes. This does not remove any confusion at all, but forces the issue. The far more common usage of ‘kilobyte’ has been (and still continues to be) ‘1024 bytes’, while these organizations felt it their duty to declare that, from now on, ‘kilobyte’ should mean ‘1000 bytes’. That’s enforcing a change in common usage—Language Police.

      Their “standards” are what has made this confusion so bad today, since people thought they should start following them, leading to an even worse situation where the common usage is not as reliable. In the end, it will never ever be the case that everyone uses ‘kilobyte’ to mean ‘1000 bytes’. The ambiguity will always be there, thanks to “standards”.

      Bytes, as opposed to bits, are useful in binary contexts, such as RAM and hard-drive platters. In those contexts, the fully-binary system makes sense, while the hybrid system these organizations are trying to enforce makes no sense whatsoever.

      Can you think of one context where using decimal values of bytes (rather than binary values, or decimal values of bits) makes sense, where it conveys some meaningful information? I cannot, because it makes no sense, and it never did. That’s why the only “decades-old” confusion you think existed was for users who saw these technical terms and got confused. And users have a right to get confused—not because of the ambiguous ‘kilo-’ but because we’re talking about binary values. That’s why I think it makes more sense to go all-decimal for users and just measure bits.

      Anyway, personally, I would love to see sizes displayed as, for example, ‘3.14 Mb (383.5 KiB)’. No ambiguous terms, normal metric values given primacy, so they can be read by actual human beings and not just by us geeks. And it would be a great way to transition everyone to a more humane system.


      • on 28. March 2010 at 20:49 michaeleriksson

        I see little room for a mutally beneficial discussion at the moment, but re-iterate that I consider your “language police” take to be a fundamental misunderstanding.

        In between, I have read http://en.wikipedia.org/wiki/Binary_prefix for some background information on the history, the complications, etc. You too may find it informative.


  23. on 28. March 2010 at 22:31 Jasso

    Amazing how so many seem to miss the point. This process shouldn’t be about does your favourite program use KB or KiB as a unit nor about how many bytes does 12091832 GB at the moment mean. For a regular user the most important point of this policy is that (in future) when a program shows a value of 53274 KiB or 1328 kB the user will finally know exactly how many bytes it actually is.


  24. on 29. March 2010 at 12:03 OurLife » Ubuntu units policy

    [...] annyira triviális, mint az elsőre látszik, amit jól jelez az is, hogy végül egy kiadással későbbre halasztották a bevezetését. A felmerülő nehézségekről meglehetősen jó képet ad a következő [...]


  25. on 29. March 2010 at 22:02 EnNegrita » Ubuntu 10.10

    [...] desarrollador oficial, así como el wiki con la política de uso de unidades de medida en Ubuntu, confirman que la discho [...]


  26. on 30. March 2010 at 4:47 Ubuntu 10.10 medirá la información en kibibytes (cuando sea necesario) | AlfaLibre

    [...] Un desarrollador oficial, así como el wiki con la política de uso de unidades de medida en Ubuntu, confirman que la distro está en pleno proceso (este sí democrático, por cierto) de estandarizar su manera de representar las dimensiones de la información, con miras a terminar en Ubuntu 10.10. [...]


  27. on 30. March 2010 at 5:00 Ubuntu 10.10 medirá la información en kibibytes (cuando sea necesario) « Swichers Linux

    [...] Un desarrollador oficial, así como el wiki con la política de uso de unidades de medida en Ubuntu, confirman que la distro está en pleno proceso (este sí democrático, por cierto) de estandarizar su manera de representar las dimensiones de la información, con miras a terminar en Ubuntu 10.10. [...]


  28. on 31. March 2010 at 8:09 Ubuntu 10.10 medirá la información en kibibytes (cuando sea necesario) | Todos Geek

    [...] desarrollador oficial, así como el wiki con la política de uso de unidades de medida en Ubuntu, confirman que la distro [...]



Comments are closed.

  • Archives

    • November 2012 (1)
    • April 2012 (1)
    • March 2011 (3)
    • January 2011 (5)
    • December 2010 (1)
    • November 2010 (1)
    • October 2010 (1)
    • August 2010 (2)
    • July 2010 (1)
    • May 2010 (1)
    • April 2010 (2)
    • March 2010 (8)
    • January 2010 (2)
    • December 2009 (3)
    • September 2009 (3)
    • August 2009 (2)
    • July 2009 (1)
    • April 2009 (2)
    • March 2009 (1)
    • December 2008 (1)
    • November 2008 (1)
    • September 2008 (2)
    • June 2008 (1)
    • May 2008 (4)
    • April 2008 (2)
    • March 2008 (6)
  • Categories

    • Blog (1)
    • freie Software (33)
      • dvdbackup (1)
      • HTML Validator (1)
      • libkibi (5)
      • Ubuntu (21)
    • Glaube (1)
    • Origami (4)
    • Planet Debian (18)
    • Planet Ubuntu (34)
    • Sinnbefreites (3)
    • Studium (1)
    • Tiefsinniges (3)
    • Urlaub (1)
    • verschiedenes (7)
  • Pages

    • About

Blog at WordPress.com.

Theme: MistyLook by WPThemes.


Follow

Get every new post delivered to your Inbox.

Powered by WordPress.com
%d bloggers like this: