What’s in a Name? (Part I)

What’s in a name? That which we call a rose
By any other name would smell as sweet;

William Shakespeare, Romeo and Juliet

Just because I am quoting one of literature’s most over-quoted lines, don’t take me to be a Shakespeare Snob.  (Not that I wouldn’t want to be a Shakespeare Snob–I consider it one of my life’s failings not to have read and seen more of his plays, but I digress…)  I just thought it would be a good way to ease into the topic of variable names.  I need to go slowly because getting into this topic is like wading into a lake filled with piranha.

There is something in the Bard’s words that are applicable to programming, that might be stated thus:

What’s in a variable name? That which we call NumberOfItems
By any other name would still be able to count the number of items.

Kind of rolls off the tongue, doesn’t it?  The compiler certainly doesn’t care if we we call a variable iNumberOfItems, NrItms, or xxxxx.  So who does?  Well, maybe your boss does, if he came up through the nerd ranks and has tried to impose some variable naming standards.  Maybe the poor soul who has to read and debug your code long after you’ve moved off to your next job does when s/he cannot remember what noits means.  But really, it should be you who cares!

Now there are a plethora of naming standards which you can find out there.  I’ve dealt with several ad hoc standards in my day, even tried to devise some of my own.  In Microsoft’s .Net Framework there is a codified set of rules here.  A quick web search turned up a set for Java here.  I’m sure I could find more.

For me it’s a constant battle to avoid the sin of using abbreviations in variable names.  Perhaps it’s the emotional scarring of having spent so many years with IBM assembler where a label could never be more than eight characters.  Back in the mists of history I even managed to run into a Basic interpreter that would only permit one-character variables, $A, $B, %I, etc.  (The horror, the horror!)

I’ve come a long way since the days of uppercase-only variables from FORTRAN, PL/I, and 360 Assembler.  When UNIX and C came along, there seemed to have been a backlash against all of the upper-cased-ness of the IBM world, despite the fact that UNIX was UNIX and not unix.  But I found it all kind of silly, the idea that your program in C was more readable because everything was in lowercase, and my program in PL/I was less readable because all the keywords and variables were in uppercase.  I say silly because I believe that Humanity as a collective had decided that mixed-case writing was superior to mono-cased writing about the time of… well, since people have been writing. (You know, back in the day, with that IBM 026 card punch, all they had was uppercase letters, and we were damn lucky to have those, sonny!)

As a kind of modern guild, we developers seem to have arrived at a consensus that mixed case variables, function names, labels, etc. are better.  Sure, there are some exceptions.  In C/C++/C# the names established via #define are conventionally all uppercase.  (Although C# has done its best to take #define out behind the garage and put a bullet in its head–rightfully so.)  And it seems that language keywords will probably forever more be in all lower case, despite the occasional holdout (e.g. me when coding SQL).

But variable and function naming conventions, while dovetailing into the mixed-case school of design, are still fractured along other lines, leaving room for more controversy, and yes, silliness.

Hungarian notation, a legacy of Charles Simonyi, was a wonderful idea, particularly when using untyped or weakly-typed languages.  While it seems to be somewhat disparaged nowadays, it still has its proponents.  So it’s still likely you’ll run into a set of corporate coding standards somewhere, dictated by one of this discipline’s disciples.

The OCD programmer in me is still lured by Hungarian.  Almost as soon as I type “int” for an integer variable, the middle digit of my right hand is atop the I key, and something deep within me wants the variable name to begin with a lower case “i”.  But I should have known something was horribly wrong by the time I got to the lpsz prefix in Win32 C.  (For those never traumatized by this, it stands for “long pointer to a zero-terminated string.”)  And this is where Hungarian notation falls on its ass.  Where do you stop?  Sure, ints, bools, chars, and strings–they all have easy selections for a prefix (i, b, c, s).  But what about byte?  Eh, can’t use “b”, that’s already taken by bool.  How about “y”?  Great, except only you will know that means byte.  What about “by”?  Well, hell, you’re already halfway there, may as well just use “byte”.

OK, so you spend a day or two, get the complete list of all the native types in your language and come up with clever alternatives for the prefix collisions: byte/bool, string/short, decimal/double, etc.  Put them up on the Wiki, and… Mission Accomplished!

Not so fast.  What about those secondary types you use all the time?  In my job I use way more .NET DateTime variables than I do chars.  OK, we’ll use “dt” for that.  I sure deal with a lot of DataSets.  OK, use “ds” for that.  There are DataTables, too, so we’ll use “dt” for that–oh, no can’t we used that for DateTime already.  The .NET Framework contains thousands of classes and structs.  You mean to say you’re going to take another few days and figure out which are the ones your team uses and then assign prefixes to them?  Push ‘em out to the Wiki, send an email indicating that All Must Follow These Hungarian Notation Prefixes, start code reviews for compliance checks…  If your co-workers are nice, they will laugh at you and recommend you get professional help.  If they are not, they will kill you, and rightfully so.

Even if this worked, so what?  A significant fraction of the variable names in my code are for classes that are defined in the application itself.  This means you will never stop updating that Wiki.  And those snickers behind your back won’t stop until the day your boss decides enough is enough, and the two of you head down to HR for a short chat.

So Hungarian is a slippery slope.  Perhaps you could try to draw the line at only basic or native types, but I think that’s just a gateway drug.  Look, it’s 2010, and we just have to say “NO” to Hungarian!

Except…

OK, so here’s where I come clean.  I find going Hungarian particularly helpful in one or two situations.  I know, it’s kind of like saying, “I only shoot up on weekends,” but when I am programming a UI (WinForms in my case, but a Web UI would apply here as well), I find it very helpful to use short prefixes for the controls on the form, e.g. btn=Button, txt=TextBox, chk=CheckBox, etc.  Since the list of controls in my IDE (VS2008) is arranged alphabetically, this keeps them grouped into types.  With scores of controls on a complex form, this can be pretty useful for finding a particular control quickly.  It also lets me easily find controls I may have dropped onto the form and neglected to rename from the boilerplate name the IDE assigned.  So I don’t think I’m going to stop this any time soon, despite its flying in the face of the orthodoxy I proposed in the previous paragraph.

My second sin in this department is that I cannot get away from “scope” prefixes.  You’ve probably seen these in some form or another.  One of the more common is to use “m_” to prefix a private or protected member variable.  (Alternatively you might see just a plain “_”, but I did C for too long and will always fear starting a name with an underscore.)  I use “k_” for a const, and “s_” for a static.  That’s pretty much the list.  Since I’m doing .NET, I do tend to avoid these for public names, since the Framework is clearly anti-Hungarian, and to a consumer of my class, I don’t want to present a different paradigm.

I realize that, at least as far as the .NET coding guidelines go, I’m committing some sins here, especially with the “m_” business.  The recommendation from the .Net guidelines is to use “camel case” for what they call “fields.”  I’m going to continue this on the next post because (1) this has gone on long enough, and (2) this whole “camel case” thing is another rat hole that I’m going to dive into, and I need to get my strength up for that adventure.  Until then…

Just say “NO” to Hungarian, except in certain cases where you can’t bear to let go.

This entry was posted in Naming conventions, Variables and tagged , , , , , , . Bookmark the permalink.

2 Responses to What’s in a Name? (Part I)

  1. Pingback: Nothing is Something | The Code Curmudgeon

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>