Thursday, April 27, 2017

No comment

There are times when code comments are very helpful. I could spend quite a bit of time trying to understand the point of this method:

 public long nextLong() {
   this.l.lock();
   try {
      this.u = this.u * 2862933555777941757L + 7046029254386353087L;
      this.v ^= this.v >>> 17;
      this.v ^= this.v << 31;
      this.v ^= this.v >>> 8;
      this.w = 4294957665L * (this.w & 0xffffffff) + (this.w >>> 32);
      long x = this.u ^ (this.u << 21);
      x ^= x >>> 35;
      x ^= x << 4;
      long ret = (x + this.v) ^ this.w;
      return ret;
   } finally {
      this.l.unlock();
   }
 }


But code comments are the absolute wrong place to elucidate on this. Instead just preface the class with something like this:

//Mid-quality random numbers, better than java.util.Random,
// faster than java.security.SecureRandom
//Source: Press, Teukolsky, Vetterling, Flannery
// Numerical Recipes 3rd Edition: The Art of Scientific Computing


Now I instantly know the intent and, if I really want the details, I know what book to read. Similarly, sometimes it's obvious what a piece of code does, but not why. Such situations usually arise around edge cases that someone encountering the code for the first time wouldn't have thought of. A 1-line comment like:

 // if we get here, it means we weren't able to find a match

is a good thing.

That said, the academic infatuation with code documentation is just nutty. Every class must have a header? Really? You need me to explain what this 3-line interface file does? If you do, you really shouldn't be looking at my code at all.

package query;

public interface IQueryDistribution {
   QueryPattern getQueryPattern();
}


Even if you don't know Java, you should be able to figure out from the words "public interface" that this is some kind of interface. The fact that it's called IQueryDistribution is a tipoff that it has something to do with the distribution of queries. Similarly, the fact that the only exposed method is called "getQueryPattern" and it returns an object of type "QueryPattern" is a pretty big clue that any object conforming to this interface should produce query patterns.

Oh, but I didn't tell you all the places it's used and what classes implement it. You know why? I have no idea! The whole point of writing an interface is so you insulate your code from that kind of dependency. Any such comments would be obsolete the minute I turn it over to the rest of the development team.

I think a lot of academics don't realize just how quickly code is developed in industry. When I started programming in the late 70's, extensive header documentation actually made some sense. It probably took all day to write the 50-line function being documented, so spending another half hour filling in the boilerplate header was not terribly burdensome. Code was typically reviewed printed on greenbar rather than online, so you couldn't just right click on an object to get a list of all the places it was referenced. Nor could you instantly reference version control to find out who messed with this code last.

That's simply not the way things are done anymore. With today's development tools, even an old codger like me can knock out hundreds of lines on the rare day I spend actually writing code. Junior programmers who are doing it all day every day may produce over a thousand. Just about everything I used to get from comments can be produced by the development environment almost instantly. The ability to run the entire suite of unit tests in seconds on any code change means that I get immediate feedback if I've misunderstood something. For that matter, assuming the unit test cases really do exercise the object, they provide much better insight into how the methods should be called and the structures returned. If I want to invoke an object, I'll often cut and paste the unit test that matches my use case the best right into my code and modify from there. It's always better to start with a piece of code that works.

Ah, well, it doesn't take that long to put a brief header on each file. So, since I've put way too much effort into this term project to get dinged on comments, I spent a few hours dutifully inserting explanations into my code even though they would already be clear to any competent programmer.

Don't get me wrong; I'm not opposed to documentation. Having some decent wiki-style entries on how to find, fix, and enhance the code is a great thing. The place for those are on the development team's shared web page. Having test cases written down in advance greatly reduces misunderstandings on requirement. Those should go in whatever test management tool you're using. And, of course, if a class is going to be published for general use by other groups, then one really ought to spend a little time writing up some sort of user guide telling what the class does and how to invoke it. None of these things choke your source files with a bunch of unneeded text that simply gets in the way of reading the actual code.

What is silly is the idea that taking code written in a non-ambiguous computer language and paraphrasing it in an ambiguous natural language somehow helps understand it. When the intent is truly unclear, by all means, let the reader know what you were thinking. But, if all you're doing is re-writing your code in English, you're wasting your time.

No comments:

Post a Comment