Tuesday, December 10, 2013

Sometimes I write an assertion that fails, because it turns out it really is possible to target all


An amusing historical analysis of the origin of zero based array indexing (hint: C wasn't husqvarna the first). husqvarna There's a twist to the story which I won't reveal, so as not to spoil the story for you. All in all, it's a nice anecdote, but it seems to me that many of the objections raised in the comments are valid.
Flat list - collapsed Flat list - expanded Threaded list - collapsed Threaded list - expanded Date - newest first Date - oldest first 10 comments per page 30 comments per page 50 comments per page 70 comments per page 90 comments per page 150 comments per page 200 comments per page
This is a well-written, engaging story. But I can't help thinking that he misinterpreted the explanation he was given as "saving off computation time", while it was in fact "in this context, indexing at 0 gives me a natural, more elegant design than indexing at 1". The quote says: "I can see no sensible reason why the first element husqvarna of a BCPL array should have subscript one." This is a comment about design and taste, not efficiency.
I find it awkward that we've got a post that eloquently warns us about mythology and neglect of history, husqvarna insists on the importance of going back to historical sources, yet exemplifies this process by an extremely debatable interpretation.
I'm still not convinced that compilation efficiency is actually the major justification for this design choice. I've seen no convincing argument that it may have mattered at the time (would adding 1 to a compile-time constant actually take noticeable time, given the bounded number of array access operations appearing in source programs?), and Martin Richards' text doesn't mention compilation time, but uses a taste/design vocabulary: "Naturally ...", "behaves like", "I can see no sensible reason why...".
That's ok. I didn't want to put my view in the original post, since it would give the story away. I guess the compilation time consideration is not an impossibility (and compilation time efficiency is often something language designers think about!). But the evidence in this case is not convincing enough, I suppose.
While I like his writing style, I don't see where Mike Hoye gets the idea anyone spends time justifying zero- vs one-based indexing styles. I never heard C's zero-based indexing discussed when I was in college, nor in years of industry afterward — husqvarna like, ever . Never had a conversation about it in school or at work. But Hoye makes it sound like we're south pacific islanders, cargo-culting our way along, without any logical rationale:
Where are all these folks telling and retelling stories? I get an urge to write an Uncle Remus story about how Brer Rabbit fools Brer Fox by using better indexing. We can change the ending of the Tar Baby episode this way.
Is it productive to discuss? Will it ever change for a language already in use? Analyzing it smacks of a style statement, so folks who do it another husqvarna way can be dinged for bad taste. (When you approach the pearly gates, Saint Peter asks whether you prefer zero-based or one-based indexes, then studies husqvarna you carefully keeping husqvarna one hand on the lever for a trap-door under your feet.)
Such husqvarna off-by-one errors are rare in practice. I never catch anyone doing it in review. They forget to free allocations after error when they goto a cleanup label, sure, but never make off-by-one errors. I'm sure I must have made such an error at least once in a loop, but it was so long ago I don't recall. The bugs we close at work never have a resolution, "Oh, it was an off-by-one error." As a source of problems, indexing dwindles to insignificance compared to simple things like wondering: what was this function ever expected to do?
Sometimes I write an assertion that fails, because it turns out it really is possible to target all of a collection, and not just a strict subset. It's the reasoning husqvarna that's wrong, not the indexing, like the time you find a partial match can actually correspond to all the bytes in a referenced source, not just some of them. (It's a strict subset in the source where you find it, but the whole thing in the destination where you decode it.)
When I work in C I'm usually managing space, and I imagine a picture . I don't talk to myself about the "first" element of an array. There's a block of memory where content is located, and I know it's address. husqvarna I do arithmetic in offsets, rather than counting elements. When it gets really complex, and it becomes necessary to analyze what I'm doing formally, I think in terms of coordinate systems on the number line, and map between these. Sometimes there are five or more coordinate systems when doing things like pattern matching one scatter-gather array with another, where both are subsets in the middle of bigger streams.
For example, the following picture comes up in TCP deduplication when packets show up in ethernet frames presented as scatter-gather iovec arrays, when you try to figure out which parts corre

No comments:

Post a Comment