Day 281 – Nature of Programming

85 – “Hello World” in 100 Programming Languages

This is a more technical post than the usual 365 fare.
Apologies to the non-programmers.

On the face of it, programming is about creating programs that do something useful. Turning a problem into a solution. And hopefully a working one at that.

To me, programming is about quite a bit more than that; it’s the act of taking an abstract idea and trying to encode all aspects of that idea into a correctly functioning piece of software. I use the word “trying” with purpose, because expressions in real-world programming languages are all just approximations of this ideal. Real-world programs always have some level of defects and “good enough” about them.

In the first instance I’d always strive for the ideal though.

Programming languages provide a great many tools to express meaning; comments, named variables and methods, automated tests, type systems, code contracts with formal verification.

The strongest expressions of meaning are always preferred; a formally verified code contract beats using parameter types alone. Boxing-in the meaning through parameter types beats well considered names. And even code comments beat constraints inherent in the idea that are left un-expressed altogether.

Every piece of meaning from the original idea that is added to a piece of code improves the chances of it getting implemented and maintained correctly. Additional meaning might have the compiler catch you in a lie before it becomes a bug. Additional meaning might help you remember the nit-picky details of how things hang together when you try to work on a piece of code years after you originally wrote it.

When I start with an idea, I look at it from all the angles. I try to work out what the underlying truths of the idea are. And then I try to find a way to translate all of these truths into pieces of the program.

Code contracts can expose code paths that can lead to null pointers where my feeble brain told me it was impossible for them to appear. Exploiting the type system can let the compiler warn me when I’m about to crash the Mars lander by mixing up my feet and metres. Expressively naming my methods and variables can help me spot where I’ve forgotten to sanitise a user-input.

And the trick is to push the encoding of concepts as far as possible into the strongest encodings the language has available.

Sure, I could make all my variables dynamic, then add unit tests to verify appropriate behaviour by variable type… but why not use a strong type system instead? Dynamic variables should really only be used where there is no alternative; maybe because some aspect of the strong type system is too strong to express the flexibility inherent in an implementation. But to use them routinely and then patch the hole with unit tests shows a very profound misunderstanding of the purpose of the type system.

Sometimes approaches can be combined to make an even stronger encoding of a concept. Just using Hungarian Notation to differentiate sanitised user input from un-sanitised user input is a good start… but adding program-wide automated testing in the build system to verify that variables using the naming convention for sanitised input only are assigned from sanitised input, or from the sanitisation method reinforces the concept in a way that makes it almost part of the language itself. A visible portion of the code that is verified by almost-the-compiler.

And there will be concepts that are hard or impossible to encode to the ultimate degree.

A variable name can indicate that the assigned value should only ever be a prime number… but there is very little that can be done to guarantee this is true, beyond hoping everyone is careful not to break that promise in the code. There is no way to reasonably implement a strongly typed “PrimeNumber” type.

But that doesn’t mean we shouldn’t keep trying.

And sooner or later, the ad-hoc encodings that have broad use and applicability will turn into new programming language paradigms for the next generation of languages. And they will be harder to program in… but only because they won’t allow us to be nearly as imprecise with our “words”.

You can lament the fact that not using dynamic variables means that you need to put in some extra effort… but all I hear is “why can’t you let me have some more hard-to-diagnose bugs?”