Programming Language Design

One of the things that strikes me is how different programming languages are, not necessarily in the way they look, but in the way they require you to use them. In some sense, this is no surprise, of course, so I better clarify.

Many if not most programming languages have their syntax derived from C these days, and therefore share syntactic elements like curly braces for delimiting blocks, loop and branch syntax, and so forth. In that sense, C, C++, Perl, PHP, Java, Go, Pike, and a plethora of lesser known languages look almost exactly the same.

Sure, there are differences in that some of these define extra keywords and syntactic constructs, for example to support object-oriented programming. But those syntactic constructs again look similar from language to language, and are closely modelled on the C syntax for defining structures.

Personally I’d consider these languages as very similar in looks. But of course there are others, from the venerable Pascal to Python or Ruby, that do not share the same syntactic elements. To name an example, blocks in Ruby function a bit like more complex lambdas — or less complex closures — concepts that simply don’t exist in all languages.

So when I say that different languages require you to use them in similar ways, I’m not talking about syntax, directly. I’m also not talking about e.g. how documentation for third-party libraries is expected to be delivered, or anything to do with the wider ecosystem in which the language is used. These differences are, for the most part, directly related to language features.

What I am talking about, then is that language designers seem to find it perfectly acceptable to design languages as magic incantations, where weird sequences of symbols need to be learned by rote rather than grasped by intuition.

To name an example, different languages define namespace separators as a single dot (.), a colon (:), a double colon (::) or various other strings of symbols. Worse, some of these languages assign duplicate meanings to that symbol: in Java, for example, the single dot is both a namespace separator as well as the operator for referencing an object member.

I should, I think, point out that I don’t blame these designers for their choice. There tend to be plenty of reasons why they made their choice, and offhand I can think of a few that I consider to be good:

  1. As I pointed out, many language share similar syntax. That’s usually deliberate, and makes it easier for a programmer to transition from one language to another, and quite often that’s a design goal.
  2. Programming language syntax can have tricky corner cases. When writing a parser, it’s easier to stick to a syntax that’s already well understood. And it’s important to not introduce too much difficulty when writing parsers, as the parser needs to be reliable. Simple is better.
  3. Most programming languages are designed to solve a specific programming problem better than other languages, and the “mundane” stuff like syntax for conditional statements, etc, simply isn’t on the designer’s mind.

All of these are good reasons. Yet all of these lead to perpetuation of bad design choices. To illustrate in a little more detail, let me give you an example of variable declaration (with initialization), function declaration and class declaration in C++ and Python:

int foo = 123;
int bar(int c, double d);
class baz : public quux, private foobar { /* ... */ };

Conversely in Python:

foo = 123
def bar(c, d): pass
class baz(quux, foobar): pass

Notice something? In C++, declaration of three different things shares almost no similarities in syntax. Python, on the other hand, is a bit more consistent in that function and class declaration is almost identical, with the exception of a different leading keyword.

There’s good things to say about the choice made in these languages and bad. On the one hand, good design makes similar things look similar; as they’re all declarations one might argue that they should all three look similar. On the other hand, they all declare different things, so maybe letting them appear very different is better.

I’m not writing this post to argue for one or the other, really, though I have my preference.

The reason I’m writing this post is that I think it’s time language designers understood a simple fact: programming languages are user interfaces.