Why Neptune?

Lately I’ve been writing a lot about the neptune language, the replacement for smalltalk on the OSVM platform. I’ve tried to focus on the technical aspects both because there’s lots to write about but also because when you’re blogging about work, keeping it technical makes it less likely that you’ll accidentally write something, shall we say, career limiting. In this post I’ll take my chances and write about one of the less technical aspects: why we made the switch from smalltalk to a different language, and why we decided not to use an existing language but design our own. It is not about the strategic business decision of changing the language — for that you’ll have to go through the official channels — but my perspective on the switch as one of the people developing the platform.

I’ll start off by explaining a bit about OSVM and why we used smalltalk for the platform in to first place.

Why Smalltalk?

To give some context I’ll start of by describing what OSVM is. The OSVM platform is basically a virtual machine targeted at memory-constrained embedded devices. The platform has been designed with several goals in mind. It should be very simple in order to make the runtime as small as possible, since memory is limited. It should never be necessary to restart the system — if you’re dealing with software for applications security or medical equipment you want the system to be able to run without any interruptions, ever. If an error occurs, and of course they do, it shouldn’t kill the system; rather, it must be possible to diagnose and fix the problem in the field. That’s not always possible but there’s a lot you can do to make it more likely that the system will survive an error. Finally, it must be possible to update the running program in the field, both to fix errors and to install software upgrades. Being able to diagnose a bug in your program is no good if you can’t update the software to fix the problem. This, in short, is what OSVM is all about.

These requirements didn’t pop out of thin air. When you’re used to working with smalltalk this is what you expect from your platform: robustness, flexibility and the ability to interact directly with running programs and to update code on the fly. The smalltalk language and environment were developed together, which means that language was designed with all the issues that are important to us in mind. This is not the case with any language I know outside the smalltalk family, which makes it uniquely suitable for our use.

But using smalltalk is not without problems.

Why not smalltalk?

The languages most used in writing embedded software is C and C++. If a company considers using a platform like OSVM instead they not only have to consider the direct cost of buying the system from Esmertec. They also have to consider the indirect cost in time and money to train their employees to use the system, before they get actually writing software.

C++ is a object-oriented programming language (let’s be generous and call it that) but other than that, C++ and smalltalk are completely different. In order to use smalltalk, a C++ programmer would have to learn everything from scratch: how to declare a variable, how do write an if and while statement, how to write a string or character literal, everything. Since most embedded software is developed in C and C++, that means that there is a considerable indirect cost associated with using smalltalk with OSVM.

The pragmatic programmers recommend that you learn a new language every year. I think learning new languages, both programming and natural, is a great hobby. If I have a deadline, however, “broadening my thinking” by learning a new programming language is not my top priority. Imagine that you were writing a book and had the choice to either write it in English using a primitive ASCII text editor, or in Danish using a state-of-the-art graphical text editor. Assuming that you don’t speak Danish, what would you choose? Or, perhaps more realistically, what would your boss choose for you?

Even though the OSVM system is very much inspired by smalltalk and the smalltalk environment, what is important is not the language itself. The important thing is the properties of the language: robustness, flexibility, serviceability, etc. The goal of the neptune language is to retain all the important aspects of smalltalk but to adapt the less important ones, like the syntax, to make it easier to learn and use by programmers who are used to more mainstream programming languages. That is not to say that syntax is not an important part of smalltalk but it is not essential for its use as a part of OSVM.

I may give the impression that I think smalltalk is the pinnacle of programming language design and any change is a change for the bad. That is not the case: I think there are plenty of ways to improve smalltalk. In changing the language we’ve tried to address many of the quirks and shortcomings of plain smalltalk to get a system that is all-round better to use and not just easier to learn for C or C++ programmers. I won’t go into any details about how I think neptune improves upon smalltalk, though, since that’s a touchy subject that I don’t want to obscure the point I’m trying to make here.

Why Neptune?

Why design a whole new language instead of using an existing one? The whole point is to not force people to learn a new programming language and if we design a new one we can be absolutely certain that no one knows it.

It all comes down the nature of the OSVM platform. The language has to be very simple since the runtime must be minimal. This requirement alone excludes many languages including Java, C# and python. But the real killer is that it must be possible to update the code dynamically. I don’t know of any languages (except smalltalk) where you can give a reasonable meaning to updating the code of a running program in all cases. This is especially bad in languages where the runtime behavior of a program depends on static typing. When designing a language it is very easy to make decisions that make it very hard to update code on the fly. You have to keep this kind of use in mind during the design and smalltalk is really the only language designed like that. Having decided that smalltalk is not the way to go, that means that there are really no existing languages for us to use. Hence neptune.

Designing the language from scratch also means that we’re free to adapt the language to fit our use. The result is a language that, beneath an added layer of syntax, is still very close to smalltalk, and retains many of the properties that makes smalltalk unique. An indication of how close smalltalk and neptune are is that the virtual machine is almost exactly the same as before, the biggest change being that indexing is now 0-based instead of being 1-based. On the surface, however, neptune is much closer to C++ or Java. Basic things like variable declarations, control structures, operators and literals will all be familiar to a C++ or Java programmer. When learning the language this means people will be able to use much of what they already know and will be able to be productive almost immediately, and to move on more quickly to the more advanced concepts such as traits, blocks and optional types.

Some might consider it to be “dumbing” the language down but we’ve also made a conscious effort to make it possible for people to use the language without necessarily being familiar with the full generality of it. When you’re using an ifTrue:ifFalse: expression in smalltalk it might be intellectually gratifying to know that it’s not a “magic” construct, just a method call with two block arguments. But it’s not something you necessarily want people to understand from day one in order to use the language. In neptune, we have a separate if statement but that is still just a shorthand for a call to the if method. You’re free to call that method directly, without the special syntax, to the exact same effect. The full generality of smalltalk, along with some new constructs, is still available to you in neptune but instead of having it forced upon you, you can grow into it gradually. Even though it make the language a bit less clean than smalltalk, I think that is a pretty reasonable tradeoff.

Traits

One of the changes we’ve made in going from smalltalk to neptune, beyond pure syntax, is in the object model. We’ve added an optional type system and introduced two new concepts: traits and protocols. I don’t mean new in the sense that we invented them, they’re pretty well known actually, but they’re new in the sense that they’re not part of standard smalltalk. In this post I’ll give a brief introduction to traits.

Smalltalk’s single inheritance model is clean and simple, but it also has many limitations. The most obvious limitation is that a class can only share code with one other class, its superclass. In addition, there are also more conceptual limitations. If I create a subclass B of class A, an instance of B should be considered a special case of an instance of A. The typical toy example of this is animals. Let’s say that the class Eagle extends the class Bird which extends the class Animal. This models the fact that an eagle is a bird and a bird is an animal. On the other hand, a bat shares many of the characteristics of a bird but it is not a bird so the class Bat cannot inherit from Bird. Even though there is much code in the class Bird that could be used in class Bat, single inheritance doesn’t allow them to share code because it is just conceptually wrong. I guess this is a pretty poor example but at least if someone disagrees and believes a bat can be implemented by using parts of birds I can just sit back and wait for the villagers with torches and pitchforks to take care of him. But I digress.

C++, another ancestor of neptune, has multiple inheritance which at least solves some of the problem — a class can now share code with more than one other class — but introduces so many other problems that we’ve never really considered it as a serious alternative. The solution we’ve decided to go for is traits. Traits seem to be slowly making their way into the mainstream as a more flexible way of sharing code between classes; variants can be found in languages such as perl6, scala and fortress.

To explain how traits work in neptune I’ll show you an example from our libraries. One place where we’ve used traits in our libraries is for implementing relational operators such as <, >=, etc. In most languages I know, the standard way of making an object x comparable to other objects is to add a method to x that takes the object to compare it with, y, and returns an integer signifying whether x is less than, equal to, or greater than y. In java that method is compareTo: if x is less than y, x.compareTo(y) returns a negative number, if they're equal it returns zero and if x is greater than y it returns a positive number. In neptune, we use the spaceship operator, <=>, for comparing objects. For instance,

"aardvark" <=> "zebra"

returns a negative number because the string "aardvark" is lexicographically less than "zebra". That's all nice and general but it is not very convenient or readable. Writing

if ((x <=> y) < 0) return y;

is a lot less clear than writing

if (x < y) return y;

This is where traits come in. Since the relational operators are completely defined by the comparison operator I can implement the relational operators as a trait:

trait TRelation {

require int operator <=>(var that);

bool operator ==(var that) {
return (this <=> that) == 0;
}

bool operator <(var that) {
return (this <=> that) < 0;
}

// ... and >=, <=, !=, > ...

}

This doesn't define an a class but a set of methods that can be used by any class, provided that is has the <=> operator. One example of this could be an implementation of fractions:

class Fraction {

use TRelation;

int num, denom;

bool operator <=>(var that) {
// ... fraction comparison ...
}

...

}

The easiest way to think of the use declaration is that it essentially copies the methods declared in the TRelation trait and pastes them into the Fraction class. The only requirement in this case is that class using the trait must define the comparison operator. By the way, a class can use as many traits as it wants.

Unlike inheritance, the class Fraction is not put in any kind of "relationship" with the trait TRelation. Traits are nothing more than collections of methods that you can import into your own classes almost as if you had used a #import in C. It should be completely irrelevant to anyone who uses the class Fraction whether the < operator is implemented by using a trait or by writing a method in the class itself. Because of this, a trait cannot be used as a type -- you cannot, for instance, declare a variable of type TRelation. This is also the reason why you write use declarations in the body of the class and not the header, since it is an implementation detail and not something should know about to use the class.

A trait is only allowed to contain methods, not fields. This might seem like a considerable limitation but it simplifies the model considerably and if a trait needs a field it can simply require the class to have it. For instance, this trait which contains methods for dynamically extending an array or vector, needs a contents field:

trait TVector {

require accessor contents;
require accessor contents=(var value);

void ensure_capacity(int new_size) {
if (new_size > this.contents.size) {
// ... enlarge contents ...
}
}

}

TVector does not have the field contents itself but requires any class that uses it to have accessors for it. A class that uses TVector can implement this by actually having a field contents but since accessors are just ordinary methods (you can think of them as getter and setter methods), the class is free to implement them however it wants.

Our traits are very closely modeled on the trait mechanism for smalltalk described in the article linked above (and again here for your convenience) which I can definitely recommend reading. There are plenty of subtleties to the mechanism that I haven't mentioned. What happens if there's a name clash when importing more than one trait? (There's a mechanism for renaming methods) Can traits use other traits? (Yes) How does super calls work with trait methods? (I can't answer that in a single sentence). For the full answers to all this you'll just have to read the article. The original motivation for adding traits was the trouble we've had with implementing the collection hierarchy. It turns out that some of the people who wrote the smalltalk traits article have also written another one on exactly that: using traits in the design of a collection hierarchy.

So far we haven't used them that much in our libraries -- my guess would be that there is at most one trait to every ten classes. But in the cases where we have used them, they've really saved the day.

Exceptions

It seems that these days whenever anyone describes some language feature, I’m compelled to compare/contrast that with how we’ve done things in the new neptune language. Most recently James Robertson has written about exceptions in smalltalk so, well, here we go.

In smalltalk exceptions capture the stack when they’re thrown which is a bit similar to the way stack traces are captured by exceptions in java. In java, though, what is stored in the exception is exactly a stack trace which is a serialized copy of the real stack that you can print and… well, not much else. In smalltalk the exception actually captures the real live stack which means that execution can be resumed from wherever the exception was thrown. This borders on what MenTaLguY calls “the Lovecraftian madness of full continuations”. But even though this model is neat and general, everyone I’ve ever talked with about resumable exceptions, at least those that have used them in practice, have told that their usefulness is actually pretty limited. James’ example code does little to change my attitude; that code is so convoluted that I take it more as an argument against than for resumable exceptions.

My attitude against resumable exceptions may just be a convenient rationalization, though. On the OSVM platform, which must be able to run on platforms with very little memory and processing power, we can’t afford the full smalltalk model anyway. In the previous incarnation of OSVM, which was based on smalltalk, there were no stacks or stack traces. You could throw any object as an exception and if someone else caught it there was no way to tell where it had been thrown from. In addition, there was no exception hierarchy and we usually just threw text strings or symbols: calling a method that hadn’t been defined caused the system to simply throw the string "LookupError" with no further information about which method was called on which object. Clearly there was room for improvement and now that we’ve had an opportunity to revisit the design, we’ve improved it quite a bit. The biggest change we’ve made has to do with, you guessed it, the stack. But that aspect will have to wait a bit.

The old system allowed you to throw any object so there was really no technical reason not to have an exception hierarchy and throw structured and informative exceptions rather than flat text strings. We just never really got around to it. For the new language we were rewriting all our libraries anyway so we added structured exceptions as soon as we had implemented exception handling. And, I must confess, doing it early on also meant that it would be a major hassle to take them out later if it turned out that someone thought it was a bad idea. Did I mention that the code name of our next release is “sneaky monkey”? Anyway, now when an error occurs somewhere because you’re trying to reverse a string (which you can’t currently do) you’ll an error that tells you exactly that: LookupError("ksiftseh".reverse()).

In neptune, the syntax for exception handlers is somewhat similar to C++, Java and C#. Here’s how an exception handler looks in C++:

try {
...
} catch (exception& e) {
...
}

This exception handler catches any objects thrown that descend from the class exception, because that’s the declared type of the exception variable e. Java and C# use the same approach. In neptune we have to use a slightly different model because we have optional typing, which means that type declarations are not allowed to affect the runtime semantics of a program. Here’s an example of a neptune exception handler:

try {
...
} catch (Exception e: IOException) {
...
}

The exceptions handled by this catch clause are listed after the colon, in this case IOException, and the declared type of the variable, Exception, has no influence on which exceptions are caught. One thing this means is that you can now catch several unrelated exceptions with the same exception handler:

try {
...
} catch (Exception e: IOException, AssertionError) {
...
}

If the variable has the same type as the exception being caught, which is usually the case, you can just leave out the type declaration:

try {
...
} catch (e: IOException) {
...
}

The try/catch syntax is not “magic” but a shorthand that, like all our other shorthands, expands straightforwardly into simple method calls. A try statement corresponds to a call to the try_catch method on Function. The handler above expands into something like this

fun {
...
}.try_catch(new Vector {
IOException.protocol,
AssertionError.protocol
}, fun (Exception e) {
...
})

Hey I didn’t say that the result was straightforward, just that the transformation was. And even though it may look expensive, with allocations and whatnot, the generated code is very efficient no matter if you use the try/catch syntax or write the call to try_catch yourself. An explanation of what .protocol means will have to wait.

Now for the stack traces. As you’ll remember, in Java and smalltalk the stack is stored in the exception object being thrown. I’m not one to underestimate the ingenuity of the people that implement smalltalk or java (since a majority of my colleagues have done both) but no matter how clever you are, storing the stack in an exception is bound to cost space and time. The worst part is that when creating or throwing an exception in smalltalk or java, you don’t actually know if the stack will be used or not so you always have to store it, just in case. That won’t do in OSVM because we don’t have processing power or space enough to do stuff “just in case”. Instead, you can specify that you want a stack trace at the location where you’ll be using it: the exception handler:

try {
...
} catch (e: IOException) at (StackTrace s) {
...
}

This way the stack trace and the exception object are completely decoupled and we only need to instantiate the trace when it is really needed. Another advantage of decoupling the trace is that throwing an object doesn’t change it. In smalltalk, throwing an exception that has been thrown before causes the previous stack to be overwritten, which is not an especially clean model. In Java the stack trace stored in the exception is captured when the exception is instantiated, not when it is thrown. In my experience that is the desired semantics only in one situation: if you want to get the stack trace but don’t want to throw an exception. Doing

new Throwable().printStackTrace()

is not uncommon in java but it’s a hack really. In neptune there’s a much cleaner way to get the current stack: just call the static Thread.current_stack() method.

I don’t know of any other language with a similar model but my experience so far is that it is exactly what you want and, in addition, simpler to implement and less expensive than the alternatives. There are still issues to work out, like what happens when you rethrow an exception and what exactly should be stored in a stack trace object. The biggest problem for me, though, is the keyword at. This is something we can still easily change so if you have a good alternative let me know.

Match

When I first heard about Scala, the very first thing I did was find the specification and look up the details of pattern matching. I’m a big fan of pattern matching, something I apparently share with Martin Odersky, and I wanted to see if Scala used the same model as his previous language, Pizza. Unfortunately it does as far as I can see. The reason I think it’s unfortunate is that I think there is a great potential in introducing pattern matching in an object-oriented language but I don’t think Scala’s model quite reaches that potential.

Before getting to Scala’s mechanism I’ll start off by explaining why I think pattern matching is something you would want to have in an object-oriented language. Conventional wisdom says that in an object-oriented language you’re not supposed to decide which action to take based on the type of other objects. Scott Meyers says in Effective C++:

Anytime you find yourself writing code of the form “if the object is of type T1, then do something, but if it’s of type T2, then do something else,” slap yourself.

Instead of dispatching on an object’s type, you should use polymorphism. This rule is a good first approximation, but it’s not the whole story. Sometimes you’re not in a position to add methods to other objects. And even if you are, adding methods can be a bad idea because it often means polluting other objects with methods that may be convenient for you but are completely irrelevant to the object itself and anyone else who might use it. Case in point: in Squeak the Object class has over 400 methods including isSketchMorph, playSoundNamed and, the winner by unanimous decision, hasModelYellowButtonMenuItems. Encapsulation? We don’t need no steenkin’ encapsulation!

This is where a pattern matching mechanism can be really useful. If you have a reasonably fixed set of object types that you want to operate on it gives you a structured, simple and readable way of distinguishing between objects. Unfortunately Scala’s model, and Pizza’s before it, imposes some limitations on the objects involved that make it unnecessarily difficult to use.

Scala’s pattern matching mechanism is pretty simple: you can declare a class as a case class, which enables you to match against it:

class Expr
case class Var(x: String) extends Expr
case class Apply(f: Expr, e: Expr) extends Expr
case class Lambda(x: String, e: Expr) extends Expr

Given an Expr object, or any object really, you can match it against these three cases like this:

val e: Expr = ...
e match {
case Var(x) => ...
case Apply(f, e) => ...
case Lambda(x, e) => ...
}

This looks very similar to functional-style pattern matching as in ML and Haskell.

The most important problem with this mechanism is that a case is bound to a particular class. The Var(x) case only matches instances of the Var class and its subclasses. It’s fine to dispatch based on whether or not an object implements a particular interface, because any class is free to declare that it implements that interface, and implement it however they want. Requiring classes to have a particular superclass, on the other hand, dictates how the class should be implemented which is a bad thing indeed.

Here’s an example of why that’s a problem. Let’s say I’m implementing a linear algebra library with various functions that act on points, lists, matrices, etc. I have a basic Point class:

case class Point(x: Int, y: Int) {
// All sorts of numerical methods
}

Let’s say that my library turns out to be a bit slow and after profiling it I discover that one point in particular is used a lot, (0, 0), and a lot of the algorithms can be simplified considerable in that case. So I make a separate Origin class which represents (0, 0) and has its own simplified implementations of all operations (here in an imaginary case-extension syntax):

class Origin extends Object case Point(0, 0) {
// Methods specialized for (0, 0)
}

In Scala you can’t do that — Origin must be a subclass of Point if it wants to match as a point. In this case extending Point means that Origin gets two fields and a lot of methods, none of which it is interested in using. Bugger.

The second problem is that the values in the pattern can not be computed, they are always read directly from the object. You might want to optimize the numerical library by making some operations lazy, so you have a LazyPoint which is just like a point except that it doesn’t calculate its coordinates until they are requested. Unfortunately you can’t: the coordinates are read through an accessor, which is generated for you, that simply returns the value of a variable. Double bugger.

The last problem is that once a class has declared that it’s a case, subclasses can’t change or override that. An example from the post about equality was the class LinePosition, which represents a position corresponding to a particular line in a text file. A subclass of this class was CharacterPosition, which not only specifies the line in the file but also a particular character within that line. I would probably want LinePosition to match LinePosition(line: Int) and CharacterPosition to match CharacterPosition(line: Int, char: Int). Unfortunately I can’t because if one is a subclass of the other then they must match the same pattern. In this case, CharacterPosition must match as LinePosition(line: Int). Triple bugger.

The conclusion here is that if I use case classes and pattern matching I’ve put a lot of restrictions on the way I can implement my classes. If I was to implement the linear algebra library I mentioned above and use pattern matching to distinguish between points, lists and matrices, I’ve severely limited the freedom to decide how to actually implement these objects. And the limitations are especially uncomfortable because they dictate how you should implement your objects: that superclass must be a particular class and that the internal state of the object, or at least part of it must be readable directly from variables.

What I’m saying is not that I think pattern matching in Scala was poorly designed — it’s more that I’m probably trying to use it to solve a problem it probably wasn’t designed to solve. But I don’t think it’s a fundamental problem with pattern matching. So without further stalling, here’s my suggestion for an alternative semantics that is much less restrictive.

There is already a well-known way to dispatch on the type of an object: the visitor design pattern. When you pattern match an object, you shouldn’t inspect the object “from the outside” to select the right case, you should instead let the object itself select the right case using double dispatch. If you were to write a double-dispatch version of the lambda example from before, without any syntactic sugar, the code would look something like this:

class Expr
class Var (x: String) extends Expr {
def match(m: Matcher) = m.var(x)
}
class Apply (f: Expr, e: Expr) extends Expr {
def match(m: Matcher) = m.apply(f, e)
}
class Lambda(x: String, e: Expr) extends Expr {
def match(m: Matcher) = m.lambda(x, e)
}

and a match expression would expand into something like:

val e: Expr = ...
e.match(new {
def var(x) = ...
def apply(f, e) = ...
def lambda(x, e) = ...
})

The match expression is packed into a Matcher object which is really a mini-visitor. The matcher is then given to the to the object being matched which calls back to select the case to match. Note that I’m not suggesting that you write the code like that. In simple cases the same syntax could be used as in Scala, in more complex cases you might need a more general syntax but I haven’t quite thought that through.

Using visitor-pattern-matching an object has full control over what and how it “matches”. In the Origin example this means that an Origin object is free to implement its match method so that it matches as a point. It also means that the lazy point object is free to calculate the coordinates before calling back to the Matcher. Finally, CharacterPosition can override the match method so it is no longer forced to match as a LinePosition. Nice!

Of course there are problems. If you want to be able to have default cases that match if no other cases do, the matcher object must be able to handle calls to methods it doesn’t implement — sort of like the way smalltalk does with doesNotUnderstand. Another problem is that the extra flexibility enables objects to abuse the Matcher object for instance by matching more than once or not at all. Finally, you don’t want to use regular method names like point since that’s a flat namespace — instead you would use some form of qualified names. But none of these problems are very difficult to solve.

Bottom line: I think visitor-ifying the pattern matching mechanism in Scala would make it much more flexible and useful but still provide, as a special case, the current mechanism. And oh, I almost forgot, it can also be used to solve the equality problem that I described in an earlier post but that will have to wait.

Update: I discussed this with some people from the scala group and they managed to convince me that visitors are just not a realistic implementation strategy. Bugger!

Reputation

Clearly, the whole cartoon controversy has done a lot of damage to Denmark’s reputation in the muslim world. This clip from the Daily Show gives an indication of how other western countries have come to view Denmark. Towards the end Michael Mandelbaum says that the reason other countries don’t oppose the US more actively, however exasperated they may be, is that they need it to play the role it does, and no other country can play that role. Jon Stewart’s comment on that is:

And that, not forgetting that Denmark could be out there, could be watching us right now. Cause those people don’t give a [beep].

Scary.

Equals

I recently discovered that if you look closely at the seemingly simple problem of comparing objects for equality it’s actually very hard, if not impossible, to solve satisfactorily. I looked around a bit to see if this was a well-known issue and it turns out that I’m certainly not the first person to notice this problem. See for instance here and here. This post introduces the problem (or one of the problems) of implementing object equality methods.

I have a standard template for implementing the equals method in Java:

public class X ... {

// ... body of X ...

public boolean equals(final Object obj) {
if (this == obj) return true;
if (!(obj instanceof X)) return false;
final X that = (X) obj;
// X-specific comparison
}

}

Unfortunately it turns out that implementing equals using instanceof doesn’t work in general. Let’s say you have a class LinePosition which specifies a particular line in a text file. Two line positions are equal if they represent the same line:

public boolean equals(final Object obj) {
if (this == obj) return true;
if (!(obj instanceof LinePosition)) return false;
final LinePosition that = (LinePosition) obj;
return this.getLine() == that.getLine();
}

Now you might later want to create a subclass, CharacterPosition, which not only specifies a line but also a particular character in the line. Two character positions are equal if they specify the same line and the same character in that line:

public boolean equals(final Object obj) {
if (this == obj) return true;
if (!(obj instanceof CharacterPosition)) return false;
final CharacterPosition that = (CharacterPosition) obj;
return this.getLine() == that.getLine()
&& this.getCharacter() == that.getCharacter();
}

This might look reasonable but it doesn’t work: equals is required to be symmetric but in this case it isn’t:

final LinePosition a = new LinePosition(3)
final CharacterPosition b = new CharacterPosition(3, 5);
final boolean aIsB = a.equals(b); // true
final boolean bIsA = b.equals(a); // false

The problem is that b is more picky about who it considers itself to be equal to than a. Damn.

An alternative approach is to avoid instanceof and use getClass instead:

public boolean equals(final Object obj) {
if (this == obj) return true;
if (obj == null || obj.getClass() != this.getClass()) return false;
final X that = (X) obj;
// X-specific comparison
}

Using this technique to implement equals on the text position classes would cause a and b to be equally picky about who they considered themselves to be equal with so equals would be symmetric. Unfortunately, now the objects are so picky that they break Liskov Substitutability, which is a Bad Thing.

Here’s an example where that might hit you. Let’s say that during debugging you notice that instances of LinePosition turn up in places where you didn’t expect then. If I wanted to find out where these positions came from, I would probably make an empty subclass of LinePosition for each place where I instantiate them so each instantiation site uses its own class. That way, whenever I find a line position where I didn’t expect it, I can see where it was instantiated

But if equals has been implemented using class equality I can’t do that because even though I don’t change any methods, the behavior of my program has changed in a subtle way. One position can now only be equal to another position if they were created at the same place, which is very different from how the program behaved before. I’ve tried hunting bugs that occurred because a subclass behaved differently from superclasses merely because it had a different class. Bugs like that can be extremely evil because it’s so unexpected if your mind is warped the way object orientation usually warps people’s minds.

At the core, this problem is a question about when can you consider two objects to be of the same kind of objects. Having a common superclass does not imply that two objects are the same kind, as the LinePosition and CharacterPosition example demonstrates. On the other hand, sometimes objects should be considered to be of the same kind even if they have different classes. So how do you determine whether or not two objects should be considered to be the same kind of object? In Java, I don’t know. I think the getClass approach is the greater of two evils so I’ll probably keep doing things the way I’ve always done, using instanceof. Looking outside Java and java-like languages I have an idea for a solution using a mechanism that’s a combination of Scala-style pattern matching and the visitor design pattern. But that will be the subject of another post in the near future.

Hotswap

Gilad writes about JSR292 (Supporting Dynamically Typed Languages on the JavaTM Platform) and adding support in the JVM for updating a program’s code without stopping the program. Since hotswapping is part of the general “dynamically typed languages support” JSR, I guess the main focus is on the kind of updates required by a full implementation of languages like Python or Ruby, where a program can change the class of an object, add methods to a class, etc. In other words updates where a program itself changes its own structure.

What is most interesting to me in all this, though, is the kind of hotswapping that allows you to develop applications while they’re running, as demonstrated in this smalltalk screencast. When I spent a lot of time working on a particular corner of our IDE I tend to focus too much on the details and forget the big picture. Watching that video, I was reminded of what kind of system we should, and do, aim to make OSVM, and that’s probably the kind of interactive programming Gilad has in mind when he talks about how powerful and useful hotswapping is.

In my experience you don’t need full hotswapping for it to be useful. Eclipse, for instance, only supports relatively limited updates — all you can do, more or less, is change the implementation of existing methods. It is still very useful, though, and far preferable to always having to restart everything after every little change. Even if they fall short of a general solution, any steps they can take in the right direction will be great.

Wow, that sounded a lot like me being positive. That’s so unlike me.

Implementing support for this in OSVM was, to say the least, non-trivial. And our VM is not only very simple but was also designed with this kind of use in mind. As for implementing this in the JVM… as they say, good luck with that. But of course this JSR doesn’t have to be ready until Java SE 7 so they have a few years to get it done.

Fortran

I found an article at dusty decks about the original Fortran I compiler. This is interesting because the designers of the fortran compiler were the first to solve a whole bunch of interesting problems. Being a parser geek, I think their solution to operator precedence parsing especially neat.

The goal of operator precedence parsing is to take an expression such as

2 * x + y / 8

and group it according to a set of precedence rules such as “* and / bind tighter than +“, in this case producing

((2 * x) + (y / 8))

Today this problem is well-understood and there is a bunch of algorithms for solving it; the algorithm used in tedir is shift/reduce expression parsing. The fortran guys, on the other hand, had to solve this problem from scratch.

The way they did it was to scan through the expression and expand it in various places with a sequence of parentheses. In the simple case for a language with only +, -, / and *, the algorithm would do the following:

  • Insert (( at the beginning of the expression and )) at the end
  • For each + and - insert ))+(( and ))-(( respectively.
  • For each * and / insert )*( and )/( respectively.

And that’s it! Once the expression has been expanded using these rules it will be correctly parenthesized. For instance

  1. 2*x+y/8
  2. ((2*x+y/8
  3. ((2)*(x+y/8
  4. ((2)*(x))+((y/8
  5. ((2)*(x))+((y)/(8
  6. ((2)*(x))+((y)/(8))

It might not be obvious that this works in all situations, and inserting unmatched parentheses should make any programmer feel queasy. But it does work (Knuth says so) and it generalizes nicely, you just have to insert additional parentheses. Of course, once you get above 5 levels of precedence you might want to investigate ways to optimize the algorithm to avoid inserting endless amounts of parentheses. And once you do that the algorithm starts looking a lot like more “modern” algorithms like shift/reduce parsing. Which was, coincidentally, invented just a few years later by Knuth who also wrote about this algorithm.

Moved

I’ve moved the blog to my own domain: quenta.org.

Using

Argh, this whole “neptune looks like C#” thing is self-perpetuating. After we first noticed the similarities between the languages I looked into C# a bit more, and having that at the back of my head as we’re discussing the language only paves the way for new C#-like constructs and so the hideous circle continues.

Example. And this is where smalltalk programmers should look away or at least read the disclaimer before continuing. The example is about synchronization and locking. In java, all objects can be used as monitors and you usually guard a critical section by synchronizing on some object:

final Object lock = ...;
synchronized (lock) {
// critical section
}

That’s pretty readable even though it’s somewhat warped that all objects can be used as monitors. But that’s beside the point. In smalltalk it’s even neater since you can use blocks and so don’t need a special synchronized statement, you can just call a method on a Monitor or Mutex:

mutex acquireDuring: [
// critical section
].

In neptune, being derived from smalltalk, we also have blocks (they’re called functions) and until recently we used the exact same technique as smalltalk for synchronization:

Mutex mutex = ...;
mutex.acquire_during(fun {
// critical section
});

Yeah I know, it doesn’t look very good. It’s exactly the same as in smalltalk except that method calls are written as mutex.acquire_during(...) and that the block syntax has changed into fun { ... }.

It turns out that you use acquire_during all over the place so we really want a more concise syntax. The idea is to add a special purpose syntax for acquiring a resource, for instance a mutex or a file, in a particular scope that makes sure the resource is released properly. It’s an old idea we’ve had floating around but it never really got anywhere because we couldn’t find the right syntax.

This afternoon we ended up discussing this problem again and it seems like we finally have a strawman: the using statement:

Mutex mutex = ...;
using (mutex) {
// critical section
}

C# programmers will find this very familiar — C# has a construct with a very similar purpose and a very similar syntax. It is important, though, to point out that we haven’t taken the whole construct from C#, only the name of the keyword. But that’s a pretty important part of the construct.

As with all out shorthands, the using statement is just syntactic sugar for a simpler construct, in this case a method call:

mutex.using(fun {
// critical section
});

Any object is free to implement the using operator so it’s a much more general construct than synchronized in Java. A using statement can also define variables, for instance when acquiring a resource produces a result:

File my_file = ...;
using (InputStream stream : my_file) {
// read file
}

Will this be a popular statement? I think that probably depends on how we use it in the libraries. But that goes for most of the language constructs, if we don’t use them in the libraries they probably won’t be used by others either. We’ll see.


Disclaimer: I cannot be held responsible for any discomfort felt by smalltalkers while reading this post. To a smalltalk programmer the following probably seems like utter insanity since smalltalk already has a great solution to the problem in this example, and the people who are designing this language are all smalltalk programmers. I’m trying to find the courage to write a post about why we’re moving away from smalltalk and I’m not quite there yet.