Sydney

Wow, it’s been a long time since I’ve posted here. I’ve actually had a lot of material for blogposts, most recently about my adventures with scsh, but it somehow doesn’t manage to get from my head onto the blog.

So, what’s new with me? Well…

A few months ago I left the v8 project which I had worked on almost since it started. I’ve enjoyed working on v8 a lot and it’s been a great experience to be an active part in the browser race over the last few years. However, I always knew that I wasn’t going to spend the rest of my life implementing JavaScript.

Last year I visited Sydney and spent a few weeks working at Google’s office there. I happened to be there during the first big demo of Google Wave and watched it with the team which is mainly located in Sydney. I was totally blown away, and that feeling never really wore off. It’s now one year and quite a bit of paperwork later; I’m now living in Sydney and working on wave.

Wave is a very different kind of project than v8 and a departure from the programming language related projects I’ve worked on so far. That’s a feature, not a bug.

Of course, now that I’m drinking the wave Kool-Aid I simply must include a wave element for comments.

Double

Many (most?) virtual machines for dynamically typed languages use tagged integers. On a 32-bit system the “natural” tagsize for a value is 2 bits since objects are (or can easily be) 4-byte aligned. This gives you the ability to distinguish between four different types of values each with 30 bits of data, or three different types where two have 30 bits and one has 31 bits of data. V8 uses the latter model: 31 bit integers, 30 bit object pointers and 30 bit “failure” pointers used for internal bookkeeping. On 64-bit systems the natural tagsize is 3 which allows for more types of tagged values, and inspired by this I’ve been playing around with tagging another kind of value: double precision floating-point numbers.

Whenever you want to tag a value you need to find space in the value for the tag. With objects it’s easy: they’re already aligned so the lowest few bits are automatically free:

oooooooooooooo..oooooooooooooo000 (untagged)
oooooooooooooo..oooooooooooooottt (tagged)

With integers no bits are free automatically so you only tag if the highest bits are the same, by shifting the lowest bits up and adding the tag in the lowest bits, for instance:

0000iiiiiiiiiiiii..iiiiiiiiiiiiii (untagged)
0iiiiiiiiiiiii..iiiiiiiiiiiiiittt (tagged)

The reason both the highest bits 0000 and 1111 work is that when you shift up by 3 there will be one bit of the original 4 left, and when shifting down again sign extension spreads that bit back out over all the top 4 bits.

With doubles it becomes harder to find free bits. A double is represented as one sign bit s, an 11-bit exponent e and a 52 bit fraction, f:

seeeeeeeeeeeffffffffff..fffffffff (untagged)

The value of the number is (-1)s2e1.f.

To get a feel for how this representation works I tried decomposing some different values into into their parts, here using single precision but the same pattern applies to double precision:

value sign exponent fraction
1.0 0 0 1
-1.0 1 0 1
2.0 0 1 1
0.5 0 -1 1
3.0 0 1 1.5
100.0 0 6 1.5625
10000.0 0 13 1.2207031
-10000.0 1 13 1.2207031
0.0001 0 -14 1.6384
0 0 -127 0.0 (denormalized)
NaN 0 128 1.5
Infinity 0 128 1.0

Looking at this table it’s clear that in an interval around ±1.0, starting close to 0.0 on the one side and stretching far out into the large numbers, the exponent stays fairly small. There are 11 bits available to the exponent, that’s a range from -1022 to 1023 since one exponent is used for special values, but for all values between 0.0001 and 10000 for instance its actual value only runs from -14 to 13.

Even though the high bits of the exponent are unused for many numbers you can’t just grab them for the tag. The 11 exponent bits don’t contain the exponent directly but its value plus a bias of 1023, apparently to make comparison of doubles easier. But after subtracting the bias from the exponent it is indeed just a matter of stealing its top three bits of the value (leaving the sign bit in place):

su000uuuuuuuffffffffff..fffffffff (untagged, bias subtracted)
suuuuuuuuffffffffff..fffffffffttt (tagged, bias subtracted)

Using this approach all doubles whose exponent is between -127 and 128 can be tagged. Since single precision numbers use 8 bits for the exponent all numbers between the numerically greatest and numerically smallest single precision numbers, positive and negative, can be represented as tagged doubles. One potential concern is that this encoding does not allow ±0.0, NaNs or ±∞ but I’m not too worried about that; it’s easy to handle 0.0 specially and it’s unclear how problematic the other values are.

What’s the performance of this? The short answer is: I don’t know, haven’t tried it. The long answer is: I can guess and I’m fairly certain that it’s worth it.

One reason it might not be worth it is if most doubles are not covered by this. To test this I instrumented v8 to test each time a heap number was allocated whether the value would fit as a tagged double. Furthermore I disabled some optimizations that avoid number allocation to make sure the allocation routines saw as many numbers as possible. The results after running the v8 benchmarks were encouraging:

+----------------------------------------+-------------+
| Name | Value |
+----------------------------------------+-------------+
| c:SmallDoubleZero | 16149 |
| c:SmallDoubleNaN | 2 |
| c:SmallDoubleInfinity | 32959 |
| c:SmallDoubleOther | 1290 |
| c:SmallDoubleHits | 628660 |

The number in c:SmallDoubleHits is the ones that fit in a tagged double: 92%. Furthermore, of the numbers that could not be represented all but 2% were zero or infinity. The v8 benchmark suite contains two floating-point heavy benchmarks: a cryptography benchmark and a raytracer. Interestingly, the cryptography benchmarks is responsible for almost all the ∞s and the raytracer is responsible for almost all the 0.0s. If a special case was added for 0.0 then 99.3% of all the doubles used in the raytracer would not require heap allocation.

Note: people often ask if we plan to add 64-bit support to v8. Don’t read this as an indication one way or the other. I’m just testing this using v8 because that’s the code base and benchmarks I know best.

Note that this is just a single data point, there’s no basis to conclude that the same holds across most applications. Note also that since v8 uses tagged integers there is some overlap between this an existing optimizations. But since all 32-bit integers can be represented as tagged doubles any nonzero numbers that are not counted because they were recognized as small integers would have counted as tagged doubles.

Another reason it might not be worth it could be that tagging and untagging is expensive; I doubt that it is considering how relatively few operations it takes:

static const uint64_t kBias = 0x3800000000000000ULL;
static const uint64_t kStolenBitsMask = 0x7000000000000000ULL;
static const uint64_t kLowerMask = 0xfffffffffffffffULL;
static const uint64_t kUpperMask = 0x8000000000000000ULL;
static const uint64_t kTagSize = 3;
static const uint64_t kTag = 0x3;

uint64_t double_bits(double value) {
return *reinterpret_cast(&value);
}

double make_double(uint64_t value) {
return *reinterpret_cast<double*>(&value);
}

bool fits_small_double(double value) {
return ((double_bits(value) - kBias) & kStolenBitsMask) == 0;
}

uint64_t tag_small_double(double value) {
uint64_t bits = double_bits(value) - kBias;
return (bits & kUpperMask)
| ((bits & kLowerMask) << kTagSize)
| kTag;
}

double untag_small_double(uint64_t bits) {
uint64_t result = (bits & kUpperMask)
| ((bits >> kTagSize) & kLowerMask);
return make_double(result + kBias);
}

I haven’t benchmarked this so I can only guess about the performance. I have no doubt that testing and tagging a double is faster overall than allocating a heap object but I suspect reading from memory could turn out to be cheaper than untagging. The overall performance also depends on how doubles are used: if most are only used briefly and then discarded then the possible cheaper reads don’t make up for the more expensive allocation. But I’m just guessing here.

In any case I think this is a promising strawman and it would be interesting to see how it performed in a real application. I also suspect there are more efficient ways of tagging/untagging the same class of doubles. The problem is actually a fairly simple one: reduce the highest bits from 01000, 11000, 01111 or 11111 to only two bits and be able to go back again to the same five bits.

Further reading: alternative approaches are here, which is essentially the same as this but has more special cases, and this which goes the other way and stores other values in the payload of NaN values.

Finally

Before I had used C++ I had quite a strong dislike for it. I remember wondering why anyone would voluntarily use C++? and being frustrated when reading interviews with Bjarne Stroustrup because he wouldn’t admit that C++, or at least parts of it, sucks.

Well, for the last few years I’ve actually been using C++ and have ended up changing my mind completely. As I expect most C++ programmers do I stay within a subset of the language, but that subset is now one of my favorite languages: not too complicated, powerful and concise. With the other languages I use regularly, Java, Python and JavaScript, I often find myself thinking “man, what were they thinking when they designed this feature?”. That practically never happens with C++. It can be complex and confusing but when taking the constraints and interactions with other language features into account I can’t think of a single instance where I’ve been able to come up with a better solution. I realize that is quite an arrogant way to put it but look at it this way: coming from an arrogant and opinionated person it is a huge compliment.

One of my favorite features is stack-allocated objects because they enable the resource acquisition is initialization pattern. Given a choice between doing resource handling in Java, using finalizers for objects and finally clauses for scopes, and C++ using RAII, I would choose C++ any day.

The weaknesses of finalizers are well-known so I won’t say more about them. The problem with finally clauses is closely related to the subject of making wrong code look wrong. Finally clauses divorce the code that releases a resource from the code that allocates it, which makes it much harder to see when you forget to clean up.

FileReader in = new FileReader(fileName);
try {
/* process file */
} finally {
in.close();
}

Using a finally clause means having the code for cleaning up in on the other side of the code that uses it. You have to scan the whole method to convince yourself that in will indeed be closed. An even worse issue is the fact that the try part has a scope of its own that the finally clause can’t see. This means that anything you do after having allocated a resource will, by default, be hidden from the finally clause. Because of this the pattern above doesn’t scale if you want to open two files:

FileReader in = new FileReader(inFileName);
FileWriter out = new FileWriter(outFileName);
try {
/* process files */
} finally {
in.close();
out.close();
}

If the second constructor throws an exception, and you can’t be sure it doesn’t, then the finally clause is not yet in effect and so in won’t be cleaned up. You can’t declare the out variable in the try clause because then the finally clause can’t see it so what people often end up doing is:

FileReader in = null;
FileWriter out = null;
try {
in = new FileReader(inFileName);
out = new FileWriter(outFileName);
/* process files */
} finally {
if (in != null) in.close();
if (out != null) out.close();[1]
}

What we’re trying to do here is really very simple: we want to be sure that these files are closed before leaving the scope. With try/finally the solution is more complicated than the problem.

In C++ the destructor of stack-allocated objects give you a way to execute code exactly when you leave a scope. Here’s how you could express the same thing in C++:

own_resource in_resource = fopen(in_file_name, "r");
FILE* in = *in_resource;
if (in == NULL) return;
own_resource out_resource = fopen(out_file_name, "w");
FILE* out = *out_resource;
if (out == NULL) return;
/* process files */

The resources we acquire here are stored in a stack-allocated own_resource; the destructor of the own_resource class takes care of releasing the value stored in it. The language ensures that no matter how control leaves the current scope the own_resource destructors will be called first. Furthermore I can see right at the call to fopen that the result will be cleaned up. You don’t have to use this pattern long before any call that allocates a resource and does not store it in an own_resource (or an own_ptr or some other own_ object) looks very wrong.

How to manage resources is closely related to how to signal errors. Most new languages use some form of exceptions, even though they have acquired something of a bad name. There are many problems with exceptions (none of which can be solved with error codes by the way) but being able to specify how a resource should be cleaned up at the same time as you allocate it does solve many of those problems. Getting this right requires you to assume that any part of your code could throw an exception. It’s a slightly nicer version of preemption really: someone might interrupt you at any nontrivial operation but unlike preemption all you must be able to do is bail out, you don’t have to be able to continue running. That is doable. Furthermore it’s just a fact of life: whether you use exceptions or not you may have to bail out at any nontrivial operation, because of stack overflow, memory exhaustion, SIGINT, or whatever. The best way to defend yourself against that is to program defensively, and to write your code in a style where resource allocation operations stand out if they don’t immediately make provisions for deallocation. I don’t know a language that does that better than C++.

However, as much as I’ve been praising C++, their solution relies much too heavily on static typing for my taste. An interesting challenge is to design a way for flexible object-oriented RAII to work in dynamically typed languages. That is something I will be working on and I hope I’ll eventually be back here with a good solution.


1: As Bob points out in the comments there’s a bug here: if in.close() throws an exception then out will not be closed. It’s another illustration of how hard this stuff is to get right. Note that the C++ example is not affected by this issue: if the destructor for out (which is the first to be called) throws an exception the same rules apply with regards to in: the exception causes us to leave its scope so its destructor will be called.

Irregexp

The latest thing I’ve been working on has just been announced: irregexp. It’s a new regular expression engine for v8 that we’ve been working on for a few months and now finally it has made it onto the dev-channel release of Google Chrome.

For me an interesting aspect of the irregexp sub-project is that it has taken place completely out in the open. The trail of code and code reviews is out there from the day we created the branch, then known as regexp2000, (599) to we merged back into the mainline (832) to we became feature complete (1104) until finally irregexp could take over all regexp handling (1117). The openness aspect of the v8 project does occasionally freak me out a little bit — almost all the work I do and much of my work-related communication takes place in public and will, barring the cataclysmic destruction of the internet, stay public forever. But in this case I have no reservations, I like the idea very much.

Since it’s been so long since I’ve posted here I have a whole backlog of stuff I’d like to write about but haven’t gotten around to. So to get myself started again here’s what I’ll do: I promise that within a week I will have published a new post. There.

V8

Finally, Google Chrome is out! If you haven’t tried it already you should do so forthwith. This is the project I work on at google, specifically the JavaScript engine V8 (pronounced [fi: ɛit]). If you want to know more about V8 I would recommend, in order of increasing detail:

  • V8 under the hood for a quick and accurate overview.
  • The comic which has a section about V8, starting at page 13 (but do read the whole thing). It features cartoon versions of Lars and Kasper and what I choose to believe is Mr. Bananagrabber without his segway.
  • Our documentation, specifically the design elements section, for a slightly more detailed look at the basic techniques.
  • The source code itself.

By the way, if you happen to notice something that looks wrong or could be improved as you browse the source code do send us a patch – we’ve already accepted the first external contributions.

One last thing. I’m a big fan of Stephen Fry who, besides being a great comedian, author and what wikipedia calls “wit”, happens to write the column dork talk in the Guardian where he writes about digital “stuff”: gizmos, programs, etc. The columns are also posted on his blog, which I can recommend. Obviously I’m hoping to see a Google Chrome review on dork talk – that would be a great final flourish to the launch for me.

dprintf

I’ve recently gone from being indifferent to varargs in C++ to thinking that they should be avoided at all cost. It’s a collection of values (but not references or objects passed by value) that you don’t know the length or type of, that you can’t pass on directly to other variadic functions, and that can only be streamed over not accessed randomly. Bleh! In the case of printf-like function, which I expect is the most popular use of varargs, you add the information that is missing from the arguments themselves to the format string. This gives ample opportunity for mistakes, which some compilers will warn you of but which other’s won’t. For me, the conclusion has been that varargs are harmful and should not be used under any circumstances.

Of course, not having printf-like functions in C++ is a drag if you want to build strings so I’ve been trying to come up with something to replace it that isn’t too painful. What the remainder of this post is about is that you can actually have your cake and eat it too: you can have something that looks just like printf that doesn’t use varargs but takes a variable (but bounded) number of arguments where the arguments each carry their own type information and can be accessed randomly. Where you would write this using classic printf:

int position = ...;
const char *value = ...;
printf("Found %s at %i", value, position);

this is how it looks with dprintf (dynamically typed printf):

dprintf("Found % at %", args(value, position))

You’ll notice that dprintf is different from printf in that:

  • There is no type information in the format string; arguments carry their own type info. Alternatively you could allow type info in the format string and use the info carried by the arguments for checking.
  • The arguments are wrapped in a call to args. This is the price you have to pay syntactically.

To explain how this works I’ll take it one step at a time and start with how to automatically collect type information about arguments.

Type Tagging

When calling dprintf each argument is coupled with information about its type. This is done using implicit conversion and static overloading. If you pass an argument to a function in C++ and the types don’t match then C++ will look at the expected argument types and, if possible, implicitly convert the argument to the expected type. This implicit conversion can be used to collect type information:

enum TypeTag { INT, C_STR, STR, FLOAT, ... }

class TaggedValue: {
public:
TaggedValue(int v) : tag(INT) { value.u_int = v; }
TaggedValue(const char* v) : tag(C_STR) { value.u_c_str = v; }
TaggedValue(const string &v) : tag(STR) { value.u_str = v; }
...
private:
TypeTag tag;
union {
int u_int;
const char* u_c_str;
const string *u_str;
} value;
}

void call_me(const TaggedValue &value) {
printf("Type tag: %i\n", value.tag());
}

call_me(4); // Prints the value of INT
call_me("foo"); // Prints the value of C_STR

When calling call_me the arguments don’t match so C++ implicitly finds the int and const char* constructors of TaggedValue and calls them. Each constructor in TaggedValue does a minimal amount of processing to store the value and type info about the value. The important part here is that you don’t see this when you invoke call_me so as a user you don’t have to worry about how this works, all you know is that type info is somehow being collected.

This has the limitation that it only works for a fixed set of types, the types for which there is a constructor in TaggedValue — but on the other hand this is no different from the fixed set of formatting directives understood by printf. On the plus side it allows you to pass any value by reference and by value as long as TaggedValue knows about it in advance. The issue of automatic type tagging is actually an interesting one; in this case we only allow a fixed set of “primitive” types but there is no reason why tagging can’t work for array types or other even more complex types[1].

The next step is to allow a variable number of these tagged values. Enter the args function.

Variable argument count

This is the nastiest part of it all since we have to define an args function for every number of arguments we want to allow. This function will pack the arguments up in an object and return it to the dprintf call:

// Abstract superclass of the actual argument collection
class TaggedValueCollection {
public:
TaggedValueCollection(int size) : size_(size) { }
int size() { return size_; }
virtual const TaggedValue &get(int i) = 0;
private:
int size_;
}

template <int n>
class TaggedValueCollectionImpl : public TaggedValueCollection {
public:
TaggedValueCollectionImpl(int size)
: TaggedValueCollection(size) { }
virtual const TaggedValue &get(int i) {
return *(elms_[i]);
}
const TaggedValue *elms_[n];
}

static TaggedValueCollectionImpl<3> args(const TaggedValue &v1,
const TaggedValue &v2, const TaggedValue &v3) {
TaggedValueCollectionImpl<3> result(3);
result.elms_[0] = &v1;
result.elms_[1] = &v2;
result.elms_[2] = &v3;
return result;
}

The TaggedValueCollectionImpl holds the arguments and provides access to them. The TaggedValueCollection superclass are there to allow methods to access the arguments without knowing how many there are. The number of arguments are part of the type of TaggedValueCollectionImpl so we can’t use that directly. The args function is the one that handles 3 arguments but they all look like this. It creates a new collection and stores its arguments in it. While it is bad to have to define a function for each number of arguments it is much better than varargs. Also, you only have to write these functions once, and then every function with printf-like behavior can use them.

Finally, the dprintf function looks like this:

void dprintf(string format, const TaggedValueCollection &elms) {
// Do the formatting
}

One advantage about having random access to the arguments is that the format string can access them randomly, they don’t have to be accessed in sequence. I’ve used the syntax $i to access the i‘th argument:

dprintf("Reverse order (2: $2, 1: $1, 0: $0)", args("a", "b", "c"))
// -> prints "Reverse order (2: c, 1: b, 0: a)"

Coupled with the fact that format strings don’t have to contain type info this gives a lot more flexibility in dealing with format strings. Honestly I’ve never had any use for this but I could imagine that it could be useful for instance in internationalization where different languages have different word order.

To sum up, the advantages of this scheme are:

  • Type information is automatically inferred from and stored with the arguments.
  • You can pass any type of value by reference or by value.
  • The argument collections (the const TaggedValueCollection &s) can be passed around straightforwardly. You only have to define one function, there is no need for functions like vprintf.
  • Format strings can access arguments in random order.

I’ve switched to using this in my hobby project and it works like a charm. You can see a full implementation of this here: string.h and string.cc (see string_buffer::printf and note that the classes have different names than I’ve used here). To see how it is being used see conditions.cc for instance.

A final note: if you find this kind of thing interesting (and since you’ve come this far I expect you do) you’ll probably enjoy enjoy Danvy’s Functional Unparsing.


1: To represent more complex types you just have to be a bit more creative with how you encode the type tag. A simple scheme is to to let the lowest nibble specify the outermost type (INT, C_STR, ARRAY, …) and then, if the outermost type is ARRAY let the 28 higher bits hold the type tag of the element type. For more complex types, like pair the 28 remaining bits can be divided into two 14-bit fields that hold the type tags for T1 and T2 respectively. For instance, pair<const char*, int**> would be represented as

bits 28-31 24-27 20-23 16-19 12-15 8-11 4-7 0-3
data c_str int array array pair

While this scheme is easy to encode and decode it can definitely be improved upon. For instance, there are many type tag values that are “wasted” because they don’t correspond to a legal type. Also, it allows you to represent some types that you probably don’t need very often while some types you are more likely to use cannot be represented. For instance, you can represent int****** but not pair<int, pair<int*, const char*>>. And of course, if you don’t have exactly 16 different cases you’re guaranteed to waste space in each nibble. But finding a better encoding is, as they say, the subject of future research.

Gestures

I’ve been pretty unhappy with Firefox for a while — I think it’s just annoying and crashy. I’ve tried various other browsers, I used Epiphany and Safari for a while, but today I thought I’d try something new so I installed Firefox 3 Beta to see if it really was as good as Mozilla claim. And the news is good: it’s probably fair to describe it as the best browser ever. I like the new UI, I like the combined URL/search box, I really like the new dialog-less would you like to save your password UI, and so far it hasn’t crashed on me. Yay!

And the best thing I discovered today isn’t even the new firefox — it was the FireGestures extension, which I think has been around for a while. This extension is seriously cool!

FireGestures works very straightforwardly. When you hold the right mouse button down it keeps track of which direction you move the mouse. For instance, when you draw a counter-clockwise circle it records that you moved the mouse left down right up left, or LDRUL, and when you release the mouse button it executes the associated action. In the case of LDRUL it opens the FireGestures configuration dialog. There is a bunch of predefined gestures: L goes back, R goes forward, LR opens a new tab and DR closes the current tab. I like the DR gesture because it draws an upper case L, and l is the first letter in the Danish word for “close”.

The best thing about FireGestures though is that you can add your own gestures with a custom sequence of mouse moves and an action implemented in JavaScript. The code for custom gestures have access to information about the page, for instance the DOM node on which the gesture started and the text currently selected.

I’ve used this to add a custom gesture that triggers when you draw a W (or DRUDRU) and searches wikipedia for the currently selected text, or if no text is selected goes to the wikipedia front page:

var text = FireGestures.getSelectedText();
if (text) {
var encoded = encodeURI(text);
var root = "http://en.wikipedia.org/wiki/Special:Search?search=";
gBrowser.loadURI(root + encoded);
} else {
gBrowser.loadURI("http://www.wikipedia.org");
}

Pretty easy — just a few straightforward lines — but really useful. Another gesture I’ve added shows the bug tracker page for the selected bug number in a new tab when I draw a ‘b’ (or DURDL):

var text = FireGestures.getSelectedText();
if (text) {
var root = "https://bugzilla.mozilla.org/show_bug.cgi?id=";
var encoded = encodeURI(text);
gBrowser.selectedTab = gBrowser.loadOneTab(root + encoded);
}

(I’ve replaced the URL to our internal bug tracking system with firefox’).

This is a great extension — simple, straightforward and extensible.

Elsewhere

There was a time, maybe a year and a half ago, when I would post here almost every week. Now, I’m lucky if I post once a month; for instance, it’s been almost three months between this and the previous post. What has happened since then?

Besides being kept pretty busy at work I’ve been spending basically all of the time I set aside for extracurricular activities, like blogging, on a new hobby project. It is a new programming language (what else) which is a child of Saturn and grandchild of Neptune (note to self: find out if there is a roman deity with the same lineage). This is where I’ve been spending my time and in fact this is probably the longest I’ve managed to stay focused on a hobby project without losing interest at any point — and I think the lack of blogging is caused by that. I recently read Vampires of the Internet, which is about writers who blog instead of writing, but which I could also relate to in part. Here is a relevant quote, adapted slightly:

Every day, when you try to sit down to code on your project, you will notice a strange weariness in your fingers. You mind will go blank as you look at the blank screen. And then, almost of their own volition, you fingers will dance on the keys, typing in the dreadful www.blogger.com

Don’t get me wrong, I’m not saying that it’s a waste of time, but sometimes less blogging activity is not altogether a bad thing. If you come back in another three months and I haven’t posted anything new then hopefully all it means is that I’m over at antineutrino (don’t ask about the name) working on my language.

Speaking of language design I came across this great talk, The story of the Ribbon, about the evolution of the most recent Microsoft Office UI. Besides being an all-round interesting talk about UI design (with, however, surprisingly tasteless slides and use of music) he made a number of important points that apply to programming language design as well.

A big problem with previous versions of office was the it had become so compex that people’s sense of mastery was gone (around 23:55). “Sense of mastery is the feeling that you get when you use a piece of software and you sort of understand what it’s capable of.” His first example is notepad, where he says that “Even if you can’t recite all of the features that are in notepad I feel pretty confident that you could form an opinion on all of the things that are in notepad.” This applies directly to programming language design. It should be at the front of every language designer’s mind that programmers should be able master their language, to understand it and to feel that they and not the software are in control. Sadly, I can’t come up with more than two or three languages that even come close to having this property.

Later on he goes through the design tenets of the new office version (around 38:25) and I especially noted two of the rules. The first one is “Give features a permanent home. Prefer consistent-location UI over ‘smart’ UI.”When the user wants to accomplish something it should be clear where to go to do it. If you have several types of menus, toolbars, task panes, context menus and whatnot, it’s unclear where to look for the tool that you need. If you have all the tools the same place you always know where to go, especially if that one place is well organized. Of course this also applies to programming languages. In smalltalk, for instance, there is more or less just one thing you can do: send a message to someone. To accomplish anything you just have to identify 1) who it the subject of this operation I want to perform, and 2) how do I phrase this as a message to that object. If your code and libraries are well organized then this structure is very helpful. Conversely, the more specialized features your language has the more difficult it becomes to identify which feature to use in a given situation. Of course Tim Toady languages are especially guilty here; having the same feature present in different forms according to context is just like a “smart” UI that tries to adapt itself to fit what you’re doing. Introducing special “smart” features only highlights the fact that the standard features are stupid.

The final point, and one that doesn’t need any comments, is “straightforward is better than clever”.

Constructors #2

In my last post I wrote about the problems with constructors. In this post I’ll describe the approach I’m currently considering using for my hobby language. But first I have to set things up a bit and describe some related language constructs.

Protocols

First of all, there are no classes but only protocols. You can think of protocols as classes without fields or, equivalently, as interfaces where methods can have implementations or, most accurately, as something very similar to traits. Here is an example:

protocol Point2D {
def x();
def y();
def to_string() {
return "a Point (x: ${this.x()}, y: ${this.y()})";
}
}

In a class-based language Point2D would have had two instance variables, x and y. Since protocols don’t have fields this one has two virtual methods instead where the fields should be, x() and y(). It also has a to_string method which uses string interpolation, which I’ve stolen directly from the neptune language.

One way to create an instance of Point2D is to create a subprotocol that defines the methods x and y:

protocol Point2DAt3Comma4 : Point2D {
def x() { return 3; }
def y() { return 4; }
}

def p := new Point2DAt3Comma4()1();
print(p.to_string()); // -> a Point(x: 3, y: 4)

Even though protocols are like interfaces there is nothing to stop you from creating instances of them. And because they don’t have any fields, initialization is trivial.

Records

However, it obviously doesn’t work that you have to write a named subprotocol whenever you want a concrete point. Instead, there is a dual to protocols: records. Where a protocol is pure behavior and no state, a record is pure state and no behavior. In the case of Point2D the associated record could be

def p := new { x: 3, y: 4 };
print(p.x()); // -> 3
print(p.y()); // -> 4

All a record can do is return the field value when the associated method is called. As with protocols, initializing a record cannot fail because it is only created after all the field values have been evaluated.

So now that we have behavior without state and state without behavior we just have to combine them. This is done using the “extended” method call syntax:

def p := new Point2D() { x: 3, y: 4 }
print(p.to_string()); // -> a Point(x: 3, y: 4)

This syntax creates a record like the previous example, but make the record an instance of Point2D. Initialization is trivial, as with plain records, because all field values have been evaluated before the instance is created. This works in the simple case but unfortunately doesn’t solve the general problem. First of all, we’ve exposed the “fields” of Point2D. If we later want to change to a polar representation, every occurrence of the above syntax that binds x and y directly will give us broken points. Also, this does not take inheritance into account.

For this reason, the syntax above is not intended to be used in client code, only in Point2D‘s methods. Instead, Point2D should define factory methods for client code to call:

protocol Point2D {
...
static factory def new(x: i, y: j) {
return new this() { x: i, y: j };
}
}

The static modifier means that the method is the equivalent of a smalltalk class method. x: and y: are python-style keywords with a more smalltalk-like syntax. The factory keyword does not have any effect until we get to subprotocols. In any case, with this definition you can now create a new point by writing

def p := new Point(x: 3, y: 4);

If, at some later point, you decide to switch to a polar representation you simply have to change the definition of new(x:, y:):

static factory def new(x: i, y: j) {
def t := arc_cos(i);
return new this() { rho: j / sin(t), theta: t };
}

As long as people use the factory methods you are free to change the implementation of the protocol. Also, the protocol can do arbitrary pre- and postprocessing and the actual instantiation and initialization is atomic. All that’s left now is to account for subprotocols. Enter Point3D

Inheritance

Here is Point3D:

protocol Point3D : Point2D {
def z();
override def to_string() {
return "a Point (x: ${this.x()}, y: ${this.y()}, z: ${this.z()})";
}
}

This protocol has three “fields”, x, y and z that have to be initialized. One way to do it would be for Point3D to initialize all of them:

protocol Point3D : Point2D {
...
static factory def new(x: i, y: j, z: k) {
return new this() { x: i, y: j, z: k };
}
}

This sort of works but it is error-prone, breaks encapsulation, and breaks down if we were to change Point2D to use polar coordinates. What we really want is for Point3D to call the new(x:, y:) method on Point2D. That’s where the factory keyword from before comes in. A factory method is one that allows you to specify a set of fields. Because new(x:, y:) is a factory method the instance created by this call

new this() { x: i, y: j }

will not only have an x and y field but also any fields specified by the caller of new(x:, y:). This allows us to rewrite new(x:, y:, z:) as

static factory def new(x: i, y: j, z: k) {
return new super(x: i, y: j) { z: k };
}

If Point2D were to change its representation to polar coordinates then Point3D would still work, provided that Point2D had x and y methods. The result would just have a rho, theta and z field.

Let’s look at another example: the one from the previous post where points have a serial number.

protocol Point2D {
def x();
def y();
def id();
static factory def new(x: i, y: j) {
def s := current_id++;
return new this() { x: i, y: j, id: s };
}
}

protocol Point3D : Point2D {
def z();
static factory def new(x: i, y: j, z: k) {
return new super(x: i, y: j) { z: k };
}
}

Even though this requires nontrivial computation in a superprotocol there is no risk of creating improperly initialized objects. In terms of the construction steps described in the last post this allows initialization to proceed like this:

  • Arbitrary preprocessing in Point3D
  • Arbitrary preprocessing in Point2D
  • Object instantiation and full initialization
  • Arbitrary postprocessing in Point2D
  • Arbitrary postprocessing in Point3D

The fact that protocols don’t declare fields but that fields are provided upon construction also allows new kinds of specialization. For instance, we don’t actually have to change the implementation of Point2D to create an instance that uses polar coordinates:

protocol Polar2D : Point2D {
def rho();
def theta();
def x() { return this.rho() * cos(this.theta()); }
def x() { return this.rho() * sin(this.theta()); }
static factory def new(rho: r, theta: t) {
return new super() { rho: r, theta: t };
}
}

In this case we choose to bypass the new(x:, y:) method and instead provide a method implementation of those fields instead of instance variables. This way we can share all methods in Point2D. This is also possible in other languages if all access to x and y takes place through accessors, but there you would get two unused fields in all instances.

Finally, let me just mention that any method, not just new, can be a factory; the syntax is just slightly different when you invoke it; for instance

def q := Point2D.at(x: 3, y: 4) { z: 5 };

It looks slightly different but the mechanism is exactly the same.


1 The meaning of new syntax, as in neptune, is not a special construct but an ordinary method call. The syntax new X() is exactly equivalent to the ordinary method call X.new().

Newspeak Constructors

There is one language design problem I’ve never been able to solve to my own satisfaction, and which no language I know solves completely: constructing objects. Gilad brought the subject up in two blog posts a while back but while the newspeak solution he describes has some nice properties it just doesn’t feel quite… right. Now, I’m currently (as always) working on a hobby language and had to find some, any, solution to this problem. While working on that I came up with a slightly weird solution that has some nice properties. But before going into that this warm-up post takes a look at the newspeak solution. Note: because I know little of newspeak I may be making some wrong assumptions about how the language works; apologies in advance if I get things wrong.

Update: Peter has written an update that demonstrates that I did indeed get things wrong, newspeak constructors are not as restricted as I assume in this post. So you may want to just skip over the newspeak part.

I’ll warm up by rehashing a well-known problem with the most widely used constructor mechanisms: they are just not object oriented. In Java, if you write

new C(...)

you’ve tied your code directly to the class C because the constructor is forced to return a new instance of C. Not a subclass, not a cached instance, not anything else. There is no abstraction there, no wiggle room.

One way to make things slightly better is to use a static factory method instead, that frees you from some of the restrictions of constructors so you’re no longer tied to a particular implementation, but it still ties you to a particular factory. You can then take a step further and implement factory objects that can be passed around as arguments and changed dynamically but that gives you a lot of boilerplate code. A much more lightweight solution in languages like SmallTalk that support it, is to use class methods on first-order class object — it’s really just factory objects but the language happens to give them to you for free. This is a pretty decent solution but has the disadvantage that it requires first-order class objects. A class object is a reflective, meta-level, thing which people shouldn’t have to use in programs that are otherwise non-reflective. But that is a small detail and probably not a problem in practice.

However, the issue of how best to invoke constructors is only part of the problem with object creation and not the hard part. The hard part is: how do you initialize instances once the constructor has been called; how do you set the values of an object’s instance variables.

What makes this especially hard is the fact that instance variables are essentially non-object-oriented. There are ways of making them less bad, like hiding them and providing accessor methods, as in Self and (apparently) newspeak. That gives you some wiggle room, for instance by allowing you to change something that was an instance variable into a computed value or overriding the accessor, without breaking client code. However, even with Self-style accessors, we’re still not off the hook because instance variables have to be initialized. This is where, in my opinion, the newspeak model breaks down.

Consider this newspeak example:

class Point2D x: i y: j = ( |
public x ::= i.
public y ::= j.
| ) (
public printString = (
ˆ ’ x = ’, x printString, ’ y = ’, y printString
)
)

There is a 1:1 correspondence between arguments to the primary constructor and instance variables in the object. And it must be so, except for trivial variations like reorderings, since any nontrivial initialization of instance variables would introduce the possibility of exceptions or non-local returns and hence void the guarantee of correct initialization[1].

Imagine that we want to give each point a unique integer id. In Java you could easily implement it something like this (ignoring synchronization issues):

class Point2D {
static int currentId = 0;
int x, y, id;
Point2D(int x, int y) {
this.x = x;
this.y = y;
this.id = currentId++;
}
}

In newspeak you can use a secondary constructor to implement this but as soon as you try to subclass Point2D you’re in trouble: the subclass has to call the primary constructor, which is only allowed to do trivial computation before setting the instance variables. So either the id generation code has to be duplicated in a secondary constructor for the subclass, or the id slot has to stay uninitialized and then be initialized in a second pass. The first option is obviously bad and the second pass erodes the guarantee of correct initialization. At least, that’s how I understand the mechanism.

The root of the problem, and this also applies to Java constructors, is that steps of object construction that logically belong together are executed in the wrong order. A simplified model of how the construction process takes place is that there are three steps:

  1. Preprocess: calculate the values to be stored in the object, for instance the point’s id
  2. Initialize: store the calculated values in instance variables. Only trivial field stores take place here, no general computation
  3. Postprocess: possibly additional work after the object is fully initialized.

If we consider the simple Point3D inheritance hierarchy where

Object <: Point2D <: Point3D

the newspeak model allows the following sequence of steps:

  • Arbitrary preprocessing in secondary constructor in Point3D
  • Instance creation by primary constructor of Point3D, recursively through primary constructor of Point2D
  • Initialization of Point2D slots
  • Initialization of Point3D slots
  • Arbitrary preprocessing in secondary Point3D constructor

Since no arbitrary computation is allowed from the object has been created to the whole chain has had a chance to initialize their variables the object is guaranteed to have been initialized. However, only the bottom-most object on the inheritance chain is allowed to perform pre- and postprocessing.

In Java the model is different: there, each constructor through the chain is allowed to do arbitrary computations, as long as it just starts off calling its superconstructor. Expressed in the simple steps from before, Java allows this sequence:

  • Immediate instance creation
  • Arbitrary preprocessing in Point2D
  • Initialization of Point2D
  • Arbitrary postprocessing in Point2D
  • Arbitrary preprocessing in Point3D
  • Initialization of Point3D
  • Arbitrary postprocessing in Point3D

Object initialization and arbitrary computations are interspersed which gives plenty of opportunity to leave the object half-initialized if an exception occurs and for superclasses to accidentally call methods overridden by subclasses before the subclass has had a chance to properly initialize itself.

The ideal model would be one that looked something like this sequence of steps:

  • Arbitrary preprocessing in Point3D
  • Arbitrary preprocessing in Point2D
  • Object instantiation
  • Initialization of Point2D
  • Initialization of Point3D
  • Arbitrary postprocessing in Point2D
  • Arbitrary postprocessing in Point3D

Here, all classes are allowed to do arbitrary pre- and postprocessing but the actual instantiation and initialization is atomic so the object is guaranteed to be well-formed. Finally, once the object has been initialized, each inheritance level can do postprocessing with a fully initialized object.

My next post will explain how I think that can be made to work.


[1]I assume that newspeak constructors guarantee that objects are always correctly initialized because of this paragraph in Gilad's post: "One detail that’s new here is the superclass clause: Point3D inherits from Point2D, and calls Point2D’s primary constructor. This is a requirement, enforced dynamically at instance creation time. It helps ensure that an object is always completely initialized."