Sequence Comprehensions

Incredible: it turns out that the for expression syntax for list comprehensions I described in my last post is almost identical to the syntax of sequence comprehensions in Scala. Compare this saturn code:

for (x: ints, x % 2 == 0) -> x

with this code in Scala:

for (val x <- ints; x % 2 == 0) yield x

Looks pretty familiar right? I do like the saturn syntax better, though.

Why do I keep having other people’s ideas?

Details

I had originally decided that I wasn’t going to spent a lot of time coming up with new features for saturn but would just reimplement neptune. However, as I might have expected, I just can’t help myself. Many of the “new” ideas are things we discussed back when we designed neptune, though, so it’s more “outtakes” than actual new material.

The thing I’ve been most happy about adding so far is also the very simplest: a compact syntax for creating intervals. In C, Java and even neptune, you iterate over an interval by using the standard for loop:

for (int i = 0; i < n; i++) ...

This is unfortunate for more than one reason. First of all, it's too much boilerplate. 90% of for loops that iterate over a sequence of numbers are the same except for the name of the loop variable and the limit to iterate up to. Below, I've underlined the parts of the statement that aren't boilerplate:

for (int i = 0; i < n; i++) ...

That's actually less than half the code! Also, the loop variable is mentioned three times which gives ample opportunity to get it wrong. If you're like me, for instance, and have the variable i hardwired in your brain but need to iterate for a different variable there's a very real risk that you'll end up writing

for (int j = 0; i < n; i++) ...

The third problem with this kind of for loops is a classic: the limit is evaluated for each iteration. In rare cases that is exactly what you need. More often, however, it is either unnecessary (and may waste execution time) or just plain wrong.

In saturn it takes just one method to fix this: the .. method on the class Integer:

class Integer {
...
operator ..(that) {
return new Interval(this, that);
}
}

With this method, writing from .. to creates a new open interval from from to to. What can you do with an interval? Most importantly, iterate through it:

for (i : 0 .. n) {
// whatever
}

Almost all the boilerplate code is gone. The iteration variable is only mentioned once and the limit is only evaluated once, when the interval is created. And since the receiver of the .. method is known at compile time, the compiler can optimize away the creation of the interval with no visible consequences other than a performance improvement. Yessir!

Another change I've made is to add more shorthands for occurrences of expressions. In neptune we had two syntaxes for blocks: one for statement blocks and one for blocks whose body was just a single expression:

// statement block
fun (x) {
var str = x.to_string();
System.print_on_stdout(str);
}

// expression block
fun (x) -> x * 2

In saturn, I've decided to introduce this distinction more places. Possibly the most visible place I've added this is methods -- there is now a compact syntax for methods that return the value of a single expression:

class Point {
var x, y;
...
norm() -> sqrt(x * x + y * y);
}

In neptune (and Java and ...) you would have had to write something like

norm() {
return sqrt(x * x + y * y);
}

Regular "statement methods" still exist but this makes it easier to write short methods and having short methods is generally a Good Thing. Also, to me, it just sort of looks right in a way I can't explain.

I'm also considering adding a new form of for statement along the same lines:

var ints = for (i : 0 .. 100) -> 2 * i;
// ints = [0, 2, 4, ..., 198]

As with a regular for statement, a for expression evaluates its body repeatedly but the expression collects the values of the body in a new collection. It's essentially a list comprehension

ints = [ i * 2 | i <- [0 .. 100] ]

but with an untraditional syntax that has a number of advantages: it's very easy to explain in terms of for statements and the variable definitions occur syntactically before their use. In the Haskell-like syntax above, i is being used before it's defined which I find non-obvious and difficult to read.

There you have it: three new features, two of which have already been implemented. I haven't figured out all the details of how the new for expression should work but I'm sure I can take inspiration from existing list comprehension mechanisms. I'm also working towards an alpha release of the compiler and runtime but currently you'll have to get it through CVS from saturn-language at sourceforge to try it out.

While

I’m still working on saturn and I recently implemented tail-call optimization. One really neat thing about tail-call optimization is that now, unlike in neptune, loops can be implemented purely within the programming language without any “magic” from the compiler. Okay a little magic is still required to parse the while loop syntax so that

while (i < n) {
System.print_on_stdout(i);
i = i + 1;
}

can be translated into

(fun -> i < n).while(fun {
System.print_on_stdout(i);
i = i + 1;
})

but where the while method was magic in neptune it can be implemented directly in saturn. It turned out, however, that in order for this method to behave as you would expect you have to be careful with how you implement it. My first implementation was pretty straightforward:

while(body) {
if (this()) {
body();
this.while(body);
}
}

The most important property of while, and the reason why it couldn’t be implemented in neptune, is that it must not grow the stack. The way to make this happen is to be sure that the recursive call to while is a tail call which is guaranteed not to grow the stack.

The problem with the method above, or one of the problems, is that in saturn, an if statement is actually a call to the if method. The if statement in the method above corresponds to this:

(this()).if(fun {
body();
this.while(body);
})

Unfortunately, that method call grows the stack. It’s easy to get rid of though, we can just change it into a guard instead:

while(body) {
if (!this()) return;
body();
this.while(body);
}

Now the recursive call to while is not surrounded by any other method calls. However, because of a nasty subtlety the recursive call is still not a tail call.

When is a call a tail call? When the call is the absolutely last thing done in the surrounding method before it returns. It would seem that in the method above the recursive call to while is indeed the very last thing that happens — so it should be a tail call right? The snag is that methods that don’t explicitly return a value implicitly return a dummy value, void. If we could write the method out in full it would correspond to this:

while(body) {
if (!this()) return;
body();
this.while(body);
return void;
}

But as before, it’s easy to get around this. If we return the value of the recursive call instead of letting the method return void the call finally becomes a tail call:

while(body) {
if (!this()) return;
body();
return this.while(body);
}

That is indeed how the method is implemented.

It is unfortunate that the last method call in a method is not a tail call but that would require methods without an explicit return to return the value of the last statement (for some suitable definition) and that brings up endless problems. Returning a void value proved to work very well in neptune so I’m not inclined to change that. Maybe there’s some implementation trick that can allow both things? I suspect there might be.

Anyway, this is one example of tail-call optimization interacting in interesting ways with other features of the language. I’m sure there’ll be more examples along the way.

Saturn!

Maybe it’s because it’s summer and nobody is registering new projects, but the registration of saturn-language at sourceforge came though extremely quickly, in less than a day. And now that I’ve made a new logo it looks like the saturn project is slowly coming alive. Yes, the new logo looks a lot like the old one but still: the ring is broader and now it’s got a few colored bands. Clearly that’s saturn, not neptune.

Someone asked me what the goals of the project are. I don’t have any long-term goals yet but currently the goal is to design the language and get an implementation, any implementation, up and running as quickly as possible. Most of the language design is already done so it’s mainly a question of implementing it. Currently there is a compiler, a simple interpreter written in Java and a very simple class library; all three evolve in parallel. The interpreter is very inefficient but on the other hand it can be changed very quickly to accomodate changes in the language or the compiler. Once the language is more stable and there are plenty of library code (tests, in particular) I will look into implementing a more efficient runtime in C or C++. But it will be some time before the project reaches that point and anyway, I’m more of a compiler/IDE guy than a VM guy so maybe I’ll have to get some more people involved to get a decent VM together. On the other hand, plenty of language get along fine with less-than-decent VMs so we’ll just have to see.

Will this future C or C++ runtime be small enough to run on embedded systems? I’m pretty sure it will. Neptune definitely did, it was designed for it. Saturn makes some extensions that will definitely require more of the runtime: transactional memory, dynamic privacy, full blocks and most likely other things I haven’t even thought of yet. But I believe there are implementation techniques that allows these things to be implemented efficiently in a reasonably small VM. But that’s way way into the future — the project is still less than a month old.

A complete aside: while I was looking for pictures of saturn I found this disturbing painting by Goya, Saturn devours his children.


I don’t even now what is most spooky: the half-eaten son or the mad look in Saturn’s eyes. But then again Goya was apparently half-mad when he painted it, not on a canvans but on his dining room wall.

The story goes that Saturn overthrew his father Uranus and castrated him, no less. Later it was foretold that the same would happen to Saturn himself so to be on the safe side he decided to eat his children at birth. It didn’t do him much good, though, he was just overthrown by someone else: Jupiter. John Keats wrote the (unfinished) poem hyperion about that:

And the sad Goddess weeping at his feet:
Until at length old Saturn lifted up
His faded eyes, and saw his kingdom gone
And all the gloom and sorrow of the place

Incidentally, both the painting and the poem are from 1819. But enough about that.

Super/Inner

The title of my last post was “Saturn?”, which I never really explained. Obviously, Saturn is an alternative name for the new implementation of neptune. The reason why I’ve considered changing the name is both that the language I’m implementing now is somewhat different from neptune and to avoid potential, how shall I put it, “conflict” with Esmertec (not that I expect any problems with Esmertec but just to be on the safe side).

Last week I hadn’t made up my mind about it but now I have: I’m changing the name. Neptune is dead; long live Saturn!

The final straw was the decision I made today to support both super and inner calls. Super calls are a well-known mechanism that exist in most object-oriented languages. Inner calls, a mechanism that originally comes from the beta language, are much less known and completely unknown in mainstream languages.

In most object-oriented languages, overriding a method means replacing it completely. An overriding method can call the overridden method through a super call, but it is entirely up to the implementer of the overriding method if and how it wants to do that.

class Window {
paint() {
...; // paint background
}
}

class BorderWindow: Window {
paint() {
super.paint(); // paint background
...; // draw border
}
}

This works well most of the time since it gives overriding method complete control while it still has the option of calling the overridden method. Occasionally, though, having the subclass in the driver’s seat just isn’t what you want. For instance, sometimes there is no good place to call super in the subclass. One place I’ve met this is when implementing caching in a superclass: the superclass handles caching and subclasses are only called upon to calculate a value if there is a cache miss.

abstract class Node {

get_index() {
if (has_cached_index(this))
return get_cached_index(this);
int index = calculate_index();
cache_index(this, index);
return index;
}

abstract calculate_index();

}

class SomeNode: Node {

override calculate_index() {
...; // expensive calculation
}

}

Here, the superclass handles caching in get_index and the subclasses can specialize how the index is calculated by overriding calculate_index. But this approach is not without problems: it introduces two methods that are deceivingly similar but must be used in two different ways. Only get_index is allowed to call calculate_index directly, since otherwise you won’t get the proper caching behavior. On the other hand, subclasses must override calculate_index, not get_index, but must of course not call calculate_index themselves.

This is where inner comes in. An inner method, as opposed to an overriding method, specializes a method in a superclass but doesn’t replace it. Instead, the method in the superclass can explicitly call the inner method in the subclass:

class A {

print() {
System.out.println("a");
inner.print();
}

}

class B: A {

inner print() {
System.out.println("b");
}

}

new A().print();
> a

new B().print();
> a
> b

Here there are two print methods but when you call print on an instance of B, the method in A will be called, not the one in B. It will only be called when the super implementation calls inner.print(). It’s in some sense overriding turned upside down, with calls downwards in the inheritance hierarchy through inner instead of calls upwards through super. Using this, we can rewrite the caching example from before:

abstract class Node {

get_index() {
if (has_cached_index(this))
return get_cached_index(this);
int index = inner.get_index();
cache_index(this, index);
return index;
}

}

class SomeNode: Node {

inner get_index() {
...; // expensive calculation
}

}

Now there is only one method so there is no confusion about which method to call and which method to override.

As demonstrated above, you don’t need inner as such since you can always just create a new method like calculate_index instead. However, a lot of situations where you would usually use super can be described more elegantly using inner. For instance, if you’ve ever come across a method that required overriding methods to call the super method first (that happens often in eclipse), that’s really just an inner call struggling to get out:

class AbstractPlugin {

/**
* Overriding methods must call super first.
**/
start() {
...; // start plugin
}

}

class MyPlugin: AbstractPlugin {

start() {
super.start();
...; // start my plugin
}

}

With inner methods that could be rewritten as:

class AbstractPlugin {

start() {
...; // start plugin
inner.start();
}

}

class MyPlugin: AbstractPlugin {

inner start() {
...; // start my plugin
}

}

Here, subclasses don’t have to implement start in any particular way, and the fact that the superclass implementation must be executed first is encapsulated in the super implementation of start. Also, the superclass is free to call the inner method however it wants: conditionally, three times, not at all, etc., which is impossible with super calls.

There’s a bit more to inner calls, for instance: what if there is no subclass implementation, how do inner calls interact with super calls, etc. All that is described in the paper I referenced above, Super and Inner – Together at Last!. The model in Saturn is pretty close to the one in the paper (but not exactly the same).

Saturn?

As I mentioned in my last post, I’ve started work on a new implementation of neptune. Even though Esmertec Denmark has been closed down it didn’t mean that the language died, not as such. What it meant was that an implementation of the language died. Now that I have some time off I might as well write a new one.

Will the new implementation be exactly the same as the old one? Not exactly, no. Now that there’s no marketing aspect I’ll take the opportunity to try out a few things. One clear difference is that concurrency control will be based on a transactional memory model. I’ve experimented with changing the syntax a little, for instance allowing -, ?, ! and such in identifiers and method names but in the end I decided against that. There will probably be “real” dynamic privacy. There will definitely be tail call optimization and full closures. But the plan is to otherwise stay close to the original language. There’s a very good reason for that. I know myself: if I’m completely free to design a language I get stuck before I even begin with theoretical issues such as “what is a method name, really?” and “how can I unify instance variables and local variables?”. So to get any work done at all I’ll have to stick with what we already have except for adding a few well-understood mechanisms.

I don’t know if this language will ever get anywhere. I have no world domination plans, so far I’m just doing this because I enjoy implementing programming languages. Of course I do hope that at some point there will be room in the world for a statically typed object-oriented language that is not C++, Java or C#. Otherwise I’m afraid the next few decades of my career will be something of a disappointment.

By the way, if you feel like contributing or just taking a look at how things are going, neplang at sourceforge is open for business.

Back

I’m back after a week of sailing in Sweden. Both the experience of sailing and the experience of going to Sweden depend a lot on the weather. This year, the weather was a bit less than optimal. Oh well, at least that gave me a chance to finally read some books I’ve had lying around. The stars’ tennis balls by Stephen Fry deservers honorable mention. Guaranteed to be read in one sitting it said on the cover. I promised myself to just read a chapter or two and save the rest for the trip home. Well, apparently book covers know me better than I do. So on the way home I just stared out the window.

Now that I’m back I plan to spend some time working on a project I started a few weeks ago. It’s a new implementation of a programming language that readers of this blog may be familiar with. I’ll write more about what my plans are when I know myself, right now I’m just typing in code.

RIP

Esmertec sent out an innocent-looking press release today: Jean-Claude Martinez Confirmed as CEO of Esmertec. But there’s a bit more to it than the title suggests:

Cost reductions: Management has decided to close down Esmertec’s subsidiary in Denmark and reduce significantly the operations in Japan. The company has assigned resources from other Esmertec’s offices and in Japan to continue supporting the products and customers in these regions.

What that means is that I now have a really long summer vacation. It also means that you shouldn’t expect an open source version of neptune to be released any time soon — in fact, don’t expect any version to be released at all.

Why has this happened? This press release gives a hint.

Types #2

When I wrote the first post about types in neptune I ran out of steam before I was completely finished. In this post I’ll finish by describing two features I only mentioned briefly in the first post, the void type and protocol literals.

Checks

But first: as Isaac points out, I’ve spent a lot of time explaining what the type system doesn’t do and doesn’t require, and very little explaining what it actually does do for you. I guess I didn’t think to write about it because the checks that are performed are pretty simple and conventional. Here’s a summary of the checks performed by the compiler.

  • If you call a method m and the type system knows that the receiver has a type that doesn’t define the method m, the checker will issue a warning:
    protocol Runnable {
    run();
    }

    Runnable action = ...;
    action.runn(); // warning: undefined method Runnable.runn()
  • If the type checker knows the signature of a method and you pass it an argument with an incompatible type, a warning will be issued:
    int gcd(int a, int b) { ... }
    gcd("foo", "bar"); // warning: incompatible types

    I’ll return to what “incompatible” means in a second.

  • If a value is assigned to a variable of type T, a warning is issued if the type of that value is not a subtype of T:
    int x = "4"; // warning: incompatible assignment
  • When returning a value from a method with return type T, the returned value must be a subtype of T.
    int foo() {
    return "bar"; // warning: incompatible return type
    }

    As a special case of this rule which I’ll return to later, returning a value from a void method is illegal, and not returning a value from a non-void method is illegal.

  • Finally, the compiler checks that methods override each other properly. An overriding method is allowed to return a subtype of the overridden method (return type covariance) and it is allowed to take arguments that are supertypes of the arguments of the overridden method (argument contravariance):
    class SuperClass {
    Object foo(int x) { ... }
    }

    class SubClass: SuperClass {
    Object foo(String x) { ... } // incompatible override
    int foo(Object x) { ... } // legal
    }

There may be more checks that I’ve forgotten but at least these are the most important ones. All but one of these checks use the subtype relation between two types which is simple in neptune: The type X is a subtype of Y they are equal, if Y is a class that extends X, directly or indirectly, or if Y implements the protocol of X, again directly or indirectly. It’s a simple version of the nominal typing relation of many other object-oriented languages, including Java and C#, except that it is more general because one class can be a subtype another class without having that class as a superclass.

Besides these checks, the compiler also issues warnings if it thinks a class has not been properly implemented: if non-abstract classes have abstract methods, if a class doesn’t provide all the methods required by the traits it uses, if it declares that a method overrides another but it doesn’t, etc. However, all those things could be checked even if there was no static type system.

I hope that clarifies what the type system will do for you, and then move on to describing the two last features of the type system.

Void

In most languages, methods that don’t return a meaningful value are treated differently than other methods. In the C language family, such functions and methods are declared as if they did return a value of type void, but the void type is special and magical; what it really means is that the function don’t return a value at all. In particular, there is no value of type void and you can’t declare a variable of that type:

void do_nothing() { }

void my_void = do_nothing(); // illegal

Void methods and functions in the C languages are fundamentally different from functions and methods that return values and the type system keeps track of whether or not you’re allowed to use the result of a call. In neptune we can’t do that because you are free to call a method even if the type system knows nothing whatsoever about it. So we couldn’t use the C model, at least not directly.

In smalltalk, on the other hand, there is no such thing as a method that doesn’t return a value. You are free to not explicitly return a value from a method, but in that case the method returns self. I don’t consider that to be a particularly good design and long before we considered switching from smalltalk to another language we experimented with changing this in OSVM. The problem is that many methods really aren’t intended to return a meaningful value but return self by default. Then someone accidentally uses this value and then later when the method is changed, under the assumption that no one is using the return value, the program breaks somewhere completely unexpected. The problem is that the intention of the method isn’t clear because sometimes methods really do intend to return self but you can’t tell if this is the case since the return isn’t written explicitly. Personally, I used the convention that if I wanted a method to return a value I would always explicitly return it, even self. If I didn’t want the method to return a meaningful value, I would return nil.

In neptune, we wanted a combo solution: all methods should return a value but we also wanted to avoid the problems from smalltalk and we wanted the code to look like C. The solution was to make void into a real type and introduce a void value. All methods that don’t explicitly return a value automatically return the void value, and a return statement without an explicit value also returns the void value. The void value itself is a full, real object just like null:

void get_void_value() {
return;
}

var void_value = get_void_value();
System.out.println(void_value.to_string()); // prints "void"

This model gives you almost the same behavior as C, Java and co. but for different reasons. If you declare that a method returns void the type system will warn you if you return a value of another type, since the returned value is not a subtype of void. If a method is declared to return something other than void but don’t return a value, the type check fails because a method that doesn’t explicitly return is made to return the void value which is not a subtype of the return type. Finally, if the void value ever turns up as the value of a variable or somewhere else where you don’t expect it you’ll know that there is a problem.

This model does give a little more flexibility, some of it is probably useless (but you never know) and some of it is slightly useful. First of all, since void is a regular type you can declare variables of type void. I don’t know why you would do that but you can. Secondly, it is legal to explicitly return a value from a void method if that value is the void value. That might not sound useful but it can be since it means that you can call another method and return in the same statement:

void add(var obj) {
if (delegate) return target.add(obj);
...
}

In java, you can’t call a void method and return in the same statement so you would have to write

void add(Object obj) {
if (delegate) {
target.add(obj);
return;
}
...
}

It’s a small thing but it does give greater uniformity. By the way, I believe you can actually do the same thing in C++.

Of course, having a dynamic representation of the “nothing” value is not a new thing, many languages have that. My experience with it, at least in the context of this kind of language, is that it works very well in practice and better than any of the alternatives I’ve tried.

Protocol literals

The last thing is sort of an oddball. Sometimes it can be practical to represent a type as a run-time value. For instance, in our unit test framework we have a method that asserts that running a particular piece of code throws a particular exception

assert_throws(IOException.protocol, fun -> file.open());

The expression IOException.protocol gives you an object that represents the protocol IOException. A protocol object can be used for exactly one thing: to test whether other objects implement the protocol it represents:

Protocol p = String.protocol;
var x = "Beige";
if (p.is_supported_by(x)) ...

That may not sound very exciting but it can be useful in combination with other abstractions that are built on protocol objects, for instance exception handling. In a post a long time ago I mentioned that in neptune, an exception handler such as this:

try {
...
} catch (e: IOException) {
...
}

is expanded into this:

fun {
...
}.try_catch(
new Vector { IOException.protocol },
fun (Exception e) {
...
}
)

Exception handling is based on the try_catch method which uses a list of protocol literals to decide whether or not it catches a particular exception. The fact that you can call this method directly means that you can make exception handlers that are not statically bound to a particular exception type but where it can be determined dynamically which exceptions to catch. This is how assert_throws is implemented:


void assert_throws(Protocol type, fun f) {
fun {
f();
fail_not_thrown(type);
}.try_catch(new Vector { type }, fun {
// Ignore exceptions of the right type
});
}

It’s not something you use often but on those rare occasions where you do use it it’s really handy.

Types

The last big thing in neptune that I haven’t written about is the static type system. We’re not actually 100% done with it yet — the design is mostly done but there is still a few late additions that we seem to agree are good ideas but that we haven’t implemented yet. I think we’re far enough, though, that I can give a pretty accurate overview.

The way the type system in neptune came about is kind of unusual. Neptune descends from smalltalk which is not statically typed, but an important goal in the design of the language was to make it look familiar to C and C++ programmers. If you look at these two versions of the same program:

int gcd(int a, int b) {
while (a != 0) {
int c = a;
a = b % a;
b = c;
}
return b;
}

and

gcd(a, b) {
while (a != 0) {
var c = a;
a = b % a;
b = c;
}
return b;
}

the first program looks just like C (because that’s what it is) and the second one looks sort of C-ish but also pretty different. The only difference between the two programs is, of course, that one has type annotations and the other one doesn’t. When we set out to design neptune there was really no question: if we wanted it to look like C, we had to have a static type system. That is not to say that this was the only reason to have a static type system, there are plenty of very good reasons for that, but it is the reason why not having a type system was never an option. Besides the syntax, the most important reasons were that it allows you to write documentation that can be understood and checked by the compiler which again enables much better tool support. The most important non-reasons were optimization security — they were basically irrelevant in the design.

So, what’s the neptune type system like? Well, first of all it’s an optional type system. This means that you are never required to specify a type for anything. For instance, the two gcd programs above are both legal neptune code1. Basically, we want the type system to be there as a service that you can use if you want to but which doesn’t impose itself on you. From a “genealogical” viewpoint, it is closely related to the Strongtalk type system for smalltalk.

The examples above demonstrate the type syntax. If you use type annotations the syntax is the same as in the C language family. If you don’t use type annotations you either just leave the type out, as with return types and parameters, or use the keyword var, as with local variables. In some cases, for instance method parameters, you can do both.

The type system is based on the concept of protocols, which is smalltalk for what most people know as “interfaces”. A protocol is simply a collection of methods:

protocol SimpleOutputStream {
void write(int b);
void flush();
void close();
}

Unlike most statically typed object-oriented languages, including Java and C#, neptune does not have class types. Class types are a Bad Thing because they not only specify the interface of an object but also dictate how the object must be implemented, breaking encapsulation.

There are two ways to define a protocol. One way is to write it directly, as the SimpleOutputStream example above. Also, whenever you define a class you automatically get a protocol with the same name. That might sound confusing but in practice it isn’t. Protocols work much the same way as interfaces, and protocols defined explicitly work exactly the same way as protocols defined by classes. For instance, you are free to the protocol of a class without subclassing it:

class MyString is String {
...
}

In this declaration, “is String” means that MyString implements the protocol of String, not that it extends String — the superclass of MyString is in this case Object. To create a subclass of String you would write

class MyString: String {
...
}

A class can extend one other class but implement an arbitrary number of protocols.

If you use String as the type of a variable it’s usually safe to think of it as if the variable can contain instances of String. But that’s is not accurate in general: the variable can actually contain any object that implements String‘s protocol. Neptune has no way to express that a variable can only contain instances of a particular class.

Programs can dynamically test whether or not an object supports a particular protocol by using the is operator:

if (obj is String) ...

One of the things we haven’t implemented yet but seem to agree is a good idea is to combine this with a limited form of dependent types. In most other languages, for instance Java, the compiler doesn’t “remember” instanceof checks:

if (obj instanceof String) {
label.setText(obj); // illegal: obj not String
}

The compiler doesn’t recognize that you only enter the body of the if if obj is a String — instead, you have to insert a cast:

if (obj instanceof String) {
String str = (String) obj;
label.setText(str);
}

In neptune, we plan to allow the compiler to automatically remember extra type information about local variables after is checks, something that will in many cases make casts unnecessary:

if (obj is String) {
label.setText(obj); // okay, obj is a String
}

In a language like Java that might make programs harder to understand, since the runtime behavior of the program can be affected if the compiler gains more type information. In neptune all it does is avoid unnecessary type warnings, since the runtime behavior of a program is not affected by static types.

This leads me to another aspect: type warnings. In neptune, all problems detected by the type checker are reported as warnings, not errors. It doesn’t matter how wrong you get the types, the program will still compile and run. Well, except for one case: you can’t mix function objects (blocks) and other objects, you have to get the types right there. That is because the compiler needs to be able to track that no references exist to a block after the method that created it returns — otherwise the runtime will die horribly. The philosophy of when to give warnings and errors is simple: we only give errors in a program if we can’t give meaning to the program, otherwise we give warnings. You won’t, for instance, see errors on unreachable code or other such nonsense. You can configure the IDE to give errors instead of warnings but that’s not the default.

Neptune’s static type system is “weak”. Even if x has type int there is no guarantee that it can only contain integers:

var my_string = "blah";
int x = my_string;
int y = x + 18;

In the second line an object of an unknown type is assigned to an integer-typed variable. That is perfectly legal and does not cause a warning so this code will compile without issuing any errors or warnings but will probably fail at runtime in the third line when adding an integer to a string. That doesn’t mean that you can use this to break the runtime. The runtime is strongly typed so no matter how wrong you get the static types it will survive. You might ask why we don’t add runtime checks to catch these situations but that is potentially expensive and it’s probably not worth the trouble. We could also add warnings if we’re not sure that an assignment is safe, but that would generate warnings whenever untyped code called typed code, which goes against the principle that the type system should never impose itself on you if you decide not to use it. Personally, this has never been a problem for me, not even once.

The other thing we’ve discussed and seem to agree upon but haven’t implemented yet are nullable types. By default all types contain the null value:

String str = null; // legal

But often a value is not actually supposed to be null. Nullable types allow you to express this:

String! str = null; // illegal
String? str = null; // legal
String str = null; // legal

The type String! does not allow null whereas String? does. The type String also allows null but is different from String? in that Strings are completely unchecked whereas using a String? in place of a String! without a null check causes a warning.

Ok, I think that’s enough about types for now. There are still interesting details about the type system that I haven’t written about — protocol objects, the void type, etc. — but that will have to wait.


1In case you were wondering, int is not a basic type in neptune — neptune has no basic types. It is just a convenient shorthand for the type Integer.