Ideas for making occam-pi's syntax more pleasant.

Something to do with nanopasses

2004-10-11 11:14; in Adam's PhD Stuff, occam syntax, Education; 139 words

Matt also pointed me at "A nanopass infrastructure for compiler education" (if you have a Kent account, use chain to read that).

The idea of nanopasses strikes me as elegant because it'd be easy to write and test extensions to a language by inserting new passes. For instance, I sometimes find myself wanting a BREAK statement to immediately exit an occam loop. Since it's possible to rewrite:

WHILE continue.condition
  do.something ()
  IF
    exit.condition
      BREAK
    TRUE
      SKIP
  do.something.else ()

as:

INITIAL BOOL foo IS TRUE:
WHILE (continue.condition AND foo)
  do.something ()
  IF
    exit.condition
      foo := FALSE
    TRUE
      do.something.else ()

it would be fairly straightforward to drop in an extra pass to do that transformation automatically. You could implement CONTINUE and RETURN (with the Python semantics) the same way.

Doing this would offend the single-exit-point purists, of course, but they could always just disable that pass.

occam abstract syntax

2004-10-10 12:48; in Adam's PhD Stuff, occam syntax; 73 words

While chatting to Matt Jadud about my occam Python-style syntax plan, he suggested something that I'd thought about a while ago and then forgotten again back when I first considered alternate syntaxes: having an abstract syntax that the occam compiler takes as input. I was originally thinking in terms of XML, but Matt suggested S-expressions instead; they're very simple to manipulate in Lisp-family languages, and they're much easier for humans to deal with.

Python-style syntax for occam

2004-10-10 09:14; in Adam's PhD Stuff, occam syntax; 915 words

The most striking feature about occam's syntax for newcomers to the language is that it uses significant indentation: rather than delimiting code blocks with {} like C or Perl, the parser simply looks for changes in indentation level. Since good programmers indent their code anyway, this reduces redundancy and visual clutter.

occam isn't alone in this: the popular modern languages Python and Haskell are also indentation-based. Their syntaxes have some nice features that occam currently doesn't. I'd like to revise the occam syntax so that it's comparable to more recent indentation-based languages.

I don't want to change the semantics of the language at all; I'm just proposing an alternative syntax for the existing occam-pi language. The intention is to borrow the useful syntactical features from newer languages, and hopefully make occam look a bit less alien to new programmers at the same time. (I've added a wiki page for things that confuse new occam programmers.)

A few really useful features don't fit into this scheme very well at the moment: replicated IFs, extended channel inputs, VALOF expressions. I'll need to think some more about how these could be represented.

I'd also like to come with with an example bit of occam code that uses all the features here, and mutate it as I work through the suggestions.

(One of the suggestions here was to add ASSERT to the language, until Fred pointed out that KRoC already has it!)

Flexible indentation

In occam, each indentation step must be two spaces.

WHILE foo
  SEQ
    bar ()
    IF
      condition
        baz ()

Python and Haskell don't care, provided you're consistent between lines in the same block. Python counts tabs as eight spaces; some people have suggested making it complain if you mix tabs and spaces.

WHILE foo
    SEQ
        bar ()
        IF
          condition
            baz ()

(Not that I'd actually want to indent code like that!)

Lowercase keywords

These days, most languages don't make you SHOUT ALL THE TIME. Modern syntax-highlighting editors differentiate keywords by colour, so there's no particular need for them to be capitalised any more.

while foo
  seq
    bar ()
    if
      condition
        baz ()

Simpler IF syntax

The occam IF syntax is elegant for complicated stuff, but for simple usage it's a bit verbose:

IF
  condition
    do.something ()
  other.condition
    do.something.else ()
  TRUE
    SKIP

A Python-style syntax could write this as:

if condition:
    do_something()
elif other_condition:
    do_something_else()
else:
    skip

Or perhaps even have the compiler insert the skip clause automatically; when you want the old behaviour, you can always add an else: stop clause yourself.

Changing what colons mean

occam uses colons to indicate that a declaration is in force for the next block:

INT foo:
BOOL bar:
SEQ
  c ? foo
  do.something (c)

Python uses colons after statements that need an indentation increase after them:

while foo:
    bar()
    if x == 3:
        print "foo"
    else:
        print "bar"

In neither case are the colons actually needed (the Ruby language is pretty similar but has neither, for instance), but I find the Python style to be a bit more readable.

Implicit SEQs

Current occam requires you to insert SEQ or PAR whenever you have multiple processes:

WHILE condition
  SEQ
    do.one.thing ()
    do.another.thing ()
    IF
      other.condition
        SEQ
          do.third.thing ()
          do.fourth.thing ()

A quick look at the occamnet code shows that I use about three times as many SEQs as I do PARs. It'd be possible to make occam code rather more compact by assuming that any set of multiple processes is wrapped in a SEQ unless it's already wrapped in a PAR. I suspect this'd be a controversial change, because it could result in programmers thinking less about opportunities to parallelise their code; we'd have to decide whether the increased readability makes it worthwhile.

WHILE condition
  do.one.thing ()
  do.another.thing ()
  IF
    other.condition
      do.third.thing ()
      do.fourth.thing ()

You'd still need to have SEQ in the language, so that you can do replicated SEQs or force an extra internal scope. It may also be necessary to come up with a different syntax for extended inputs, which need to have two code blocks specified (without an implicit SEQ).

Better syntax for INITIAL

occam-pi lets you initialise a variable when you declare it. However, initialising a variable currently looks more like a VAL declaration:

INT foo:
INITIAL INT bar IS 4:
VAL INT baz IS 5:

I'd prefer to have a syntax that looks more like a variable declaration:

INT foo IS 4:
INT foo := 4:

Underscores in variable names

occam, unlike pretty much every other programming languages, allows dots in variable names.

INT my.string:

Many languages use underscores instead for the same purpose.

int my_string

Field access

occam uses array-like syntax to refer to fields in structures.

packet[ip.id] := 3

C-derived languages use dots.

packet.ip_id := 3

C-style assignment and equality operators

occam uses the same scheme as many 70s-era languages for setting and comparing variables.

num := 3
IF
  num = 3
    print ("occam works!")
  num <> 3
    print ("occam doesn*'t work!")
  TRUE
    print ("occam really doesn*'t work!")

Modern languages tend to use a set of operators derived from C instead:

num = 3
if num == 3:
    print("occam works!")
elif num != 3
    print("occam doesn't work!")
else:
    print("occam really doesn't work!")

C-style string escapes

occam (following Modula?) uses * to escape characters in string constants (and complains if you don't escape ' characters).

print ("Hello, world!*c*n")

Modern languages follow the C conventions.

print("Hello, world!\r\n")

It's also unusual to use \r\n as a line-ending sequence on the systems that occam's used on these days; I'd rather just be able to write \n. (As with C, you could have \n translated to the appropriate line-ending sequence on platforms that use \r or \r\n.)

Categories

Contact: <ats@offog.org>

Copyright © 2004 Adam Sampson