dcreager.net

Slip and slurp

Swanson uses a linear, continuation-passing, concatenative language called S₀ as its assembly language. (As of right now, at least—I've gone back and forth more times than I'd like to count on most of those!)

There shouldn't be any names in S₀

Continuation-passing S₀: The return

Concatenative languages look backwards

In a typical concatenative language, programs look “backwards”, since to invoke a quotation, you have to push the inputs onto the stack first, then the quotation, and then you invoke some kind of apply instruction. Take this example from the Factor documentation for comparing two numbers:

USING: io kernel math ;
10 3 < [ "Math is broken" print ] [ "Math is good" print ] if
Math is good

if [Factor documentation]

We first evaluate the conditional, then push quotations for the then and else clauses, and then we apply the if word.

But the program is only backwards for individual calls. Higher-level logic—the sequencing of calls you want to make—still appear in “forwards” program order:

USING: io kernel math ;
10 3 < [ "Math is broken" print ] [ "Math is good" print ] if
20 7 > [ "Math is still good" print ] [ "Math is still broken" print ] if
Math is good
Math is still good

Let's make it worse

Just as a warning, this example is going to look MUCH WORSE in S₀! Here it is:

                               # ¹uint_constants ³print
{                              # ⁰ten ¹uint_constants ³print
  {                            # ⁰ten three ¹uint_constants ³print
    {                          # ⁰is_lt ¹uint_constants ³print
      ₀to₁                     # ¹uint_constants is_lt ³print
      &true {                  # ¹uint_constants ³print
        {                      # ³print
          ("Math is broken")₁  # ¹message ³print
          :print₃
        }                      # ¹uint_constants ³print κ
        :drop₁
      }
      &false {                 # ¹uint_constants ³print
        {                      # ³print
          ("Math is good")₁    # ¹message ³print
          :print₃
        }                      # ¹uint_constants ³print κ
        :drop₁
      }                        # ¹uint_constants is_lt ³print κ
      :evaluate₁
    }                          # ⁰ten three ¹uint_constants ³print κ
    :lt₀
  }                            # ⁰ten ¹uint_constants ³print κ
  :three₁
}                              # ¹uint_constants ³print κ
:ten₁

So everything is backwards, just like in the Factor example. But it's also somehow “inside out”, with worse callback hell than anything you'd see in JavaScript!

This isn't an S₀ reference, so I'm glossing over some details, but to help you follow along:

  • There are four stacks, numbered 0 through 3.
  • Comments start with #, and I'm using them to show the contents of the stacks after the instruction on the line has been executed.
  • Each instruction has a subscript indicating which stack it operates on.
  • Most things on the stack are “invokables”, which are like quotations with multiple possible branches. You invoke them with a :foo instruction, which invokes the “foo” branch of the invokable at top of stack.
  • A quotation is an invokable whose branches consist of S₀ code, appearing within braces. &foo gives the name of a quotation branch. If you don't provide one, the branch has an empty name.
  • There are also primitives (such as print and uint_constants) that are provided by the host. You can't tell (and don't need to know) whether an invokable is a quotation or a primitive—you invoke them the same way.
  • ("text") pushes a binary literal onto the stack.
  • There are no other literals! To get numeric constants, you have to invoke one of the branches of the uint_constants primitive.
  • Everything is linear! Invoking an invokable consumes it. But many primitives are “magic”, in that they have a way to put themselves back on the stack so you can use them again.
  • Continuation-passing style! Each branch must contain precisely one invocation, and it must be the last instruction in the branch. You have to push a “continuation” onto the stack—typically called κ in my stack comments and typically going onto stack 3—and most invocations will pass on control to their continuation parameter as the final thing they do.
  • evaluate is our equivalent to Factor's if. It is a method (branch) of boolean values directly, instead of being a word that takes in the conditional as a parameter.
  • Invocations are linear, which means you cannot drop them. That's why they have multiple branches—evaluate wants to invoke either the “then” branch or the “else” branch. So we pass it one quotation that has two branches. evaluate is obligated to invoke the quotation exactly once, but it can choose which branch to invoke based on the value of the conditional.

So, what can we do to make this less painful? I recently introduced two new operators that I call “slip” and “slurp”, which make this look much nicer.

Importantly, both of these operators are purely syntactic sugar. You can apply their transformations to the instruction stream at parse time; they are not part of S₀'s AST or runtime model or anything like that.

Slip implemented at parse time [git.sr.ht]

Slip

First up is slip. A slip precedes an instruction, and consists of 1 or more >s. Its job is to “undo” the backwardsness of having to push parameters before invoking things. You'll typically use slip with an invocation, and your intuition should be that the number of > tells you the number of parameters (including any continuation!) that you want to pass to the invocation.

How it works is that you parse the slip, then parse an instruction (the “slipped instruction”) and put it off to the side. Then you parse one instruction for each > in the slip, adding them to the parsed instruction stream as usual. Lastly you add the slipped instruction to the instruction stream.

(Or put another way, slipping an instruction means “go ahead and parse the next instruction, but don't add it to the parsed result until after you've parsed N other instructions”.)

Looking at a small snippet of our example, using slip we could transform

("Math is good")₁ :print₃

into

>:print₃ ("Math is good")₁

And stepping one level out, we could then transform

{
  >:print₃
  ("Math is good")₁
}
:drop₁

into

>:drop₁
{
  >:print₃
  ("Math is good")₁
}

(Remember, the continuation quotation is a single instruction, and the :drop invocation gets slipped past the whole thing.)

And as one last example, we can slip over multiple instructions, transforming

₀to₁
&true {
  >:drop₁
  {
    >:print₃
    ("Math is broken")₁
  }
}
&false {
  >:drop₁
  {
    >:print₃
    ("Math is good")₁
  }
}
:evaluate₁

into

>>:evaluate₁
₀to₁
&true {
  >:drop₁
  {
    >:print₃
    ("Math is broken")₁
  }
}
&false {
  >:drop₁
  {
    >:print₃
    ("Math is good")₁
  }
}

(There are two >s, so we skip over two instructions: ‘₀to₁’ and the multi-branch continuation quotation.)

Slurp

The other new operator is “slurp”. Slurp is signified by a ~, and lets you write out a quotation branch without introducing another level of braces. You replace the opening brace with the tilde, and leave out the closing brace.

For instance, we can transform

>:drop₁
{
  >:print₃
  ("Math is good")₁
}

into

>:drop₁ ~
>:print₃
("Math is good")₁

Slurp turns out to be most useful when you're already using slip, and when invocations expect to receive their continuations as their “last” input. Together the two let us turn our original S₀ example into

                                # ¹uint_constants ³print
>:ten₁ ~                        # ⁰ten ¹uint_constants ³print
>:three₁ ~                      # ⁰ten three ¹uint_constants ³print
>:lt₀ ~                         # ⁰is_lt ¹uint_constants ³print
>>:evaluate₁
₀to₁                            # ¹uint_constants is_lt ³print
&true {                         # ¹uint_constants ³print
  >:drop₁ ~                     # ³print
  >:print₃ ("Math is broken")₁
}
&false {                        # ¹uint_constants ³print
  >:drop₁ ~                     # ³print
  >:print₃ ("Math is good")₁
}

There's still extra complexity relative to the Factor example, since everything is linear, and because we don't have integer literals built into the language. But it's much nicer with slip and slurp than without!

Swanson

Concatenative programming languages