A linear basis for Swanson

Brent Kerby calls out a couple of linear bases for a concatenative language.

The two-element linear concatenative basis

The three-element linear concatenative basis

How would that look in Swanson?

We really do need stack values

To start, I think we finally resolve the question of whether we model stack values using quotations, by having them be a separate entity in the formalism. And the Kerby instructions would only operate on stack values.

Stack values can be quotations

Stack values cannot be quotations

Choosing which basis

Which basis to use? Pros and cons analysis:

  • The main benefit of the two-element basis is that it would have fewer core instructions.
  • But ‘dig’ and ‘bury’ seem like they'll be very common instructions, and they both have nice intuitive definitions using ‘take’ and ‘i’. And three core instructions is not that much more than two.

Given that, at least for now, I'm going to go with the three-element basis ‘take’, ‘cat’, and ‘i’. In Kerby's post, the name ‘i’ is chosen because it's the equivalent of the ‘I’ combinator from the classical combinatory SKI calculus, and he made sure that every concatenative combinator had the same name as its classical counterpart. In S₀, I'm going to call the instruction ‘unpack’, since that seems more descriptive of what it does.

Defining the structure of stacks

v ≜ [ξ] | ...
ξ ≜ ε | v | ξ⋅ξ

A value ‘v’ is a stack value, or any of the other kinds of values we care about (described elsewhere).

A stack ‘ξ’ is either the “empty stack” ε, a “singleton stack” containing a single value, or is the concatenation of two stacks.

An alternative definition would be ε | ξ⋅v, only allowing values to be added/removed from the right side of the stack. I think I need the ξ⋅ξ definition to make the instructions work, because we don't know the size of the stack values being consumed. I worry that will make the eventual type system more difficult though, because of how [Kutsia2007] shows that sequence unification is unitary if “sequence variables occur only in the last argument positions in terms”.

[Kutsia2007] Solving equations with sequence variables and sequence functions

Definitions of the primitive instructions

I'm going to treat the empty stack value separate from the instruction that pushes a new empty stack value onto the stack. The empty stack value is ‘[]’, and the instruction is ‘nil’.

⟦take⟧   ξ⋅ v₁ ⋅[ξ₀] ≜ ξ⋅[ξ₀⋅v₁]
⟦cat⟧    ξ⋅[ξ₁]⋅[ξ₀] ≜ ξ⋅[ξ₁⋅ξ₀]
⟦unpack⟧ ξ     ⋅[ξ₀] ≜ ξ⋅ξ₀
⟦nil⟧    ξ           ≜ ξ⋅[]

Note that ‘take’ requires top-of-stack to be a stack value, but the value below that can be anything.