while
and sa-decl
While
In this approach, a practical language is translated into a kernel language that consists of a small number of programmer-significant elements. The rich set of abstractions and syntax is encoded into the small kernel language. This gives both programmer and student a clear insight into what the language does.
— Concepts, Techniques, and Models of Computer Programming
Some of these kernel languages will be proper subsets of the languages we use in the course (especially Oz), but some will be theoretical languages.
In midterms and exams, you will generally not be expected to have memorised the syntax of such theoretical languages.
In fact (unless it is announced otherwise before the tests), in questions regarding these languages we will provide you (the relevant portions of) the CFG defining the language syntax.
For a kernel language to be “useful”, we must convince ourselves that it includes
Almost any general purpose language serves as an example that violates the second condition.
An example of a language which violates the first is the “Goto” language. In this language,
⟨stmt⟩ ∷= [label] ⟨command⟩
| ⟨stmt⟩ ; ⟨stmt⟩
⟨command⟩ ∷= skip
| var ++
| var --
| if-zero var goto label
This language is useful for studying computability, but not so much for studying language design.
There are two ways we will extend our kernel languages:
Syntactic sugar
Whereas a linguistic abstraction
for
loops to a language with only while
loops.For simplicity's sake, we will (for the moment) restrict our attention to languages with
and we will not assume any type checking.
The abstract syntax of these expressions is given by the grammar
⟨expr⟩ ∷= ⟨bexpr⟩ | ⟨iexpr⟩
⟨bexpr⟩ ∷= true | false
| ⟨expr⟩ == ⟨expr⟩ | ⟨expr⟩ \= ⟨expr⟩
| ⟨expr⟩ =< ⟨expr⟩ | ⟨expr⟩ < ⟨expr⟩
| ⟨expr⟩ >= ⟨expr⟩ | ⟨expr⟩ > ⟨expr⟩
⟨iexpr⟩ ∷= number | var
| ⟨iexpr⟩ + ⟨iexpr⟩ | ⟨iexpr⟩ - ⟨iexpr⟩
| ⟨iexpr⟩ * ⟨iexpr⟩ | ⟨iexpr⟩ div ⟨iexpr⟩ | ⟨iexpr⟩ mod ⟨iexpr⟩
We will later introduce more basic types than just integer and boolean (as well as discussing type checking).
So we call this simple language of expressions Expr₀.
We are going to define two kernel languages; one imperative, and one declarative.
Before we get to the statements of the languages, we need to explain the memory model underlying each.
We will call these two memory models:
We begin though by discussing a simpler model, which we call explicit state without reference types.
A binding
In the simplest memory models, values are simply assigned to variables; memory is a mapping between variable names and values.
In general, things are more complex, and it is useful to distinguish the terms
We will generally denote the contents of memory using
the symbols σ
(sigma) and τ
(tau), possibly with
subscripts (σ₁
, τ₂
, etc.) or primes (σ′
, τ‴
, etc.).
We use notation similar to substitution to update memory states.
σ[x ≔ v]
is read as σ
with x
updated to be v
.
We will also need to discuss an environment
for variables and procedures/functions;
we will use Σ
(Sigma), adorned with subscripts or primes,
to denote environments.
In the explicit state without reference types model, we can view
So a state σ is a mapping between identifiers and values.
σ : Identifier → Value
This is a nice, simple mental model.
If σ
is this state, the update σ[x ≔ v]
changes the
arrow coming from x
to instead point to value v
.
When we add the concept of references or addresses to our memory model, we must update our notion of state; we now view
The addition of references to our model provides the abstractions
In this memory configuration, x
is a constant, y
is a (simple) variable,
and z
is a reference variable.
Assignment changes the arrows leaving references; the arrows leaving identifiers are fixed.
So if this state is σ
, σ[r₁ ≔ v]
would be a new state where
r₁
points to v
instead of 5
.
In a single-assignment store, variables can be bound to values at most once.
We consider a single-assignment store model which allows partial bindings.
In this memory configuration,
x
is (totally) bound to the value 5
,y
is (partially) bound to the value 2 + z
,z
and z′
are bound together, but neither is bound to a value.While
The basic operation in imperative languages is the assignment. To make a Turing-complete language, we additionally need sequencing and some control structures.
while
loop.As mentioned, most imperative languages include some ability to work with references to memory.
To introduce reference types to our language, we extend Expr₀ to Expr₀′, adding
⟨expr⟩ ∷= ⟨rexpr⟩
⟨rexpr⟩ ∷= & var | var
⟨expr⟩ ∷= ! var
&
operation obtains the reference to a variable.!
operation dereferences a reference variable,
returning the value stored at the reference.
var
is not of reference type, this results in a type error.
*
.while
So let us define a kernel for imperative languages
based on the while
loop and an if-then-else
.
We'll call this language While₀.
⟨stmt⟩ ∷=
skip
| var ≔ ⟨expr⟩
| ⟨stmt⟩ ⟨stmt⟩
| if ⟨bexpr⟩ then ⟨stmt⟩ else ⟨stmt⟩
| while ⟨bexpr⟩ do ⟨stmt⟩
Note that:
;
is used to sequence instructions, but this is
abstract syntax, so we omit it.
end
marker
for the body of if
's and while
loops.then
, else
and do
keywords,
but choose to keep them for clarity.≔
rather than =
.While₀ is a sufficient language in many ways, but it is missing (at least) two key abstractions.
Since we can encode subroutines as a linguistic abstraction, we do not address the first shortcoming, at least for the moment.
However, to make the kernel language useful, we must address the second shortcoming.
Most [imperative] programming languages have, among others, five constructs: assignment, variable declaration, sequence, test and loop. These constructs form the imperative core of the language.
— Principles of Programming Languages (Dowek)
We add the “do nothing” command skip
to this list of constructs
to obtain our language While.
⟨stmt⟩ ∷=
skip
| local var in ⟨stmt⟩
| var ≔ ⟨expr⟩
| ⟨stmt⟩ ⟨stmt⟩
| if ⟨bexpr⟩ then ⟨stmt⟩ else ⟨stmt⟩
| while ⟨bexpr⟩ do ⟨stmt⟩
In the kernel language approach,
The syntax of While we have given
For interest, let us investigate this embedding with a language we are familiar with:
Starting with expressions,
&
, we embed as the method object_id
.
& x
≈ x.object_id
!
, we embed as the function
ObjectSpace._id2ref
.
! x
≈ ObjectSpace._id2ref(x)
Considering each type of statement of While:
skip
skip
.local var in ⟨stmt⟩
var = nil; s
where s
is the embedding of the sub-statement.var = expr
⟨stmt⟩ ⟨stmt⟩
if ⟨bexpr⟩ then ⟨stmt⟩ else ⟨stmt⟩
end
after the second statement.while ⟨bexpr⟩ do ⟨stmt⟩
end
after the statement.By our embedding, the While program
local x in local y in x = 5 y = ! & x while y > 0 do y = y - 1
is embedded as
x = nil ; y = nil ; x = 5 ; y = ObjectSpace._id2ref(x.object_id) ;
while y > 0 do y = y - 1 end
The second kernel language we consider
⟨stmt⟩ ∷=
skip // Empty statement
| ⟨stmt⟩ ⟨stmt⟩ // Sequence
| local ⟨var⟩ in ⟨stmt⟩ end // Variable creation
| ⟨var⟩ = ⟨var⟩ // Binding
| ⟨var⟩ = ⟨value⟩ // Value creation
| if ⟨var⟩ then ⟨stmt⟩ else ⟨stmt⟩ end // Conditional
| case ⟨var⟩ of ⟨pattern⟩ then ⟨s⟩ else ⟨s⟩ end // Pattern match
| `{` {⟨var⟩}⁺ `}` // Procedure application
We will continue working with these kernel languages, beginning by
We will also