.sh 1 "Discussion"
.pp
SR lies in between recent languages in which
a distributed program appears to execute on one virtual machine
(e.g., Ada, Linda [Gele85], and NIL [Parr83, Stro83])
and more conventional languages in which a distributed system is built from
distinct programs, one per machine.
In SR, the invoker of an operation need not be concerned
with where that operation is serviced,
but mechanisms are provided to enable
the programmer to exert some control over a program's
execution environment.
For example, the programmer can control where a resource is
created and can determine whether a resource or machine has failed.
In this respect, SR is similar to the V kernel [Cher84].
However, SR and the V kernel take quite different approaches
to constructing distributed programs.
SR is a strongly typed language with an integrated set
of mechanisms for sequential and distributed programming;
the V kernel is a type-less collection of message-passing
primitives that are invoked from a sequential language such as C.
The V kernel has been designed with efficiency being the
most important criteria; SR has been designed to balance
expressiveness, ease of use, and efficiency.
.pp
The remainder of this section discusses the most important
aspects of SR and relates them to other approaches
to distributed programming.
.sh 2 "Integration of Language Constructs"
.pp
There is a large similarity between the sequential
and concurrent mechanisms in SR.
For example,
the %if, %do, and %in statements have similar appearances
and the same underlying non-deterministic semantics.
The %fa, %co, and %in statements all use quantifiers
to specify repetition.
Finally, the %exit and %next statements
are interpreted uniformly within iterative statements
(%do and %fa) and the %co statement.
.pp
CSP [Hoar85] has a similar integration of mechanisms.
By way of contrast, Ada [Ada83] provides distinct mechanisms
for sequential and concurrent programming.
In Ada, tasks are used for concurrent modules, but
packages are used for sequential modules; also,
%select is used for selecting alternative entries, but
%if-%then-%else is used for selecting alternative conditions.
As a specific example, consider how one would program
a #Queue package and #BoundedBuffer task in Ada,
and compare them to our #Queue and #BoundedBuffer resources.
In Ada, the differences are marked.
In SR, the differences are minimal and in fact the interfaces
to the two resources are identical.
A similar difference between the sequential and concurrent
components of a language results whenever an existing sequential
language is extended with concurrency constructs
(e.g., Concurrent C [Geha86]).
.pp
The SR mechanisms for communication and synchronization
are also well integrated.
Operations support all of local and remote procedure call,
rendezvous, dynamic process creation,
asynchronous message passing, multicast, and semaphores.
In addition,
there is just one way\(emcapabilities\(emin which an
operation is named in an invocation.
Moreover,
capabilities are used for
virtual machines, entire resources, and individual operations\(em\
another example of similar mechanisms for similar concepts\(em\
and are first class objects in the language.
Such integration and flexibility is not achieved in languages
like Ada and EPL [Blac84]
where many mechanisms,
each having special rules and restrictions,
are used to achieve the same effects that are realizable
with just a few SR mechanisms.
.pp
The integration of the various language mechanisms
plus the almost total lack of restrictions
make SR easy to learn and use.
Students who have used SR for term projects were
able to learn the language and design and code their
projects in about 2 weeks.
Although these projects were of modest size,
they supported multi-user interactions and used
most of the language features that would
be used in ``real'' concurrent programs.
These features\(emresource creation/destruction,
operations, capabilities, invocations\(emcaused the students
few conceptual difficulties.
.sh 2 "Resources"
.pp
The structure of the %resource construct
is similar to that of modular constructs
in procedure-based languages,
such as Euclid [Lamp77] and Modula-2,
and other distributed programming languages,
such as Distributed Processes [Brin78],
StarMod [Cook80], Argus [Lisk83], and EPL.
Like Modula-2 and Ada, SR allows the specification
of a resource to be compiled separately from its body.
This permits the interface to a resource to be separated
from its implementation.
It also permits construction of
programs, such as the file system in Sec. 3.2, in which resources
invoke each other's operations.
.pp
SR goes beyond the above languages in two ways.
First, a resource body can be parameterized.
This permits instances to have different internal
characteristics and external communication connections.
In this respect, SR is more like LYNX [Scot87].
Second, SR includes an inheritance mechanism,
the %extend phrase, that permits an interface to be
split into multiple parts and supports multiple implementations
of the same abstract interface.
In this respect, SR is more like Mesa [Mitc79] and Emerald [Blac87].
As with other aspects of the language, we have tried to
provide integrated mechanisms that support
functionality that has been found to be useful.
.pp
Resources provide the only data-abstraction mechanism in SR.
They are used to program sequential ``abstract data types''
such as #Queue as well
as concurrent data types such as #BoundedBuffer.
Having just one abstraction mechanism makes the language smaller
and hence easier to learn
than if two separate mechanisms
were provided, one for sequential types and one for
concurrent types.
There is a disadvantage though:  the implementation
of sequential types is not as efficient as it might be
since a resource that implements a type might be located
on a different  machine than its clients.
We are able to perform some optimizations when a resource and its
clients are located on the same (virtual) machine, but not as many
as would be possible if ``sequential'' resources
were distinguished as such in the language and were forced
to be located on the same machine as their clients.
A second potential shortcoming of resources is that they
are not polymorphic:  they may not have
types as parameters.
We have not, however, found many situations in which
a generic resource facility would justify its large
implementation cost.
.pp
We do not allow resources to be nested,
primarily because nesting is not needed.
If one resource needs the services provided by another,
it can either create an instance of the needed resource
or be passed a capability for it.
Precluding nesting also simplifies the implementation.
One disadvantage, though,
is that different resources cannot share variables,
although pointers can be passed between resources on
the same virtual machine.
We also do not allow processes to be nested, for essentially
the same reasons.
In contrast,
Ada allows arbitrary nesting of tasks, packages, and subprograms.
This makes the implementation of Ada
much more complicated and costly,
and makes many programs more difficult to understand [Clar80].
.pp
A resource can contain initialization and finalization code.
Initialization code gives the programmer a way to
control the order in which initialization is done; e.g.,
the programmer can ensure that resource variables are
initialized before processes are created.
Initialization code is executed as a process so it
can use any of the language mechanisms (another instance
of our aversion to imposing restrictions).
For example, initialization code can service operations,
create other resources, or do whatever else might be required.
.pp
Finalization code provides a means by which a resource
can ``clean up'' before it disappears.
For example,
if a resource has obtained a lock for a file,
it can record that it owns the lock;
its finalization code can then release that lock if the resource
is ever destroyed.
Finalization code is executed as a process, again so it
can use any of the language mechanisms.
Our approach is similar to that in NIL.
A different approach is used in Ada.
When an Ada task is aborted,
it does not get control\(emit is just destroyed.\**
.(f
\**Task destruction may not be immediate; e.g.,
a task is allowed to complete
servicing a rendezvous before the task is destroyed.
.)f
Thus, in the above example, there is no way the task itself
can release the lock;
such a release can only be done by another task that is monitoring the
task that was aborted.
.pp
SR supports multiple active processes within each resource instance;
a separate, potentially concurrent
thread of control is associated with each %proc invocation.
This is similar to the approaches taken in Ada, EPL, Linda, and NIL.
A different approach is taken in DP and LYNX
where threads execute as coroutines.
We prefer our approach since an SR process corresponds
to the usual conceptual notion of a process.
Also, this approach admits a multiprocessor-based implementation
in which processes in the same resource might truly execute concurrently.
Finally, this approach accommodates immediate processing
of operations that service interrupts.
A drawback of having concurrent threads
is that processes must synchronize access to shared variables
to avoid race conditions.
However, how to do so is now well-understood and SR's operations
can be used to simulate semaphores in an efficient way.
.sh 2 "Operations"
.pp
Operations in SR can be invoked either synchronously (%call)
or asynchronously (%send).
Many other languages (e.g., Ada and CSP)
provide only synchronous message passing.
While this is very useful, especially for programming
client/server interactions, asynchronous message
passing is also useful.
First, it can be used to avoid #remote #delay in which a server,
in processing a request,
invokes an operation in another server that might delay [Lisk86].
In particular, %send can be used to invoke the remote operation
whenever it is necessary for the first server to honor
other requests in order to remove the conditions
that led to remote delay.
This was shown in the #server process in the #Servant resource in Sec. 3.1.
In a language that provides only synchronous message passing,
extra processes must be employed to avoid remote delay;
this often complicates problem solutions.
Asynchronous message passing is also useful whenever
it is not necessary to delay the invoker of an operation.
For example, it
can be used to program pipelines of filter processes,
where it is most natural for the producer to continue after sending
a message to the consumer.
.pp
The %co statement provides additional flexibility in invoking operations.
It allows the invoker to call several operations at the same time
and to continue when an appropriate combination of replies
has been received.
In addition,
the post-processing block associated with each concurrent invocation
allows the programmer to handle the reply from each invocation
in a manner appropriate to that invocation.
%co #can be simulated using %send and %in.
However, such a simulation results in a much more complex program.
It also requires changing the interface between the invoking
and servicing processes since parameters and results have
to be sent as separate messages.
In addition to being useful, %co is relatively simple to
implement since its implementation can use the basic invoke and reply
primitives in the RTS.
Thus, %co illustrates how opening up the implementation provides
additional, useful flexibility.
Note that the %co statement is similar to Argus's %coenter statement
and to the V kernel's
multicast mechanisms [Cher85].
.pp
All operations are invoked using capabilities.\**
.(f
\**An operation can also be invoked using just
its name if that name is declared in the current scope.
Such a name is treated as a capability constant
for the named operation.
.)f
In addition to capabilities for entire resources,
capabilities for individual operations are provided.
This makes some programming jobs easier since it overcomes the
limitations of Eden's capabilities,
which can only be bound to entire modules [Blac85].
For example, a command server in Saguaro is passed a record
of operation capabilities.
One field of the record is for standard input
and another is for standard output.
These fields can be bound to operations in different resources;
e.g.,
the capability for standard input might be bound to a read operation
in a file server,
while the capability for standard output might be bound to
a write operation in the terminal driver.
.pp
Operations can be declared within a process (a #local operation)
or at the resource-level (a #resource operation).
Local operations support the programming of conversations,
as shown in Sec. 3.2.
Resource operations provide the most commonly used form.
Of importance is that resource operations, like resource variables,
may be shared;
i.e., they can be serviced by %in statements in more than one process.
Shared resource operations
are almost a necessity given that multiple instances
of a %proc can service the same resource operation.
They are also useful since they can be used to implement
conventional semaphores, ``data-containing'' semaphores,
and server work queues.
A data-containing semaphore is a semaphore that
contains data as well as a synchronization signal.
As an example,
we use such semaphores to implement buffer pools in Saguaro.
A buffer is produced by sending its address to a shared operation;
a buffer is consumed by receiving its address from the shared operation.
A shared operation can also be used to permit multiple
servers to service the same work queue.
Clients request service by invoking a shared operation.
Server processes (in the same resource) wait for
invocations of the shared operation;
which server actually receives
and services a particular invocation is transparent to the clients.
In addition to being useful, shared resource operations
can be implemented almost as efficiently as non-shared operations;
the only additional requirement is grouping operations
into classes, each of which has a lock (as discussed in Sec. 4.2.2).
.sh 2 "Issues Related to Program Distribution"
.pp
Many distributed programs have a hierarchical structure
in which resources provide operations that are invoked only by
higher-level resources.
This is not always the case, however.
In some programs, such as the file system example in Sec. 3.2,
two resources interact as equals, with each resource both
providing operations used by the other and using operations
provided by the other.
An additional interaction pattern that has been found
to be useful is the #upcall [Clar85]
in which data flows from a server back to a client.
All these interaction patterns are supported in SR
since a resource may contain more than one process that is servicing
invocations
and capabilities can be used to pass operations between resources.
(A set of experiments using SR to program different
upcall program structures is reported in [Atki87a]).
Ada supports such interaction patterns,
although each of its servers is limited to be a single task.
.pp
In distributed programs it is
important to be able to specify the machine on which
the different parts of a program are to execute;
for programming a distributed operating system, it is essential.
For example,
this allows device drivers to be placed on the appropriate machine
and provides a basic tool for load sharing.
This is one of the lessons learned from Eden [Blac85].
The Eden implementors found it valuable to be able to specify
the machine on which an object executes,
even though their overall philosophy is to provide an
environment in which objects are location independent.
SR supports programmer control over placement since
the location for a resource can be specified
when the resource is created;
Argus provides similar support.
By contrast, Ada provides no support for placement of tasks.
.pp
Related to controlling where a resource is placed is
recognizing that there is an inherent difference
in efficiency between invoking an
operation that is local and one that is remote.
Our implementation optimizes calls within a virtual
machine as much as possible.
Also, the language allows resources that are placed in the same virtual
machine to use pointers and reference parameters.
This necessitates run-time enforcement and can lead
to exceptions, but makes many programs much more efficient
than they would be if we insisted that all parameters be copied
and prohibited the use of pointers outside a resource.
.pp
A distributed programming language must also provide
support for detecting and handling machine or network crashes and
for dealing with local exceptions, which are inevitable even in
the most carefully designed program.
SR provides two failure handling mechanisms:
handlers and %when statements.\**
.(f
\**The design of the %when statement is based on ideas in [Schl87].
.)f
Handlers are used by clients on the invoking side of operations;
%when statements are used by servers of operations.
The differences between these two mechanisms reflects that,
in general, a client communicates with one server at a time
but a server has many potential clients.
By contrast,
LYNX provides a single exception handling
mechanism that uniformly handles failures of either
the receiving or sending side of a link;
this is possible because only one thread of control
can be bound to each end of a link.
SR's failure handling mechanisms
are higher level than a simple timeout mechanism,
such as that used to detect invoker failure in Ada %select/%accept statements,
and lower level than mechanisms like
atomic actions [Lisk83], fault-tolerant actions [Schl83],
and replicated procedure call [Coop84].
We feel that our approach is appropriate for the intended
application domain of SR.
Timeout intervals are used to implement failure detection,
but the programmer need not be concerned with such low-level details.
We expect that the SR mechanisms will be more efficient than higher-level
failure-handling mechanisms, and hence they are more appropriate
for a systems programming language.
In fact, the SR mechanisms can be used to implement high-level
mechanisms such as atomic actions.
.pp
The final requirements for a language that is used to
write distributed operating systems are the abilities
to execute user programs and to accommodate a changing hardware
configuration.
The only language we know of that comes close to meeting
these requirements at present is LYNX, in which it is
possible for a process to be compiled after and then connect
to an already executing program.
Although resources and communication links can be created
and destroyed dynamically in SR, the machine configuration
and collection of resources that comprise a program are
static input to the SR linker.
To overcome this limitation,
we are currently working on the design of two mechanisms.
The first is an ``execute'' facility to load and start execution
of an external program, which would interact with the host SR
program by being linked to a set of SR library routines.
The second mechanism is a generalization of operations that
would support group communication somewhat analogous
to that provided by V [Cher85].
