Description:

This is the beta release of the Prospero Resource Manager (PRM). It enables
users to run parallel applications on loosely coupled systems, such as a 
cluster of workstations connected by local or wide area networks. PRM provides
a uniform and scalable framework for scheduling jobs in parallel and
distributed environments while presenting a single system image to the user
or programmer. This is achieved through dynamic task scheduling and message
routing mechanisms operating in a manner transparent to the user. Further, the
components of a job may span multiple administrative domains and hardware
platforms, without imposing on the user the responsibility of mapping
individual components to nodes. Thus PRM presents the user an the execution
environment similar to that of many message-passing multicomputers.

Application programs are based on the message-passing model and are coded in
C. PRM provides a library of routines for sending and receiving tagged 
messages, broadcasting and global synchronization. Standard interfaces to the 
message passing library make it easy to use PRM as a platform for development
of applications for multiprocessors that may not be easily accessible. This 
release provides a set of macros and routines that enable applications written
for the Connection machine (CM-5) using the CMMD library calls to be directly
compiled for the PRM environment without any modifications to the code. The
PRM environment also supports terminal and file I/O activity by its tasks,
enabling a task to print to a terminal or access files that may be on a
filesystem not mounted on the host on which the task is running. 

The Prospero Resource Manager is intended as a tool for managing resources in
large systems in which multiprocessors exist in the broader context of
distributed systems. PRM applies the concepts of the Virtual System Model 
to processor allocation in such environments. This model provides abstractions
to organize objects and resources in large distributed systems into virtual
systems in which resources of interest are readily accessible, and those of
less interest are out of the way. Such organization is based on the conceptual
relationship between resources and the details of administration and mapping
to physical locations are hidden from the user. PRM divides the resource 
management functions into 3 levels of abstraction, each handled by a different
type of manager. Further, large systems may be configured with multiple 
managers to enhance scalability.


Mechanism:	

PRM's resource allocation functions are divided across three entities:
the system manager, the job manager, and the node manager. The system manager
controls access to a collection of physical resources and allocates them to
jobs as required. Large systems may employ multiple system managers, each 
managing a subset of resources. The job manager is the principal interface
through which a job acquires resources to execute its tasks. The job manager
acquires resources from the system manager and schedules tasks through the 
node manager interfaces. A node manager initiates and monitors tasks on the
node on which it is running. The mapping of tasks to nodes is transparent to
the application. This location transparency is achieved through dynamic 
mechanisms for translating logical task identifiers to physical host addresses.


Status and Supported Platforms:

PRM has been tested on Sun-3, Sparc and HP-9000/700 workstations connected by
local and wide area networks. It is possible to start a job on a workstation,
but have some or all of its components execute on remote hosts which do not
share a common filesystem with the workstation. The job and node managers 
co-operate in transferring the executable to the remote site and loading it on 
the remote node, transparent to the user. An auxiliary task created by the job
manager handles requests to read and write files not accessible to a task 
through Unix system calls.

A PRM environment may be configured by unprivileged users. A one time 
configuration process involves starting a nodemngr process on each 
workstation-node and a sysmngr process on one of them. Configuration options
include specification of time windows during which a node (workstation) is
made available to run remote jobs. Once configured, a user only needs to start
a job manager to run his job. The job manager reads resource requirements of 
the job from a user-specified input file and initiates the job.

In future releases, we plan to support time sharing of nodes across jobs, task
migration and support for job manager-assisted debugging and performance 
tuning of parallel applications. 

Questions concerning this software should be directed to INFO-PROSPERO@ISI.EDU
and should specifically mention the Prospero Resource Manager (PRM).

--------

This project was supported in part by the Advanced Research Projects Agency 
under NASA Cooperative Agreement NCC-2-539. Copyrights apply. Please read the
files usc-copyr.h and uw-copyright.h.
