|
This paper is intended attempt to answer the
question "Why STREAMS?" Actually, in consideration
of the question itself, the question can be
asked in several more verbose forms. For example,
"Why does Gcom use STREAMS?" Or, "Why is STREAMS
preferable over other methods of protocol stack
implementation?" Or, "Why is STREAMS preferable
over some particular method of protocol stack
implementation?" Or, "Why should operating
systems that do not incorporate STREAMS do
so?"
We will give a definitive answer to the first
variation of the question and hope to at least
give some hints as to the rest.
|
History back
to contents
STREAMS is not something that
sprang forth in an act of divine
creation. There was a history
leading up to its design and
implementation. Dennis Ritchie
published a paper entitled A
Stream Input-Output System
outlining the essentials of
STREAMS in 1984. This paper
was written from the perspective
of an operating system implementer
struggling with the requirements
of terminal oriented input output
systems in the transition to
networking.
At Gcom, we come at the problem
from a different perspective:
that of protocol stack implementers.
Our history goes back to the
late 1960s at the University
of Illinois.
In the latter half of the 1960s,
the U of I was engaged in the
design of a highly parallel
computer system called the ILLIAC
IV. This was an ARPA funded
research project. At the same
time, APRA was funding a networking
project involving a number of
universities and research institutions.
This project became known as
the ARPA Network. The U of I
was also an active participant
in this effort. The author was
involved in both projects at
the time.
ILLIAC IV was intended to be a
resource available on the ARPA
Network, similarly to the way
that NCSA is a present day supercomputer
resource. When it was decided
that the ILLIAC IV would be
installed at NASA Ames in Mountain
View, California rather than
in Urbana, Illinois, the focus
at the U of I turned to network
access.
Equipped with an ARPA Network
IMP (Interface Message Processor)
and a PDP11/20 with a host interface
to the IMP, we set out, in 1969,
to develop an ARPA Network Terminal
System (ANTS) which was to serve
as a terminal concentrator and
printer controller to allow
us to access computers remotely
via the ARPA Network. This project
was completed in a mere 7 months
using two programmers with a
highly targeted system that
was designed to solve that,
and only that, problem. It was
the first system of its kind
and interested ARPA enough that
they funded a follow-on project
to design a "more general" version
that might be useful to others.
We then proceeded to design a
second version of ANTS (ANTS
II) around a semaphore-driven
multi-tasking model. In this
model each stage of protocol
processing would be represented
by a process, or task, in more
modern nomenclature. Each task
had its own stack to run on
and would access data streams
via interprocess queuing primitives
that, at bottom, would utilize
semaphores to wake up the next
task when data arrived for it.
Full duplex data streams were
achieved by associating two
processes, one for each direction
of flow.
Network protocol processing could
sometimes be a fairly lengthy
chain of such processes. The
individual tasks were designed
to perform limited functions
so that one could construct
chains of them to tailor the
processing of any given data
stream. Taking terminals, printers,
plotters and file transfers
into account, the library of
such tasks grew to include quite
a number of individual processing
modules.
It was a very elegant design,
but was, in practice, unworkable.
The context switch time together
with the task scheduling overhead
resulted in a prohibitive amount
of system overhead. It was so
bad that on a machine of modest
power, such as a PDP11/45, one
could actually perceive processing
delays on a human scale in performing
such seemingly simple functions
as echoing characters typed
in from a terminal.
In 1976 this project, then six
years old, was abandoned. The
lead designer then was given
a year by the U of I to study
the system and try to figure
out what had gone wrong.
|
|
Lessons
Learned back
to contents
The primary lesson learned first
hand from that experience was
that the multi-tasking model
is not appropriate for data
communications processing. This
becomes more and more the case
as protocol processing chains
get longer. In hindsight we
could all see that in a data
communication system, each stage
of processing of the data stream
simply must deal with whatever
shows up next in its input stream.
There is no useful paradigm
of a task deciding to read the
data stream from a logic point
5 levels deep in subroutine
calling. In other words, protocol
processes are best realized
as finite state machines.
In a multi tasking environment,
a finite state machine process
would have a main program that
would look something like the
following in pseudo-code:
for (;;)
{
read message
invoke finite state machine (state, message)
}
Modules organized in this fashion do not need any calling
history saved if they are allowed to complete one cycle of
execution before giving up the CPU. This also means that the
underlying operating system would not have to save and restore
registers or switch to a different stack when executing the
next processing module. Choosing the next process to run is
typically the most expensive process-scheduling operation
that a multi-tasking operating system performs.
In retrospect, it seemed to us
that there was nothing useful
to be gained by having the ability
of process A to preempt the
execution of process B in a
data communication system. The
data streams need to be processed
and there are natural queuing
points due to flow control considerations
and device hardware. The queuing
points, in turn, are natural
return points for the finite
state machines. Thus, a non-preemptive
scheduler seems most appropriate
for data communications.
If the amount of time that it
takes a finite state machine
to process an incoming message
and either forward it, discard
it or queue it is approximately
equal to the amount of time
that it takes the operating
system's task scheduler to preempt
a task then there is no point
in doing anything other than
allowing the FSM to run to completion.
Measurements made by Gcom in
the 1980s with our Ring System
(see below) showed this not
to be too far from the case.
Some protocol modules could
be performed in fewer CPU cycles
than an operating system task
switch and some of the longer-executing
ones with perhaps 4 or 5 times
the instructions of a task switch.
The task switch becomes significant
overhead for little or no discernable
gain.
The recommendation of this study
was to organize the underlying
system in such a way that the
execution flow of the CPU paralleled
that of the data flow through
the system. The processing modules
would become passive elements
and would be invoked when data
arrived for them, rather than
being active elements which
would solicit data to process.
The study recommended that data
communications systems be organized
as message passing systems and
suggested an architecture in
which fixed format messages
would be queued in a ring buffer
with the executive removing
the elements one at a time for
processing. Each message would
contain an indication of the
sending and receiving process
identity together with a command
code and a pointer to a buffer
that would contain any data
accompanying the command. This
system was named the "Hub System,"
a term that was later trademarked
by a spin-off company. It also
forms the basis for the Gcom
Ring System, or Rsystem for short, also a registered
trademark.
Experiments showed this system
to be extremely fast. The overhead
to deliver a message to the
next processing stage is very
small so that virtually the
entire CPU bandwidth can be
utilized to actually process
the data, rather than deciding
which process will next be eligible
to do so.
Another desirable side-effect
of the run-to-completion model
is that there is no longer any
need for locks to protect multiple
accesses to data structures
since no process can preempt
another. This has an additional
savings of CPU time relative
to a multi tasking architecture.
Gcom's implementation also imposed
the conventional restriction
that interrupt routines were
not allowed to call the buffer
allocator or memory allocator,
which meant that these two allocators
can run with interrupts completely
enabled and no locking.
|
|
Topologies
of Protocol Processing back
to contents
One of the goals of both the failed
ANTS-II project and the subsequent
Ring System designs is to be
able to configure protocol processing
topologies that were not preconceived
by the designers of the data
communications executive. In
the Ring System each individual
process had a unique process
number that identified it. Neighboring
processes had their own numbers.
Each instance of the same protocol
process (as in multiple connections
of the same type) had its own
unique number.
Thus, in the Ring System, a process
was written in terms of message
arrivals and sending to neighboring
processes. No process needed
to know the name of any symbol
in any other process. Each process
needed only to know the process
number of its neighbors.
Some agent needed to start up
the processes when appropriate
and communicate the process
numbers to the affected parties.
In the Ring System, because
it was most often used in embedded
applications, this agent would
often be a custom programmed
system startup module into which
was built the "knowledge" of
the topology of the system at
hand. So whereas the Ring System
itself had no preconceived notions
about which protocol modules
were connected to which other
modules, something fairly static
did know what the intended topology
was to be.
|
|
Whence
STREAMS back
to contents
It is clear from reading Dennis
Ritchie's paper
on STREAMS that AT&T was
grappling with these same issues
only from a different perspective.
UNIX is a fully featured operating
system with a disk based file
system and the ability to run
arbitrary programs in their
own address spaces. The need
for kernel protocol processing
was becoming apparent. The simple
driver model meant that a complete
protocol stack must be incorporated
into a single driver, which
looked like a monolithic object
to the operating system. Implementing
protocol stacks in configurations
not anticipated by the kernel
designers was very difficult
to do.
The UNIX kernel already embodied
the notion of "run to completion"
in the management of kernel
processes. So AT&T did not
have to learn the multi tasking
lesson the hard way as we did,
or they learned it longer ago
than we did. However, the sleep/wakeup
mechanism within the kernel
still had the sense of the process
actively going to fetch the
data rather than processing
the data upon its arrival. Ritchie's
STREAMS idea turned that technique
inside out and he arrived at
a design that shared many attributes
with the Ring System design
that we had arrived at just
a few years earlier.
There was, however, one major
difference. The STREAMS design
had a user level program responsible
for opening drivers and building
the processing stages, or protocol
stacks, on top of the drivers.
The STREAMS design solved the
problem of static configuration,
making the configuration completely
dynamic. This is a natural consequence
of working with a fully featured
operating system rather than
an embedded system.
So what we have with STREAMS is
a data communications subsystem
that has the following properties
- Execution flow follows data
flow
- Protocol processing modules
are passive, not active
- Protocol modules implement
most naturally as finite
state machines
- Protocol processing modules
run to completion relieving
the necessity for locks
- Flow control mechanisms
are built into the queue
management
- Arbitrary protocol stacks
can be constructed with
no foreknowledge on the
part of the kernel
|
|
What
is STREAMS back
to contents
With the kind permission of Dennis
Ritchie, I would like to quote
from his 1984 paper in which
he outlines the STREAMS mechanism.
Even as of today, this high
level description is quite accurate.
[I have included a very few
editorial comments italicized
inside brackets.]
Streams
"A stream is a full-duplex
connection between a user's
process and a device or
pseudo-device. It consists
of several linearly connected
processing modules, and
is analogous to a Shell
pipeline, except that data
flows in both directions.
The modules in a stream
communicate almost exclusively
by passing messages to
their neighbors. Except
for some conventional variables
used for flow control,
modules do not require
access to the storage of
their neighbors. Moreover,
a module provides only
one entry point to each
neighbor, namely a routine
that accepts messages.
At the end of the stream
closest to the process
is a set of routines that
provide the interface to
the rest of the system.
A user's write and I/O
control requests are turned
into messages sent to the
stream, and read
requests take data from
the stream and pass
it to the user. At the
other end of the stream
is a device driver module.
Here, data arriving from
the stream is sent
to the device; characters
and state transitions detected
by the device are composed
into messages and sent
into the stream
towards the user program.
Intermediate modules process
the messages in various
ways. The two end modules
in a stream become
connected automatically
when the device is opened;
intermediate modules are
attached dynamically by
request of the user's program.
Stream processing
modules are symmetrical;
their read and write interfaces
are identical."
Queues
"Each stream processing
module consists of a pair
of queues, one for each
direction. A queue comprises
not only a data queue proper,
but also two routines and
some status information.
One routine is the put
procedure, which is called
by its neighbor to place
messages on the data queue.
The other, the service
procedure, is scheduled
to execute whenever there
is work for it to do. The
status information includes
a pointer to the next queue
downstream, various flags,
and a pointer to additional
state information required
by the instantiation of
the queue. Queues are allocated
in such a way that the
routines associated with
one half of a stream
module may find the queue
associated with the other
half. (This is used, for
example, in generating
echoes for terminal input.)"
Message blocks
"The objects passed between
queues are blocks obtained
from an allocator. Each
contains a read pointer,
a write pointer, and a
limit pointer, which specify
respectively the beginning
of information being passed,
its end, and a bound on
the extent to which the
write pointer may be increased.
The header of a block specifies
its type; the most common
blocks contain data. There
are also control blocks
of various kinds, all with
the same form as data blocks
and obtained from the same
allocator. For example,
there are control blocks
to introduce delimiters
into the data stream, to
pass user I/O control requests,
and to announce special
conditions such as line
break and carrier loss
on terminal devices. Although
data blocks arrive in discrete
units at the processing
modules, boundaries between
them are semantically insignificant;
standard subroutines may
try to coalesce adjacent
data blocks in the same
queue. Control blocks,
however, are never coalesced.
[This may still be true
of STREAMS based TTY drivers
but is not true of other
STREAMS based protocol
drivers in which messages
are not to be coalesced.]"
Scheduling
"Although each queue module
behaves in some ways like
a separate process, it
is not a real process;
the system saves no state
information for a queue
module that is not running.
In particular queue processing
routines do not block when
they cannot proceed, but
must explicitly return
control. A queue may be
enabled by mechanisms described
below. When a queue becomes
enabled, the system will,
as soon as convenient,
call its service procedure
entry, which removes successive
blocks from the associated
data queue, processes them,
and places them on the
next queue by calling its
put procedure. When there
are no more blocks to process,
or when the next queue
becomes full, the service
procedure returns to the
system. Any special state
information must be saved
explicitly. Standard routines
make enabling of queue
modules largely automatic.
For example, the routine
that puts a block on a
queue enables the queue
service routine if the
queue was empty."
Flow Control
"Associated with each queue
is a pair of numbers used
for flow control. A high-water
mark limits the amount
of data that may be outstanding
in the queue; by convention,
modules do not place data
on a queue above its limit.
A low-water mark is used
for scheduling in this
way: when a queue has exceeded
its high-water mark, a
flag is set. Then, when
the routine that takes
blocks from a data queue
notices that this flag
is set and that the queue
has dropped below the low-water
mark, the queue upstream
of this one is enabled."
|
|
How
it Works back
to contents
The fundamental data structure
of STREAMS is the Queue structure.
Each Queue structure controls
operations for a STREAMS driver
in one direction of data flow.
Queues are always allocated
in pairs so that there is one
Queue structure controlling
the upstream data flow, called
the "read" queue, and another
controlling the downstream data
flow, called the "write" queue.
The illustration below depicts
a pair of queues.
Among other things, each queue
contains a pointer to the driver's
"Put" procedure, its "Service"
procedure, its "Open" procedure
and its "Close" procedure. By
convention, only the read queue's
Open and Close procedures are
used.
Each queue contains a linked list
head for a list of messages,
a traditional FIFO queue. Among
the kernel routines that manipulate
queues are the routines putq(queue,message)
and getq(queue), which
are used to insert a message
at the tail of the queue and
to remove the message at the
head of the queue, respectively.
Each queue also contains a link
that points to the next queue
in a chain of queues, both upstream
and downstream. By linking queues
together in this manner, one
forms protocol stacks. Each
queue pair controls the protocol
processing module for one layer
of the stack.
Messages are passed from one driver
to the next using the kernel
routine putnext(queue, message).
This routine follows the pointer
to the "next" queue in the chain
and calls the Put procedure
pointed to by the "next" queue.
This mechanism combines complete
flexibility in building protocol
stacks, since the queues are
linked together dynamically
under control of user-space
programs, with extremely fast
message passing from one protocol
layer to the next.
The Put procedure is called to
pass a message to the driver
from the previous STREAMS driver.
The Put procedure is passed
a pointer to the queue structure
and a pointer to the message
being passed. The Put procedure
must do one of three things
with the message. It must forward
it; it must queue it; or, it
must free it.
When a driver queues a message,
the message is linked into the
list whose head resides in the
queue structure. When the first
message is inserted into the
queue, STREAMS schedules the
Service procedure associated
with the queue for execution.
Subsequent insertions into the
same queue do not cause a rescheduling
of the Service procedure.
The Service procedure is passed
a pointer to the queue for which
it is providing service. The
idea is that the Service procedure
should check for conditions
to see if one or more messages
can be removed from the queue
and processed or forwarded to
the next module. The Service
procedure will not be automatically
scheduled for execution again
upon message insertion unless
it attempts to remove a message
from an empty queue via the
getq() routine.
The Service procedure can be explicitly
scheduled to run by calling
the kernel routine qenable(queue).
This routine simply adds the
indicated Service procedure
to the list of those that are
scheduled to be executed.
The Queue structure also contains
counters that indicate the amount
of memory that is queued in
the linked list of the queue.
Each queue also contains high
and low water mark thresholds
against which the counter can
be compared. A STREAMS driver
can interrogate the memory usage
status of the next queue in
the chain of queues by using
the kernel routine canputnext(queue).
This routine compares the memory
usage counter against the high
and low water marks and returns
"true" if the queue in not considered
"full" and "false" if the queue
is considered "full."
When passing a message to the
next driver in the chain of
drivers, one first calls canputnext()
to see whether the receiving
queue is full. If it is full,
one calls putq() to queue
the message, otherwise, one
calls putnext() to pass
the message to the next driver.
The Service procedure calls
canputnext() before removing
any messages from the queue
to ascertain whether or not
a message can be passed along
to the next driver. When a Service
procedure removes a message
from its queue and that causes
the queue level to fall to the
low water mark then any queue
for which canputnext()
has returned "false" for this
particular queue has its Service
procedure scheduled for execution.
That is, the driver which could
not pass the message has its
Service procedure scheduled
so that it can try again.
|
|
Types
of STREAMS Drivers back
to contents
There are three distinct types
of STREAMS drivers, depending
upon how they are utilized to
build protocol stacks. The three
types of drivers are illustrated
below.
|
|
A
STREAMS Driver back
to contents
A STREAMS Driver has a queue pair
representing data flow into
and out of the Driver from upstream
clients. There is no data flow
downstream from a Driver. A
STREAMS Driver is typically
a device driver which directly
operates hardware. It serves
as the base of a protocol stack.
Other types STREAMS drivers
may be configured above the
Driver by way of its queue linkage.
If a driver operates multiple
devices then it will have multiple
queue pairs coming into it.
|
|
A
STREAMS Pushable Module back
to contents
A STREAMS Pushable Module has
a single queue pair associated
with it. The two queues are
linked to upstream and downstream
neighbors. The module receives
messages via its write put procedure
from above and via its read
put procedure from below. It
can queue messages in either
queue for flow control purposes.
A Pushable Module has no choice
as to the routing of messages.
It can only act upon them "as
they go by."
|
|
A
STREAMS Multiplexor back
to contents
A STREAMS Multiplexor has multiple
queue pairs on both the top
and the bottom of the driver.
The routing of messages through
the multiplexor is determined
by computations within the driver
code itself. The example shows
one queue pair on the bottom
and two queue pairs on the top.
Such a configuration might correspond
to an SDLC configuration with
two logical stations.
|
|
TTY
Driver Architecture back
to contents
The TTY driver architecture in
STREAMS involves three layers
of message processing as illustrated
by the following.
The Stream Head is a kernel module
which communicates with the
user via the file system above
and communicates with a STREAMS
driver below.
The STREAMS driver immediately
below the Stream Head is the
Line Discipline (ldterm)
driver. It is responsible for
all of the canonical form character
processing for input and output.
The STREAMS driver below ldterm
is the driver for the actual
serial hardware. This driver
focuses on the details of handling
the hardware and passing characters
back and forth to the ldterm
module above it.
This architecture allows multiple
TTY drivers for different kinds
of hardware to co-exist in the
same system. The TTY driver
is a much simpler driver than
ldterm. Its processing
of terminal attributes is much
more straightforward.
This was the operating example
that Ritchie was trying to achieve
with the STREAMS subsystem design
for the UNIX kernel. As it happens,
the design is extremely general
and capable of solving much
more complex protocol architecture
situations than TTY handling.
|
|
STREAMS
Interfacing Protocols back
to contents
In their design and implementation
of STREAMS, AT&T defined
interfacing protocols for each
of the first four layers of
the ISO protocol model. Bearing
in mind that STREAMS drivers
and modules communicate with
each other via message passing,
it is no surprise that these
interfacing protocol definitions
are specified in terms of formats
of messages exchanged between
STREAMS drivers configured at
different layers of a protocol
stack.
It is worth noting that STREAMS
itself is entirely unaware of
these interfacing protocols.
The STREAMS executive in the
kernel simply provides the operating
system tools to allow the user
to build protocol stacks. How
the protocol stacks communicate
with one another is a matter
of convention. The interfacing
protocols constitute a set of
conventions established by AT&T
for this purpose.
STREAMS messages, whose detailed
format we are not delving into
in this monograph, contain a
type field which identifies
the message as to its type.
STREAMS defines a number of
different message types, but
there are two types that are
the most important from an interfacing
protocol standpoint. An M_PROTO
type message is, by convention,
a message which contains an
inter-driver protocol header.
An M_DATA message is, by convention,
a message which contains only
data. Multiple STREAMS messages
can be chained together to form
a larger whole, and an M_DATA
can be chained to an M_PROTO
to form a message that contains
both inter-driver protocol content
and data.
The STREAMS inter-driver interfacing
protocols are defined in terms
of the formats of M_PROTO messages
and any M_DATA message that
may accompany them.
The Communications Device Interface
(CDI) protocol defines a set
of STREAMS message formats and
procedures for interfacing to
a raw line driver. It is not
intended for use with TTY drivers.
The kind of line driver involved
here is more likely a synchronous
driver for an HDLC or Bisync
line. The M_PROTO messages allow
the user to attach to a particular
line and to control the enabling
of data traffic on the line.
Some messages allow the user
to manage a half-duplex line.
Gcom has added extensions to
this protocol to manage modem
signals.
The Data Link Provider Interface
(DLPI) protocol defines a set
of STREAMS message formats and
procedures for interfacing to
a link layer or MAC layer entity.
In the case of a link layer
entity, the STREAMS driver which
contains the link layer protocol
code will (may) use the services
of a CDI driver below it to
actually communicate with the
physical line. In the case of
a MAC layer entity, the STREAMS
driver usually controls the
network card directly. The M_PROTO
messages in the DLPI protocol
allow the user to attach to
a particular physical line (CDI
driver stream below the DLPI
driver) and to bind Service
Access Point (SAP) addresses
to a STREAM. There are also
messages that allow the user
to manage link level functions
such as link setup, disconnect
and reset. The DLPI protocol
also contains special message
types for sending and receiving
HDLC TEST and XID frames. The
DLPI protocol is used to interface
to such protocols as LAPB, SDLC,
HDLC, LLC, QLLC and Frame Relay.
The Network Provider Interface
(NPI) protocol defines a set
of STREAMS message formats and
procedures for interfacing to
a network layer entity. The
NPI protocol allows for the
user to bind network layer SAP
addresses to a connection or
to issue network connection
requests to remote hosts on
the network. The M_PROTO vocabulary
contains elements for connecting,
disconnecting and resetting
virtual circuits. In the data
transfer phase of a connection,
there are M_PROTO messages that
can be prepended to M_DATA to
allow the use access to network
layer services such as delivery
confirmation, "more data" indications
and expedited data. Gcom has
extended the use of these M_PROTO
messages to include the X.25
Q-bit and some SNA control bits.
The NPI protocol is used to
interface to the X.25 packet
level protocol. Gcom also uses
NPI to interface to SNA and
Bisync.
The Transport Layer Interface
(TLI) protocol defines a set
of STREAMS message formats and
procedures for interfacing to
a transport layer entity. The
TLI protocol allows for the
user to bind transport layer
SAP addresses to a connection
or to issue network connection
requests to remote hosts on
the network. The M_PROTO vocabulary
contains elements for connecting
and disconnecting virtual circuits.
It also contains constructs
that allow the user to send
datagrams over a connectionless
service. The TLI protocol is
used to interface to TCP/IP,
UDP and ISO Transport.
For more information on this subject,
view the following
link.
|
|
Users
and Providers back
to contents
An architectural consequence of
AT&T's interfacing protocol
definitions is that the four
protocols, when implemented,
come in user/provider pairs.
That is, the implementation
of, say, DLPI in a MAC driver
is a "provider" implementation
since it is providing DLPI services
to a STREAMS driver located
above it (or a user program).
The implementation of DLPI in,
say, the IP protocol is a "user"
implementation since it utilizes
the services of a DLPI provider
located below it.
This is a central concept of protocol
stack building in STREAMS. Each
layer of protocol (except for
the bottom layer which controls
hardware) contains a provider
module of some type in its "upper"
portion, a protocol engine of
some sort in the middle and
a user module of some type in
its "lower" portion to interface
to the next driver downstream.
|
|
TCP/IP
Protocol Architecture back
to contents
In UNIX kernels, TCP/IP is implemented
in STREAMS. The following is
an illustration of a TCP/IP
protocol stack in STREAMS.
TCP connections are accessed as
files from user space via the
Stream Head. The illustration
at the left shows two active
TCP streams. The TCP/IP protocol
module contains a TLI Provider
module at its upper edge. Thus,
the user program must implement
the user side of the TLI protocol
interchange.
The IP module is, logically speaking,
the lower half of the TCP/IP
driver. It uses the DLPI protocol
to communicate with drivers
below it. IP has no preconceived
ideas about which or how many
drivers are below it. Protocol
management software running
in user space is responsible
for connecting DLPI capable
drivers below TCP/IP. IP is
told about each lower driver
as it is connected. It is given
the driver's interface address
which it uses in constructing
a "bind request" M_PROTO to
associate a SAP with each lower
stream. The SAP in this case
is an IP address. IP then chooses
which interface to which to
send a packet based upon these
addresses and routing criteria.
The TCP/IP driver is a STREAMS
Multiplexor. It is really the
most general case of a multiplexor
in that in can have an arbitrary
number of TCP (and UDP) connections
above and an arbitrary number
of DLPI driver interfaces below.
The driver below IP delivers incoming
IP packets upstream based upon
the SAPs that are bound to it.
The MAC driver may have multiple
SAPs bound to it on multiple
streams allowing IP packets,
ARP packets, IPX packets and
OSI packets all to go to different
service providers.
The interesting thing about this
architecture is that the kernel
has absolutely no foreknowledge
of how these protocol arrangements
will be set up. The STREAMS
mechanism provides the tools
and then driver implementations
and conventions beyond the knowledge
of the kernel provide the rest.
|
|
Adding
More Drivers to TCP/IP back
to contents
The drawing at the left illustrates
how one can construct a TCP/IP
protocol stack with Frame Relay
as one of the interfaces.
The Frame Relay protocol module
is linked below IP. The interface
is DLPI, so IP's interfacing
requirements are met. In the
illustration, just one Frame
Relay circuit is shown connected
to IP, but in practice many
virtual circuits from a single
Frame Relay access line could
be fed into IP. Each virtual
circuit would represent a link
to a different host. In this
manner, multiple TCP/IP LANs
can be connected together via
a Frame Relay network.
As will be emphasized many times
in this document, the STREAMS
mechanism allows protocol configurations
such as this to be built without
any explicit knowledge on the
part of the STREAMS executive,
the kernel or even the TCP/IP
protocol module itself. These
are simply protocol stacks that
use STREAMS mechanisms to hook
drivers together. The drivers
then communicate using conventions
established between themselves,
and in this case, they use standard
interfacing conventions agreed
upon industry-wide.
|
|
An
X.25 Protocol Stack back
to contents
The drawing at the left illustrates
how Gcom constructs an X.25
protocol stack using STEAMS
drivers. For simplicity sake,
only one X.25 virtual circuit
is shown coming up out of the
NPI driver to a user process
(not shown) above the Stream
Head.
The NPI Provider driver is a STREAMS
multiplexor which can have multiple
NPI connections above and multiple
link layer streams below. The
top portion of the driver implements
the NPI Provider STREAMS interfacing
protocol. The bottom portion
implements the DLPI User STREAMS
interfacing protocol.
In between the NPI Provider and
the DLPI User is the X.25 packet
level protocol implementation.
It processes LAPB I-frames which
it receives from LAPB (the X.25
Frame Level) via the DLPI interface
to the driver below.
The DLPI Provider is also a STREAMS
multiplexor. It can have multiple
link layer streams above and
multiple physical connections
below. In the case of the X.25
Frame Level (LAPB) there is
one upper stream for each lower
physical connection. As we will
see shortly, other link layer
protocols employ multiplexing
to allow for multiple upper
streams over a single physical
line.
The CDI Provider provides line
driver services for the DLPI
driver above. The lower portion
of the DLPI driver contains
a CDI User module which communicates
to the CDI Provider STREAMS
driver at the bottom of the
protocol stack.
This protocol stack, and others
similar to it, is constructed
using only the mechanisms of
connecting streams together
and message passing provided
by STREAMS plus inter-driver
interfacing conventions that
follow an industry standard.
No kernel in which this protocol
stack runs has any concept of
"X.25" built into it.
|
|
An
SNA Protocol Stack back
to contents
The drawing on the left illustrates
how Gcom configures its SNA
protocol stack in STREAMS. The
streams that extend upward to
the Stream Head represent SNA
LU sessions. A user level application
is attached to each such stream
via the file system.
The NPI Provider is the same as
is used in X.25, that is, the
code is shared between X.25
and SNA. The DLPI User module
in this driver is also shared.
In the middle of the NPI driver
is the SNA protocol engine itself.
It processes incoming I-frames
from its Data Link Layer and
interprets the SNA head information.
It de-multiplexes LU sessions
to upper streams as illustrated
here.
Below the NPI driver is the DLPI
Provider. Its lower interface
is the CDI User. Both the provider
and user code is shared among
all forms of Gcom link layer
drivers. In the center of the
DLPI driver is the code for
SDLC. The SDLC code is also
capable of de-multiplexing incoming
data into multiple data link
stations. The illustration shows
only one such station feeding
up into the bottom of the NPI/SNA
driver, but there could just
as easily be more link stations
and more SNA PUs configured.
The CDI Provider is the same code
as is used for X.25. For an
SNA/SDLC line this interface
is usually operated in half
duplex mode.
Note that there is considerable
re-use of code between X.25
and SNA. All of the STREAMS
interfacing modules use the
same code modules. Only the
protocol processing modules
are different. We wish to emphasize,
yet again, that nothing in the
kernel has any a priori knowledge
concerning these SNA protocol
stacks. The mechanism of STREAMS
alone is general enough and
sufficient to build these protocol
stacks.
|
|
A
Deeper SNA Protocol Stack back
to contents
The drawing at the left illustrates
a much more complicated SNA
protocol stack. In this example,
SNA is running on top of X.25,
using an X.25 virtual circuit
in place of the Data Link Layer.
From the top down, the data streams
coming out the top of the NPI
driver and going to the Stream
Head are LU sessions that terminate
in a user application program.
The user level application does
not have to process any SNA
protocol headers; that is all
done by the driver level code.
The NPI Provider contains the
SNA protocol processing and
an interface to the Data Link
Layer via the DLPI User interfacing
module. This arrangement is
exactly the same as was shown
above in the SDLC example. In
point of fact, SNA has no knowledge
that the DLC module below it
is QLLC rather than SDLC.
The DLPI Provider module below
SNA contains the QLLC protocol
module. This module operates
one X.25 virtual circuit and
sends "frames" from SNA over
the virtual circuit. Its lower
module is an NPI User which
interfaces to the NPI Provider
for the X.25 virtual circuit
interface.
Below QLLC is the NPI Provider
again. This is the same code
as above the SNA protocol module,
only this time the interface
is going to X.25 rather than
SNA. The two protocols co-exist
inside the same STREAMS driver
and are reached via NPI connect
request M_PROTOmessagess whose
addresses resolve either to
an SNA LU session or an X.25
virtual circuit. The DLPI User
module in the lower part of
this driver is used to interface
to LAPB below.
The DLPI Provider below X.25 is
the standard X.25 LAPB frame
level. The DLPI Provider code
is shared between LAPB, SDLC
and QLLC. Notice that the lower
interface here is CDI User and
that for QLLC it is NPI User.
Thus, the Gcom DLPI driver can
communicate with different types
of downstream drivers using
different STREAMS interfacing
protocol user modules.
There could easily be much more
"fanout" in this protocol stack
since X.25 could de-multiplex
a number of virtual circuits,
each of which could lead via
QLLC to an SNA PU, each of which
could have LU sessions operating.
And this could be occurring
over multiple physical lines.
The STREAMS subsystem in the kernel
allows all of this to occur
by virtue of the fact that it
is a general purpose tool and
imposes no restrictions on the
user's ability to build protocol
stacks.
Gcom has followed that model in
configuring its own protocol
stacks by implementing a suite
of protocols with User/Provider
modules to go with them and
then letting the configuration
files determine the exact configuration.
This take maximal advantage
of the flexibility of STREAMS
in building protocol stacks.
|
|