[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Array proposal (long msg)



At the 8/21 meeting, we more or less decided on a general sort of
array/vector scheme and I volunteered to work out a detailed proposal.
The reason this has taken so long is that when I tried to work out the
details of what we had discussed (or what I THOUGHT we had discussed), I
ran into some inconsistencies.  The following proposal, then, is based
rather loosely on the 8/21 discussion.

I propose the following:

Arrays can be 1-D or multi-D.  All arrays can be created by MAKE-ARRAY
and can be accessed with AREF.  Storage is done via SETF of an AREF.
1-D arrays are special, in that they are also sequences, and can be
referenced by ELT.  Also, only 1-D arrays can have fill pointers.
Suppose we use the term VECTOR to refer to all 1-D arrays, since that is
what "vector" means in the vernacular.

Vectors can be specialized along several distinct axes.  The first is by
the type of the elements, as specified by the :TYPE keyword to
MAKE-ARRAY (actually, I would much prefer :ELEMENT-TYPE as the keyword
for this option, since :TYPE is confusing here).  A vector whose
element-type is STRING-CHAR is referred to as a STRING.  Strings, when
they print, use the ".." syntax; they also are the legal inputs to a
family of string-functions, as defined in the manual.  A vector whose
element-type is BIT (alias (MOD 2)), is a BIT-VECTOR.  These are special
because they form the set of legal inputs to the boolean bit-vector
functions.  (We might also want to print them in a strange way -- see
below.)

Some implementations may provide a special, highly efficient
representation for simple vectors.  (These are the things we were
tentatively calling "quick arrays" at the meeting -- I think "simple
vector" is a much better name.)  A simple vector is (of course) 1-D,
cannot have a fill pointer, cannot be displaced, and cannot be altered
in size after its creation.  To get a simple vector, you use the :SIMPLE
keyword to MAKE-ARRAY (or MAKE-STRING, etc.) with a non-null value.  If
there are any conflicting options specified, an error is signalled.  If
an implementation does not support simple vectors, this keyword/value is
ignored except that the error is still signalled on inconsistent cases.

We need a new set of type specifiers for simple things: SIMPLE-VECTOR,
SIMPLE-STRING, and SIMPLE-BIT-VECTOR, with the corresponding
type-predicate functions.  Simple vectors are referenced by the usual
forms (AREF, CHAR, BIT), but the user may use THE or DECLARE to indicate
at compile-time that the argument is simple, with a corresponding
increase in efficiency.  Implementations that do not support simple
vectors ignore the "simple" part of these declarations.

Strings (simple or non-simple) would self-eval; all other arrays would
cause an error when passed to EVAL.  EQUAL would descend into strings,
but not into any other arrays.  EQUALP would descend into arrays of all
kinds, comparing the corresponding elements with EQUALP.  EQUALP would
be false if the array dimensions are not the same, but would not be
sensitive to the element-type of the array.

Completely independent of the above classifications is the question of
whether or not an array is normally printed.  If the :PRINT keyword to
MAKE-ARRAY has a non-null value, the array will try to print its
contents, subject to PRINLEVEL and PRINLENGTH-type constraints;
otherwise, the array would print as a non-readable object: #<array-...>.
I would suggest that if :PRINT is not specified, all vectors should
default to printing and all other arrays should default to non-printing.

Now the only problem is how to print these arrays.  If we want this
printing to preserve all features of the array (do we?) I think that the
only reasonable solution is to make the common cases print in a
nice-looking format and use #.(make-array...) for the rest.  Simple
strings could print in the double-quote syntax, simple-bit-vectors in
the #"..." format, simple vectors of element-type T could print as
#(...).  For arrays of type element-T, we could resurrect the #nA(...)
format, where n is the number of dimensions and the list contains the
elements, nested down n levels.  (I would not allow arbitrary sequences
here -- useless and confusing.)  The vector and array representations,
but not the string or bit-vector representaions, would observe PRINLEVEL
and PRINLENGTH.  Everything else would have to use #.(make-array ...),
unless we want to make up some really horrible new notation.

Alternatively, we could print everything in the nice form, but lose the
information on whether the original was simple and whether its
element-type is T or something more restrictive.  All strings, simple or
not, would print as "...", all bit vectors as #"...", all other vectors
as #(...), and all other arrays as #nA(...).  I would prefer this, but
it might turn out to be a big screw for some applications if these
notations did not preserve all of the state or the original object.

Opinions?