[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
EQUALP hash tables (and more ...)
- To: tim@MIT-MC.ARPA
- Subject: EQUALP hash tables (and more ...)
- From: Jon White <JLW@SU-AI.ARPA>
- Date: Fri, 10 May 1985 03:56:00 -0000
- Cc: common-lisp@SU-AI.ARPA
Not too long ago, some of the LOOPS people at Xerox PARC needed a more
generalized form of hashing; the solution was to admit two more "properties"
of hashtables -- the "equivalenceing" function, and the "numericalizing"
function. Both of these are fully general arguments -- the user may supply
any reasonable function there -- and I don't see why such an extended definition
hasn't been accepted into CL; can anyone remember a reason for the restriction
to such a limited set? [current Interlisp-D release documents changes made
to hashing almost a year ago -- this loops-inspired extension was done only
at the end of 1984, and wasn't even fully integrated into the system by the
time I left Xerox in January 1985].
For "equivalence" function, the user may supply #'eq, #'string=, or even
something random like #'=-mod-5; the "numericalizing" function is simply
how to reduce a general s-expression into a reaonable sized fixnum; for EQ
type tables, this is usually something like MacLisp's MAKNUM (Interlisp's LOC),
and for EQUAL type tables it is usually SXHASH.
One very interesting point to note is that the CLM doesn't require that EQ
type tables actually use the pointer address (ie. numericalizing by MAKNUM);
apparently in an effort to maintain generality and portability, the only
specification for such tables is that the equivalencing test be EQ. I and
several of my colleagues "believe" that everyone thinks hashing on EQ means
using the pointer address, rather than the sxhash, as the numerical start --
but nothing in the CLM would prevent an implementation from using sxhash
instead. And indeed, the only way one would ever notice the difference
(apart from an unreasonably slow implementation of SXHASH!), is that
using MAKNUM would insulate against re-orderings of the "chains" inside
the table [I'm assuming everyone knows the problem with using sxhash
for the numericalizing -- that updates to a hashed structure will change
its sxhash value, and thereby cause it to be in a differnt chain in
the hash table.]
-- JonL --