[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
SXHASH on circular structures?
- To: Hoey@NRL-AIC.ARPA
- Subject: SXHASH on circular structures?
- From: Jon White <JLW@SU-AI.ARPA>
- Date: Fri, 17 May 1985 11:35:00 -0000
- Cc: common-lisp@SU-AI.ARPA
In-reply-to: your message of 10-May-85 3:54:51 EDT
Yes, one might implement SXHASH in a manner that would loop on circular
structures, but there is no requirement to do so. In fact, the CLM is
so vague about sxhash that (defun sxhash (x) 1) is a perfectly legitimate
definition!
I think there has been a common misconception about sxhash that it must
descend into every node of a structure; indeed, the first implementation
in pdp10 MacLisp, over a decade ago, did just that (and I'm no doubt the
guilty party). But more recently, I (and others) have experimented with
definitions which are depth-limited. [Similarly, during the NIL development,
both NIL and pdp10 MacLisp got definitions for a number of functions which were depth-limited, solely as a (slow) way to stop
depth-limiting; this was done primarily as a low-overhead, fail-safe way to
detect circularities. However, a fairly large limit has to be allowed, and thus
this idea has a loop hole on "blam" lists given to EQUAL; but it effectively
and somewhat quickly stop circularites on LENGTH, NTH, and a few others.]
But the purpose of SXHASH is to provide a user interface to the "numericalizing"
function needed for hash arrays; there is no need to make it more discriminatory
that would be useful in providing a moderately random distribution of collision
chains in hash tables. SXHASH is a many-to-one mapping, and there is no reason
to assume that, statistically speaking, the depth cutoff is more damaging than
is the accidental coinciding of a particular dataset -- one might think of this
"accidental coinciding" as the inherent noise in the information of the problem.
Of course, too small a limit on the depth would certainly be noticeable in
terms of the *average* speed of hashing functions, but just exactly where
that boundary is may be difficult to determine. [I wouldn't seriously
suggest using a constant function, or even one that merely discriminated
on the data type -- but can anyone seriously suggest that depth 1000 will
be statistically different from depth 100? I would like to know.]
For some months now, Lucid Common Lisp has been running with variations on
the limited-depth sxhash, and not only does it meet the CLM requirements,
but it seems to meet our day-to-day needs.
-- JonL --