Closed Bug 729940 Opened 12 years ago Closed 12 years ago

Stop using crappy hashing functions in Gecko

Categories

(Core :: General, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla13
Tracking Status
firefox13 --- fixed

People

(Reporter: bzbarsky, Assigned: justin.lebar+bug)

References

(Depends on 2 open bugs, Blocks 2 open bugs)

Details

(Whiteboard: [qa-])

Attachments

(4 files, 4 obsolete files)

We seem to have a lot of these around...  I'll file bugs as I go.
Depends on: 290032
Depends on: 729952
Depends on: 729953
> I'll file bugs as I go.

I'm working on a big roll-up patch which I'll try to split up into manageable pieces.  There are a lot of crappy hash functions.
Component: Tracking → General
QA Contact: chofmann → general
jorendorff is blamed for this comment jsscopeinlines.h:

inline JSDHashNumber
StackShape::hash() const
{
    JSDHashNumber hash = uintptr_t(base);

    /* Accumulate from least to most random so the low bits are most random. */
    hash = JS_ROTATE_LEFT32(hash, 4) ^ (flags & Shape::PUBLIC_FLAGS);
    hash = JS_ROTATE_LEFT32(hash, 4) ^ attrs;
    hash = JS_ROTATE_LEFT32(hash, 4) ^ shortid;
    hash = JS_ROTATE_LEFT32(hash, 4) ^ slot_;
    hash = JS_ROTATE_LEFT32(hash, 4) ^ JSID_BITS(propid);
    return hash;
}

Any objection to using a hash here which properly mixes the bits?  Or is something special going on with the low-order bits of this hash value?
Attached patch WIP v1 (obsolete) — Splinter Review
I'm kind of amazed the browser stands up with this patch.  I need to split the patch up, so some housekeeping, and push to try.
Attachment #601263 - Attachment is obsolete: true
Assignee: nobody → justin.lebar+bug
Try run for ba6d140a923a is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=ba6d140a923a
Results (out of 77 total builds):
    success: 50
    warnings: 12
    failure: 15
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-ba6d140a923a
Of course we have tests which rely on the iteration order of hashtables.

At least test_localStorageBase.html is honest about it:

  localStorage.setItem("key2", "value2");
  is(localStorage.length, 2, "The storage has two key-value pairs");
  is(localStorage.key(1), "key1"); // This test might not be accurate because order is not preserved
  is(localStorage.key(0), "key2");
Try run for b19b39a2ee51 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=b19b39a2ee51
Results (out of 1 total builds):
    failure: 1
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-b19b39a2ee51
Try run for ccc37aa97300 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=ccc37aa97300
Results (out of 218 total builds):
    success: 186
    warnings: 20
    failure: 12
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-ccc37aa97300
(In reply to comment 12)

This is all green modulo a Windows build error which seems to be a result of using stdint types in a mfbt cpp file.  Waldo and I are investigating.
No clear movement on our benchmarks.
Depends on: 731789
Green try run on all platforms with the fix in bug 731789: https://tbpl.mozilla.org/?tree=Try&rev=b8ebf3c4ca59

I'll post updated patches for review in a moment.
I'm not sure who to ask for the JS review, but this patch touches some TI code, so tossing it to bhackett.
Attachment #601859 - Flags: review?(bhackett1024)
Attachment #601438 - Attachment is obsolete: true
Attachment #601439 - Attachment is obsolete: true
Attachment #601440 - Attachment is obsolete: true
Attachment #601857 - Flags: review? → review?(honzab.moz)
These patches do not obliterate *all* the crummy hash functions in the tree.  In particular, a few callsites still use PL_HashString, which uses the old, crappy algorithm.

I grepped for '^' and 'ROTATE' to find the callsites I changed.
Try run for b8ebf3c4ca59 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=b8ebf3c4ca59
Results (out of 279 total builds):
    exception: 1
    success: 242
    warnings: 22
    failure: 14
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-b8ebf3c4ca59
Comment on attachment 601859 [details] [diff] [review]
Part 2: Stop using crappy hash functions in the js engine.

Review of attachment 601859 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsobjinlines.h
@@ +1516,5 @@
>  
>  inline bool
>  NewObjectCache::lookup(Class *clasp, gc::Cell *key, gc::AllocKind kind, EntryIndex *pentry)
>  {
> +    uintptr_t hash = mozilla::GenericHash(clasp, key, kind);

I don't think this preserves the correctness of the N.B. below.  GenericHash(clasp, key) + kind should work, or just test for entry->kind == kind below.
Attachment #601859 - Flags: review?(bhackett1024) → review+
Comment on attachment 601861 [details] [diff] [review]
Part 3: Stop using crappy hash functions in Gecko

>+++ b/content/base/src/nsAttrValue.cpp

Add a "using namespace mozilla" here, and drop the explicit prefixes, please.

>+++ b/content/base/src/nsNameSpaceManager.cpp

Likewise.

>+++ b/content/base/src/nsNodeInfoManager.cpp

Likewise.

>+    return mozilla::HashString(nsDependentAtomString(node->mName));

I wish we could use node->mName->hash() here, but the mName and mNameString cases need to produce the same hash when they represent the same string.  Perhaps document that here?

We should consider having nsIAtom::hash() return the HashString of the string instead of what it returns right now.  Followup bug on that, maybe?

>+++ b/gfx/thebes/gfxFont.h
>+            return mozilla::GenericHash(aKey->mFontEntry, aKey->mStyle->Hash());

Should this maybe be mozilla::AddToHash(aKey->mStyle->Hash(), aKey->mFontEntry) ?

Either way, I guess.

>+++ b/xpcom/ds/nsHashtable.cpp
> nsCStringKey::HashCode(void) const
>-    return nsCRT::HashCode(mStr, (PRUint32*)&mStrLen);
>+    return HashString(mStr, mStrLen);

So this is actually changing behavior.  The old code only hashed the bytes up to the first null byte of mStr and stored the number of such bytes in mStrLen.  The new code hashes all bytes in mStr and doesn't change mStrLen.

It looks like the old code was introduced in:

  3.35 <warren@netscape.com> 2000-08-20 14:29
  Fix for hash code performance problem discovered by bienvenu. 'Sampling' hash
  code was statistically evil.

That change also commented out the strlen calls in the constructors.  Then those commented out calls got removed at some point, then readded in:

  3.45 <brendan@mozilla.org> 2001-07-31 12:05
  FASTLOAD_20010703_BRANCH landing, r=dbaron, sr=shaver.

I think the old behavior was nuts and the new one is fine, fwiw.  I have no idea what warren was trying to get at there.  We can try asking bienvenu, in case he recalls, if we care enough.  Just something to watch out for post-landing, mostly.

> nsStringKey::HashCode(void) const

Same here.

>+++ b/xpcom/ds/nsStaticNameTable.cpp

This is at least the second consumer I've seen for something like a "case-insensitive hash key".  Worth considering a followup bug on consolidating those a bit too.  Might need two variants (Unicode-aware and ASCII-lowercasing).

>+++ b/xpcom/glue/nsHashKeys.h

>+class nsStringCaseInsensitiveHashKey : public PLDHashEntryHdr

This thing is a footgun, sorta, because of the copy.  May be worth a followup to avoid that, maybe.

r=me with the nits dealt with
Attachment #601861 - Flags: review?(bzbarsky) → review+
> The old code only hashed the bytes up to the first null byte of mStr and stored the 
> number of such bytes in mStrLen.  The new code hashes all bytes in mStr and doesn't 
> change mStrLen.

Yeah, I have no idea what this code was trying to do, calculating strlen twice.  It looks like it was just bugged.  Hopefully it's not a change in behavior in practice, because that would mean that string lengths are changing while strings are in the hashtable!
Waldo suggested I use "**" instead of "^" for exponentiation in the AddU32ToHash comment, in bug 729952 comment 62, and because I forgot to push that change, I'm following up here.

I changed it locally, but "2**32" just doesn't read right to me. I think it's pretty clear that "2^32" is exponentiation, in context.  But I agree that we're mixing metaphors.  I could change the "^" for xor earlier to "xor"; maybe that would be better?

What do you think?
(In reply to Justin Lebar [:jlebar] from comment #25)
> I changed it locally, but "2**32" just doesn't read right to me. I think

To anyone who knows Fortran, ** for exponentiation looks perfectly natural. That may be a minority of programmers these days. However, I used this same operator for the same disambiguation reasons through the entire Opus spec, so that should tell you what my opinion is.
> To anyone who knows Fortran, ** for exponentiation looks perfectly natural.

That sounds like a reason /not/ to use **, if anything!  :-p

More to the point, Python also uses ** for exponentiation, and I'm already using a Python-inspired list comprehension with "[x * m (mod 2^32) for 0 <= x < 2^32]".
Comment on attachment 601857 [details] [diff] [review]
Part 0, v1: Fix {local,session}Storage tests which rely on hashtable iteration order.

Review of attachment 601857 [details] [diff] [review]:
-----------------------------------------------------------------

r=honzab

::: dom/tests/mochitest/localstorage/test_localStorageBase.html
@@ +99,5 @@
> +  var firstKey = localStorage.key(0);
> +  var secondKey = localStorage.key(1);
> +  ok((firstKey == 'key1' && secondKey == 'key2') ||
> +     (firstKey == 'key2' && secondKey == 'key1'),
> +     'Both keys should be present.');

The message should be "key() API works" or so.
Attachment #601857 - Flags: review?(honzab.moz) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/c228a0a6f2ce

I didn't pick up the review comment, sorry. I'm sure jlebar will handle it at some point.
The pushes in comment 30 are for bug 729952.  Comment 29 is patch 0 here.

Sorry for the confusion; I didn't do a good job pushing this series of dependencies to try, so I didn't realize I needed patch 0 for bug 729952.
Comment on attachment 601858 [details] [diff] [review]
Part 1, v1: Add hash function utilities to mfbt.

Review of attachment 601858 [details] [diff] [review]:
-----------------------------------------------------------------

Holy helper functions, Batman!

::: mfbt/HashFunctions.cpp
@@ +35,5 @@
> + * and other provisions required by the GPL or the LGPL. If you do not delete
> + * the provisions above, a recipient may use your version of this file under
> + * the terms of any one of the MPL, the GPL or the LGPL.
> + *
> + * ***** END LICENSE BLOCK ***** */

MPL2

@@ +54,5 @@
> +  // Walk word by word.
> +  size_t i = 0;
> +  for (; i < length - (length % sizeof(size_t)); i += sizeof(size_t)) {
> +
> +    // Do an explicitly unaligned load of the data.

Remove the blank line above this.

@@ +62,5 @@
> +    hash = AddToHash(hash, data, sizeof(data));
> +  }
> +
> +  // Get the remaining bytes.
> +  while(i < length) {

while (

::: mfbt/HashFunctions.h
@@ +48,5 @@
> + *                  length.
> + *
> + *  - HashBytes     Hash a byte array of known length.
> + *
> + *  - GenericHash   Hash one or more values.  Currently, we support uint32_t,

Consistency with HashString and HashBytes suggests this should be HashGeneric.

@@ +82,3 @@
>  #include "mozilla/Attributes.h"
>  #include "mozilla/Types.h"
> +#include "mozilla/Util.h"

What are you using from Util.h?  I don't think you're actually using anything from it.  And we'd like to remove Util.h eventually (and move stuff in it into separate headers), so it's best not to add new uses of it if possible.

@@ +182,5 @@
> +AddToHash(uint32_t hash, A* a)
> +{
> +  /*
> +   * You might think this function should just take a void*.  But then we'd only
> +   * catch data pointers and couldn't handle function pointers.

I assume we have a use case for hashing function pointers?

@@ +197,5 @@
> +  }
> +  else {
> +    MOZ_ASSERT(false, "Unknown pointer size.");
> +    return uint32_t(-1);
> +  }

Hmm.  So this contemplates a runtime assertion in the unsupported case, rather than a compilation failure.  And (rather more importantly) it means that all the time we're going to have known-false comparisons, tempting compilers to emit warnings.  I think we can extend the templates another level to avoid this possibility, basically like so:

template<size_t PointerSize = sizeof(uintptr_t)>
MOZ_WARN_UNUSED_RESULT inline uint32_t
AddUintptrToHash(uint32_t hash, uintptr_t u)
{
  MOZ_STATIC_ASSERT(PointerSize == 4 || PointerSize == 8,
                    "unimplemented for bizarro architectures");
}

template<>
MOZ_WARN_UNUSED_RESULT inline uint32_t
AddUintptrToHash<4>(uint32_t hash, uintptr_t u)
{
  return AddToHash(hash, static_cast<uint32_t>(u));
}

template<>
MOZ_WARN_UNUSED_RESULT inline uint32_t
AddUintptrToHash<8>(uint32_t hash, uintptr_t u)
{
  return detail::AddU64ToHash(hash, static_cast<uint64_t>(u));
}

Then replace the if-else if-else with just |return detail::AddUintptrToHash(hash, p);|.  It's probably worth tryservering this to verify compilers won't do something stupid here.

@@ +291,5 @@
> +  T c;
> +  while ((c = *str)) {
> +    hash = AddToHash(hash, c);
> +    str++;
> +  }

A for-loop seems rather more appropriate to this purpose:

for (T c; (c = *str); str++)
  hash = AddToHash(hash, c);

@@ +303,5 @@
> +{
> +  uint32_t hash = 0;
> +  for (size_t i = 0; i < length; i++) {
> +    hash = AddToHash(hash, str[i]);
> +  }

mfbt doesn't brace single-line bodies.

@@ +319,5 @@
> +MOZ_WARN_UNUSED_RESULT
> +inline uint32_t
> +HashString(const char* str)
> +{
> +  return detail::HashUntilZero(str);

Maybe just call HashString(str, strlen(str))?  Dunno how much the tradeoffs of traversing twice really matter...

@@ +345,5 @@
>  }
>  
> +/*
> + * On Windows, wchar_t (PRUnichar) is not the same as uint16_t, even though it's
> + * the same width!

Bug 726888 wants a Unicode character type too.  I guess this is added incentive to do the work to provide one soon...

@@ +355,2 @@
>  {
> +  return detail::HashUntilZero(str);

Is it the case that we'll never be hashing something containing an embedded null, for sure?

@@ +369,5 @@
> + *
> + * This hash walks word-by-word, rather than byte-by-byte, so you won't get the
> + * same result out of HashBytes as you would out of HashString.
> + */
> +MOZ_WARN_UNUSED_RESULT

extern
Attachment #601858 - Flags: review?(jwalden+bmo) → review+
I was going for Python with ** rather than Fortran, btw.  :-P  (And I guess Perl and Ruby too, not that I knew that before web searches just now.)
> Consistency with HashString and HashBytes suggests this should be HashGeneric.

HashGeneric is better, but I don't really like GenericHash or HashGeneric.  Accepting suggestions for alternatives.

> I assume we have a use case for hashing function pointers?

This happens surprisingly often.  The JS engine uses it in a few places, for instance.

> Maybe just call HashString(str, strlen(str))?  Dunno how much the tradeoffs of traversing twice 
> really matter...

Calling strlen is also an indirect jump, unless libc is compiled in statically.  Anyway, I wanted to write code without perf surprises.

> Is it the case that we'll never be hashing something containing an embedded null, for sure?

If you have an embedded zero, you have to call one of the known-length hash functions.  Otherwise, there's very little we can do!
Well, chalk one up for helpful template error messages:

> error: default template arguments may not be used in function templates without -std=c++0x or 
> -std=gnu++0x"

I can explicitly send the size in the call; is that OK with you?

return AddUintptrToHash<sizeof(uintptr_t)>(hash, p)
Heh, I was testing everything with -std=c++11 because I originally tried using = delete on the unspecialized version, bug clang won't compile it (not sure if it's a bug or not, need to talk to espindola about it).  Sure, sending it explicitly is good -- or I think you could use <> as being marginally better in that it invokes the default argument.

I don't particularly like or dislike HashString and HashBytes, myself.  Best to be consistent whatever unsatisfactory name gets used, methinks.  :-)

Catch me on IRC sometime and point out a place or two where the JS engine does this.  I suspect those are places where perhaps we should be using a union of some sort.
https://hg.mozilla.org/mozilla-central/rev/c228a0a6f2ce
Status: NEW → ASSIGNED
Whiteboard: [leave open]
Target Milestone: --- → mozilla13
Whiteboard: [leave open]
Attachment #601859 - Attachment description: Bug 729940 - Part 2: Stop using crappy hash functions in the js engine. → Part 2: Stop using crappy hash functions in the js engine.
> I wish we could use node->mName->hash() here, but the mName and mNameString cases need to produce 
> the same hash when they represent the same string.  Perhaps document that here?
>
> We should consider having nsIAtom::hash() return the HashString of the string instead of what it 
> returns right now.  Followup bug on that, maybe?

For my reference, this is kind of complicated right now because atoms store the pldhashtable's hashcode, which is

 (hash_we_calculate * golden_ratio) >> 1

plus some complication if hash_we_calculate * golden_ratio == 0 or 1.  The solution would probably be to store hash_we_calculate in the atom and then do the extra computation on it when needed.
Blocks: 732814
Blocks: 732815
Try run for 5cc85d858fe8 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=5cc85d858fe8
Results (out of 215 total builds):
    exception: 1
    success: 172
    warnings: 40
    failure: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-5cc85d858fe8
> Was this expected?

No, and these were too small to have seen when I pushed to try.

The non-PGO ones don't phase me as much as

  Regression :( V8 increase 1.9% on MacOSX 10.7 Mozilla-Inbound

I'll back out until I can get a handle on these.
(In reply to Justin Lebar [:jlebar] from comment #40)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/f9f51c726fa7
> https://hg.mozilla.org/integration/mozilla-inbound/rev/dfae37833c9b
> https://hg.mozilla.org/integration/mozilla-inbound/rev/afeafc02c1de

Backed out parts 1-3 in:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c8503cd3aac4

Since comment 41/comment 42, there has also been another pgo regression:

Regression :( V8 increase 1.6% on XP Mozilla-Inbound
----------------------------------------------------
    Previous: avg 9.350 stddev 0.051 of 30 runs up to revision b69617debd8d
    New     : avg 9.500 stddev 0.000 of 5 runs since revision 9b65b4ac15cd
    Change  : +0.150 (1.6% / z=2.950)
    Graph   : http://mzl.la/zkIuFW
Part 0a landed in mozilla-central for 13.0:
https://hg.mozilla.org/mozilla-central/rev/cf668a3984fe
I tried force-inlining and redoing the dicey hash function from comment 22, but that did not fix the regressions.

https://tbpl.mozilla.org/?tree=Try&rev=64c842aed6d6
Try run for 64c842aed6d6 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=64c842aed6d6
Results (out of 261 total builds):
    exception: 2
    success: 219
    warnings: 36
    failure: 4
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-64c842aed6d6
 Timed out after 06 hours without completing.
Two nits:

1. HashBytes:
In the HashBytes a memcpy is used to 'explicitly unaligned load of the data'.
When it is known that the array is uint32_t aligned, a direct load of the uint32 will faster.

Also inside the loop of HashBytes the 'hash = AddToHash(hash, data, sizeof(data))'
is not right. AddToHash with 3 args is translated into two AddToHash's, adding  both data and the 'sizoef (data)' (which is 4) to the hash itself.
This should be just 'hash = AddToHash(hash, data)'.

2. HashChars in jsatom.h:
Why manually inline the HashString in jsatom.h? There is no benefit, as inside the loop still a function is called. Instead just use HashString there, so there is one function call, and then inside there the hashloop (without a function call inside the loop). Or even better, instead of defining HashChars inside jsatom.h, directly call HashString. This will need to assembly check to verify which is faster...
About 2:
Replace 
224     static HashNumber hash(const Lookup &l) { return HashChars(l.chars, l.length); }
with:
224     static HashNumber hash(const Lookup &l) { return mozilla::HashString(l.chars, l.length); }

And remove the HashChars function.
> 1. HashBytes:
> In the HashBytes a memcpy is used to 'explicitly unaligned load of the data'.
> When it is known that the array is uint32_t aligned, a direct load of the uint32 will 
> faster.

...on some architectures, at the cost of a branch, which will slow us down for small inputs.

Cityhash, which is by far the fastest commercial hash function I looked at, unconditionally uses memcpy for loading data, which is why I did the same.  If you can demonstrate that it's actually worse, I'd consider changing it.

> +    hash = AddToHash(hash, data, sizeof(data));

This is a bug; thanks.

> +    // We could call mozilla::HashString here, but that isn't inlined.

This comment was true at one point, but not anymore.  Good catch!
> The solution would probably be to store hash_we_calculate in the atom

Yes.  You wouldn't need to do the extra computation on it, imo.
Try run for b9895e83aa6c is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=b9895e83aa6c
Results (out of 16 total builds):
    success: 16
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-b9895e83aa6c
All or at least a sizable part of the Dromaeo jslib regression is due to the JS engine changes.  (Not particularly surprising.)  I've pushed the changes to each file individually to see if I can narrow things down further.
It looks like the changes in jsscope.cpp (and possibly jswatchpoint.cpp) are causing the slowdown.  I pushed to try once for each file modified in the JS patch and ran Dromaeo five times with just that file's modifications.  Here are the Dromaeo jslib averages over those five runs:

jsscope.cpp		144.31
jswatchpoint.cpp	145.04
jshash.cpp		145.89
jsscopeinlines.h	145.90
jsdhash.cpp		146.24
jsatom.h		146.48
jsinferinlines.h	146.50
(No changes)		146.64
XPCMaps.cpp		147.50
jspropertycache.h	147.79
glandium, is it possible that MFBT_API-linkage for HashBytes is causing the function call to be slow in the JS engine?  Semi-random speculation here.
(In reply to Jeff Walden (:Waldo) (remove +bmo to email) from comment #54)
> glandium, is it possible that MFBT_API-linkage for HashBytes is causing the
> function call to be slow in the JS engine?  Semi-random speculation here.

We don't call HashBytes from JS.
The two possibilities for the jsscope slowdown are:

 1) The new hash function is slower, and/or
 2) The new hash function is causing some hashtables to collide more often.

I tried to test (2) by having HashTable output its |steps| statistic on destruction and running v8.  But it turns out that the total |steps| across all HashTables varies by a factor of three (!) between different runs of the same build, even though as other stats (e.g. |searches|) stay constant.  So...no dice.

Since it seems that maybe collisions in HashTable aren't particularly expensive, I'm going to try testing (1) now.  This is kind of tricky, because the hash function's speed depends on the microarchitecture.  On my i7 box, I see no speed difference in microbenchmarks with or without the multiply in the hash.  But maybe the talos machines see a difference.
Wow, it looks like HashGeneric is not optimized as I expected.  There isn't a single rol instruction emitted in jsscope.cpp.  I wonder what's going on here...
Waldo, to answer a question from a while ago:

> Catch me on IRC sometime and point out a place or two where the JS engine
> [hashes function pointers].

See StackBaseShape::hash.  base->rawGetter and base->rawSetter are both function pointers.

(In reply to Justin Lebar [:jlebar] from comment #57)
> Wow, it looks like HashGeneric is not optimized as I expected.  There isn't
> a single rol instruction emitted in jsscope.cpp.  I wonder what's going on
> here...

I see...gcc is rolling being clever and rolling the rotate into the multiply.  Sometimes we multiply by the golden ratio, and sometimes we multiply by a different value.

In any case, if the hash function is actually a bottleneck here, we're going to have a tough time going faster than rol & xor.  If I cheat and hash only the bottom 32 bits of the pointers (as the code currently does) then I'm at 25 instructions instead of 15, which seems pretty good!  But we saw the performance regression on 32-bit Windows, so this isn't the issue.

I'm finding it difficult to believe that 10 instructions in this path is showing up as a >1% regression in Dromaeo jslib.  On the other hand, it also doesn't seem that this hash function causes more collisions than the original.  So that leaves me...basically nowhere.  :(
At least on my Linux box, the new hash function has fewer collisions than the old one.  Not a lot fewer, but some fewer.  It's looking increasingly less likely that the problem is the output of the hash function.

I counted the number of unique hash values coming out of InitialShapeEntry::hash and StackBaseShape::hash in one run of v8.

InitialShapeEntry::hash
  original:  7995 unique hashes
  new:       7996 unique hashes

StackBaseShape::hash
  original: 15065 unique hashes
  new:      14980 unique hashes
> StackBaseShape::hash
>  original: 15065 unique hashes
>  new:      14980 unique hashes

Sorry, this is reversed.  The new one has more unique hashes.
The (much) simpler hash function of

  hash = 33 * (hash ^ value)

seems to work ok, at least for the JS engine.  Pushed to try to see how fast it is.
(In reply to Justin Lebar [:jlebar] from comment #58)
> I'm finding it difficult to believe that 10 instructions in this path is
> showing up as a >1% regression in Dromaeo jslib.  On the other hand, it also
> doesn't seem that this hash function causes more collisions than the
> original.  So that leaves me...basically nowhere.  :(

Talos has produced junk regression reports before, e.g. bug 712714 had 10% V8 regressions on talos which did not reproduce at all locally.  If you can't reproduce the regressions with your own builds, I would just ignore the Talos numbers you get.
If you are looking at jsscope, and the StackBaseShape::hash method:
In the old code the hash is initalized by base->flags directly,
but the new way does a GenericHash on it.
Probably you could use init the hash with base->flags also directly (saving a IMUL).
In fact, there are more places where in the 'old' code, the hash is initted by the first word, and in the new with GenericHsah causing an extra multiply&rotate.

For example:
-    uintptr_t hash = (uintptr_t(clasp) ^ uintptr_t(key)) + kind;
+    uintptr_t hash = mozilla::GenericHash(clasp, key, kind);

The GenericHash(a,b) is translated into AddToHash(0, a, b) which is AddToHash(AddToHash(0, a), b).
But for above cases an plain 'AddToHash(a, b)' is sufficient.

Another example (XPCMaps.cpp)
-    h = (JSDHashNumber) obj->GetFlags();
-    for (s = (const unsigned char*) obj->GetJSClass()->name; *s != '\0'; s++)
-        h = JS_ROTATE_LEFT32(h, 4) ^ *s;
-    return h;
+    h = GenericHash(obj->GetFlags());
+    return AddToHash(h, HashString(obj->GetJSClass()->name));
could be just:
+    return AddToHash(obj->GetFlags(), HashString(obj->GetJSClass()->name));

And please remove the const unsigned char *s, it is no longer used.
> In fact, there are more places where in the 'old' code, the hash is initted by the first word, and 
> in the new with GenericHsah causing an extra multiply&rotate.
>
> The GenericHash(a,b) is translated into AddToHash(0, a, b) which is AddToHash(AddToHash(0, a), b).
> But for above cases an plain 'AddToHash(a, b)' is sufficient.

The way it works is intentional.

Note that we AddToHash(a, b) translates into

 Multiplier * (rol(a, 5) ^ b).

But I'm explicitly trying to avoid rol(a, 5) ^ b, because that's at the heart of what's problematic about the old hash.
> The (much) simpler hash function of
>
>  hash = 33 * (hash ^ value)
>
> seems to work ok, at least for the JS engine.  Pushed to try to see how fast it is.

It's slow.  Which is odd, considering that it evaluates to

  (hash ^ value) << 5 + (hash ^ value)

which shouldn't be a ton more expensive than the rol(4).  Indeed, I made the hash functions MOZ_NEVER_INLINE and they showed up as a fat 0.00% in my profiler, so this code doesn't even seem to be hot.  Which is just bizarre...

> Talos has produced junk regression reports before, e.g. bug 712714 had 10% V8 regressions on talos 
> which did not reproduce at all locally.  If you can't reproduce the regressions with your own 
> builds, I would just ignore the Talos numbers you get.

The fact that we saw this regression in two benchmarks (Dromaeo jslib and V8) and on multiple platforms makes me think that we're seeing a real effect here.  Unfortunately the variance in the tests on all the machines I own is way too high for me to observe 1.5% changes.

A better-distributed hash function is not virtuous in itself -- the point is to get better performance, or at least to avoid pathological hash collisions.  If I can't demonstrate that these changes are least as good as the old function, then there's no point in making the changes.

One possibility is to use rotate-and-xor for HashGeneric and use multiplication for HashString.  I'll see if this is an improvement.
Try run for c7226d03f13f is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=c7226d03f13f
Results (out of 24 total builds):
    exception: 3
    success: 2
    failure: 19
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-c7226d03f13f
In an unpatched build, I occasionally see the following stats (I believe for the compartment->initialShapes hashtable) after running v8:

> searches=5,903,705 steps=10,666,729 hits=5903285 misses=420 addOverRemoved=43 removes=140 
> removeFrees=71 grows=2 shrinks=2 compresses=0

The searches-to-steps ratio here is pretty abysmal; it takes on average (11,000 + 5,900) / 5,900 = 2.85 probes to complete a lookup.  In other runs of the same build, I see a ratio closer to 1.8.  This is by far the hottest hashtable in the run.

It's not clear whether these extra probes are due to full hashcode collisions or because a hot table entry is getting stuck behind some other entry, but I suspect it's the latter because I've seen this high number of probes often, while the "improved" hash function here reduces hashcode collisions by only 1% or so.  (That is, I haven't counted the absolute number of collisions, but I suspect it's low.)

So what could be happening is: My system, for whatever reason, has higher variance on this hashtable's layout than we see on Talos.  And on Talos, the "improved" hash function happens to tickle the hashtable into a worse layout.

Anyway, it seems that this hashtable may be too small.
Rather than continue to fight against these spidermonkey changes, I'm going to try to push the Gecko changes and worry about the js changes separately.

https://tbpl.mozilla.org/?tree=Try&rev=195844a22ab7
>> The (much) simpler hash function of
>>
>>  hash = 33 * (hash ^ value)
>>
>> seems to work ok, at least for the JS engine.  Pushed to try to see how fast it is.
>
> It's slow.  Which is odd...

I may have prematurely dismissed this one on the basis of one low (but perhaps normal) score.  Follow along at https://tbpl.mozilla.org/?tree=Try&rev=284fb758b848 (the orange is from tests which rely on hashtable ordering).
Blocks: 735088
Dromaeo jslib looks good on try with Gecko changes alone.  I pushed to m-i.  Hopefully this won't have a surprise V8 regression (V8 does not run on try, bug 733490).

https://hg.mozilla.org/integration/mozilla-inbound/rev/f9019988b9e4
https://hg.mozilla.org/integration/mozilla-inbound/rev/12813323739a

Filed bug 735088 for the JS engine changes.
Summary: Stop using crappy hashing functions → Stop using crappy hashing functions in Gecko
Try run for 195844a22ab7 is complete.
Detailed breakdown of the results available here:
	https://tbpl.mozilla.org/?tree=Try&rev=195844a22ab7
Results (out of 281 total builds):
    exception: 2
    success: 238
    warnings: 40
    failure: 1
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jlebar@mozilla.com-195844a22ab7
https://hg.mozilla.org/mozilla-central/rev/f9019988b9e4
https://hg.mozilla.org/mozilla-central/rev/12813323739a
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
{
[...]/mailnews/compose/src/nsMsgCompose.cpp:155: error: 'nsCRT' has not been declared
}

http://hg.mozilla.org/comm-central/rev/027b16dd9b5e
"Fix bustage due to missing includes caused by bug 729940 re-arranging how some files were included"
{
[...]/mailnews/import/eudora/src/nsEudoraWin32.cpp(408) : error C2653: 'nsCRT' : is not a class or namespace name
}

http://hg.mozilla.org/comm-central/rev/e7b46f37c5c1
"Fix windows bustage due to missing includes caused by bug 729940 re-arranging how some files were included"
{
[...]/mailnews/import/oexpress/nsOEMailbox.cpp(398) : error C2653: 'nsCRT' : is not a class or namespace name
}

http://hg.mozilla.org/comm-central/rev/a1dfd5790422
"Fix more windows bustage due to missing includes caused by bug 729940 re-arranging how some files were included"
Depends on: 736564
Whiteboard: [qa-]
Depends on: 780408
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: