From Algorithmist
Jump to navigation Jump to search

Will probably move to a language-specific section at some point. Larry 08:32, 31 Mar 2005 (EST)

How about a quick tutorial on how to actually use them? I get errors when I try to use the find and insert functions on a hash_set. My code is here. Thanks! Sartak 03:20, 7 Feb 2006 (EST)

You need to change the declaration to

hash_set<std::string, hash_string> dict;

Or if you want a hash_map (but then the insert is different)...

hash_map<std::string, int, hash_string> dict;

Also note that your hash_string func is really bad, lots of strings have the same length so you will have lots of collisions, and you are passing by value rather than reference, so a copy of the string will be made before each call to hash.

I'd be glad if you posted some working example code ;).

--Rrenaud 14:26, 7 Feb 2006 (EST)

I got the hash set working fine, following your first suggestion. Unfortunately, it's still far too slow. The C++ takes 1m29.891s while the (completely unoptimized, first attempt) Perl took 0m37.047s. I reckon with a better hash function the former will perform better. View the updated code here. It needs to run in less than one second on a computer that is only about as third as fast (thus I need to cut the runtime to about 0.3% of what it is now). There's probably some clever optimization I'm missing; is it really necessary to enumerate the substrings? Sartak 01:41, 8 Feb 2006 (EST)

It's not neccesary, but it is sufficient, to enumerate substrings. But you need to spend O(1) time per substring, rather than O(n). Maybe I'll solve that problem ;). I guess the fast solutions are based on suffix trees which don't require enumerating the substrings.. --Rrenaud 07:08, 8 Feb 2006 (EST)

Suffix Arrays! =) Larry 09:25, 8 Feb 2006 (EST)