Edit: Nevermind, you can disregard this whole thing.
I downloaded the data, but it comes in 17770 separate files. Eighteen THOUSAND separate files. I have no particular interest in hacking together a macro to turn those files into something useful.
Of course, if someone else already HAS, I'd be very willing to download their already-compiled list...
Call me lazy, but it would take me hours and hours just to recombine these files.
I'm downloading the data set for the Netflix prize. It will be the largest data set I've ever tried to run my user preference algorithm on.
I don't think I'll win the prize - ever - for a few reasons. First, it's not a "fire and forget" thing. They'll keep mumbling along for FIVE YEARS before passing out the prize, and I'm not willing to work gratis for quite that long.
Second, their data is bunged.
They did it to keep people from using the data to "make certain inferences about Netflix customers". Hrrrrmmmm... what, exactly, are we supposed to use the data for, then?
They say that the missing data doesn't affect THEIR algorithm's accuracy. But, you know, they just admitted their algorithm's accuracy is 8.5%.
Anyhow, we'll see whether I can do anything with it. The biggest problem, at the moment, is that I don't actually have a database program installed on this computer. Ha! I'll have to get one.