Monday, October 30, 2006

Big Batch of Splogs

Here is a big batch of splogs I've managed to extract using some new code I've been working on. Combined total splog count is over 150,000. Some of these splogs are quite trickly. Many of them are cloaked meaning they appear to be 404 dead blog page but underneath it is a link farm in attempt to jack up Google search engine rankings. I'm also seeing a rise in redirected splogs and I'm working on some code to identify those in mass number.

I see that Google has been deleting quite a large number of splogs but even then they are on average about 20% effective. What that means is if a single spammer creates 1000 splogs, Google will eventually delete at most about 200 of them leaving 800 alone. Obvously this is rather poor percentage and hopefully my efforts will bump up that figure close to 90% and above.

20061030_1.txt - 19401 splogs
20061030_2.txt - 4332 splogs
20061030_3.txt - 8936 splogs
20061030_4.txt - 8794 splogs
20061030_5.txt - 18912 splogs
20061030_6.txt - 5158 splogs
20061030_7.txt - 70755 splogs
20061030_8.txt - 1182 splogs
20061030_9.txt - 11410 splogs
20061030_10.txt - 968 splogs
20061030_11.txt - 1584 splogs

Here is a tarball of all splog list files listed above: 20061030.tar.gz

