SPLOG = SPAM + BLOG

Monday, October 16, 2006

Visualizing Splogs

I decided to have some fun with GD library and generate something visual representation to gauge daily splog activities. Even though the images below look like some corrupted image file but it's really a graph of daily splog activity. Every pixel represents a blog ping during that day. Blogs are sorted alphabetically from top left to bottom right. The background black pixel represents fairly normal blog. Red pixel represents a potential splog. The brighter the red, more likely it's a splog. Horizontal streaks represent a block of splogs generated by one spammer. White pixels are blogs that show excessive characteristics of splogs far beyond bright red. Of course this is just a visual representation of just one algorithm I'm working on.


Sunday, Oct. 15, 2006
2006-10-15

Saturday, Oct. 14, 2006
2006-10-14

Friday, Oct. 13, 2006
2006-10-13

Thursday, Oct. 12, 2006
2006-10-12

Wednesday, Oct. 11, 2006
2006-10-11


I can see that Friday was relatively light day compared to Saturday. On Friday, I see exceptionally long red streaks which means one spammer just went all out that day. Also I see that Thursday's splogs were much more scattered than other days. The sheer amount of data is just way too much to make sense out of by just looking at pages of numbers so this sort of visualization is really only way to do it effectively.

1 comment:

JoeC said...

Not that useful probably, but graphical representations are always interesting.

Any idea why there appears to be an alternating pattern of significant bright reds one day and then less red and more whites the next. Likely just chance since the sample is so small. I would be interested in seeing how these look over some more days at least.

Any interesting patterns by time of day?