SPLOG = SPAM + BLOG

Wednesday, December 28, 2005

Returning After a Long Break

I've taken some time off from fighting splogs but now I'm slowly resuming operations. I've been catching up on accumulated splog analysis and I'm about 75% complete after two straight days of script runs. My raw data is now taking up approximately 50 GB and growing at a rate of 300 MB per day. I'm in the middle of some hardware upgrade to handle this growth. Unfortunately one of my file server took a dive in a shallow end of the pool last week and I'll have somehow manage without it.

Not surprisingly splog situation has not improved. Now I'm seeing some annoying redirect splogs that take you right to some gambling site. Adding "127.0.0.1 www.webs-search.com" to /etc/hosts or c:\windows\system32\drivers\etc\hosts file will stop this annoyance for now. I'll be going after these guys very soon.

Yes. I'm back.

Wednesday, November 23, 2005

Current Status

It's been a while since I posted anything and that's because I haven't had much to report. Google has effectively stopped deleting splogs I report. Although what that means is to going back to targetting AdSense account of spammers even though that's not entirely correct. I haven't been reporting any of results but I've been targetting AdSense accounts all along. I've had handfull of successes during the past two months.

Splog landscape has changed quite a bit since this blog began. I would say that now there are quite a number of splogs that doesn't depend on AdSense at all. For example all the porn and gambling related splogs are not tied to AdSense. The new trend I've also observed is where splogs are being promoted with an email spam. I've gotten couple email spams that point to blogspot.com subdomain.

The number of splog dropped quite a bit immediately after Blogger's implementation of captcha on spammy blogs. The Blogger's captcha does work to thwart certain blog spammers but not all. The reason is the spammers are no longer interested in creating splogs and maintaining them. They are simply going to create splogs and dumping them. Essentially they are creating disposable one time use splogs. It really doesn't matter if Blogger identifies them to be splog. Spammers have already moved onto to create more splogs. Anyway since the sharp drop in splogs, the numbers are starting to creep back up since Google isn't deleting splogs. This recent lack of action by Google encourages spammers and now it appears they are making a comeback.

Wednesday, November 09, 2005

1647 Splogs Deleted

After four days of zero splogs deleted by Google, they've resumed deleting splogs. I've actually reported about twice the number of splogs but apparently this is the best Google can do today. We'll see what happens tommorow.

1519 Splogs Deleted

This is yet another belated report of splogs deleted on last Wednesday. This ends the week's total splog deletion count at 3294 which is quite dismal compared to weeks prior. It appears Google has slowed down their rate of splog deletion as of last Wednesday. No deleted splogs for Thursday, Friday, Monday and Tuesday.

Saturday, November 05, 2005

657 Splogs Deleted

This is a belated reporting of splogs deleted for this Tuesday. The number is low because I've been trying out a different tactic.

Monday, October 31, 2005

1118 Splogs Deleted

Today's deleted splog count is 1118. Splog deletion numbers will most likely jump up and down all week as expected. I don't have any hard numbers but just looking at the trend the new splog creation has slowed down somewhat since last week. I believe this has lot to do with Google's new captcha barrier they rolled out last week.

Saturday, October 29, 2005

1521 Splogs Deleted

To end the week, Google deleted 1521 splogs. The total for this week is 13738, an average of 2747.6 per day. I believe the tactics I am applying is actually working. I've essentially forced spammers to go through lot more hoops thus increasing their basic "cost" of spamming on their end by several times. Although I can't go into detail, I plan to once again double or triple their "cost". This coming weeks my daily numbers may actually drop considerbly but that's actually part of the plan.

Friday, October 28, 2005

3044 Splogs Deleted

3044 splogs have been deleted for Thursday. I've already met my 2000 splog deletions per day goal for this week with one more day to go.

Wednesday, October 26, 2005

5860 Splogs Deleted

So much for easing off on number of splogs being reported to Google. I believe today's count is the personal record for the number of splogs reported and deleted in a single day since I began this blog.

Two Wrongs Don't Make Right

Today I've heard an idea by Mitch Ratcliffe to combat the splog issue. This is by far the most idiotic idea I've heard to date. He proposes a massive AdSense click fraud in effort to punish Google and AdWords advertisers. He thinks it's good idea to essentially punish advertisers financially so that Google is forced to fix blogspot. Of course he doesn't metion how exactly Google would fix the splog problem. Also he even admits that this will put money into spammer's pockets. I just don't understand how rewarding spammers thus encouraging them to create even more splogs will actually help the situation. This kind of rally for vigilante action is really unhelpful and extremely counterproductive. In general I believe this sort of end justifies the means way of thinking is quite flawed in more ways than one. I can only hope that people don't stoop to the level of spammers in effort to beat them.

Update: As usual Joe has written a very well written blog post regarding Mitch's idea. There is also some discussion brewing at FightSplog.com.

More Update: Mitch replies to many criticisms he drew over his idea. He continues to assert that what he is proposing isn't a click fraud just because his motives are to make a "political action". I suppose with his same line of reasoning, it's perfectly ethical to launch a denial of service attack on a spammer or in this case advertisers and Google as long as the motive behind the act is to make a political statement. I just can't understand how someone can justify causing harm to innocent third party just to make a political statement based on some twisted notion of vigilante justice.

Tuesday, October 25, 2005

3313 Splogs Deleted

Today was an odd and confusing day because I thought Google completely ignored yesterday's batch of splog reports but apparently they deleted some of them today as well as some from today's reports but certainly not all of them. It looks like they are bit backed up from the weekend's support emails and they are barely keeping up with amount of splog reports I'm submitting. I guess I'll ease off just a bit for them to catch up.

Friday, October 21, 2005

4364 Splogs Deleted

Google deleted 4364 splogs I've reported. For some reason Google just ignored a batch of 500 splogs I've submitted so I'll have to resubmit them on Monday. The total number of splogs deleted for this week is 11976. The average number of splogs deleted this week is 2395.2 per day so I've exceeded my goals for this week by 20%.

Thursday, October 20, 2005

3293 Splogs Deleted

I would say that today was a good day. There are 3293 less splogs in the world. Google actually deleted 100% of the splogs I've reported. Although I can't go into specifics of what I'm doing exactly but what I can say is that I'm targeting specific kind of spammers in effort to change their routine and thus making it harder for them to be prolific. It will take some time but eventually I think they will be quite frustrated with how they have to go about generating splogs.

Wednesday, October 19, 2005

1587 Splogs Deleted

I was quite disappointed to see that I didn't get any confirmation email from Google regarding my daily batch of splogs I reported but it turns out they were still deleting them. They simply didn't send me any email. It appears they actually deleted all reported splogs except nine splogs. I guess they just missed them. Anyway, I'm off to send out more splog reports to Google. I realize I'm quite behind my goal of 2000 splog deletions per day so I'm going to try harder and see if I can catch up to that goal tommorow.

Splog Expansion

Until now, every splog I came across were of english language. Today was first time I came across some non-english language splogs. I saw a Brazillian porn splog for the first time as well as some spanish language splogs. It was somewhat confusing at first because it was almost automatic for me to skip over all non-english blogs during my visual examination. Although it's not a big surprise but it appears splogs have finally gone international.

On a related note Russia has eclipsed UK and is now a distant second in source of splog while US still holds the lead on number of splogs being produced by wide margin. As an example, splog such as this one has image links pointing to a .ru domain with a unique identifier just like your typical email spam. As expected the traditional email spammers have caught onto the new blog spam as just an alternate medium to spread their junk. I expect to see lot more nasty stuff in the near future as web pages provide vastly more possibilities for abuse. I'll write more extensively about what may be ahead of us on another post.

Yet another trend I'm starting to see is spammers are starting to spread their splogs to other blog providers. For example these two splogs and many more were created by one spammer:

http://translationservices4u.blogspot.com/
http://sm-select1-1.blogspot.com/

The reason why I know this is because on translationservices4u.blogspot.com splog, there are links like this to splogs hosted on msn spaces:

http://spaces.msn.com/members/sm-select-6/
http://spaces.msn.com/members/sm-select-44/
http://spaces.msn.com/members/sm-select-91/
http://spaces.msn.com/members/sm-select-92/

He started out on blogspot but now he has created splogs on msn spaces and he is cross linking them.

There are other examples of this across other blog service provider. This particular splog exists on blogspot:

http://healthychineserecipe.blogspot.com/

This splog has links to typepad:

http://dreamretirement.typepad.com/save_money_on_gas/
http://dreamretirement.typepad.com/healthy_eating/

Perhaps I wasn't paying close attention to this but this is pretty new to me. It appears spammers are starting to diversify their spamming activities across multiple blog providers. The optimistic part of me is thinking perhaps this is a sign that they are starting to feel the squeeze by Blogger and they decided to move elsewhere. But then I'm guessing this is somehow a better means to trick Google to rank these pages higher since links are going to and coming from a different domain.

Clearly the splog problem is evolving beyond the scope of one organization's control. Some level of cooperation and communication between everyone in blogosphere is required to combat this problem effectively. I'm still quite cynical on this. I just have hard time imagining Google and Microsoft openly sharing information to address this problem to their mutual benefit.

Update: Here is a spammer who created five splogs. Do you know what language it is in? I didn't either but after little bit of digging I figured it out. I'll post the answer tommorow.

Update: Those five splogs originated from Prague, Czech Republic.

Tuesday, October 18, 2005

Google's New Captcha Barrier

To my surprise Google has actually done something about the current splog issue. They've implemented a new captcha barrier to suspecting splogs during a blog post. Although this is a step in the right direction, it may be too little too late. If they are determining which blog is a splog by the same algorithm to weed out splogs from Next Blog button ring, then this new captcha barrier does nothing for splogs that have already beaten the initial barrier. There are still quite a number of splogs including rather obscene porn splogs still showing up on Next Blog ring as I'm writing this. Having said all this I'm going to reserve my judgement and see how much impact this will have on growth of splogs. It's only been hours since this new barrier was instituted so once again I'll give Google the benefit of the doubt.

1736 Splogs Deleted

Today's effort resulted in 1736 splogs being deleted. Google seems to have selectively ignored about one fifth of my splog reports for some unknown reason. I'll try to resubmit the ignored splogs along with 5000+ ignored splogs from last Friday.

Google Responds to Splog Issue

Google has publically responded to the current splog issue. Actually they haven't done anything but to make a statement on the problem at hand. Only thing they've done that's different than two days ago is to publish a list of deleted subdomins. Of course that doesn't do anything to prevent or deter splogs. As I read what Google had to say, I can't help but feel unsatisfied with it. Obviously this splog problem has reached some level of critical mass and it was something that they can no longer ignore or sweep it under a rug as they've been doing all this long. I get the impression that this statement was more of a public relations move in trying to show that they understand the scope of the problem. However the words they've used are very wishy-washy and they've failed to convince me that they are taking serious measure to address the problem head on. I quote, "We can also make it more difficult for suspected spammers to create content." This is far different than stating that they are planning to do just that. I think it's long overdue for some real actions on the part of Google to do something. It's pretty clear the Flag as Objectionable button doesn't work and it has failed miserably. Depending on time of day there is just as many splogs in Next Blog ring as before flagging was implemented. Well, it does works great as a placebo but not much else. I think Google's response is yet another placebo to calm people down yet it lacks any real substance for me to be satisfied.

In the mean time I will back up my harsh criticisms of late with whatever I can do to help the situation. I will be submitting thousands of splogs in coming days for Google to delete.

Monday, October 17, 2005

996 Splogs Deleted

As I've noted on my last entry about 5000 reported splogs to Google has gone unanswered and now I assume they were simply ignored and discarded. Despite this lack of response I have not given up. I've still reported more splogs to Google and it appears they've responded. They deleted 996 splogs today. It isn't much but it's still better than nothing.

Splogs Reach Critical Mass?

With the exception of Mark Cuban and maybe Doc Searls, just about every high profile bloggers have been largely ignoring splog problem or at least haven't commented about the problem. Today is first day when many of them spoke out against it and urged Google to do something about it. IceRocket has even stopped indexing blogspot blogs for time being. Why did it seem like everyone took notice of splog problem all at once? It was because of one spammer. It appears this one blog spammer went bit overboard on Sunday and created tens of thousands of splogs all by himself. There was a definite spike in number of splogs on Sunday since early morning. He created splogs that were equivalent to about three times the daily average of all splogs being created per day. He essentially buried all all blogs and all other splogs with his massive number of splog. Also his splogs were designed to avoid Next Blog button ring filter so about half the day his splogs were only thing showing up on Next Blog button.

I wonder if Google will finally do something about splog problem now. Honestly I really doubt it and here is the reason why. Currently Google makes money whenever spammers makes money. They get a cut of the money whenever money flows from advertisers to publishers. No splogs, no money. It's that simple. Also having all these splogs clogging up blogspot and more importantly search engines actually help Google. For small search engines like IceRocket or Technorati, splogs are big problem since it will take up alot of resource to filter them out. For Google they have the money and the resource so it's not such a big deal. Having these splogs hurts the little search engines lot more than Google.

I've had these very cynical view of Google with regards to splogs for some time now but I gave them the benefit of the doubt so far. But when Google created Blog Search it started to make sense. I do believe that this splog problem will grow to the point where even Google has no choice but to simply remove blogspot and perhaps other blogs from the main Google search index. I see Google's Blog Search as a eventual dumping ground of all blogs which is essentially ghettofying the blogosphere. I sure hope it doesn't come to that but when my report of over 5000 splogs are simply being ignored by Blogger support, I can't help but to think this is inevitable. It's only matter of time.

If anyone wants the list of splogs created yesterday by that one spammer please feel to email me at fightsplog@gmail.com.

Thursday, October 13, 2005

2529 Splogs Deleted

Today's effort in fighting splog resulted in deletion of 2529 splogs. I believe I can sustain a rate of 2000 or more splogs being deleted daily for the next two weeks. Afterwards I will be making some adjustments to how I target splogs and spammers who create them.

Wednesday, October 12, 2005

697 Splogs Deleted

Today I've been able to shut down 697 splogs by reporting them to Google. I've quadrupled my efforts tonight and if Google removes every one of the reported splogs there should be several thousand less splogs in the world tommorow.

I've been tweaking my set of splog processing scripts and it's finding massive number of splogs with much greater efficiency. Perhaps it's working little too well because it found more than 18000 gambling related splogs. About three weeks ago I've reported a mixed batch of about 2000 splogs but most of them weren't deleted. I think Google support staff freaked over the size of the list and decided to simply ignore it. I found out today the trick to get Google to delete these splogs is to report them in a small convenient chunks. Trying to figure out how I can separate out 18000 splogs into some logical chunks is actually harder than finding the splogs in the first place.

Friday, October 07, 2005

Is Google Listening?

Apparently googling "blogburner" no longer shows Rick Butts' spamming software in blue sponsored links on top of search result. I started to wonder if Google actually stopped doing business with this spammer but after some digging it turns out that was not the case. Googling for "articleburner" will reveal his other spamming tool still being advertised. Perhaps Rick Butts has voluntarily pulled his advertisement off of Google. Then again maybe Google is one who removed the sponsored link but then how come they left his other spam tools? Did Rick Butts have multiple AdWords accounts and Google just happen to miss it? Perhaps Google has a problem with blogburner but not articleburner and therefore decided to keep Rick Butts as a advertiser despite the fact that he is still a spammer? I don't think I'll ever know for sure what happened. I have yet to get a clear indication as to where Google currently stands with regards to spammers on Blogspot, AdWords, and AdSense.

Speaking of Google doing business with spammers, if you google for "rss2blog", you'll see more advertisements of spamming softwares. rss2blog is a software that allows anyone to just dump rss feeds into blogspot blog in masses. This software tends to create splogs which has a predictable pattern of a link and a block of text below and this tends to bypass the recently implemented splog filter from Next Blog button ring.

Google was very quick to remove all 2763 porn related splogs I've sumbitted days earlier. Having said that I'm not entirely certain if I had anything to do with it. I never received the semi human/machine generated email telling me that they've received my email. That confirmation email tends to somewhat funny as well as bit annoying since it recommends that I use "Flagging as Objectionable" feature. I can't imagine me or anyone visiting thousands of splogs and clicking on flag button.

Update: I did eventually get a confirmation email from Blogger support.

Thursday, October 06, 2005

Stupid Spammer

Apparently the spammer who created those 2763 porn related splogs isn't very smart. He left his web front end to his spamming tool wide open for public. Also it appears he is a Russian which I didn't know. www.oocasino.info is where his web front end is accessible.

Update: The spammer finally figured out that people were going into his web front end and started messing with it so he put up an authentication. It doesn't really matter because Google deleted all his splogs.

Wednesday, October 05, 2005

Current Status

My current list of splogs has jumped to 7520 entries. I am still accumulating blogspot blogs at a steady pace. I have more then 232,000 blogs archived which is about 28GB of raw data that will need to be parsed and analyzed. By my rough estimate between half to two thirds of those blogs are splogs. I began noticing some trends regarding the growth of splogs recently. About a month ago blog spammers were mostly small timers trying to generate traffic and make money by AdSense ad clicks. It has slowly attracted more serious professional spammers with their refined skills to thwart filters and javascript driven dynamically generated links. Now I'm seeing splogs with links to some site in Russia with unique identifier in links just like your email spam. I guess what they say is true. When you have one broken window you have to fix them as quickly as possible. If you don't you end up with whole lot of broken windows. They tend to spread like a wildfire. By tolerating the squalor, it become the norm. Right now blog space is quickly turning into a slum of the internet. Google needs to take drastic action soon and hopefully before it's too late.

2763 Splogs

I just reported a set of 2763 porn related splogs created by one spammer. Here are the first ten splogs as a sample:

00-free-adult-1i.blogspot.com
100-ameture-web-2k.blogspot.com
100-free-adult-1e.blogspot.com
100-free-adult-3p.blogspot.com
100-free-adult-4c.blogspot.com
100-free-adult-7a.blogspot.com
100-free-adult-8s.blogspot.com
100-free-adult-8t.blogspot.com
100-free-live-5z.blogspot.com
100-free-live-7p.blogspot.com

Thursday, September 29, 2005

My Recommendations for Google

Most of these applies specifically to Google but few does apply to other blog service providers as well.

1. Advertisers and online marketing companies should stop doing business with spammers. The motivations behind blog spammers are no different than any other spammers. It's all about money. If you reduce the money for spammers you reduce the spam. Google needs to be much more proactive on this front. I don't believe that Google is doing enough to cut the funding of AdSense revenue to spammers. From my experience Google hasn't shut down many AdSense accounts of spammers. There is another side of this as well. Google is currently letting spammers advertise their blog spamming software via AdWord. Just google for "blogburner" and you'll see that Rick Butts' blog spamming software is being advertised through Google's AdWord. I think it's about time Google makes it's position clear on where they stand when it comes to blog spam.

2. Blogger could put limits on various activities. If the limit is high enough it should not affect the blogging activity of normal or even highly active bloggers but it should prevent spammers from going about their daily spamming.

  • Limit on number of account a person can create in one day
  • Limit on number of blogs a person can create in one day
  • Limit on number of blogs a person can create per account
  • Limit on number of blog posts in one day
  • Limit on number of comments in one day

3. Put a timed moritorium on newly created blogs. Make it so that newly created blogs will not be indexed by search engines or show up on Next Blog ring for the first 30 days or whatever the approprite time may be. This will be enough to identify and delete spam blogs before it sees the light of day and do anything to pollute the search indexes. Currently a spammer can create dozens if not hundreds of blogs and they will just show up on Next Blog ring and get indexed immediately. This sort of immediate gratification is what needs to be addressed. Spammers are by nature not patient. They are out to make a quick and easy buck and slowing them down should frustrate them enough to make most of them go away.

4. Template tampering prevention is needed. Spammers are working around the Blogger's metatags by stripping the <$BlogMetaData$> tag from template and replacing it with their own metatags tricking search engines to index and crawl their spam blogs. This sort of tampering is well documented here. It should be very simple to put a check in the system to prevent such a tampering.

5. "Mark this comment as spam" feature is needed. This is analogous to the flagging feature but applies to individual comment. The information gathered from this kind of feature can make current antispam measures more effective and less intrusive. The current implementation of word verification feature to block comment spam has a serious problem with visually impaired users and I think most bloggers aren't aware of it's implications. I believe there is a smarter method that can solve these two problems at once. Instead of a word verification being all or nothing, it can be applied only when it's necessary. There is already a one pattern that exists for just about every comment spam. Every comment spam I've seen has a link to get user to click on them. A server can keep track of comments with links as well as comment flagging numbers to apply a word verification for suspecious spam comment. For example when there is lot of comment being flagged as spam and if those comments have link to some domain like buyfakewatches.info then word verification would kick in. There are other potentially effective means of curbing comment spam that can be implemented when data collection is in place. If a Blogger user is flagged as making numerous number of comment spam then they can be throttled down to not allow comment for set number of time. Perhaps a combining the ideas with #2 word verification is mandatory for newly created accounts. I'm sure there are plenty of other technical means to curb comment spams and they should be discussed further.

Ultimately spammers are depending on their ability to create mass number of accounts, blogs and comments. There are technical means to hinder this unrestricted creation of junk. Unlike email Google does have full control over their own infrastructure and therefore I believe blog spammer's days are numbered.

Wednesday, September 28, 2005

Current Status

Returning from a short vacation of sort I see that my trusty script has spidered over 120,000 blogspot blog pages while I was gone. The total number of blogs I need to sift through now stands at over 180,000 pages totaling about 15 GB of raw data. My useful little perl script that extracted links from html is no longer all that useful anymore. It used to take about a minute to process 2000 pages but now that I'm dealing with data size much larger I'll need to figure out more efficient means to process all this data at greater speed.

Tuesday, September 20, 2005

Scope of the Splog Problem

I've been monitoring and processing about 27000 blogs daily looking for splogs and it appears at least 15000 new splogs are being created daily give or take a thousand. Of course this is only from blogspot and I haven't even begun to identify splogs in other free blog services. The problem of splog is bigger than I imagined and it's growing at a rapid pace. The problem does need immediate attention from various industry leaders like Google and Yahoo to address this problem head on. Hopefully something will come out of the second web spam summit hosted by Technorati.

Friday, September 16, 2005

Lost and Found

Recently I've seen the number of visible splogs recede like a tide. Even though they are trying to make a comeback to next blog ring it is no where visible like it once was. I began wondering where they were. Did spammers just give up? I didn't think so. After poking around I found a way to efficiently retrieve a list of potential splogs and it's huge. It's much larger than I ever expected. Currently I have a list of 22000 blogs and from my rough estimate about 15000 of them are splogs. I've been able to definitively confirm only about thousand of them today. Obviously I will be exploring various ways to identify splogs. The growth rate of splog count is not yet known since today was the first day I've taken advantage of this new information source. As a result of this my list has suddenly jumped to 3021 splogs.

Wednesday, September 14, 2005

2085 Splogs

I've submitted my list of 2085 splogs to Blogger for their review and hopefully prompt removal. We'll see if this will bring up the number of splogs being deleted daily.

Sunday, September 11, 2005

Post Flag Day Status

Splog flag day came and went without much hoopla. I don't know how many people have participated and it's effectiveness is not apparent as of yet. Blogger doesn't automatically remove splogs from next button ring so obviously we're not going to see the result for few days. What I have noticed however is it seems like there is more splogs in next button ring. It looks as though spammers have stepped up their spamming activities and figured out Blogger's filtering heuristics. I'm actually bit surprised by this because I thought their primary motivation for creating splog was to be indexed by search engines and they can achive that goal regardless of being in the next button ring. I may have thought this wrong. Maybe they do need to be visible because Google is doing a better job at ignoring the spammer's obvious attempt at PageRank manipulation. Anyway as result of splogs being more visible on next button ring my splog list have grown by about 500 during the three days. It now stands at 1867 as of this writing. Sadly Blogger has deleted only nine splogs on Friday.

Something I have not really thought about came to my attention as of today. About three weeks ago Blogger implemented the word verification or also known as captcha feature to prevent comment spamming on blogs. To me this seems like a pretty good idea for preventing spams since most comment spammers are done by automated programs. Now I see how this has an unintended consequence of preventing blinds from using blogs. Here is a blog by a blind person voicing his opinion about word verification requirement during creation of blog back in April. Obviously things have gotten even worse for this guy since Blogger added comment word verification feature. I see this as a very bad thing. Blind bloggers are now cut off because of spammers. Becaus of this, I have decided to turn off word verification on comments and urge others to do the same. I imagine I will get spammed now but that's ok. I will turn this into my advantage. This blog will serve as a honeypot of sort to collect more information about spammers.

Friday, September 09, 2005

Flag Day Tommorow

I've just posted a list of 350 splogs on flag day wiki for all to see and flag if they wish. Having done that I have some doubt as to whether flagging actually works. Somthing I've observed lately is that Google doesn't seem to be deleting splogs based on high flagging. I think others have noticed as well. I think Blogger is using flag information to fine tune their splog identification heuristics to filter out splogs from their next blog button ring but that's about it. They are not deleting the blogs that they've identified as splogs for some unknown reason. At the same time I've noticed that splogs I've listed on this blog are all gone. I don't know if Google is looking at this blog or if others have taken the list and submitted them to Google for removal. I guess I'll wait and see what Google will do with all the new flags tommorow. Will they remove all the splogs as they should?

Tuesday, September 06, 2005

Case #14 - e5y3461.blogspot.com and more

I've been working on some perl and shell scripts to identify and extract data from splogs. This is the first fruit of my labor. Here is a list of 85 splogs which all point to a domain talk-stuff.com. The total number of links pointing to this domain is whopping 60098. That's right, it's over 60 thousand links to one domain! Obviously this guy is really trying to pump up his page ranks on Google.

e5y3461.blogspot.com
e5y34610.blogspot.com
e5y34611.blogspot.com
e5y34612.blogspot.com
e5y34613.blogspot.com
e5y34614.blogspot.com
e5y34615.blogspot.com
e5y34616.blogspot.com
e5y34617.blogspot.com
e5y34618.blogspot.com
e5y34619.blogspot.com
e5y3462.blogspot.com
e5y34620.blogspot.com
e5y3463.blogspot.com
e5y3464.blogspot.com
e5y3465.blogspot.com
e5y3466.blogspot.com
e5y3467.blogspot.com
e5y3468.blogspot.com
e5y3469.blogspot.com
h435yt1.blogspot.com
h435yt10.blogspot.com
h435yt11.blogspot.com
h435yt12.blogspot.com
h435yt13.blogspot.com
h435yt14.blogspot.com
h435yt15.blogspot.com
h435yt16.blogspot.com
h435yt17.blogspot.com
h435yt18.blogspot.com
h435yt19.blogspot.com
h435yt2.blogspot.com
h435yt20.blogspot.com
h435yt3.blogspot.com
h435yt4.blogspot.com
h435yt5.blogspot.com
h435yt6.blogspot.com
h435yt7.blogspot.com
h435yt8.blogspot.com
h435yt9.blogspot.com
hudfghdf1.blogspot.com
hudfghdf10.blogspot.com
hudfghdf2.blogspot.com
hudfghdf3.blogspot.com
hudfghdf4.blogspot.com
hudfghdf5.blogspot.com
hudfghdf6.blogspot.com
hudfghdf7.blogspot.com
hudfghdf8.blogspot.com
hudfghdf9.blogspot.com
jterer10.blogspot.com
jterer3.blogspot.com
jterer7.blogspot.com
jterer9.blogspot.com
tetey1.blogspot.com
tetey2.blogspot.com
tetey3.blogspot.com
tetey4.blogspot.com
tetey5.blogspot.com
tetey6.blogspot.com
tetey7.blogspot.com
treye41.blogspot.com
treye411.blogspot.com
treye412.blogspot.com
treye413.blogspot.com
treye414.blogspot.com
treye415.blogspot.com
treye416.blogspot.com
treye417.blogspot.com
treye418.blogspot.com
treye419.blogspot.com
treye42.blogspot.com
treye43.blogspot.com
treye44.blogspot.com
treye45.blogspot.com
treye46.blogspot.com
treye47.blogspot.com
treye48.blogspot.com
treye49.blogspot.com
trwerw1.blogspot.com
uywetwe1.blogspot.com
uywetwe3.blogspot.com
yertertyr1.blogspot.com
yterwerte2.blogspot.com
yterwerte5.blogspot.com

Case #13 - corincent.blogspot.com

Initially splogs were nothing more than page full of links. However I'm noticing that more and more it's trying to look like it has content. Some do better job at that but this one isn't it. Anyway, I'm still going after them one by one.

Case #12 - digitalcameracorner.blogspot.com

This is a typical splog picked at random. I realize that just targeting large spammers may not be the best way to go about it hence the randomness. I've reported this splog to AdSense as with any other.

Monday, September 05, 2005

Case #11 - ioanaani.blogspot.com

At first appearance this is yet another splog but it's really a precursor of lot more. What's different about this splog is that it has a link to sex animation web pages. AdSense has a policy of not allowing AdSense on "Pornography, adult, or mature content." Obviously I've alerted AdSense people about this. What's becoming clear is there is a definite rise in pornography related splogs.

Friday, September 02, 2005

Current Status Updated

It appears spammers have changed their tactics. They seemingly have gone away but they are still there. They're just hidden. They know that people like me are coming after them vigorously. They are now creating splogs that will not show up on blogger directory so clicking on Next Blog button will not show the splogs. Google will still index them thus fulfilling spammer's goal of manipulating Google search result generating revenue. I guess it's time for me to read up on Google API to ferret out these spammers. This may actually be a good thing since I can now use Google to find splogs instead of clicking on Next Blog button. They can run but they can't hide.

Current Status

I admit I haven't posted much in here in last couple days. Just so you know I haven't given up. I have four cases that I've worked on but not posted anything about. Having said that, four cases for four days isn't much. I've been busy working on a set of perl and shell scripts to automate many analysis tasks. With this new tool I've found a set of splogs that has over 60000 links pointing to spammer's website. I believe it will accelerate many other tasks greatly. Current number of splogs in my database is 1292. The number tends to fluctuate between 1200 and 1500 depending on how many new splogs are being created each day and how many are being shut down. I've noticed that on average Blogger deletes 75 to 150 splogs on my list. If anyone wants to report splogs you've found, please send it to fightsplog@gmail.com.

I've noticed sudden surge of pornography splogs yesterday. At the same time I think Blogger is starting to step up their efforts to curb the growth of splogs. So far we are winning the battle. I was able to click through about twenty consecutive legitimate blogs via Next Blog button and that's pretty remarkable.

Monday, August 29, 2005

New Procedure

I've decided to change my splog reporting procedures slightly. It appears splogs are getting shut down before Google gets to see them and evaluate AdSense policy violations. Ideally it would work like what happened on Case #3. Spammers should get their AdSense account suspended and then Blogger will clean up the remaining mess. The reason why I'm aiming for AdSense account suspension is to really pull out the weed from the roots sorta speak. Just removing the splogs is not a deterrent since they will just create more splogs to replace the ones that were shut down. The new procedure is to report the splogs to AdSense and give them time to take appropriate action first then I'll report the case to the public after few days. I believe this as a more effective means to deter sploggers.

If you are wondering how you can help fighting splog, there is plenty you can do. Probably the most time consuming process of fighting splog is collecting and identifying splogs. You can help me by submitting your splog lists to my email address fightsplog@gmail.com. If you are a spammer and reading this, I suggest you delete your spam blogs immediately and stop any further spamming or else you risk getting banned by AdSense.

Saturday, August 27, 2005

Case #10 - 1forless.info and others

This spammer created 19 sets of 20 splogs each totaling 380 splogs. This just goes to prove that the mentality that drives a blog spammer is no different than an email spammer. It's driven by volume of spam and ultimately greed.

* N is 1 through 20
eepskidkN.blogspot.com
ertgetgeegN.blogspot.com
ewddwweN.blogspot.com
fsdfggtrrefN.blogspot.com
greterwseqN.blogspot.com
gsasdfrereN.blogspot.com
kitsegeN.blogspot.com
ksdjhjsjhdeweN.blogspot.com
mnxzmbxbmN.blogspot.com
ndnfvdfrN.blogspot.com
nsdbadwwN.blogspot.com
nfhgfhghgN.blogspot.com
ooiweiuwqN.blogspot.com
opweiweuhhgvN.blogspot.com
owoiiewwdN.blogspot.com
oyutrtyrtN.blogspot.com
pocjjejsafN.blogspot.com
pwioeiwerrN.blogspot.com
xbxbbbfsdfN.blogspot.com

These splogs as usual have no content whatsoever and point back to webpages with following domain names:

1forless.info
1thebest.info
247sale.info
forless1.info
getcheaper1.info
getforless.info
health7.info
sale3.info
save1.info
totravel.info

Friday, August 26, 2005

Case #3 is a success!

Apparently Google has banned the AdSense account of splogger who owns p50.info. All the ads on the domain have stopped. I'll will be submitting his 218 splogs to Blogger for immediate purging.

Case #9 - 101-online-reference.info and others

Very predictable set of splogs have been identified. This spammer created 136 splogs with a predictable pattern.

* NNNN is 1234 through 1369 = 136 splogs
referenceNNNN.blogspot.com

Every one of these splogs contain links back to his 94 .info domain sites all with a word reference in it. Here are the first five as a sample:

101-online-reference.info
101-reference-online.info
101-reference.info
101-web-reference.info
101onlinereference.info

Thursday, August 25, 2005

Status Report

Following actions have been taken so far:

8 policy violation notices submitted to Google
1 policy violation notice submitted to Searchfeed
1 spam complaint filed to a domain registrar
1 spam complaint filed to a web hosting provider Pair Networks
504 splogs are currently being targetted

1072 identified splogs remain for analysis

Case #8 - custombaseballhats9ur.blogspot.com

This splog wins the award for the most lame splog so far. It's just a page with one AdSense ad unit and nothing else. I'll let AdSense folks go at it with this one.

Wednesday, August 24, 2005

Case #7 - laserhairsremoval.com and others

I've identified a block of 40 splogs which points back to 20 websites. Here they are:

* N is 1 through 40 except 23. It appears 23 has been removed by Blogger
N-baby-boy-nursery-1.blogspot.com
palm-tree-nursery-1.blogspot.com

The splogs above point back to following domains:

laserhairsremoval.com
laptopmonth.com
lancastervisit.com
laptopcomputerdeal.com
laptopbatteryshop.com
laptopfast.com
laketahoeconnection.com
landscapelightinguk.com
labelwear.com
laminateflooringsales.com
laminatefloorspecialists.com
laptopmemorysales.com
labeladd.com
labelbot.com
juicerbuy.com
sofanut.com
cheapticketscenter.info
flashgroupware.com
mytopblogs.com
aap-very-best.info

It appears every domain from laserhairsremoval.com to juicerbuy.com has been registered by same person through godaddy and hosted by Pair Network. It appears AdSense has caught onto the splogs and disabled the AdSense account on splogs but not the targetted webpages. I've contacted AdSense about this and this time I've also contacted Pair Network about this spamming activity.

Case #6 - newgolftechniques.blogspot.com

I thought having 221 ad units on a page was bad. This one has 679 AdSense ad units on one page and 336 links to various pages on www.sewingmachine4u.info. Also I'm not sure what "New Golf Techniques" has to do with sewing machines. AdSense has been notified of this splog.

Case #5 - locator.w500.info

This spammer has created 170 splogs to draw traffic to pages on locator.w500.info. He created at least 17 sets of 10 splogs starting with some jibberish and a number 1 through 10.

* N is 1 through 10
ccbuttuygtN.blogspot.com
cnryiueyoiuysN.blogspot.com
drritypitubytN.blogspot.com
erntruyobN.blogspot.com
etryetueytryN.blogspot.com
eybrtyuyrbN.blogspot.com
kcnbtttertvrN.blogspot.com
msiuytoiurytrN.blogspot.com
mswieuyrerytN.blogspot.com
mxryeuyetrN.blogspot.com
ncueryurytirN.blogspot.com
ncuyrtryturN.blogspot.com
sedytertvbtrrN.blogspot.com
siryiruytrbN.blogspot.com
smuytrtybirybN.blogspot.com
snurtuvrbyN.blogspot.com
swerutyurytbN.blogspot.com

AdSense has been notified of this policy violation.

Case #4 - structured-settlements-4u.blogspot.com

structured-settlements-4u.blogspot.com is rather a "special" splog. This splog contains 221 units of AdSense ads. Google AdSense apparently does have some sense since only three ad units are being rendered. This splog like any other splogs it links to his website attempting to draw traffic to his site. Anyway, Google has been notified of this splog.

Case #3 - p50.info

This one so far wins the record for the most number of splogs created for a website. 218 splogs have been created by one person and they all point to some website under the p50.info domain.

* N is 1 through 57 = 57 splogs
229ik0N.blogspot.com

* N is 1 through 7 = 7 splogs
3334red3N.blogspot.com

* N is 1 though 50 = 50 splogs
3334rfe3N.blogspot.com

* N is 1 through 57 = 57 splogs
33rf56gtN.blogspot.com

* N is 10 through 56 = 47 splogs
er36dg4N.blogspot.com

57 + 7 + 50 + 57 + 47 = 218 splogs!

Needless to say, AdSense has been contacted about this blatent policy violation.

I believe that this spammer has many more sites with this type of splog in other places. In attempt to figure out the registrar of p50.info domain, I ran across 30host.com which is a WordPress based splog. Also that splog links to pages on b60.info domain. More digging is definitely needed.

Case #2 - betteringtones.com

betteringtones.blogspot.com is a splog which contains nothing but links as well as AdSense ad. Every link on that page points to betteringtones.com website. This is a clear violation of AdSense policy. I have submitted this policy violation to AdSense for review and possible action.

Case #1 - in1hit.com

I have identified a blog spammer who created 72 splogs in attempt to draw traffic to his website.

* NN is 01 through 72 = 72 splogs
asd0NN.blogspot.com

All these splogs contain absolutely no content and they are just pages of links pointing to http://www.in1hit.com

I've noticed that just about every link on in1hit.com's website is a clickthrough for an advertiser searchfeed.com. The page also contains AdSense ads. I've submitted complaints for violation of their spam policy. I have also contacted the registrar of in1hit.com, enom.com and submitted complaints as well.

Tuesday, August 23, 2005

Fighting Splog!

I got so fed up with spamming in blogs that I've decided to do something about it. By the way, splog is short for spam blog in case you didn't know. I've began collecting information about this new type of spamming and I've already identified over 1000 blogs which are nothing more than a spammed links to a spammer's website. I was bit surprised at what I found so far. It seems that majority of splogs are created to draw traffic to a website for the purpose of generating ad revenue. Obviously this is an archiles heel of their plan and I will be exploiting this weakness.

I don't believe that blogger's "Flag" feature and keyword commenting will be enough to curb the growth splog. If a splog has been flagged and no longer shows up in searches and indexed, they will simply create more splogs in it's place. Also keyword commenting is turned off by default. Many blog writers are not even aware of this new feature. I believe fighting splog has to be much more proactive effort to be effective.

It appears vast majority of splogs are created in violation of Google's AdSense terms of service. I will be submitting multiple TOS violation findings to AdSense support starting with the most egregious violators. This is just a first step in many fights to come.