Author Archives: [XTS] Jeremy

About [XTS] Jeremy

XTS Server and Hardware Guru

Blocking Site Rippers, Crawlers and Email Scrapers

May 1, 2012 in Website Security by [XTS] Jeremy  |  No Comments

Website security comes in many forms and this article will present 1 method which can be used for sites running under Apache2 with mod_rewrite.

Apache2 makes use of a special file (.htaccess) that can be placed in the root of your site or in individual folders.  .htaccess has a multitude of options to help combat the darker side of the web we live with.  We highly recommend reading this http://perishablepress.com/stupid-htaccess-tricks/#sec9 .

.htacces has the ability to weed out what types of user-agents are requesting your website and redirect them to another url or to a 403, 500 etc. This requires that the Apache2 module mod_rewrite is enabled.

Add the following to your .htaccess file to enable this feature

#Enable RewriteEngine
RewriteEngine On 

# Stop the Nasties!!!
 RewriteBase /
 RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
 RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
 RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
 RewriteCond %{HTTP_USER_AGENT} ^autoemailspider [OR]
 RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
 RewriteCond %{HTTP_USER_AGENT} ^Xenu [OR]
 RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
 RewriteCond %{HTTP_USER_AGENT} ^Zeus
 RewriteRule ^.* - [F,L]

To get a better idea of what user-agents might be affecting your site hosted with Xtreme Services, open your browser to http://yoursite/stats  scroll down to the section that looks like this :

We can clearly see that a user-agent named ZmEu has been bombarding this site followed by the Baidu spyder.

To learn more about the ZmEu user-agent, do a search on google to see what others are saying, or go to www.botsvsbrowsers.com and search to see if it has been classified.

The ZmEu user-agent can be added to the block list as follows :

#Enable RewriteEngine
RewriteEngine On 

# Stop the Nasties!!!
 RewriteBase /
 RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
 RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
 RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
 RewriteCond %{HTTP_USER_AGENT} ^autoemailspider [OR]
 RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
 RewriteCond %{HTTP_USER_AGENT} ^Xenu [OR]
 RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
 RewriteCond %{HTTP_USER_AGENT} ^Zeus [OR]
 RewriteCond %{HTTP_USER_AGENT} ZmEu [NC]
 RewriteRule ^.* - [F,L]

To test that this additional condition is effective, visit http://www.botsvsbrowsers.com/SimulateUserAgent.asp and paste ZmEu into the User Agent text box, add your website address into the URL below it and then click GO. If the new condition worked, you should see a Forbidden message :

A comprehensive example :

#Enable RewriteEngine
RewriteEngine On

# Stop the Nasties!!
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
RewriteCond %{HTTP_USER_AGENT} ^autoemailspider [OR]
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xenu [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus [OR]
RewriteCond %{HTTP_USER_AGENT} ZmEu [NC]
RewriteRule ^.* - [F,L]

If you would rather not be so friendly, you can re-direct specific user-agents to a website of your choice.

# Redirect to a site that gets the message across
RewriteRule ^.*$ https://www.google.com/search?q=getlost [R,L]

If you know a user-agent is a Spam Bot / Mail Scraper, you can use the following to keep them busy :

# Redirect to a Spam Poison site : )
RewriteRule ^.*$ http://english-1335903213.spampoison.com [R,L]

 

Disclaimer – this article is a helpful how to only.  Implement the content at your own risk.  Be sure to test all changes you make to .htaccess thoroughly to prevent any unwanted messages for valid user-agents.

XTS Opposes C.I.S.P.A (HR3523)

April 27, 2012 in News by [XTS] Jeremy  |  No Comments

We in the IT and Web industry deal with Cyber threats on a daily basis. However, we do not believe that giving up our Privacy will make us any safer. Please join us in encouraging your Representatives and Senators to NOT support this bill. Take action by CLICKING HERE

Learn more about CISPA

Who else opposes CISPA?

SOPA Weather Black Out

January 18, 2012 in News by [XTS] Jeremy  |  No Comments

Today is the official SOPA/PIPA blackout day recognized by Google, Wikipedia and many other web giants. Between 10:45am and 2:30pm PST we had a blackout ourselves. The Xtreme winter weather in the western Washington area caused a chain of events that resulted in a severed Fiber Line. Everything is back online thanks to our ISP technicians. We apologize for the outage and appreciate your understanding with service blips that are out of our hands. ~ XTS team.

XTS Opposes S.O.P.A. (H.R. 3261)

January 17, 2012 in News by [XTS] Jeremy  |  No Comments

We would like to say something about a controversial piece of Legislation named S.O.P.A (Stop Online Piracy Act).

This bill in its current form might have good intentions to fight Piracy, but as the Web Industry unanimously agrees, we can do better.

H.R. 3261 was Introduced on October 26th, 2011 by Lamar smith (R-TX) to the House of Representatives. It will be debated in the Senate January 23rd, 2012.

You can read it for yourself :

http://thomas.loc.gov/cgi-bin/query/z?c112:H.R.3261:

Xtreme Services stands with Google, Facebook, Foursquare, Wikipedia, eBay, Paypal, LinkedIn, Mozilla, Twitter, Yahoo!, Zynga and many others in opposing this bill.

You can look up your State Senators to see where they stand on the bill.

http://projects.propublica.org/sopa/

Sincerely,

~XTS Team

Happy New Year 2012!

January 1, 2012 in News by [XTS] Jeremy  |  No Comments

We would like to wish all of our members a very happy New Year in 2012! We look forward to what the new year will bring as we continue to do our best in providing Xcellent services to our members. Sincerely ~XTS team

Merry Christmas!

December 23, 2011 in News by [XTS] Jeremy  |  No Comments

We would like to wish all of our Members a very Merry Christmas! To celebrate this wonderful season, we are giving the gift of Bandwidth! This Christmas morning you will wake up to a major increase in our Internet Pipe. We hope this fills you with good Pingings of great Joy. Sincerely ~XTS Team

XTS is Green

November 18, 2011 in News by [XTS] Jeremy  |  No Comments

Be assured that your hosting with Xtreme Services is leaving little to no footprint on the environment.
Our servers are powered with Electricity distributed by Snohomish PUD broken down as follows :

XTS uses Green Power

ISP Outage 4:15pm PST 11/11/2011

November 11, 2011 in Outage by [XTS] Jeremy  |  No Comments

Our ISP services were offline between 4:15pm and 5:15pm PST due to the wind storm today. Sorry for the inconvenience. ~Support

High Availability

October 7, 2011 in News by [XTS] Jeremy  |  No Comments

Fun Facts : Xtreme Services uses Heartbeat and DRBD coupled with Raid 10 data volumes for an Xtremely reliable, High Availability hosting environment.