I have compiled several lists of names of bots, spiders and crawlers so they can easily be pasted into a filter of a traffic counter e.g. count per day for WordPress.
According to a report by internet security Encapsula 61.5% of website visitors are now non-humans. 31% of the non humans are good bots like those indexing your pages for search engines. The other 31% of non humans are impersonators (20.5%), scrapers (5%), hacking tools (4.5%) and spammers (0.5%).
Check out the link to the original report. They have a cool info-graphic explaining what the different bots do. Of course the company benefits from high numbers of potentially bad non-humans, so a confirmation of these numbers from the “other side” (?) would be helpful. The list of perhaps the most common (=appear in several other lists I found; best known would be a better term) non-human visitors is in table 3.
Some bots are specifically designed to generate fake page views which can be useful if you are selling advertising space.
I did already suspect that my page counter does not filter enough robots since the number of comments is relatively low. I recently saw a spike in page views and noticed that it was created by only a few visitors who appeared to view more than 50 pages each.
I tried to get a list of names of the most common bots which I could use as a filter for my page counter. I found various sites but none of them offered a handy list with just the names which I could paste into my WordPress plugin e.g. “Count per day”
So I did some formatting and collecting and thought this might be useful for others too – so here it is. Just copy the whole tables into Excel and you will get the names in separate columns, ready to paste into your filter.
It is possible though that there are many thousands of bots and the most evil ones are likely to change names. So the list may or may not be useful. I am trying it out now and will report back after a while if it made any difference in my traffic count.
The different sources also have little overlap in the names which also indicates a much larger number of bots than each list contains. The total number is 1561 and 1374 remain after removing the duplicates. I also don’t know if such a long filter list slows the site down.
Warning: If you don’t use the list for traffic count filtering but to block them from your site completely, be aware that some columns contain not only bad bots but also the “good” ones like Googlebot or Bing. So don’t use the lists for banning (using e.g robot.txt or htaccess) unless you checked for – and removed the few good ones :) – and have a quick look at the source. Some only list bad ones (e.g. Table 2 last column). For blocking it may be better to white list the few good ones.
Update 1: After I cleaned my counter using a list of all bots together from the tables below, the traffic count is now reduced by about 50% consistently over the past 12 months. Since I did filter some bots already before it is indeed possible that about 60% of all traffic to my site is non-human.
Unfortunately, the long list of names (over 1300 after removing duplicates) in the filter caused my counter plugin (Count per day) to slow down both the back and front end of the site such that it would not connect even after a minute.
I now attached a third table that only contains names that appear at least twice which brings the number of names down to 140 (from over 1300) – the best known bots. However I still had connection problems.
Update 2: It turned out my host had a (hopefully independent!) connection problem around the same time. It may be that the list of 1300+ bots is not a problem after all. Things are working fine now with the 140 bots (+ my old list of about 20) filtered out. No delay and no spike in the views so far.
Table 1 of 3: Four lists of names of bots, crawlers and spiders
bad bots http://www.botreports.com/badbots/index.shtml | advertising http://www.botreports.com/advertising/index.shtml | crawlers http://www.botreports.com/crawlers/index.shtml |
scrapers http://www.botreports.com/scrapers/index.shtml |
AhrefsBot | adidxbot | 8 | AhrefsBot |
AITCSRobot/1.1 | AdsBot-Google | 200PleaseBot | Huaweisymantecspider |
Alexibot | AMZNKAssocBot | 360Spider | Offline Explorer |
Aqua_Products | grapeshot | 4seohuntBot | SiteSnagger |
Arachnophilia | Mediapartners-Google | A1 Sitemap Generator | TeleportPro |
ASpider/0.09 | MSR-ISRCCrawler | ABACHOBot | WebCopier |
asterias | YandexDirect | ABCdatos BotLink | WebReaper |
asterias | Aboundexbot | WebStripper | |
AURESYS/1.0 | AboutUsBot | WebZIP | |
b2w/0.1 | Accoona-AI-Agent | Xaldon_WebSpider | |
BackDoorBot | AddSugarSpiderBot | ||
BackDoorBot/1.0 | Ahoy! The Homepage Finder | ||
BackRub/. | ArchitextSpider | ||
Baiduspider-video | archive.org_bot | ||
Big Brother | BaiduSpider | ||
Bizbot003 | Baiduspider-image | ||
BizBot04 kirk.overleaf.com | Baiduspider-news | ||
Black Hole | BecomeBot | ||
Black.Hole | BeslistBot | ||
BlackWidow | Bingbot | ||
BLEXBot | BingPreview | ||
BlowFish | CatchBot | ||
BlowFish/1.0 | ccbot | ||
Bookmark search tool | ChangeDetection | ||
Bot mailto:craftbot@yahoo.com | city review | ||
BotALot | Daumoa | ||
BotRightHere | envolk | ||
BSpider/1.0 libwww-perl/0.40 | ExaBot | ||
BuiltBotTough | facebookexternalhit | ||
Bullseye | FDSE robot | ||
CheeseBot | Feedfetcher-Google | ||
CherryPicker | Genieo | ||
CherryPickerElite/1.0 | Gigabot | ||
CherryPickerSE/1.0 | Girafabot | ||
ChinaClaw | Googlebot | ||
Copernic | Googlebot-Image | ||
CopyRightCheck | Googlebot-News | ||
cosmos | grapeshot | ||
Crescent | grub-client | ||
Crescent Internet ToolPak HTTP OLE Control v.1.0 | gsa-crawler | ||
Custo | ia_archiver | ||
CyberPatrol SiteCat Webbot | IRLbot | ||
Daumoa | Linguee Bot | ||
DISCo | linkdexbot/2.0 | ||
DISCo Pump 3.0 | linkdexbot/2.1 | ||
DISCo Pump 3.2 | LinkedInBot | ||
DISCoFinder | magpie-crawler | ||
DittoSpyder | MJ12bot | ||
Download Demon | Mnogosearch | ||
Download Demon/3.2.0.8 | msnbot | ||
Download Demon/3.5.0.11 | MSRBot | ||
dumbot | NaverBot | ||
eCatch | oBot | ||
eCatch/3.0 | PagePeeker | ||
EirGrabber | Psbot | ||
EmailCollector | ScoutJet | ||
EmailSiphon | SeznamBot | ||
EmailWolf | Sosospider | ||
EroCrawler | Sougou+web+spider | ||
es | Speedy Spider | ||
Express WebPictures | Twiceler | ||
Express WebPictures (www.express-soft.com) | UnwindFetchor/1.0 | ||
ExtractorPro | VoilaBot | ||
EyeNetIE | WebDataCentreBot | ||
FairAd Client | Yahoo Pipes 1.0 | ||
Flaming AttackBot | Yahoo! Slurp | ||
FlashGet | Yahoo! Slurp China | ||
FlashGet WebWasher 3.2 | Yandex | ||
Foobot | YandexAntivirus | ||
FrontPage | YandexBlogs | ||
FrontPage [NC,OR] | YandexBot | ||
Gaisbot | YandexCatalog | ||
GetRight | YandexDirect | ||
GetRight/2.11 | YandexDirect | ||
GetRight/3.1 | YandexFavicons | ||
GetRight/3.2 | YandexImages | ||
GetRight/3.3 | YandexMarket | ||
GetRight/3.3.3 | YandexMedia | ||
GetRight/3.3.4 | YandexNews | ||
GetRight/4.0.0 | YandexPagechecker | ||
GetRight/4.1.0 | YandexVideo | ||
GetRight/4.1.1 | YandexWebmaster | ||
GetRight/4.1.2 | YandexZakladki | ||
GetRight/4.2 | Yodaobot | ||
GetRight/4.2b (Portuguxeas) | zibber-v0.1(www.zibb.com/crawler/) | ||
GetRight/4.2c | ZyBorg | ||
GetRight/4.3 | |||
GetRight/4.5 | |||
GetRight/4.5a | |||
GetRight/4.5b | |||
GetRight/4.5b1 | |||
GetRight/4.5b2 | |||
GetRight/4.5b3 | |||
GetRight/4.5b6 | |||
GetRight/4.5b7 | |||
GetRight/4.5c | |||
GetRight/4.5d | |||
GetRight/4.5e | |||
GetRight/5.0beta1 | |||
GetRight/5.0beta2 | |||
GetURL.rexx v1.05 | |||
GetWeb! | |||
Go!Zilla | |||
Go!Zilla (www.gozilla.com) | |||
Go!Zilla 3.3 (www.gozilla.com) | |||
Go!Zilla 3.5 (www.gozilla.com) | |||
Go-Ahead-Got-It | |||
Golem/1.1 | |||
GrabNet | |||
Grafula | |||
Gromit/1.0 | |||
grub | |||
HappyFunBot | |||
Harvest | |||
Harvest/1.5 | |||
Hatena Antenna | |||
hloader | |||
HMView | |||
httplib | |||
HTTrack | |||
HTTrack 3.0 | |||
HTTrack [NC,OR] | |||
Huaweisymantecspider | |||
humanlinks | |||
Image Stripper | |||
Image Sucker | |||
inagist.com url crawler | |||
Indy Library | |||
Indy Library [NC,OR] | |||
InfoNaviRobot | |||
Informant | |||
InterGET | |||
Internet Ninja | |||
Internet Ninja 4.0 | |||
Internet Ninja 5.0 | |||
Internet Ninja 6.0 | |||
Iron33/1.0.2 | |||
JennyBot | |||
JetCar | |||
JOC Web Spider | |||
Kenjin Spider | |||
Kenjin.Spider | |||
Keyword Density/0.9 | |||
Keyword.Density | |||
larbin | |||
larbin (samualt9@bigfoot.com) | |||
larbin samualt9@bigfoot.com | |||
larbin_2.6.2 (kabura@sushi.com) | |||
larbin_2.6.2 (larbin2.6.2@unspecified.mail) | |||
larbin_2.6.2 (listonATccDOTgatechDOTedu) | |||
larbin_2.6.2 (vitalbox1@hotmail.com) | |||
larbin_2.6.2 kabura@sushi.com | |||
larbin_2.6.2 larbin2.6.2@unspecified.mail | |||
larbin_2.6.2 larbin@correa.org | |||
larbin_2.6.2 listonATccDOTgatechDOTedu | |||
larbin_2.6.2 vitalbox1@hotmail.com | |||
LeechFTP | |||
LexiBot | |||
libWeb/clsHTTP | |||
LinkextractorPro | |||
LinkScan/8.1a Unix | |||
LinkScan/8.1a.Unix | |||
LinkWalker | |||
LNSpiderguy | |||
lwp-trivial | |||
lwp-trivial/1.34 | |||
Mass Downloader | |||
Mass Downloader/2.2 | |||
Mata Hari | |||
Mata.Hari | |||
Microsoft URL Control | |||
Microsoft URL Control – 5.01.4511 | |||
Microsoft URL Control – 6.00.8169 | |||
Microsoft.URL | |||
MIDown tool | |||
MIIxpc | |||
MIIxpc/4.2 | |||
Mister PiX | |||
Mister Pix II 2.01 | |||
Mister Pix II 2.02a | |||
Mister PiX version.dll | |||
Mister.PiX | |||
moget | |||
moget/2.1 | |||
MSIECrawler | |||
naver | |||
Navroad | |||
NearSite | |||
NeoScioCrawler | |||
Net Vampire | |||
Net Vampire/3.0 | |||
NetAnts | |||
NetAnts/1.10 | |||
NetAnts/1.23 | |||
NetAnts/1.24 | |||
NetAnts/1.25 | |||
NetCarta CyberPilot Pro | |||
NetMechanic | |||
NetSpider | |||
NetZIP | |||
NetZip Downloader 1.0 Win32(Nov 12 1998) | |||
NetZip-Downloader/1.0.62 (Win32; Dec 7 1998) | |||
NetZippy+(http://www.innerprise.net/usp-spider.asp) | |||
NICErsPRO | |||
NPbot | |||
Octopus | |||
Offline Explorer | |||
Offline Explorer/1.2 | |||
Offline Explorer/1.4 | |||
Offline Explorer/1.6 | |||
Offline Explorer/1.7 | |||
Offline Explorer/1.9 | |||
Offline Explorer/2.0 | |||
Offline Explorer/2.1 | |||
Offline Explorer/2.3 | |||
Offline Explorer/2.4 | |||
Offline Explorer/2.5 | |||
Offline Navigator | |||
Offline.Explorer | |||
OGspider | |||
Openbot | |||
Openfind | |||
Openfind data gatherer | |||
Oracle Ultra Search | |||
PageGrabber | |||
Papa Foto | |||
pavuk | |||
pcBrowser | |||
PerMan | |||
ProPowerBot/2.14 | |||
ProWebWalker | |||
psbot | |||
Python-urllib | |||
QueryN Metasearch | |||
QueryN.Metasearch | |||
R6_CommentReader | |||
R6_FeedFetcher | |||
Radiation Retriever 1.1 | |||
RealDownload | |||
RealDownload/4.0.0.40 | |||
RealDownload/4.0.0.41 | |||
RealDownload/4.0.0.42 | |||
ReGet | |||
RepoMonkey | |||
RepoMonkey Bait & Tackle/v1.01 | |||
RMA | |||
Roverbot | |||
searchpreview | |||
SiteSnagger | |||
SlySearch | |||
SmartDownload | |||
SmartDownload/1.2.76 (Win32; Apr 1 1999) | |||
SmartDownload/1.2.77 (Win32; Aug 17 1999) | |||
SmartDownload/1.2.77 (Win32; Feb 1 2000) | |||
SmartDownload/1.2.77 (Win32; Jun 19 2001) | |||
Snooper/b97_01 | |||
Solbot/1.0 LWP/5.07 | |||
sootle | |||
SpankBot | |||
spanner | |||
Spanner/1.0 (Linux 2.0.27 i586) | |||
spyder3.microsys.com | |||
Sqworm/2.9.85-BETA (beta_release; 20011115-775; i686-pc-linux | |||
SuperBot | |||
SuperBot/3.0 (Win32) | |||
SuperBot/3.1 (Win32) | |||
SuperHTTP | |||
SuperHTTP/1.0 | |||
Surfbot | |||
suzuran | |||
Szukacz/1.4 | |||
tAkeOut | |||
Teleport | |||
Teleport Pro | |||
Teleport Pro/1.29 | |||
Teleport Pro/1.29.1590 | |||
Teleport Pro/1.29.1634 | |||
Teleport Pro/1.29.1718 | |||
Teleport Pro/1.29.1820 | |||
Teleport Pro/1.29.1847 | |||
TeleportPro | |||
Telesoft | |||
The Intraformant | |||
The.Intraformant | |||
TheNomad | |||
TightTwatBot | |||
Titan | |||
toCrawl/UrlDispatcher | |||
True_Robot | |||
True_Robot/1.0 | |||
turingos | |||
TurnitinBot | |||
UnisterBot | |||
UnwindFetchor/1.0 | |||
URL Control | |||
URLSpiderPro | |||
urlck/1.2.3 | |||
URLy Warning | |||
URLy.Warning | |||
Valkyrie/1.0 libwww-perl/0.40 | |||
vBSEO | |||
VCI | |||
VCI WebViewer VCI WebViewer Win32 | |||
VoidEYE | |||
Web Image Collector | |||
Web Sucker | |||
Web.Image.Collector | |||
WebAuto | |||
WebAuto/3.40 (Win98; I) | |||
WebBandit | |||
WebBandit/3.50 | |||
WebCapture 2.0 | |||
WebCopier | |||
WebCopier v.2.2 | |||
WebCopier v2.5 | |||
WebCopier v2.6 | |||
WebCopier v2.7a | |||
WebCopier v2.8 | |||
WebCopier v3.0 | |||
WebCopier v3.0.1 | |||
WebCopier v3.2 | |||
WebCopier v3.2a | |||
WebCopy/ | |||
WebCrawler/3.0 Robot libwww/5.0a | |||
WebEMailExtrac.* | |||
WebEnhancer | |||
WebFerret | |||
WebFetch | |||
webfetch/2.1.0 | |||
WebFetcher/0.8, | |||
WebGo IS | |||
weblayers/0.0 | |||
WebLeacher | |||
WebLinker/0.0 libwww-perl/0.1 | |||
WebmasterWorld Extractor | |||
WebmasterWorld Extractor | |||
WebmasterWorldForumBot | |||
WebmasterWorldForumBot | |||
WebMoose/0.0.0000 | |||
WebReaper | |||
WebReaper [info@webreaper.net] | |||
WebReaper [webreaper@otway.com] | |||
WebReaper v9.1 – www.otway.com/webreaper | |||
WebReaper v9.7 – www.webreaper.net | |||
WebReaper v9.8 – www.webreaper.net | |||
WebReaper vWebReaper v7.3 – www,otway.com/webreaper | |||
webs@recruit.co.jp | |||
WebSauger | |||
WebSauger 1.20b | |||
WebSauger 1.20j | |||
WebSauger 1.20k | |||
Website eXtractor | |||
Website Quester | |||
Website Quester – www.asona.org | |||
Website Quester – www.esalesbiz.com/extra/ | |||
Website.Quester | |||
Webster Pro | |||
Webster.Pro | |||
WebStripper | |||
WebStripper/2.03 | |||
WebStripper/2.10 | |||
WebStripper/2.12 | |||
WebStripper/2.13 | |||
WebStripper/2.15 | |||
WebStripper/2.16 | |||
WebStripper/2.19 | |||
WebVac | |||
webvac/1.0 | |||
webwalk | |||
WebWalker | |||
WebWalker/1.10 | |||
WebWatch | |||
WebWhacker | |||
WebZIP | |||
WebZIP/2.75 (http://www.spidersoft.com) | |||
WebZIP/3.65 (http://www.spidersoft.com) | |||
WebZIP/3.80 (http://www.spidersoft.com) | |||
WebZip/4.0 | |||
WebZIP/4.0 (http://www.spidersoft.com) | |||
WebZIP/4.1 (http://www.spidersoft.com) | |||
WebZIP/4.21 | |||
WebZIP/4.21 (http://www.spidersoft.com) | |||
WebZIP/5.0 | |||
WebZIP/5.0 (http://www.spidersoft.com) | |||
WebZIP/5.0 PR1 (http://www.spidersoft.com) | |||
Wget | |||
wget | |||
Wget/1.4.0 | |||
Wget/1.5.2 | |||
Wget/1.5.3 | |||
Wget/1.6 | |||
Wget/1.7 | |||
Wget/1.8 | |||
Wget/1.8.1 | |||
Wget/1.8.1+cvs | |||
Wget/1.8.2 | |||
Wget/1.9-beta | |||
WhoWhere Robot | |||
Widow | |||
wired-digital-newsbot/1.5 | |||
WWW Collector | |||
WWW-Collector-E | |||
www.freeloader.com. | |||
WWWOFFLE | |||
WWWWanderer v3.0 | |||
Xaldon WebSpider | |||
Xaldon WebSpider 2.5.b3 | |||
Xaldon_WebSpider | |||
Xenu’s | |||
Xenu’s Link Sleuth 1.1c | |||
XGET/0.7 | |||
Yasaklibot | |||
yes | |||
YesupBot | |||
Yeti | |||
Zeus | |||
Zeus 11389 Webster Pro V2.9 Win32 | |||
Zeus 11652 Webster Pro V2.9 Win32 | |||
Zeus 18018 Webster Pro V2.9 Win32 | |||
Zeus 26378 Webster Pro V2.9 Win32 | |||
Zeus 30747 Webster Pro V2.9 Win32 | |||
Zeus 32297 Webster Pro V2.9 Win32 | |||
Zeus 39206 Webster Pro V2.9 Win32 | |||
Zeus 41641 Webster Pro V2.9 Win32 | |||
Zeus 44238 Webster Pro V2.9 Win32 | |||
Zeus 51070 Webster Pro V2.9 Win32 | |||
Zeus 51674 Webster Pro V2.9 Win32 | |||
Zeus 51837 Webster Pro V2.9 Win32 | |||
Zeus 63567 Webster Pro V2.9 Win32 | |||
Zeus 6694 Webster Pro V2.9 Win32 | |||
Zeus 82016 Webster Pro V2.9 Win32 | |||
Zeus 82900 Webster Pro V2.9 Win32 | |||
Zeus 84842 Webster Pro V2.9 Win32 | |||
Zeus 90872 Webster Pro V2.9 Win32 | |||
Zeus 94934 Webster Pro V2.9 Win32 | |||
Zeus 95245 Webster Pro V2.9 Win32 | |||
Zeus 95351 Webster Pro V2.9 Win32 | |||
Zeus 97371 Webster Pro V2.9 Win32 | |||
Zeus Link Scout | |||
ZyBorg | |||
Table 2 of 3: Four more independent (?) lists of bots, crawlers, spiders
http://user-agent-string.info/de/list-of-ua/bots | http://www.projectcounter.org/r4/COUNTER_Robots_list_Jan2014.txt | http://www.robotstxt.org/db.html | http://www.rhyolite.com/anti-spam/badbots.html |
AddThis.com | [^a]fish | ABCdatos BotLink | purebot |
MSNBot | [+:,\.\;\/\\-]bot | Acme.Spider | Ezooms |
DotBot | ^$ | Ahoy! The Homepage Finder | MJ12bot |
WeSEE | ^IDA$ | Alkaline | SurveyBot |
bingbot | ^ruby$ | Anthill | sitebot |
proximic | ^voyager\/ | Walhello appie | dotnetdotcom |
Mail.Ru bot | acme\.spider | Arachnophilia | dotbot |
Googlebot | alexa | Arale | SolomonoBot |
Exabot | Alexandria(\s|\+)prototype(\s|\+)project | Araneo | ZmEu |
YandexBot | AllenTrack | AraybOt | Morfeus |
MJ12bot | almaden | ArchitextSpider | Snoopy |
NetcraftSurveyAgent | appie | Aretha | WBSearchBot |
Genieo Web filter | Arachmo | ARIADNE | Exabot |
seegnifybot | architext | arks | findlinks |
Yahoo! | archive\.org_bot | AskJeeves | aiHitBot |
EasouSpider | arks | ASpider (Associative Spider) | AhrefsBot |
ia_archiver | asterias | ATN Worldwide | DinoPing |
Baiduspider | atomz | Atomz.com Search Robot | panopta.com |
NaverBot | autoemailspider | AURESYS | linkchecker.sourceforge.net |
sogou spider | awbot | BackRub | linkcheck |
sistrix | baiduspider | Bay Spider | Searchmetrics |
archive.org_bot | bbot | Big Brother | lipperhey |
ShopWiki | BDFetch | Bjaaland | dataprovider.com |
coccoc | biadu | BlackWidow | SemrushBot |
omgilibot | biglotron | Die Blinde Kuh | Sosospider |
magpie-crawler | bjaaland | Bloodhound | discoverybot |
Vagabondo | blaiz\-bee | Borg-Bot | Yandex |
BLEXBot | bloglines | BoxSeaBot | www.integromedb.org/Crawler |
FlipboardProxy | blogpulse | bright.net caching robot | 360Spider |
SeznamBot | boitho\.com\-dc | BSpider | 80legs |
psbot | bookmark\-manager | CACTVS Chemistry Spider | YamanaLab-Robot |
Woko | bot | Calif | ip-web-crawler.com |
URLAppendBot | Brutus\/AET | Cassandra | Aboundex |
AhrefsBot | bspider | Digimarc Marcspider/CGI | Aboundex |
Infohelfer | bwh3_user_agent | Checkbot | |
GrapeshotCrawler | celestial | ChristCrawler.com | |
CareerBot | cfnetwork|checkbot | churl | |
rogerbot | checkprivacy | cIeNcIaFiCcIoN.nEt | |
bitlybot | China\sLocal\sBrowse\s2\.6 | CMC/0.01 | |
ShowyouBot | cloakDetect | Collective | |
MetaJobBot | Code\sSample\sWeb\sClient | Combine System | |
ChangeDetection | combine | Conceptbot | |
TurnitinBot | commons\-httpclient | ConfuzzledBot | |
Netseer | contentmatch | CoolBot | |
Wotbox | ContentSmartz | Web Core / Roots | |
Blekkobot | core | XYLEME Robot | |
Daumoa | CoverScout | Internet Cruiser Robot | |
aiHitBot | crawl | Cusco | |
SemrushBot | crawler | CyberSpyder Link Test | |
spbot | cursor | CydralSpider | |
linkdexbot | custo | Desert Realm Spider | |
MojeekBot | DataCha0s\/2\.0 | DeWeb(c) Katalog/Index | |
uMBot | daumoa | DienstSpider | |
SEOENGBot | Demo\sBot | Digger | |
A6-Indexer | docomo | Digital Integrity Robot | |
Crawler4j | Download\+Master | Direct Hit Grabber | |
ZumBot | DSurf | DNAbot | |
OpenWebSpider | dtSearchSpider | DownLoad Express | |
WBSearchBot | dumbot | DragonBot | |
IntegromeDB | easydl | DWCP (Dridus’ Web Cataloging Project) | |
Spinn3r | EmailSiphon | e-collector | |
WebCorp | EmailWolf | EbiNess | |
IstellaBot | exabot | EIT Link Verifier Robot | |
BUbiNG | fast-webcrawler | ELFINBOT | |
favorg | Emacs-w3 Search Engine | ||
Jyxobot | FDM(\s|\+)1 | ananzi | |
BingPreview | feedburner | esculapio | |
SEOkicks-Robot | FeedFetcher | Esther | |
BacklinkCrawler | feedfetcher\-google | Evliya Celebi | |
SearchmetricsBot | ferret | FastCrawler | |
NalezenCzBot | Fetch(\s|\+)API(\s|\+)Request | Fluid Dynamics Search Engine robot | |
Motoricerca-Robots.txt-Checker | findlinks | Felix IDE | |
VoilaBot | Fulltext | Wild Ferret Web Hopper #1, #2, #3 | |
MetaGeneratorCrawler | Funnelback | FetchRover | |
socialbm_bot | gaisbot | fido | |
Aboundexbot | GetRight | Hämähäkki | |
ichiro | geturl | KIT-Fireball | |
CCBot | gigabot | Fish search | |
gocrawl | girafabot | Fouineur | |
HubSpot Connect | gnodspider | Robot Francoroute | |
iCjobs | Goldfire(\s|\+)Server | Freecrawl | |
Company News Search engine | FunnelWeb | ||
x28-job-bot | grub | gammaSpider, FocusedCrawler | |
EveryoneSocialBot | gulliver | gazz | |
Seobility | harvest | GCreep | |
Ezooms | heritrix | GetBot | |
Symfony Spider | hl_ftien_spider | GetURL | |
Iframely | holmes | Golem | |
Lipperhey Spider | htdig | Googlebot | |
NextGenSearchBot | htmlparser | Grapnel/0.01 Experiment | |
KrOWLer | HttpComponents\/1.1 | Griffon | |
netEstate Crawler | HTTPFetcher | Gromit | |
Twingly Recon | httpget\?5\.2\.2 | Northern Light Gulliver | |
Robots_Tester | httpget\-5\.2\.2 | Gulper Bot | |
BDCbot | httrack | HamBot | |
FacebookExternalHit | ia_archiver | Harvest | |
meanpathbot | ichiro | havIndex | |
oBot | iktomi | HI (HTML Index) Search | |
Arachnophilia | ilse | Hometown Spider Pro | |
bixocrawler | internetseer | ht://Dig | |
eCommerceBot | intute | HTMLgobble | |
Alexabot | iSiloX | Hyper-Decontextualizer | |
emefgebot | Jakarta\+Commons\-HttpClient | iajaBot | |
AntBot | java | IBM_Planetwide | |
UASlinkChecker | jeeves | Popular Iconoclast | |
Kraken | jobo | Ingrid | |
Nuhk | kyluka | Imagelock | |
Panscient web crawler | larbin | IncyWincy | |
Najdi.si | libcurl | Informant | |
SecurityResearchBot | libwww | InfoSeek Robot 1.0 | |
yacybot | libwww\-perl | Infoseek Sidewinder | |
CloudServerMarketSpider | lilina | InfoSpiders | |
YYSpider | linkbot | Inspector Web | |
UnisterBot | linkcheck | IntelliAgent | |
200PleaseBot | linkchecker | I, Robot | |
ICC-Crawler | LinkLint-checkonly | Iron33 | |
trendictionbot | linkscan | Israeli-search | |
Peeplo Screenshot Bot | linkwalker | JavaBee | |
Steeler | livejournal\.com | JBot Java Web Robot | |
nekstbot | lmspider | JCrawler | |
360Spider | LOCKSS | Jeeves | |
LoadTimeBot | lwp | JoBo Java Web Robot | |
SpiderLing | LWP\:\:Simple | Jobot | |
webinatorbot | lwp\-request | JoeBot | |
Cliqzbot | lwp\-tivial | The Jubii Indexing Robot | |
Leikibot | lwp\-trivial | JumpStation | |
AboutUsBot | lwp-request | image.kapsi.net | |
TinEye | lycos[_+] | Katipo | |
musobot | mail.ru | KDD-Explorer | |
search.KumKie.com | MarcEdit.5.2.Web.Client | Kilroy | |
Nigma.ru | mediapartners\-google | KO_Yappo_Robot | |
CompSpyBot | Mediapartners-Google | LabelGrabber | |
SeoCheckBot | megite | larbin | |
hawkReader | Microsoft(\s|\+)URL(\s|\+)Control | legs | |
PercolateCrawler | milbot | Link Validator | |
Butterfly | mimas | LinkScan | |
8 | mj12bot | LinkWalker | |
Plukkie | mnogosearch | Lockon | |
WebThumbnail | moget | logo.gif Crawler | |
Falconsbot | mojeekbot | Lycos | |
thumbshots-de-Bot | momspider | Mac WWWWorm | |
SSL-Crawler | motor | Magpie | |
ThumbSniper | msiecrawler | marvin/infoseek | |
Embedly | msnbot | Mattie | |
linguatools | MuscatFerre | MediaFox | |
backlink-check.de | myweb | MerzScope | |
PayPal IPN | NABOT | NEC-MeshExplorer | |
adressendeutschland.de | nagios | MindCrawler | |
XRL | NaverBot | mnoGoSearch search engine software | |
IdeelaborPlagiaat | netcraft | moget | |
SiteCondor | netluchs | MOMspider | |
Web-Monitoring | ng\/2\. | Monster | |
Vedma | Ning | Motor | |
parsijoo | no_user_agent | MSNBot | |
GarlikCrawler | nomad | Muncher | |
Browsershots | nutch | Muninn | |
LoadImpactPageAnalyzer | ocelli | Muscat Ferret | |
FyberSpider | Offline(\s|\+)Navigator | Mwd.Search | |
classbot | onetszukaj | Internet Shinchakubin | |
ZeerchBot | OurBrowser | NDSpider | |
Feedly | parsijoo | Nederland.zoek | |
WebCookies | pear.php.net | NetCarta WebMap Engine | |
LinkedInBot | perman | NetMechanic | |
TomTom places company search | PHP\/ | NetScoop | |
CloudFlare-AlwaysOnline | pioneer | newscan-online | |
Readability | playmusic\.com | NHSE Web Forager | |
suggybot | playstarmusic\.com | Nomad | |
CatchBot | powermarks | The NorthStar Robot | |
Jabse.com Crawler | psbot | nzexplorer | |
woriobot | PycURL | ObjectsSearch | |
ExB Language Crawler | python | Occam | |
kulturarw | qihoobot | HKU WWW Octopus | |
BrainbruBot | rambler | OntoSpider | |
KomodiaBot | Readpaper | Openfind data gatherer | |
Qualidator.com Bot | redalert|robozilla | Orb Search | |
IXEbot | RePEc.link.checker | Pack Rat | |
CMS Crawler | robot | PageBoy | |
immediatenet thumbnails | robots | ParaSite | |
Shareaholicbot | RPT\-HTTPClient\/0.3-3E | Patric | |
YioopBot | rss | pegasus | |
Qualidator.com SiteAnalyzer 1.0 | scan4mail | The Peregrinator | |
Qirina Hurdler | scientificcommons | PerlCrawler 1.0 | |
BegunAdvertising | scirus | Phantom | |
LuminateBot | scooter | PhpDig | |
linkdex.com | seekbot | PiltdownMan | |
Curious George | seznambot | Pimptrain.com’s robot | |
Fetch-Guess | shoutcast | Pioneer | |
SBSearch | slurp | html_analyzer | |
alexa site audit | sogou | Portal Juice Spider | |
AraBot | speedy | PGP Key Agent | |
AMZNKAssocBot | spider | PlumtreeWebAccessor | |
Speedy | spiderman | Poppi | |
HostTracker | spiderview | PortalB Spider | |
CliqzBot | Strider | psbot | |
findlinks | sunrise | GetterroboPlus Puu | |
CCResearchBot | superbot | The Python Robot | |
Semantifire | surveybot | Raven Search | |
LinkAider | T\-H\-U\-N\-D\-E\-R\-S\-T\-O\-N\-E | RBSE Spider | |
Zookabot | tailrank | Resume Robot | |
ScreenerBot Crawler | technoratibot | RoadHouse Crawling System | |
webmastercoffee | Teleport(\s|\+)Pro | RixBot | |
PaperLiBot | Teoma | Road Runner: The ImageScape Robot | |
QuerySeekerSpider | titan | Robbie the Robot | |
Crowsnest | turnitinbot | ComputingSite Robi/1.0 | |
UnwindFetchor | twiceler | RoboCrawl Spider | |
MetaURI API | ucsd | RoboFox | |
MiaDev | ultraseek | Robozilla | |
AcoonBot | URL2File | Roverbot | |
Gigabot | urlaliasbuilder | RuLeS | |
firmilybot | urllib | SafetyNet Robot | |
Sosospider | validator | Scooter | |
OpenindexSpider | virus[_+]detector | Sleek | |
MetaHeadersBot | voila | Search.Aus-AU.COM | |
Strokebot | w3c\-checklink | SearchProcess | |
GeliyooBot | Wanadoo | Senrigan | |
bot-pge.chlooe.com | Web(\s|\+)Downloader | SG-Scout | |
ownCloud Server Crawler | WebCloner | ShagSeeker | |
CirrusExplorer | webcollage | Shai’Hulud | |
ProCogSEOBot | WebCopier | Sift | |
Dlvr.it/1.0 | Webinator | Simmany Robot Ver1.0 | |
Open Web Analytics Bot | weblayers | Site Valet | |
RyzeCrawler | Webmetrics | Open Text Index Robot | |
discoverybot | webmirror | SiteTech-Rover | |
crawler for netopian | webreaper | Skymob.com | |
ADmantX Platform Semantic Analyzer | WebStripper | SLCrawler | |
R6 bot | WebZIP | Inktomi Slurp | |
bl.uk_lddc_bot | Wget | Smart Spider | |
Linguee Bot | wordpress | Snooper | |
SolomonoBot | worm | Solbot | |
Grahambot | www.gnip.com | Spanner | |
Automattic Analytics Crawler | WWW\-Mechanize | Speedy Spider | |
YoudaoBot | xenu | spider_monkey | |
PiplBot | Xenu(\s|\+)Link(\s|\+)Sleuth | SpiderBot | |
FlightDeckReportsBot | y!j | Spiderline Crawler | |
fastbot crawler | yacy | SpiderMan | |
4seohuntBot | yahoo | SpiderView(tm) | |
Updownerbot | yandex | Spry Wizard Robot | |
JikeSpider | yodaobot | Site Searcher | |
NLNZ_IAHarvester2013 | zealbot | Suke | |
wsAnalyzer | zeus | suntek search engine | |
YodaoBot | zyborg | Sven | |
Esribot | Sygol | ||
Thumbshots.ru | TACH Black Widow | ||
BlogPulse | Tarantula | ||
bot.wsowner.com | tarspider | ||
wscheck.com | Tcl W3 Robot | ||
Qseero | TechBOT | ||
drupact | Templeton | ||
HuaweiSymantecSpider | TeomaTechnologies | ||
PagePeeker | TITAN | ||
HomeTags | TitIn | ||
facebookplatform | The TkWWW Robot | ||
Pixray-Seeker | TLSpider | ||
BDFetch | UCSD Crawl | ||
MeMoNewsBot | UdmSearch | ||
ProCogBot | UptimeBot | ||
WillyBot | URL Check | ||
peerindex | URL Spider Pro | ||
Job Roboter Spider | Valkyrie | ||
MLBot | Verticrawl | ||
WebNL | Victoria | ||
Peepowbot | vision-search | ||
Semager | void-bot | ||
MIA Bot | Voyager | ||
heritrix | VWbot | ||
Eurobot | The NWI Robot | ||
DripfeedBot | W3M2 | ||
Whoismindbot | WallPaper (alias crawlpaper) | ||
Bad-Neighborhood | the World Wide Web Wanderer | ||
Hailoobot | w@pSpider by wap4.com | ||
akula | WebBandit Web Spider | ||
MetamojiCrawler | WebCatcher | ||
Page2RSS | WebCopy | ||
EasyBib AutoCite | webfetcher | ||
NerdByNature.Bot | The Webfoot Robot | ||
EventGuruBot | Webinator | ||
quickobot | weblayers | ||
gonzo | WebLinker | ||
bnf.fr_bot | WebMirror | ||
UptimeRobot | The Web Moose | ||
Influencebot | WebQuest | ||
MSRBOT | Digimarc MarcSpider | ||
KeywordDensityRobot | WebReaper | ||
Ronzoobot | webs | ||
ScoutJet | Websnarf | ||
Twikle | WebSpider | ||
SWEBot | WebVac | ||
RADaR-Bot | webwalk | ||
DCPbot | WebWalker | ||
Castabot | WebWatch | ||
imbot | Wget | ||
EdisterBot | whatUseek Winona | ||
WASALive-Bot | WhoWhere Robot | ||
Accelobot | Wired Digital | ||
PostPost | Weblog Monitor | ||
factbot | w3mir | ||
Setoozbot | WebStolperer | ||
biwec | The Web Wombat | ||
Search17Bot | The World Wide Web Worm | ||
Lijit | WWWC Ver 0.2.5 | ||
JUST-CRAWLER | WebZinger | ||
Apercite | XGET | ||
pmoz.info ODP link checker | |||
LemurWebCrawler | |||
Covario-IDS | |||
Holmes | |||
RankurBot | |||
AdsBot-Google | |||
envolk | |||
Ask Jeeves/Teoma | |||
LexxeBot | |||
StackRambler | |||
Abrave Spider | |||
EvriNid | |||
arachnode.net | |||
CamontSpider | |||
wikiwix-bot | |||
Nymesis | |||
trendictionbot | |||
Sitedomain-Bot | |||
SEODat | |||
SygolBot | |||
Snapbot | |||
OpenCalaisSemanticProxy | |||
ZookaBot | |||
CligooRobot | |||
cityreview | |||
nworm | |||
SBIder | |||
TwengaBot | |||
Dot TK – spider | |||
EuripBot | |||
ParchBot | |||
Peew | |||
YRSpider | |||
Urlfilebot (Urlbot) | |||
Gaisbot | |||
WatchMouse | |||
Tagoobot | |||
WebWatch/Robot_txtChecker | |||
urlfan-bot | |||
StatoolsBot | |||
page_verifier | |||
SSLBot | |||
SAI Crawler | |||
DomainDB | |||
LinkWalker | |||
WMCAI_robot | |||
voyager | |||
copyright sheriff | |||
Ocelli | |||
Twiceler | |||
amibot | |||
abby | |||
NetResearchServer | |||
VideoSurf_bot | |||
XML Sitemaps Generator | |||
BlinkaCrawler | |||
nodestackbot | |||
Pompos | |||
taptubot | |||
BabalooSpider | |||
Yaanb | |||
Girafabot | |||
livedoor ScreenShot | |||
eCairn-Grabber | |||
FauBot | |||
Toread-Crawler | |||
Setoozbot | |||
MetaURI | |||
L.webis | |||
Web-sniffer | |||
FairShare | |||
Ruky-Roboter | |||
ThumbShots-Bot | |||
BotOnParade | |||
Amagit.COM | |||
HatenaScreenshot | |||
HolmesBot | |||
dotSemantic | |||
Karneval-Bot | |||
HostTracker.com | |||
AportWorm | |||
XmarksFetch | |||
FeedFinder/bloggz.se | |||
CorpusCrawler | |||
Willow Internet Crawler | |||
OrgbyBot | |||
GingerCrawler | |||
pingdom.com_bot | |||
Nutch | |||
baypup | |||
Mp3Bot | |||
192.comAgent | |||
Surphace Scout | |||
WikioFeedBot | |||
Szukacz | |||
DBLBot | |||
Thumbnail.CZ robot | |||
LinguaBot | |||
GurujiBot | |||
Charlotte | |||
50.nu | |||
SanszBot | |||
moba-crawler | |||
HeartRails_Capture | |||
SurveyBot | |||
MnoGoSearch | |||
smart.apnoti.com Robot | |||
Topicbot | |||
JadynAveBot | |||
OsObot | |||
WebImages | |||
WinWebBot | |||
Scooter | |||
Scarlett | |||
GOFORITBOT | |||
DKIMRepBot | |||
Yanga | |||
DNS-Digger-Explorer | |||
Robozilla | |||
YowedoBot | |||
botmobi | |||
Fooooo_Web_Video_Crawl | |||
UptimeDog | |||
^Nail | |||
Metaspinner/0.01 | |||
Touche | |||
RSSMicro.com RSS/Atom Feed Robot | |||
SniffRSS | |||
Kalooga | |||
FeedCatBot | |||
WebRankSpider | |||
Flatland Industries Web Spider | |||
DealGates Bot | |||
Link Valet Online | |||
Shelob | |||
Technoratibot | |||
Flocke bot | |||
FollowSite Bot | |||
Visbot |
Table 3 of 3: The perhaps best known bots, spiders and crawlers which appear at least twice in the table above
name of bot (140 in total) |
8 |
200PleaseBot |
360Spider |
4seohuntBot |
ABCdatos BotLink |
Aboundex |
Aboundexbot |
AboutUsBot |
AdsBot-Google |
Ahoy! The Homepage Finder |
AhrefsBot |
aiHitBot |
AMZNKAssocBot |
Arachnophilia |
ArchitextSpider |
archive.org_bot |
arks |
asterias |
BaiduSpider |
BDFetch |
Big Brother |
Bingbot |
BingPreview |
bjaaland |
BlackWidow |
BLEXBot |
BlogPulse |
bspider |
CatchBot |
ccbot |
ChangeDetection |
Cliqzbot |
Custo |
Daumoa |
discoverybot |
DotBot |
dumbot |
EmailSiphon |
EmailWolf |
envolk |
ExaBot |
Ezooms |
facebookexternalhit |
findlinks |
Gaisbot |
GetRight |
geturl |
Gigabot |
Girafabot |
Googlebot |
grapeshot |
grub |
Harvest |
heritrix |
Holmes |
httpget\?5\.2\.2 |
HTTrack |
Huaweisymantecspider |
ia_archiver |
ichiro |
Informant |
jeeves |
larbin |
Linguee Bot |
linkcheck |
LinkedInBot |
linkscan |
LinkWalker |
magpie-crawler |
Mediapartners-Google |
MJ12bot |
Mnogosearch |
moget |
MojeekBot |
momspider |
motor |
MSIECrawler |
msnbot |
MSRBot |
NaverBot |
NetMechanic |
nomad |
Nutch |
oBot |
Ocelli |
Offline Explorer |
Openfind data gatherer |
PagePeeker |
parsijoo |
PerMan |
pioneer |
psbot |
Robozilla |
Roverbot |
Scooter |
ScoutJet |
SemrushBot |
SeznamBot |
SiteSnagger |
SolomonoBot |
Sosospider |
spanner |
Speedy |
Speedy Spider |
spiderman |
SuperBot |
SurveyBot |
Technoratibot |
TeleportPro |
Titan |
TurnitinBot |
Twiceler |
UnisterBot |
UnwindFetchor/1.0 |
VoilaBot |
voyager |
WBSearchBot |
WebCopier |
Webinator |
weblayers |
WebmasterWorld Extractor |
WebmasterWorldForumBot |
webmirror |
WebReaper |
WebStripper |
WebVac |
webwalk |
WebWalker |
WebWatch |
WebZIP |
Wget |
WhoWhere Robot |
Xaldon_WebSpider |
Yandex |
YandexBot |
YandexDirect |
Yodaobot |
Zeus |
Zookabot |
ZyBorg |
keywords: best known names of bots, list of names, common bots, spiders, crawlers, WordPress, visitor stats, count per day, internet traffic, traffic filter