Warning: Undefined variable $device in /var/www/vhosts/jpfbb1p9.web6.alfahosting-server.de/novis-itineribus.de/.include/start.inc on line 219
🕷 Nice and Devil - the Spider Trap 🕷 Schön und teuflisch - die Spinnenfalle 🕷 Nice and Devil - Edderkoppefælden 🕷 Nice en Devil - de spinnenval 🕷 Nice and Devil - the Spider Trap 🕷 Nice and Devil - Spindelfällan 🕷 Nice and Devil - Pułapka na pająki 🕷 Nice and Devil - Edderkoppfellen 🕷 Nice and Devil - la Araneo-Kaptilo 🕷 Nice an Däiwel - de Spider Trap 🕷 Bonito y diablo - la trampa de la araña 🕷 Nice and Devil - Hämähäkkiloukku 🕷 Gentil et Diable - le piège à araignées 🕷 Bello e diavolo: la trappola per ragni
whatsapp e-mail dancenter TripAdvisor GOOGLE
Homepage Server-Status Test-Site EVIL.php IP-Lookup

Internet Security by novis itineribus - Cottage 'zantbaŋk

🕷 Nice and Devil - the Spider Trap

This is a trap for unwanted crawlers and spiders.

NICE.php & EVIL.php

Bots that visit our site can be found under EVIL.php

Amazonbot

2024-10-15 19:00:43+02:00

xxx.xxx.xxx.xxx.crawl.amazonbot.amazon

Mozilla/5.0 (Macintosh Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML like Gecko) Version/8.0.2 Safari/600.2.5 (Amazonbot/0.1 +https://developer.amazon.com/support/amazonbot)

Amazonbot is Amazons web crawler, such as enabling Alexa to answer even more questions for customers. Amazonbot respects standard robots.txt rules.

AhrefsBot

2024-05-11 03:49:18+02:00

164.90.225.193

SEO Company that only collects data for its own customers. You can only use your own data for a fee.

AppleBot

17-241-xxx-xxx.applebot.apple.com

Search Engine: desired, everything okay

Akkoma

2024-03-30 12:17:55

loquat.unboiled.info // vps-aefd39af.vps.ovh.net

Pages shared on social media must be scanned by the service.

AwarioBot

2024-05-05 22:53:17+02:00

pot34.webmeup.com

SEO Company that only collects data for its own customers. You can only use your own data for a fee.

amazonaws

AWS Amazon Web Services, is used by many. AWS IP address ranges

BaiduSpider

2024-04-17 01:41:40+02:00

xxx.xxx.xxx.xxx

Crawler for the chinese search engine Baidu: desired, everything okay

bingBot

msnbot-xx-xxx-xxx-xxx.search.msn.com

Search Engine: desired, everything okay

BacklinksExtendedBot

2024-07-24 13:56:41+02:00

185.170.167.18

UNKNOWN: There is no information about the crawler. (MNT By: Semrush_Net)

Barkrowler [babbar.tech]

2024-03-27 23:21:41

154.54.249.162

SEO Company that only collects data for its own customers. You can only use your own data for a fee.

Bytespidert

2024-03-24 21:29:01

ec2-xx-xxx-xx-xxx.ap-southeast-1.compute.amazonaws.com

ByteDance, products and services such as TikTok, CapCut, TikTok Shop, Lark, Pico. Could not verify why data is being collected.

BLEXBot

2024-04-01 06:42:42+02:00

ninja-crawler96.webmeup.com

Free backlink checker / crawler, by SEO SpyGlass, Data for Free

CensysInspect

2024-11-15 06:22:35

162.142.125.198

Mozilla/5.0 (compatible CensysInspect/1.1 +https://about.censys.io/)

Collects safety-relevant!

companyspotter

2024-10-02 12:48:14+02:00

83.149.81.165, itbe.nl

companyspotter/2.0.0.0 (robot@companyspotter.com)

Website analysis: Find out how websites are built and what software is used. Lead generation: Find your ideal business target group through various intelligent and creative search methods.

contaboserver

2024-09-11 10:51:59+02:00

173.212.226.212, 84.247.151.19, vmi….contaboserver.net

Mozilla/5.0 (Windows NT 10.0 Win64 x64 rv:88.0) Gecko/20100101 Firefox/88.0

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. Net Name: CONTABO

curl/8.3.0, curl/8.6.0

2024-09-24 05:50:05+02:00

internettl.org, ec2-x-x-x-x.us-east-2.compute.amazonaws.com, pool-x-x-x-x.nycmny.fios.verizon.net

Could not verify why data is being collected. Scans only index.php. IP-Lookup 13.95.133.245: Handle: MSFT - Name: Microsoft Corporation

Clickagy Intelligence Bot

2024-07-12 21:35:22+02:00

ec2-44-215-105-52.compute-1.amazonaws.com

Advertising: Tracking websites and customers. „Data intelligence company that works with leading marketers to anonymously identify and segment audiences based on their digital behavior in real time.“

CCBot

2024-04-03 03:42:33+02:00

ec2-….compute-1.amazonaws.com

CCBot/2.0 (https://commoncrawl.org/faq/)

Common Crawl is a 501(c)(3) nonprofit organization whose mission is to provide Internet researchers, companies, and individuals with a copy of the Internet, free of charge, for research and analysis purposes. Does not follow its own robots.txt rules

CookieBot

2024-03-21 16:56:14

x.x.x.x.bc.googleusercontent.com

Self-started / queried, everything okay

ClaudeBot

2024-11-24 00:11:56+01:00

ec2-x-x-x-x.us-east-2.compute.amazonaws.com

Mozilla/5.0 AppleWebKit/537.36 (KHTML like Gecko compatible ClaudeBot/1.0 +claudebot@anthropic.com)

AI/KI 
denmark Kunstig intelligens (AI)
germany Künstliche Intelligenz (KI)
language en Artificial Intelligence (AI)
france Intelligence artificielle (IA)
poland Sztuczna inteligencja (AI)
 
agent from Anthropic, could not verify why data is being collected. Does not follow its own robots.txt rules

ChatGPT-User

….….….…

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot

ChatGPT users may also interact with external applications via GPT Actions. ChatGPT-User governs which sites these user requests can be made to. It is not used for crawling the web in any automatic fashion, nor to crawl content for generative AI training.: desired, for me everything okay.

comscore.com/Web-Crawler

2024-04-23 13:22:49+02:00

ec2-52-90-218-67.compute-1.amazonaws.com

Advertising: Tracking websites and customers. „Comscore's contextual content analysis enables advertising partners to determine the best matching campaign for a page's content.“

dataproviderBot

2024-03-11 23:11:04

crawl-149-56-160-221.dataproviderbot.com

Paid search engine. Speaks against a free internet.

DomainStatsBot

2024-06-24 16:17:29+02:00

bot.domainstats.com

SEO Company that only collects data for its own customers. You can only use your own data for a fee.

DataForSeoBot

2024-04-16 19:19:52+02:00

crawling-gateway-136-243-228-179.dataforseo.com

SEO Company that only collects data for its own customers. You can only use your own data for a fee.

DENIC-Crawler

2024-05-02 10:01:36+02:00

81.91.173.172

IP-Adresse DENIC eG Niederlande. Operation and management of the top-level domain .de. It's a shame there's no information on the web crawler.

DmBot

2024-03-07 03:15:59

UNKNOWN: There is no information about the crawler.

Download Demon/3.5.0.11

2024-08-23 09:28:59+02:00

unn-185-24-11-164.datapacket.com

UNKNOWN: There is no information about the crawler.

dynamic-*-*-*-*.*.*.pool.telefonica.de

2024-02-22 16:48:03

Test Bot / Crawler from me

🦆 DuckDuckGo Favicons-Bot

2024-03-22 05:38:31

20.191.45.212 | 40.88.21.235

Search Engine: desired, everything okay

exit-01.tor.r0cket.net

2024-10-11 16:57:22+02:00

45.84.107.198

Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML like Gecko) Chrome/48.0.2564.109 Safari/537.36

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. Does not follow its own robots.txt rules

Embarcadero

2024-10-04 19:28:10+02:00

165.154.201.75

Embarcadero URI Client/1.0

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. Is this a Client oer Bot-Server?

IP-Lookup 165.154.201.75 Name: Scloud Pte Ltd. Country: SG-Singapore

Exabot

2024-09-28 10:56:27+02:00

185.65.135.162

Mozilla/5.0 (compatible Exabot/3.0 http://www.exabot.com/go/robot)

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. The domain name exabot.com is for sale.

everyfeed-spider

2024-04-04 17:27:09+02:00

193.222.96.142

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. The domain name EVERYFEED.COM is for sale.

ev-crawler

2024-04-25 11:16:15+02:00

holmavik.core.headline.com

No information, could not verify why data is being collected.

fastwebserver

2024-09-07 04:51:06+02:00

vps2406078.fastwebserver.de

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/91.0.4472.114 Safari/537.36 Edg/91.0.864.54

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected.

IP-Lookup 62.141.44.236 Name: Internet No DNS data found. Country: ./.

FiliBot

filiBot is the generic name for Fili's SEO crawler - Desired, everything okay. Own website SEO.

Friendica

2024-06-11 11:42:17+02:00

anonsys.net

Pages shared on social media must be scanned by the service.

Flipboard

2024-03-30 12:17:55

ec2.….compute-1.amazonaws.com

Is a content discovery app that indexes news web content.

Facebook-Crawler

2024-08-18 09:22:51+02:00

facebookexternalhit, facebookcatalog, meta-externalagent

fwdproxy-ncg-….fbsv.net

Pages shared on social media must be scanned by the service.

Friendly_Crawler // FriendlyCrawler

2024-05-10 18:23:37+02:00

Friendly_Crawler/Nutch-1.20-SNAPSHOT // FriendlyCrawler/1.0

...us-west-2.compute.amazonaws.com

No information, could not verify why data is being collected. (Two crawlers: Note underscore)

fedistatsCrawler/1.0

2024-06-24 07:48:38+02:00

164.92.220.26

Search for hashtags of trending topics and articles on Mastodon. It's a shame there's no information on the web crawler.

faviconkit.com

2024-07-28 19:40:49+02:00

ec2-…-…-…-….us-west-1.compute.amazonaws.com

Crawls for favicon to offer it for a fee on your website.

GeedoProductSearch

2024-03-20 13:17:32

product-search-83-99-151-68.geedo.com

Scans online stores to find products.

GoogleBot

crawl-xx-xxx-xx-x.googlebot.com

Search Engine: desired, everything okay

Googlebot-Video/1.0

2024-11-04 22:57:37

crawl-xx-xxx-xx-x.googlebot.com

Search Engine: desired, everything okay

Googlebot-Image

crawl-xx-xxx-xx-x.googlebot.com

Search Engine: desired, everything okay

Google-InspectionTool

2024-03-25 14:32:59

Self-started / queried, everything okay. Own website SEO.

Google-Safety

2024-06-17 14:50:25+02:00

rate-limited-proxy-66-249-92-23.google.com

Self-started / queried, everything okay. Link Check for YouTube.

Google-Structured-Data-Testing-Tool

2024-04-18 14:31:37+02:00

google-proxy-66-249-83-118.google.com

Self-started / queried, everything okay. Own website SEO.

GrapeshotCrawler (Oracle) [UK]

2024-02-21 08:38:25

152.67.137.35

Advertising platform supports advertisers in placing contextual advertising and affiliate programs on websites

GPTBot

2024-05-11 03:31:49+02:00

20.171.206.41

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.1; +https://openai.com/gptbot

GPTBot is OpenAI's web crawler: desired, for me everything okay.

Go-http-client

2024-09-07 04:51:06+02:00

36.182.49.83

Go-http-client/1.1

UNKNOWN: Causes a lot of Errors 404 NOT FOUD. There is no information about the crawler. Could not verify why data is being collected.

IP-Lookup 36.182.49.83 Net Name: CMNET. Country: CN

NEW IANA-Crawler

2024-11-04 22:39:19

191.107.250.11

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/126.0.0.0 Safari/537.36

IP-Adresse IANA.org EU. The IANA.org (Internet Assigned Numbers Authority) functions coordinate the Internet’s globally unique identifiers, and are provided by Public Technical Identifiers, an affiliate of ICANN. It's a shame there's no information on the web crawler. Scans only index.php. Does not follow its own robots.txt rules

InternetMeasurement

2024-08-26 10:34:02+02:00

congratulated.monitoring.internet-measurement.com, lionhearted.monitoring.internet-measurement.com

Security services, simple data free

IonCrawl

2024-02-19 11:54:02

Collects data only for your own company.

internet-census

2024-07-15 21:17:53+02:00

zl-ams-nl-gp1-wk117b.internet-census.org

No imprint, no address. Collects safety-relevant!

LinkedInBot

2024-04-08 16:13:52+02:00

108-174-5-113.fwd.linkedin.com

Pages shared on social media must be scanned by the service.

LivelapBot

2024-03-30 12:17:55

Is a content discovery app that indexes news web content.

linkfluence

2024-06-11 12:29:07+02:00

54.39.177.173

Tracking news, social media, SEO projects

MetaJobBot

2024-10-15 02:17:32+02:00

crc.metajobs.de

Mozilla/5.0 (compatible MetaJobBot https://www.metajob.de/crawler)

Search Engine for Jobs: desired, everything okay

Mediumbot-MetaTagFetcher

2024-06-17 14:53:06+02:00

ec2-…-…-…-….eu-west-1.compute.amazonaws.com

Pages shared on social media must be scanned by the service.

MJ12bot

2024-04-08 12:40:47+02:00

ns3088854.ip-217-182-175.eu

Blocked due to too frequent visits. SEO Company that only collects data for its own customers. You can only use your own data for a fee. According to its own statement, it does not store any web content or personal data. Only link relationships between websites are shown.

🐘 Mastodon

2024-03-30 12:17:54

static.254.9.130.94.clients.your-server.de // ip250.ip-51-68-203.eu // mail.mls20.de // mx.zvcdn.de // neuland.social // gamma.ohai.is // ns31628207.ip-57-128-95.eu // vps-a39c1e80.vps.ovh.net

Pages shared on social media must be scanned by the service.

msnBot

msnbot-xx-xxx-xxx-xxx.search.msn.com

Search Engine: desired, everything okay

Nextdoorbot

2024-06-17 15:02:41+02:00

ec2-…-…-…-….compute-1.amazonaws.com

Pages shared on social media must be scanned by the service.

netEstate NE Crawler

2024-06-08 00:06:04+02:00

81.209.177…

Website directory. Does not follow rules!

Logo naver NAVER

2024-08-18 09:22:08+02:00

crawl.xxx-xxx-xx-xxx.web.naver.com

Yeti/1.1 +https://naver.me/spd

Search Engine: desired, everything okay

OAI-SearchBot

2024-09-17 05:29:13+02:00

20.42.10.179

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; OAI-SearchBot/1.0; +https://openai.com/searchbot

OAI-SearchBot is used to link to and surface websites in search results in the SearchGPT prototype, and OpenAI search features.: desired, for me everything okay.

Pleroma

2024-03-30 12:17:55

static.100.2.21.65.clients.your-server.de

Pages shared on social media must be scanned by the service.

paloaltonetworks

2024-07-20 11:24:04+02:00

198.235.24.122

Expanse a Palo Alto Networks company searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans please send IP addresses/domains to: scaninfo@paloaltonetworks.com

pbiaas.com

2024-02-15 17:04:12

ip85.215.186.x.pbiaas.com

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)

No website, could not verify why data is being collected.

peer39_crawler [US]

2024-02-21 08:38:25

204.15.208.26

Advertising platform supports advertisers in placing contextual advertising and affiliate programs on websites

Pinterest

2024-03-22 10:08:40

crawl-54-236-1-13.pinterest.com

Pages shared on social media must be scanned by the service.

proximic

2024-04-23 13:22:49+02:00

ec2-52-90-218-67.compute-1.amazonaws.com

Advertising: Tracking websites and customers. „Comscore's contextual content analysis enables advertising partners to determine the best matching campaign for a page's content.“

SiteCheckerBotCrawler

2024-10-17 12:23:39+02:00

static.88-99-92-200.clients.your-server.de

SiteCheckerBotCrawler/1.0 (+http://sitechecker.pro)

Self-started / queried, everything okay. Own website SEO.

Scrapy

2024-09-26 16:54:38+02:00

ec2-18-225-11-90.us-east-2.compute.amazonaws.com

Scrapy/2.5.1 (+https://scrapy.org)

Could not verify why data is being collected. Does not follow robots.txt rules. An open source and collaborative framework for extracting the data you need from websites.

Screaming Frog SEO Spider

2024-06-17 09:01:45+02:00

unn-185-156-…-….datapacket.com

SEO Company that only collects data for its own customers. There is free access for hobby users and beginners. This way you can check your own data.

🔍 SeznamBot

fulltextrobot-77-75-78-164.seznam.cz // 74.114.154.xxx

Search Engine: desired, everything okay

Schema-Markup-Validator

2024-04-18 14:31:33+02:00

google-proxy-66-249-83-116.google.com

Self-started / queried, everything okay. Own website SEO.

serpstatbot

2024-04-24 04:03:05+02:00

static.25.67.76.144.clients.your-server.de

SEO Company that only collects data for its own customers. You can only use your own data for a fee.

SemrushBot

2024-02-16 16:48:35

xxx.bl.bot.semrush.com

SEO Company that only collects data for its own customers. You can only use your own data for a fee.

Seobility

SEO Company: Three queries are permitted free of charge each day. This way you can check your own data.

SerendeputyBot

2024-06-11 11:53:32+02:00

23-239-8-56.ip.linodeusercontent.com

Pages shared on Newsportal must be scanned by the service.

Spider_Bot

2024-05-18 07:19:41+02:00

li965-236.members.linode.com

UNKNOWN: There is no information about the crawler.

TrendictionBot

2024-03-30 12:18:20

p15n14

Tracking news, social media, SEO projects

t3versions

2024-03-25 02:56:52

The t3versions bot will save the domain name of a website to a database, if the website has been identified to use TYPO3 CMS. So you don't need to scan our website.

Twitter

2024-03-24 23:14:22

r-xxx-xx-xxx-xxx.twttr.com

Pages shared on social media must be scanned by the service.

Twingly Recon-Klondike

2024-03-30 12:17:56

ec2-34-241-108-107.eu-west-1.compute.amazonaws.com

Pages shared on social media must be scanned by the service.

URLSuMa

2024-02-15 17:04:12

ip85.215.186.233.pbiaas.com

No SSL website, no idea what data is collected and for what.

versanet

2024-02-15 17:04:12

i5387BC22.versanet.de

Mozilla/5.0 (Macintosh Intel Mac OS X 10_11_1) AppleWebKit/601.2.4 (KHTML like Gecko) Version/9.0.1 Safari/601.2.4 facebookexternalhit/1.1 Facebot Twitterbot/1.0

There is no information about the crawler. Could not verify why data is being collected.

WebSuseBot

2024-11-14 05:01:48

85.215.99.19

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko compatible WebSuseBot/1.0 +https://www.WebSuse.de/bot.html) Chrome/70.0.3538.77 Safari/537.36

UNKNOWN: There is no information about the crawler. 1&1 IONOS Strato! As long as only the index.php is crawled, this is ok.

WebwikiBot

2024-06-17 22:38:36+02:00

de-bot.webwiki.com

Desired, everything okay

Private project: Information can be found here.

wpbot

2024-04-09 21:43:41+02:00

ec2-xx-xx-xx-xxx.us-west-2.compute.amazonaws.com

Mozilla/5.0 (compatible wpbot/1.0 +https://forms.gle/ajBaxygz9jSR8p8G9)

No website, no idea what data is collected and for what.

Xing Bot

2024-04-08 17:18:36+02:00

185.169.xxx.225

Pages shared on social media must be scanned by the service.

📂 XML Sitemaps Generator

2024-04-30 12:54:23+02:00

new1.xml-sitemaps.com // pro….pro-sitemaps.com

Self-started / queried, everything okay. Creation of sitemap.xml / sitemap.html / sitemap.xml.gz.

🔍 YaCy - Bot

Search Engine: desired, everything okay

Search Engine Software: Information can be found here.

🔍 Yandex

x-xxx-xxx-xxx.spider.yandex.com

Search Engine: desired, everything okay

IP-Addresses

81.91.173.172

2024-05-02 10:01:36+02:00

81.91.173.172

IP-Adresse DENIC eG Niederlande (DENIC-Crawler). Operation and management of the top-level domain .de. It's a shame there's no information on the web crawler.

45.129.35.105

2024-07-14 11:44:23+02:00

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)

IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot

IP-Lookup 45.129.35.105 Name: Packethub S.A.; Country: PANAMA

102.129.145.83

2024-09-06 22:34:25+02:00

Mozilla/5.0 (X11 Linux x86_64) AppleWebKit/537.36 (KHTML like Gecko) HeadlessChrome/122.0.6261.94 Safari/537.36

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected.

IP-Lookup 102.129.145.83 Name: Internet Utilities Africa (PTY) LTD Country: ZA

108.165.237.77

2024-09-06 09:43:20+02:00

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/126.0.0.0 Safari/537.36

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected.

IP-Lookup 108.165.237.77 Name: IPXO Limited Country: US

119.29.53.223

2024-09-08 01:38:34+02:00

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/127.0.0.0 Safari/537.36

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected.

IP-Lookup 119.29.53.223 Name: TencentCloud Country: HK

151.248.3.19

2024-10-04 07:18:38+02:00

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)

IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot

IP-Lookup 151.248.3.19 Name: Internetbolaget; Country: SWEDEN - Change Name to internetnord.se

151.248.4.7

2024-07-14 11:44:02+02:00

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)

IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot

IP-Lookup 151.248.4.7 Name: Internetbolaget; Country: SWEDEN

194.110.115.43

2024-07-14 11:44:02+02:00

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)

IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot

IP-Lookup 194.110.115.43 Name: M247 Brussels NOC; Country: BELGIUM

spider - crawler - bot

1and1.org

2024-02-19 11:54:02

Collects data only for your own company.

2ip bot/1.1 (+http://2ip.io)

2024-04-07 13:26:38+02:00

vm1564919.stark-industries.solution

Tool's for „Speed test“, „IP address“ and much more. Normally only the home page is called

Please select the first letter of the topic area.

Blocked Everything okay under observation Infos

Crawler, Spider, Bot
  • Identifies itself

    Example: +https://example.com/bot.html spider@example.com

  • Only downloads the static, textual content
  • Honors the rules of a robots.txt
  • Doesn't execute JavaScript to generate Ad Impressions or Views
  • Crawls at a slow rate by default

    robots.txt ⇒ Crawl-delay: 60

    However, you should note that this also limits the number of pages that the search engines can index or update. A crawl delay of 60 seconds, for example, means that only 1,440 pages can be indexed per day, per bot, spider, crawler.

  • How it works!?

    The HTML code with -rel="nofollow"- and the link text with -style="display:none!important;"- are incorporated into the legal links. rel="nofollow" means that the search engines should not follow this link. style="display:none!important; means that the HTML code is not displayed. However, some search engines do not adhere to this!

    Source code

    add to robots.txt

    User-agent: * Allow: * Disallow: evil.php

    add to footer

    You can not see this, on your page

    <p><a href="evil.php" rel="nofollow"><img src="//novis-itineribus.de/graphics/icon/spinne.svg" alt="spinne" style="display:none!important;" loading="lazy" width="48" height="48"></a></p>

    OR: You can see this, on your page

    <p><a href="evil.php" rel="nofollow"><img src="//novis-itineribus.de/graphics/icon/awards/no-bad-bot.svg" alt="spinne" loading="lazy" width="40" height="13"></a></p>

    NEW PAGE: "evil.php"

    <!DOCTYPE HTML> <html> <?php header('HTTP/1.1 200 OK'); header('Content-Type: text/html; charset=utf-8'); ini_set('display_errors', 1); error_reporting(E_ALL & ~E_NOTICE); ?> <head> <meta name="robots" content="none"> <php $path=$_SERVER['DOCUMENT_ROOT']; $root='https://'.$_SERVER['SERVER_NAME']; $filename=$path.'/data/bot.csv'; if (isset($_SERVER['HTTP_REFERER']))$herkunft=$_SERVER['HTTP_REFERER'];else $herkunft='false'; $REMOTE_HOST=$_SERVER['REMOTE_HOST'] ?? $_SERVER['REMOTE_ADDR']; $hostname = gethostbyaddr($REMOTE_HOST); $cookiesSet = implode("~", array_keys($_COOKIE)); $queryString=substr($_SERVER['QUERY_STRING'], 0, 20); $cookiesSet=substr($cookiesSet, 0, 30); $cookiesSet=str_replace("~", "<br>", $cookiesSet); $var=date(DATE_ATOM); $var=str_replace('T', ' ',$var); $var=str_replace('+01:00', '',$var);$var=str_replace('+02:00', '',$var); $array = array ( $var , $_SERVER['REMOTE_ADDR'] , $hostname , $_SERVER['HTTP_USER_AGENT'] , $queryString , $herkunft , $cookiesSet , 'new_line~' ); $evil=implode("~", $array); if (!isset($cookiesSet))file_put_contents($filename, $evil , FILE_APPEND ); ?> </head> <body> <?php echo'<a href="'.$root.'/">HOME</a>'; if (file_exists($filename)){ echo '<table border="0" cellspacing="0" cellpadding="5" width="100%" class="csvTable">'; echo '<tr style="color:#f7f6f5; background-color:#2196F3;"><td>TIME</td><td>IP REMOTE</td><td>IP HOST</td><td>AGENT</td><td>PORT</td><td>QUERY STRING</td><td>HERKUNFT</td></tr>'; $handle = fopen($filename, 'r'); $start = 0; while (($data = fgetcsv($handle, 1000, "~")) !== FALSE) { echo '<tr>'; for ( $x = 0; $x < count($data); $x++) { if ($data[$x]=='new_line')echo'</tr><tr>'; else echo '<td>'.$data[$x].'</td>' . "\n"; } $start++; echo '</tr>' . "\n"; } fclose($handle); echo '</table>'; } else echo'<p>The spider hasn&apos;t caught anything yet!</p>'; ?> </body> </html>

    Ban Code

    PHP

    On all Pages

    <!DOCTYPE HTML> <?php if(!empty($_SERVER['HTTP_USER_AGENT']) and preg_match('/Mb2345Browser|peer39_crawler|dataprovider|Dmbot|Grapeshot|IonCrawl|URLSuMaBot|Semrush|LieBaoFast|zh-CN|MicroMessenger|zh_CN|Kinza|MJ12bot|AhrefsBot|Bytespider/i',$_SERVER['HTTP_USER_AGENT'])) { header('HTTP/1.0 403 Forbidden'); die('<h1>Error 403 Forbidden</h1><h2><a href="/evil.php">Spider Trap - Spinnenfalle</a></h2><p>[EN] Access not allowed</p><p>[DE] Zugriff nicht erlaubt</p><p>[DA] Adgang ikke tilladt</p>'); header('X-Robots-Tag: none'); }else header('HTTP/1.1 200 OK'); ?> <html> <head> ... <head>

    OR, not AND

    .htaccess

    # ban code [USER AGENT] <IfModule mod_rewrite.c> RewriteCond %{HTTP_USER_AGENT} (base64_decode|eval) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (DataForSeoBot|MJ12bot|wpbot|Friendly_Crawler|Bytespider|peer39_crawler|paloaltonetworks|dataprovider|Dmbot|Grapeshot) [NC] RewriteRule .* - [F] </IfModule>

    [HTTP_USER_AGENT] which are blocked via .htaccess

    Some [HTTP_USER_AGENT] are automatically blocked by the 8G firewall

    base64_decode || bin/bash || disconnect || eval || unserializ || ahrefs || archiver || curl || libwww-perl || pycurl || scan || wget || acapbot || acoonbot || alexibot || asterias || attackbot || backdorbot || becomebot || binlar || blackwidow || blekkobot || blexbot || blowfish || bullseye || bunnys || butterfly || careerbot || casper || checkpriv || cheesebot || cherrypick || chinaclaw || choppy || clshttp || cmsworld || copernic || copyrightcheck || cosmos || crescent || cy_cho || datacha || demon || diavol || discobot || dittospyder || dotbot || dotnetdotcom || dumbot || econtext || emailcollector || emailsiphon || emailwolf || eolasbot || eventures || extract || eyenetie || feedfinder || flaming || flashget || flicky || foobot || fuck || g00g1e || getright || gigabot || go-ahead-got || gozilla || grabnet || grafula || harvest || heritrix || httracks? || icarus6j || jetbot || jetcar || jikespider || kmccrew || leechftp || libweb || liebaofast || linkscan || linkwalker || loader || lwp-download || majestic || masscan || miner || mechanize || mj12bot || morfeus || moveoverbot || netmechanic || netspider || nicerspro || nikto || ninja || nominet || nutch || octopus || pagegrabber || petalbot || planetwork || postrank || proximic || purebot || queryn || queryseeker || radian6 || radiation || realdownload || remoteview || rogerbot || scan || scooter || seekerspid || semalt || siclab || sindice || sistrix || sitebot || siteexplorer || sitesnagger || skygrid || smartdownload || snoopy || sosospider || spankbot || spbot || sqlmap || stackrambler || stripper || sucker || surftbot || sux0r || suzukacz || suzuran || takeout || teleport || telesoft || true_robots || turingos || turnit || vampire || vikspider || voideye || webleacher || webreaper || webstripper || webvac || webviewer || webwhacker || winhttp || wwwoffle || woxbot || xaldon || xxxyy || yamanalab || yioopbot || youda || zeus || zmeu || zune || zyborg
     

    Images for you

    [DA] Højreklik ➛ gem billede som...

    [DE] Rechte Maustaste ➛ Bild speichern unter...

    [EN] Right click ➛ save image as...

    No Bad-Bot Spider empty Unicode® HTML-Code Symbol
    |No Bad-Bot| |Spider| |leer| 🕷 |&#128375;| 🕸 |&#128376;|


    License and copyright notice for SPIDER-TRAP


    🕸 Websites with Spider-Trap

    You have installed Spider-Trap and would like to be listed here? Write your information in our discussion forum.

    Orthopädie Schutechnik Meisterbetrieb Risse e.K. - Inhaber Kuin Hasasov

    SpassAmVerreisen.de :: Hier beginnt der Urlaub schon beim Buchen

    Business View Photo Ag, Ihre Digital Marketing Agentur

    GOOGLE Street View | trusted

    Bernhard Mennemeier - Fahrradgeschäft in Waltrop

    mtandao ist Suaheli und bedeutet Netzwerk




    📧 Wir bieten Ihnen ein hohes Maß an Sicherheit. So nutzten wir grundsätzlich das Hypertext Transfer Protocol Secure (HTTPS Schloss HTTPS - sicheres Hypertext-Übertragungsprotokoll), um die Kommunikation über das Internet zu schützen. Beachten Sie aber bitte, dass die Kommunikation über das Internet (E Mails eingeschlossen) nicht völlig sicher ist, und dass Personen, die nicht zu unserem Umfeld gehören, übermittelte Informationen abfangen und auf sie zugreifen könnten. Desweiteren werden unsere Server regelmäßig auf Unregelmäßigkeiten wie Viren, Phishing, gefährliche Downloads etc., durch Google und TrustedSite überprüft.


    (4) Quellcode (HTML, JAVA, C, Batch ...): Fragen & Hilfe erhalten Sie in unserem Diskussionsforum oder FAQ. | Hinweis: Wir behalten uns das Recht vor, jederzeit änderungen vorzunehmen, zusätzliche Informationen einzupflegen, oder vorhandene zu Löschen. | Eine Garantie kann nicht gegeben werden, wir schließen jedwede Ersatzansprüche aus!


    Die Angaben auf dieser Webseite entsprechen dem Stand der Technik sowie Erfahrungen. Bei der Vielfalt der Anwendungsmöglichkeiten und der technischen Gegebenheiten können sie lediglich Hinweise auf Anwendungen geben und sind nicht auf jeden Einzelfall voll übertragbar, daher können daraus keine Verbindlichkeiten, Haftungs- und Gewährleistungsansprüche abgeleitet werden.


    (●) de Alle Angaben erfolgen ohne Gewähr. fr Toutes ces dates sont données sans garantie. en All these dates are given without guarantee. dk Alle oplysninger gives uden garanti. se All information tillhandahålls utan garanti. NO All informasjon gis uten garanti. nl Alle informatie wordt zonder garantie verstrekt. eo Ĉiuj informoj estas provizitaj sen garantio. es Toda la información se proporciona sin garantía. lb All Informatioun gëtt ouni Garantie geliwwert. fy Alle ynformaasje wurdt levere sûnder garânsje. gd Tha a h-uile fiosrachadh air a thoirt seachad gun ghealladh. pt Todas as informações são fornecidas sem garantia. pl Wszystkie informacje są dostarczane bez gwarancji.



    Do you like the article?

    Would you like to save this or forward it?




     
    x

    5692

    en