Warning: Undefined variable $device in /var/www/vhosts/jpfbb1p9.web6.alfahosting-server.de/novis-itineribus.de/.include/start.inc on line 228
🕷 Nice and Devil - the Spider Trap 🕷 Schön und teuflisch - die Spinnenfalle 🕷 Nice and Devil - Edderkoppefælden 🕷 Nice en Devil - de spinnenval 🕷 Nice and Devil - the Spider Trap 🕷 Nice and Devil - Spindelfällan 🕷 Nice and Devil - Pułapka na pająki 🕷 Nice and Devil - Edderkoppfellen 🕷 Nice and Devil - la Araneo-Kaptilo 🕷 Nice an Däiwel - de Spider Trap 🕷 Bonito y diablo - la trampa de la araña 🕷 Nice and Devil - Hämähäkkiloukku 🕷 Gentil et Diable - le piège à araignées 🕷 Bello e diavolo: la trappola per ragni
whatsapp e-mail dancenter TripAdvisor GOOGLE
Homepage Server-Status Test-Site EVIL.php IP-Lookup

Internet Security by novis itineribus - Cottage 'zantbaŋk

🕷 Nice and Devil - the Spider Trap

This is a trap for unwanted crawlers and spiders.

NICE.php & EVIL.php

Bots that visit our site can be found under EVIL.php

Captured Bots/Crawlers/Spiders with Spider-Trap 
Gefangene Bots/Crawler/Spiders mit der Spinnenfalle
 
EVIL.php

avast.com

2025-04-29 08:06:03

vpn-gw-prod-007.sin0-gcl.ff.avast.com

92.223.86.0/24 
Start IP: 92.223.86.0
End IP: 92.223.86.255
 

python-httpx/0.28.1

UNKNOWN: There is no information about the crawler. G-Core Labs S.A.. 2 Rue Edmond Reuter, 5326 Contern, Luxembourg

Alibaba NOC

2025-04-23 00:47:08

47.74.0.0/15, 47.76.0.0/14, 47.80.0.0/13 
Start IP: 47.74.0.0
End IP: 47.87.255.255
 

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.43

UNKNOWN: There is no information about the crawler.

Logo WordPress   Automattic Analytics Crawler

2025-02-24 03:44:56

192.0.64.0/18

Automattic Analytics Crawler/0.2 http://wordpress.com/crawler/

Only index.php: The crawler loaded the public homepages, and checked which CMS was in use to generate statistics for comparison. The project was retired in 2014 and no longer runs. ❓

alfahosting-server.de

2025-01-02 14:35:52

185.3.235.245

web6.alfahosting-server.de

website crawler, from the hosting provider, to check whether malware is installed somewhere

archive.org_bot

2024-12-20 11:37:22

crawl346.us.archive.org

Mozilla/5.0 (compatible archive.org_bot +http://archive.org/details/archive.org_bot) Zeno/76f39f7 warc/v0.8.53

The Internet Archive is a nonprofit digital library that preserves web data and makes it available for research purposes through the Wayback Machine. We began archiving the web in 1996, and currently have preserved over 150 billion web documents.

Amazonbot

2025-04-11 05:50:21

xxx.xxx.xxx.xxx.crawl.amazonbot.amazon

34.192.0.0/10 
Start IP: 34.192.0.0
End IP: 34.255.255.255
 

Mozilla/5.0 (Macintosh Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML like Gecko) Version/8.0.2 Safari/600.2.5 (Amazonbot/0.1 +https://developer.amazon.com/support/amazonbot)
Mozilla/5.0 AppleWebKit/537.36 (KHTML like Gecko compatible Amazonbot/0.1 +https://developer.amazon.com/support/amazonbot) Chrome/119.0.6045.214 Safari/537.36

Amazonbot is Amazons web crawler, such as enabling Alexa to answer even more questions for customers. Amazonbot respects standard robots.txt rules.

AhrefsBot

2024-05-11 03:49:18

164.90.225.193

SEO Company that only collects data for its own customers. You can only use your own data for a fee.

AppleBot

2025-03-20 03:48:40

17.0.0.0/8 
Start IP: 17.0.0.0
End IP: 17.255.255.255
 

17-241-xxx-xxx.applebot.apple.com

Mozilla/5.0 (Macintosh Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML like Gecko) Version/17.4 Safari/605.1.15 (Applebot/0.1 +http://www.apple.com/go/applebot)

Search Engine: desired, everything okay

Akkoma

2025-02-24 14:00:23

loquat.unboiled.info // vps-aefd39af.vps.ovh.net

Akkoma 3.13.1 https://social.anthropi.st Akkoma 3.13.1 https://social.anthropi.st Bot

Pages shared on social media must be scanned by the service.

AwarioBot

2024-05-05 22:53:17

pot34.webmeup.com

SEO Company that only collects data for its own customers. You can only use your own data for a fee.

amazonaws

AWS Amazon Web Services, is used by many. AWS IP address ranges

BitSightBot

2025-06-23 10:05:38

185.117.225.0/24 
Start IP: 185.117.225.0
End IP: 185.117.225.255
 

Mozilla/5.0 (compatible BitSightBot/1.0)

BitSightBot ist ein Webcrawler von BitSight Technologies, einem auf Cybersicherheitsbewertungen spezialisierten Unternehmen. Dieser Bot scannt und sammelt Daten von Websites, um deren Sicherheitslage zu bewerten. USA

BaiduSpider

2024-04-17 01:41:40

xxx.xxx.xxx.xxx

Crawler for the chinese search engine Baidu: desired, everything okay

bingBot

2025-03-19 19:23:56

157.54.0.0/15, 157.56.0.0/14, 157.60.0.0/16 
Start IP: 157.54.0.0
End IP: 157.60.255.255
 

msnbot-xx-xxx-xxx-xxx.search.msn.com

Mozilla/5.0 AppleWebKit/537.36 (KHTML like Gecko compatible bingbot/2.0 +http://www.bing.com/bingbot.htm) Chrome/116.0.1938.76 Safari/537.36

Search Engine: desired, everything okay

BacklinksExtendedBot

2024-07-24 13:56:41

185.170.167.18

UNKNOWN: There is no information about the crawler. (MNT By: Semrush_Net)

Barkrowler

2025-09-26 03:36:21

217.113.194.0/24, 154.54.249.0/24 
Start IP: 217.113.196.0
End IP: 217.113.196.255
Start IP: 154.54.249.0
End IP: 154.54.249.255
 

c016.babbar.eu

Mozilla/5.0+(compatible;+Barkrowler/0.9;++https://babbar.tech/crawler)

You can only use your own data for a fee. Is the same as IbouBot? Babbar.tech is operating a crawler service named Barkrowler which fuels and update our graph representation of the world wide web. This database and all the metrics we compute with are used to provide a set of online marketing and referencing tools for the SEO community.

Bytespidert

2025/02/17 11:19:02

ec2-xx-xxx-xx-xxx.ap-southeast-1.compute.amazonaws.com

Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; spider-feedback@bytedance.com) [Bytespider]

ByteDance, products and services such as TikTok, CapCut, TikTok Shop, Lark, Pico. Could not verify why data is being collected.

BLEXBot

2025-03-20 11:39:08

37.27.0.0/16 
Start IP: 37.27.0.0
End IP: 37.27.255.255
 

ninja-crawler96.webmeup.com
rondo-crawler10.blex.seranking.com
rondo-crawler12.blex.seranking.com

Mozilla/5.0 (compatible BLEXBot/1.0 +http://webmeup-crawler.com/)
Mozilla/5.0 (compatible BLEXBot/1.0 +https://help.seranking.com/en/blex-crawler)

Free backlink checker / crawler, by SEO SpyGlass, Data for Free

Logo Google G   CMS-Checker

2025-07-14 11:09:12

79.236.168.34.bc.googleusercontent.com

34.128.0.0/10 
Start IP: 34.128.0.0
End IP: 34.191.255.255
 

Mozilla/5.0 (compatible CMS-Checker/1.0 +https://example.com)

Google LLC (GOOGL-2), 1600 Amphitheatre Parkway, Mountain View, CA, 94043, US: desired, everything okay

CodaBot/1.0

2025-03-12 08:51:45

89.249.193.0/24
104.233.0.0/18
146.100.0.0/14
hyperaesthesic.gsshoppingmall.com

CodaBot/1.0

(a) Does not follow the rules of robots.txt. (b) Absolutely no information about the BOT found. IP-Adresse probably RIPE netutils.io. It's a shame there's no information on the web crawler.

CheckMarkNetwork

2025-01-01 15:43:22

ec2-52-59-196-8.eu-central-1.compute.amazonaws.com

CheckMarkNetwork/1.0 (+http://www.checkmarknetwork.com/spider.html)

CheckMark Network – Complete Brand Protection. I don't need.

CensysInspect

2024-11-15 06:22:35

162.142.125.198

Mozilla/5.0 (compatible CensysInspect/1.1 +https://about.censys.io/)

Collects safety-relevant!

companyspotter

2024-10-02 12:48:14

83.149.81.165, itbe.nl

companyspotter/2.0.0.0 (robot@companyspotter.com)

Website analysis: Find out how websites are built and what software is used. Lead generation: Find your ideal business target group through various intelligent and creative search methods.

contaboserver

2025-04-09 02:58:13

81.0.218.0/23
144.91.64.0/18
147.93.0.0/16, 147.94.0.0/15, 147.96.0.0/16 
Start IP: 147.93.0.0
End IP: 147.96.255.255
 

vmi….contaboserver.net // vmd….contaboserver.net

(A) fasthttp
(B) Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/91.0.4472.114 Safari/537.36 Edg/91.0.864.54
(C) python-requests/2.32.3
(D) Mozilla/5.0 (Macintosh Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML like Gecko) Chrome/127.0.0 Safari/537.36

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. Net Name: RIPE Network Coordination Centre OR Net Name: CONTABO

curl/8.3.0, curl/8.6.0

2024-09-24 05:50:05

internettl.org, ec2-x-x-x-x.us-east-2.compute.amazonaws.com, pool-x-x-x-x.nycmny.fios.verizon.net

Could not verify why data is being collected. Scans only index.php. IP-Lookup 13.95.133.245: Handle: MSFT - Name: Microsoft Corporation

Clickagy Intelligence Bot

2024-07-12 21:35:22

ec2-44-215-105-52.compute-1.amazonaws.com

Advertising: Tracking websites and customers. „Data intelligence company that works with leading marketers to anonymously identify and segment audiences based on their digital behavior in real time.“

CCBot

2024-04-03 03:42:33

ec2-….compute-1.amazonaws.com
18-97-14-….crawl.commoncrawl.org

CCBot/2.0 (https://commoncrawl.org/faq/)

Common Crawl is a 501(c)(3) nonprofit organization whose mission is to provide Internet researchers, companies, and individuals with a copy of the Internet, free of charge, for research and analysis purposes. Does not follow its own robots.txt rules

CookieBot

2024-03-21 16:56:14

x.x.x.x.bc.googleusercontent.com

Self-started / queried, everything okay

ClaudeBot

2024-11-24 00:11:56

ec2-x-x-x-x.us-east-2.compute.amazonaws.com

Mozilla/5.0 AppleWebKit/537.36 (KHTML like Gecko compatible ClaudeBot/1.0 +claudebot@anthropic.com)

AI/KI 
denmark Kunstig intelligens (AI)
germany Künstliche Intelligenz (KI)
language en Artificial Intelligence (AI)
france Intelligence artificielle (IA)
poland Sztuczna inteligencja (AI)
 
agent from Anthropic, could not verify why data is being collected. Does not follow its own robots.txt rules

Logo OpenAI - ChatGPT   ChatGPT-User

2025-03-19 15:58:59

20.33.0.0/16, 20.34.0.0/15, 20.36.0.0/14, 20.40.0.0/13, 20.48.0.0/12, 20.64.0.0/10, 20.128.0.0/16 
Start IP: 20.33.0.0
End IP: 20.128.255.255
 
, 23.96.0.0/13, 52.224.0.0/11 
Start IP: 52.224.0.0
End IP: 52.255.255.255
 
, 57.150.0.0/15, 57.152.0.0/13, 57.160.0.0/12, 135.232.0.0/14, 135.236.0.0/15 
Start IP: 135.232.0.0
End IP: 135.237.255.255
 
172.200.0.0/13, 172.208.0.0/13 
Start IP: 172.200.0.0
End IP: 172.215.255.255
 

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot

ChatGPT users may also interact with external applications via GPT Actions. ChatGPT-User governs which sites these user requests can be made to. It is not used for crawling the web in any automatic fashion, nor to crawl content for generative AI training.: desired, for me everything okay. See also GPTBot, OAI-SearchBot…

comscore.com/Web-Crawler

2024-04-23 13:22:49

ec2-52-90-218-67.compute-1.amazonaws.com

Advertising: Tracking websites and customers. „Comscore's contextual content analysis enables advertising partners to determine the best matching campaign for a page's content.“

☕ CoffeeCup Sitemapper

2024-12-07 13:20:39

Hostname Peer

Sitemapper: Self-started / queried, everything okay.

Dazzle BlueSky Bot

2025-05-03 14:05:10

172.64.0.0/13 
Start IP: 172.64.0.0
End IP: 172.71.255.255
 

Dazzle BlueSky Bot/1.0

Pages shared on social media must be scanned by the service. (Cloudflare)

Logo Google G   DnBCrawler-Analytics

2025-04-26 02:45:07

35.192.0.0/12 
Start IP: 35.192.0.0
End IP: 35.207.255.255
 

34.64.0.0/10 
Start IP: 34.64.0.0
End IP: 34.127.255.255
 

42.252.195.35.bc.googleusercontent.com

GOOGLE-CLOUD Engine: desired, everything okay

dataproviderBot

2024-03-11 23:11:04

crawl-149-56-160-221.dataproviderbot.com

Paid search engine. Speaks against a free internet.

DomainStatsBot

2024-06-24 16:17:29

bot.domainstats.com

SEO Company that only collects data for its own customers. You can only use your own data for a fee.

DataForSeoBot

2024-04-16 19:19:52

crawling-gateway-136-243-228-179.dataforseo.com

SEO Company that only collects data for its own customers. You can only use your own data for a fee.

DENIC-Crawler

2024-05-02 10:01:36

81.91.173.172

IP-Adresse DENIC eG Niederlande. Operation and management of the top-level domain .de. It's a shame there's no information on the web crawler.

DmBot

2024-03-07 03:15:59

UNKNOWN: There is no information about the crawler.

Download Demon/3.5.0.11

2024-08-23 09:28:59

unn-185-24-11-164.datapacket.com

UNKNOWN: There is no information about the crawler.

dynamic-*-*-*-*.*.*.pool.telefonica.de

2024-02-22 16:48:03

Test Bot / Crawler from me

🦆 DuckDuckGo Favicons-Bot

2024-03-22 05:38:31

20.191.45.212 | 40.88.21.235

Search Engine: desired, everything okay

🦆 DuckDuckGo

2025-09-24 15:40:21

14.160.0.0/11 
Start IP: 14.160.0.0
End IP: 14.191.255.255
 

static.vnpt.vn

Mozilla/5.0 (Macintosh Intel Mac OS X 10_15) AppleWebKit/605.1.15 (KHTML like Gecko) Version/17.0 DuckDuckGo/7 Safari/605.1.15

VIETNAM INTERNET CENTER - VNNIC, Vietnam Posts and Telecommunications Group. Search Engine?: desired, everything okay

Eancenter Telecom LLC

2025-03-25 22:01:40

178.218.128.0/21 
Start IP: 178.218.128.0
End IP: 178.218.135.255
 

python-requests/2.32.3

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. Trapped with a spider web, doesn't follow any rules Name: EANCENTER-TELECOM-LLC Country: US

Example3

2025-01-07 00:04:15

nl.proxy.tntcode.net

Mozilla/5.0 (compatible Example3/1.0 +https://www.example3.com/domain/novis-itineribus.de)

Web directory without links, only the texts are cloned. Links cost money. What's the point?

exit-01.tor.r0cket.net

2024-10-11 16:57:22

45.84.107.198

Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML like Gecko) Chrome/48.0.2564.109 Safari/537.36

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. Does not follow its own robots.txt rules

Embarcadero

2024-10-04 19:28:10

165.154.201.75

Embarcadero URI Client/1.0

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. Is this a Client oer Bot-Server?

IP-Lookup 165.154.201.75 Name: Scloud Pte Ltd. Country: SG-Singapore

Exabot

2024-09-28 10:56:27

185.65.135.162

Mozilla/5.0 (compatible Exabot/3.0 http://www.exabot.com/go/robot)

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. The domain name exabot.com is for sale.

everyfeed-spider

2024-04-04 17:27:09

193.222.96.142

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. The domain name EVERYFEED.COM is for sale.

ev-crawler

2025-04-23 12:05:07

holmavik.core.headline.com

64.71.128.0/18 
Start IP: 64.71.128.0
End IP: 64.71.191.255
 

Mozilla/5.0 (compatible ev-crawler/1.0 +https://headline.com/legal/crawler)

No information, could not verify why data is being collected. „Every so often, we cruise websites for publicly available information like home page content, job postings, team pages, location references. We do this because we're constantly searching the world for interesting companies, and we use technology to discover them.“ As long as only the homepage is scanned, ok

fastwebserver

2024-09-07 04:51:06

vps2406078.fastwebserver.de

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/91.0.4472.114 Safari/537.36 Edg/91.0.864.54

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected.

IP-Lookup 62.141.44.236 Name: Internet No DNS data found. Country: ./.

flyriverbot

2025-06-21 03:18:10

82.165.208.0/21 
Start IP: 82.165.208.0
End IP: 82.165.215.255
 

ns2.cacheability.com

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/91.0.4472.124 Safari/537.36 Flyriverbot/1.1 (+https://www.flyriver.com/crawler AI Content Source Check)

Web directory with ready-made Ai/Ki answers? I still don't quite understand.

FiliBot

filiBot is the generic name for Fili's SEO crawler - Desired, everything okay. Own website SEO.

Friendica

2025-02-24 13:59:24

anonsys.net // ip5b42e95a.dynamic.kabel-deutschland.de // nobody.yourvserver.net // static.58.220.201.138.clients.your-server.de // anantes-655-1-3-127.w2-9.abo.wanadoo.fr

Friendica/2024.09-rc DatabaseVersion/1576 Request/SiteInfo/1 +https://base.nospy.net
Friendica/2024.12 DatabaseVersion/1576 Request/SiteInfo/1 +https://anonsys.net
Friendica… +https://friendica.world // +https://theprancingpony.in

A Decentralized Social Network. Pages shared on social media must be scanned by the service. Change from success to under observation. You cannot see any data without registration!

FlipboardProxy

2024-03-30 12:17:55

ec2.….compute-1.amazonaws.com, proxy.flipboard.com

Mozilla/5.0 (Macintosh Intel Mac OS X 10.11 rv:49.0) Gecko/20100101 Firefox/49.0 (FlipboardProxy/1.2 +http://flipboard.com/browserproxy)

Is a content discovery app that indexes news web content. Pages shared on social media must be scanned by the service.

Facebook-Crawler

2025-02-13 13:08:15

57.141.0.0/16, 57.142.0.0/15, 57.144.0.0/14, 57.148.0.0/15, facebookexternalhit, facebookcatalog, meta-externalagent, fwdproxy-ncg-….fbsv.net

meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler)
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)

Pages shared on social media must be scanned by the service.

Friendly_Crawler // FriendlyCrawler

2024-05-10 18:23:37

Friendly_Crawler/Nutch-1.20-SNAPSHOT // FriendlyCrawler/1.0

…us-west-2.compute.amazonaws.com

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/605.1.15 (KHTML like Gecko compatible FriendlyCrawler/1.0) Chrome/120.0.6099.216 Safari/605.1.15

No information, could not verify why data is being collected. (Two crawlers: Note underscore)

fedistatsCrawler/1.0

2024-12-05 19:19:11

164.92.220.26

fedistatsCrawler/1.0

Search for hashtags of trending topics and articles on Mastodon. It's a shame there's no information on the web crawler.

faviconkit.com

2024-07-28 19:40:49

ec2-…-…-…-….us-west-1.compute.amazonaws.com

Crawls for favicon to offer it for a fee on your website.

GeedoProductSearch

2025-03-17 09:20:20

83.99.151.64/29 

Start IP: 83.99.151.64
End IP: 83.99.151.71
 

product-search-83-99-151-….geedo.com

Mozilla/5.0 AppleWebKit/537.36 (KHTML like Gecko compatible GeedoProductSearch +http://www.geedo.com/product-search.html) Chrome/79.0.3945.88 Safari/537.36

Scans online stores to find products.

Logo Google G   GoogleBot

66.249.64.0/19 
Start IP: 66.249.64.0
End IP: 66.249.95.255
 

crawl-xx-xxx-xx-x.googlebot.com

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)
Mozilla/5.0 (Linux Android 6.0.1 Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML like Gecko) Chrome/133.0.6943.53 Mobile Safari/537.36 (compatible Googlebot/2.1 +http://www.google.com/bot.html)

Search Engine: desired, everything okay

GoogleBot

2025-02-13 04:40:18

202.62.58.0/24-headquarter.online.com.kh

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)

It's not a Googlebot! Anyone who uses a false identity as a bot on the Internet has something to hide. So beware! If a company needs to conceal its identity, we also need to block it.

GoogleBot

2025-09-24 22:19:18

172.224.0.0/12 
Start IP: 172.224.0.0
End IP: 172.239.255.255
 

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)

It's not a Googlebot! Anyone who uses a false identity as a bot on the Internet has something to hide. So beware! If a company needs to conceal its identity, we also need to block it.: Akamai Technologies, Inc., NOC United States

GoogleBot

2025-03-18 19:46:33

85.215.96.0/19 
Start IP: 85.215.96.0
End IP: 85.215.127.255
 

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)

It's not a Googlebot! Anyone who uses a false identity as a bot on the Internet has something to hide. So beware! If a company needs to conceal its identity, we also need to block it.: IONOS SE/Strato GmbH, Germany

GoogleBot

2025-03-18 19:46:33

197.184.0.0/16 
Start IP: 197.184.0.0
End IP: 197.184.255.255
 

rain-197-184-98-1.rain.network

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)

It's not a Googlebot! Anyone who uses a false identity as a bot on the Internet has something to hide. So beware! If a company needs to conceal its identity, we also need to block it.: RAIN GROUP HOLDINGS (PTY) LTD, AFRINIC

Logo Google G   Google-Display-Ads-Bot

2025-01-27 05:14:11

google-proxy-74-125-215-201.google.com

Ads Engine: desired, everything okay

Logo Google G   Googlebot-Video/1.0

2024-11-04 22:57:37

crawl-xx-xxx-xx-x.googlebot.com

Search Engine: desired, everything okay

Logo Google G   Googlebot-Image

crawl-xx-xxx-xx-x.googlebot.com

Search Engine: desired, everything okay

Logo Google G   Google-InspectionTool

2024-03-25 14:32:59

Self-started / queried, everything okay. Own website SEO.

Logo Google G   Google-UserContent

2025-03-19 13:08:58

34.64.0.0/10 
Start IP: 34.64.0.0
End IP: 34.127.255.255
 
, 35.192.0.0/12, 130.211.0.0/16 
Start IP: 130.211.0.0
End IP: 130.211.255.255
 

.bc.googleusercontent.com

Mozilla/5.0 (Macintosh Intel Mac OS X 10.11 rv:52.0) Gecko/20100101 Firefox/52.0
Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/79.0.3945.130 Safari/537.36

Google Cloud Compliance, webcache…, translate…, adsense.googleusercontent.com: everything okay

Logo Google G   Google-Safety

2025-06-17 14:49:27

rate-limited-proxy-66-249-…-….google.com

Mozilla/5.0 (Linux Android 6.0.1 Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible Google-Safety +http://www.google.com/bot.html)

Google Safe Browsing better protects users from dangerous websites or corrupt download files by providing warnings. Testing our websites

Logo Google G   Google-Structured-Data-Testing-Tool

2024-04-18 14:31:37

google-proxy-66-249-83-118.google.com

Self-started / queried, everything okay. Own website SEO.

GrapeshotCrawler (Oracle) [UK]

2024-02-21 08:38:25

152.67.137.35

Advertising platform supports advertisers in placing contextual advertising and affiliate programs on websites

Logo OpenAI - ChatGPT   GPTBot

2024-05-11 03:31:49

20.160.0.0/12

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.1; +https://openai.com/gptbot

GPTBot is OpenAI's web crawler: desired, for me everything okay. See also ChatGPT-User, OAI-SearchBot…

Go-http-client

2024-09-07 04:51:06

36.182.49.83

Go-http-client/1.1

UNKNOWN: Causes a lot of Errors 404 NOT FOUD. There is no information about the crawler. Could not verify why data is being collected.

IP-Lookup 36.182.49.83 Net Name: CMNET. Country: CN

headquarter.online.com.kh

2025-01-17 10:03:52

124.248.190.47

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)

Everything is obscured here. Bot is not Google.

howBot

2024-12-27 23:59:50

business-90-187-127-97.pool2.vodafone-ip.de

Mozilla/5.0 (compatible howBot/1.0)

There is no information about the crawler. Could not verify why data is being collected.

IbouBot

2025-09-26 03:36:21

c067.ibou.io

217.113.196.0/24 
Start IP: 217.113.196.0
End IP: 217.113.196.255
 

Mozilla/5.0 (compatible IbouBot/1.0 +bot@ibou.io +https://ibou.io/iboubot.html)

Could not verify why data is being collected. Is the same as Barkrowler? Ibou.io operates a crawler service named IbouBot which fuels and updates our graph representation of the World Wide Web. This database and all the metrics are used to provide a search engine. We do not train AI models with the data.

ImagesiftBot

2025-01-26 20:07:36

64.124.8.178.available.above.net

Mozilla/5.0 (compatible ImagesiftBot +imagesift.com)

„ImageSiftBot is a web crawler that scrapes the internet for publicly available images to support our suite of web intelligence products.“ Could not verify why data is being collected. What happens to the pictures? „Our web intelligence products use this index to enable search and retrieval of similar images.“

IANA-Crawler

2024-11-04 22:39:19

191.107.250.11

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/126.0.0.0 Safari/537.36

IP-Adresse IANA.org EU. The IANA.org (Internet Assigned Numbers Authority) functions coordinate the Internet’s globally unique identifiers, and are provided by Public Technical Identifiers, an affiliate of ICANN. It's a shame there's no information on the web crawler. Scans only index.php. Does not follow its own robots.txt rules

internet-transparency.org

2025-09-16 11:34:46

CIDR: 185.151.154.0/23 
Start IP: 185.151.154.0
End IP: 185.151.155.255
 

scan01-rh.please-see-measurements.internet-transparency.org

Mozilla/5.0 (Macintosh Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML like Gecko) Chrome/133.0.0.0 Safari/537.36

Universitaet Muenster, WWU-MUENSTER-2: Forschungsprojekt Internet Transparency - Ein gemeinsames Forschungsprojekt der Westfälischen Wilhelms-Universität Münster und der Universität Twente in den Niederlanden

InternetMeasurement

2024-08-26 10:34:02

congratulated.monitoring.internet-measurement.com, lionhearted.monitoring.internet-measurement.com

Security services, simple data free

IonCrawl

2024-02-19 11:54:02

Collects data only for your own company.

internet-census

2024-07-15 21:17:53

zl-ams-nl-gp1-wk117b.internet-census.org

No imprint, no address. Collects safety-relevant!

liquidtelecom.net

2025-05-09 11:17:05

41.72.216.106.liquidtelecom.net

41.72.216.0/24 
Start IP: 41.72.216.0
End IP: 41.72.216.255
 

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)

It's not a Googlebot! Anyone who uses a false identity as a bot on the Internet has something to hide. So beware! Liquid Telecommunications Operations Limited, KENYA

LinkedInBot

2024-04-08 16:13:52

108-174-5-113.fwd.linkedin.com

LinkedInBot/1.0 (compatible Mozilla/5.0 Apache-HttpClient +http://www.linkedin.com)

Pages shared on social media must be scanned by the service.

LivelapBot

2024-12-05 18:12:48

ns367083.ip-188-165-235.eu

LivelapBot/0.2 (http://site.livelap.com/crawler)

LivelapBot can visit a page if it is shared on social media, and as part of its RSS/page crawling schedule. Livelap indexes web content and makes meta data and a link to your content available in livelap.com and in the Livelap app.

linkfluence

2024-06-11 12:29:07

Mozilla/5.0 (compatible YaK/1.0 http://linkfluence.com/ bot@linkfluence.com)

54.39.177.173

Tracking news, social media, SEO projects

MSIECrawler

2025-06-24 02:34:52

176.107.128.0/19 
Start IP: 176.107.128.0
End IP: 176.107.159.255
 

host38-147-107-176.static.arubacloud.pl

Mozilla/4.0 (compatible MSIE 5.01 Windows NT 5.0 YComp 5.0.2.6 MSIECrawler)

UNKNOWN: There is no information about the crawler. Aruba S.p.A., ITALY

UPDATE meta-externalagent

2025-04-25 10:27:04

57.141.0.0/16, 57.142.0.0/15, 57.144.0.0/14, 57.148.0.0/15 
Start IP: 57.141.0.0
End IP: 57.149.255.255
 

meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler)

Search Engine for Jobs: desired, everything okay. Meta Platforms: fb-neteng, facebook-neteng

MojeekBot

2025-02-13 08:47:28

crawl-5-102-173-71.mojeek.com

Mozilla/5.0 (compatible MojeekBot/0.11 +https://www.mojeek.com/bot.html)

Search Engine: desired, everything okay

Mediatoolkitbot

2025-01-06 16:06:48

213.186.1.154

Mediatoolkitbot (complaints@mediatoolkit.com)

SEO Company that only collects data for its own customers. You can only use your own data for a fee.

MetaJobBot

2024-10-15 02:17:32

crc.metajobs.de

Mozilla/5.0 (compatible MetaJobBot https://www.metajob.de/crawler)

Search Engine for Jobs: desired, everything okay

Mediumbot-MetaTagFetcher

2024-06-17 14:53:06

ec2-…-…-…-….eu-west-1.compute.amazonaws.com

Pages shared on social media must be scanned by the service.

MJ12bot

2024-04-08 12:40:47

194.247.172.0/23 
Start IP: 194.247.172.0
End IP: 194.247.173.255
 

ns3088854.ip-217-182-175.eu

Mozilla/5.0 (compatible MJ12bot/v1.4.8 http://mj12bot.com/)

Blocked due to too frequent visits. SEO Company that only collects data for its own customers. You can only use your own data for a fee. According to its own statement, it does not store any web content or personal data. Only link relationships between websites are shown.

🐘 Mastodon

2024-03-30 12:17:54

static.254.9.130.94.clients.your-server.de // ip250.ip-51-68-203.eu // mail.mls20.de // mx.zvcdn.de // neuland.social // gamma.ohai.is // ns31628207.ip-57-128-95.eu // vps-a39c1e80.vps.ovh.net

Pages shared on social media must be scanned by the service.

🐘 Mastodon/4.2.15 (UK)

2025-03-20T12:37:02

ip82.ip-51-77-122.eu

http.rb/5.1.1 (Mastodon/4.2.15 +https://mastodonapp.uk/) Bot

Pages shared on social media usually have to be scanned by the service, but it does not adhere to robots.txt in any way! Reactivated for testing purposes

msnBot

msnbot-xx-xxx-xxx-xxx.search.msn.com

Search Engine: desired, everything okay

monosmdom-crawler

2025-07-20 06:13:35

corvus.uberspace.de

95.143.172.0/24 
Start IP: 95.143.172.0
End IP: 95.143.172.255
 

monosmdom-crawler/0.0.1 (contact: monosm@uber.space) (codename: SuperTallSoupFleece)

UNKNOWN: There is no information about the crawler. rh-tec Business GmbH, Fuenfhausen 32, 32549 Bad Oeynhausen, Germany

Nicecrawler

2025-01-08 05:52:40

69.160.160.59

Mozilla/5.0 AppleWebKit/537.36 (KHTML like Gecko compatible Nicecrawler/1.1 +http://www.nicecrawler.com/) Chrome/90.0.4430.97 Safari/537.36
Mozilla/5.0 (X11 Linux x86_64) AppleWebKit/537.36 (KHTML like Gecko) HeadlessChrome/92.0.4515.107 Safari/537.36
Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/79.0.3945.130 Safari/537.36

IP address is from Intelium Corp. (US). The BOT info page states „Our goal is to create an image archive of the entire Internet, changing over time, to preserve it historically.“ But I can't query this.

Nextdoorbot

2024-06-17 15:02:41

ec2-…-…-…-….compute-1.amazonaws.com

Pages shared on social media must be scanned by the service.

netEstate NE Crawler

2025-02-10 02:07:53

vinsanto.netestate.de, bardolino.netestate.de

81.209.177.0/24

netEstate NE Crawler (+http://www.website-datenbank.de/)

Website directory. All outgoing links are nofollow (rel="nofollow"), Does not follow rules and robots.txt! Crawling of pages from other search engines is prohibited in the robots.txt file

Logo naver   NAVER

2024-08-18 09:22:08

crawl.xxx-xxx-xx-xxx.web.naver.com

Yeti/1.1 +https://naver.me/spd

Search Engine: desired, everything okay

Logo Google G   Orbbot

2025-06-20 17:45:43

251.205.48.34.bc.googleusercontent.com

34.4.5.0/24, 34.4.6.0/23, 34.4.8.0/21, 34.4.16.0/20, 34.4.32.0/19, 34.4.64.0/18, 34.4.128.0/17, 34.5.0.0/16, 34.6.0.0/15, 34.8.0.0/13, 34.16.0.0/12, 34.32.0.0/11 
Start IP: 34.4.5.0
End IP: 34.63.255.255
 

Mozilla/5.0 (compatible; Orbbot/1.1;)

Google Crawler: I couldn't find any information about this crawler.

Logo OpenAI - ChatGPT   OAI-SearchBot

2025-03-27 00:18:02

51.8.0.0/16 
Start IP: 51.8.0.0
End IP: 51.8.255.255
 

Mozilla/5.0 (Macintosh Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML like Gecko) Chrome/131.0.0.0 Safari/537.36 compatible OAI-SearchBot/1.0 +https://openai.com/searchbot

OAI-SearchBot is used to link to and surface websites in search results in the SearchGPT prototype, and OpenAI search features.: desired, for me everything okay. See also ChatGPT-User, GPTBot…

Logo Huawei   PetalBot

2025-04-25 11:00:31

114.119.128.0/19 
Start IP: 114.119.128.0
End IP: 114.119.159.255
 

petalbot-114-119-…-….petalsearch.com

Mozilla/5.0 (Linux Android 7.0 ) AppleWebKit/537.36 (KHTML like Gecko) Mobile Safari/537.36 (compatible PetalBot +https://webmaster.petalsearch.com/site/petalbot)
Mozilla/5.0 (compatible;PetalBot;+https://webmaster.petalsearch.com/site/petalbot)

Petal search engine and present content recommendations for the user in Huawei Assistant and AI Search services. The HUAWEI Petal Search APP is only available as an APK from the Huawei UpToDown Store. Downloads 752759

UPDATE Currently blocked in robots.txt User-agent: PetalBot Disallow: / due to too many requests. Crawler adheres to the rule.

📂 Pro-Sitemaps/1.0

2025-02-04 11:01:04

new1.xml-sitemaps.com // pro2.pro-sitemaps.com

Mozilla/5.0 (compatible Pro Sitemaps Generator pro-sitemaps.com) Gecko Pro-Sitemaps/1.0

Self-started / queried, everything okay. Creation of sitemap.xml / sitemap.html / sitemap.xml.gz.

Pandalytics/2.0

2025-01-20 23:06:46

ec2-18-203-244-213.eu-west-1.compute.amazonaws.com

Pandalytics/2.0 (https://domainsbot.com/pandalytics/)

Sales and acquisitions that only collects data for its own customers. Brand monitoring, business intelligence, data provision, name suggestion

Logo perplexity   PerplexityBot

2025-01-29 03:31:18

ISP

Mozilla/5.0 AppleWebKit/537.36 (KHTML like Gecko compatible PerplexityBot/1.0 +https://perplexity.ai/perplexitybot)

AI search engine, okay for me at first

Pleroma

2024-03-30 12:17:55

static.100.2.21.65.clients.your-server.de

Pages shared on social media must be scanned by the service.

paloaltonetworks

2024-07-20 11:24:04

198.235.24.122

Expanse a Palo Alto Networks company searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans please send IP addresses/domains to: scaninfo@paloaltonetworks.com

pbiaas.com

2025-05-02 21:52:18

ip85.215.186.xxx.pbiaas.com
ip217-160-3-xxx.pbiaas.com

217.160.3.0/24 
Start IP: 217.160.3.0
End IP: 217.160.3.255
 

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)
Mozilla/5.0 (Macintosh Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML like Gecko) Chrome/99.0.4844.84 Safari/537.36

No website, no information, could not verify why data is being collected. IONOS SE

peer39_crawler [US]

2024-02-21 08:38:25

204.15.208.26

Advertising platform supports advertisers in placing contextual advertising and affiliate programs on websites

Pinterest

2025-05-02 22:28:12

crawl-54-236-1-13.pinterest.com

54.224.0.0/11 
Start IP: 54.224.0.0
End IP: 54.255.255.255
 

Mozilla/5.0 (compatible Pinterestbot/1.0 +http://www.pinterest.com/bot.html)

Pages shared on social media must be scanned by the service. (Amazon)

proximic

2024-04-23 13:22:49

ec2-52-90-218-67.compute-1.amazonaws.com

Advertising: Tracking websites and customers. „Comscore's contextual content analysis enables advertising partners to determine the best matching campaign for a page's content.“

ResearchBot

2025-03-20 10:36:26

116.202.0.0/15 
Start IP: 116.202.0.0
End IP: 116.203.255.255
 

static.30.52.202.116.clients.your-server.de

UNKNOWN: There is no information about the crawler. APNIC: Countries: ZZ, Email: no-email@apnic.net, Phone: +00 0000 0000, Remarks: No contact information for stub records.It's a shame there's no information on the web crawler.

SEOkicks

2025-02-18 08:58:20

cp2.seokicks.de, cp3.seokicks.de

Mozilla/5.0 (compatible SEOkicks +https://www.seokicks.de/robot.html)

Backlink Analyse

star-finder.de Bot

2024-12-15 21:42:21

srv01.handelsvertreter-netzwerk.de

88.99.110.77

star-finder.de Bot

Open Graph Search Engine: desired, everything okay

SiteCheckerBotCrawler

2024-10-17 12:23:39

static.88-99-92-200.clients.your-server.de

SiteCheckerBotCrawler/1.0 (+http://sitechecker.pro)

Self-started / queried, everything okay. Own website SEO.

Scrapy

2024-09-26 16:54:38

ec2-18-225-11-90.us-east-2.compute.amazonaws.com

Scrapy/2.5.1 (+https://scrapy.org)

Could not verify why data is being collected. Does not follow robots.txt rules. An open source and collaborative framework for extracting the data you need from websites.

Screaming Frog SEO Spider

2025-01-03 00:16:48

unn-185-156-…-….datapacket.com

185.151.193.0/27 
Start IP: 185.151.193.0
End IP: 185.151.193.31
 

Screaming Frog SEO Spider/21.4

SEO Company that only collects data for its own customers. There is free access for hobby users and beginners. This way you can check your own data.

🔍 SeznamBot

2025-04-28 09:17:58

fulltextrobot-77-75-7x-xxx.seznam.cz

77.75.76.0/24 
Start IP: 77.75.76.0
End IP: 77.75.76.255
 

Mozilla/5.0 (compatible SeznamBot/4.0 +https://o-seznam.cz/napoveda/vyhledavani/en/seznambot-crawler/)

Search Engine: desired, everything okay

Logo Google G   Schema-Markup-Validator

2024-04-18 14:31:33

google-proxy-66-249-83-116.google.com

Self-started / queried, everything okay. Own website SEO.

serpstatbot

2024-04-24 04:05:05

static.25.67.76.144.clients.your-server.de

144.76.0.0/16 
Start IP: 144.76.0.0
End IP: 144.76.255.255
 

SEO Company that only collects data for its own customers. You can only use your own data for a fee. Rules are not followed, links with the attribute rel=nofollow are crawled.

SemrushBot

2024-02-16 16:48:35

xxx.bl.bot.semrush.com

SEO Company that only collects data for its own customers. You can only use your own data for a fee.

Seobility

SEO Company: Three queries are permitted free of charge each day. This way you can check your own data.

SerendeputyBot

2024-06-11 11:53:32

23-239-8-56.ip.linodeusercontent.com

Pages shared on Newsportal must be scanned by the service.

Spider_Bot

2024-05-18 07:19:41

li965-236.members.linode.com

UNKNOWN: There is no information about the crawler.

startdedicated.de

2025-03-30 22:18:32

85.25.40.0/21AS29066 
StartIP: 85.25.40.0
EndIP: 85.25.64.255
 

loft10038.startdedicated.de

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/131.0.0.0 Safari/537.36 Edg/131.0.2903.86

No website, no information, could not verify why data is being collected.

TrustedSite

2025-09-25 03:23:54

44.192.0.0/10 
Start IP: 44.192.0.0
End IP: 44.255.255.255
 

ec2-44-241-33-187.us-west-2.compute.amazonaws.com

Mozilla/5.0 (Macintosh Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML like Gecko) Chrome/138.0.0.0 Safari/537.36 TrustedSite/1.0 (+https://www.trustedsite.com/crawler)

Security software: desired, everything okay

TigerBot

2025-09-22 08:04:27

spider25.tiger.ch

193.138.212.0/22 
Start IP: 193.138.212.0
End IP: 193.138.215.255
 

Mozilla/5.0 (Windows NT 10.0 Win64 x64 rv:142.0) Gecko/20100101 Firefox/142.0 TigerBot/11.33 (see tiger.ch)

Search Engine: desired, everything okay

Timpibot

2025-02-15 09:59:20

95.214.52.0/22

Mozilla/5.0 (compatible Timpibot/0.8 +http://www.timpi.io)

You can only use your own data for a fee!

thumbnail.ws

2024-12-06 21:20:10

server-55.thumbnail.ws

Mozilla/5.0 (Windows rv:81.0) Gecko/20100101 Firefox/81.0

Thumbnail (englisch für „Minibild“ oder „Vorschaubild“) für Webseiten vorschau Bilder.

TrendictionBot

2025-01-06 16:07:48

p8n13, p11n4, p15n14, p16n20

Mozilla/5.0 (Windows NT 10.0 Win64 x64 trendictionbot0.5.0 trendiction search http://www.trendiction.de/bot please let us know of any problems web at trendiction.com) Gecko/20100101 Firefox/125.0

This bot crawls public websites, including news sites, message boards and blogs, including their comments, and collects data only for its own clients, market researchers, agencies and other web applications. Does not comply with robots.txt rules

truemetrics.de

2025-03-29 00:25:00

89.58.0.0/18AS197540 
StartIP: 89.58.0.0
EndIP: 89.58.63.255
 

webc.truemetrics.de

Mozilla/5.0 (Macintosh Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML like Gecko) Chrome/69.0.3497.100 Safari/537.36

No website, no information, could not verify why data is being collected.

t3versions

2024-03-25 02:56:52

The t3versions bot will save the domain name of a website to a database, if the website has been identified to use TYPO3 CMS. So you don't need to scan our website.

Twitter

2024-03-24 23:14:22

199.16.156.0/22
r-xxx-xx-xxx-xxx.twttr.com

Twitterbot/1.0

Pages shared on social media must be scanned by the service.

Twingly Recon-Klondike

2025-05-03 14:06:26

ec2-34-241-108-107.eu-west-1.compute.amazonaws.com

34.192.0.0/10 
Start IP: 34.192.0.0
End IP: 34.255.255.255
 

Twingly Recon-Klondike/1.0 (+https://app.twingly.com/public-docs/crawler)
Mozilla/5.0 (compatible; Twingly Recon; twingly.com)

Pages shared on social media must be scanned by the service.

undefined.hostname.localhost

2025-05-12 14:40:11

185.118.79.0/24 
Start IP: 185.118.79.0
End IP: 185.118.79.255
 

WehostGroup (airbnb Management) There is no information about the crawler. Could not verify why data is being collected.

URLSuMa

2024-02-15 17:04:12

ip85.215.186.233.pbiaas.com

No SSL website, no idea what data is collected and for what.

umbrella.com

2024-12-10 01:56:41

185.104.184.0/24 
Start IP: 185.104.184.0
End IP: 185.104.184.255
 

pu13.purple-umbrella.com

Mozilla/5.0 (Windows NT 6.1 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/74.0.3729.169 Safari/537.36

DNS server from Cisco, should be fine

Logo Google G   VelenPublicWebCrawler

2025-05-15 10:01:57

34.64.0.0/10 
Start IP: 34.64.0.0
End IP: 34.127.255.255
 

x.x.x.x.bc.googleusercontent.com

Mozilla/5.0 (compatible VelenPublicWebCrawler/1.0 +https://velen.io)

IP-Adresse Google LLC - I have not found out what and why Google used a Velen WebCrawler.

versanet

2024-02-15 17:04:12

i5387BC22.versanet.de

Mozilla/5.0 (Macintosh Intel Mac OS X 10_11_1) AppleWebKit/601.2.4 (KHTML like Gecko) Version/9.0.1 Safari/601.2.4 facebookexternalhit/1.1 Facebot Twitterbot/1.0

There is no information about the crawler. Could not verify why data is being collected.

W3C_CSS_Validator

2025-05-31 18:21:30

188.213.173.0/24 
Start IP: 188.213.173.0
End IP: 188.213.173.255
 

host251-173-213-188.serverdedicati.aruba.it
ec2-54-146-212-13.compute-1.amazonaws.com

Jigsaw/2.2.5 W3C_CSS_Validator_JFouffa/2.0

Self-started / queried, everything okay. CSS Validator.

WebSuseBot

2024-11-14 05:01:48

85.215.99.19

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko compatible WebSuseBot/1.0 +https://www.WebSuse.de/bot.html) Chrome/70.0.3538.77 Safari/537.36

UNKNOWN: There is no information about the crawler. 1&1 IONOS Strato! As long as only the index.php is crawled, this is ok.

WebwikiBot

2024-06-17 22:38:36

de-bot.webwiki.com

Desired, everything okay

Private project: z.B. Information can be found here.

wpbot

2024-04-09 21:43:41

ec2-xx-xx-xx-xxx.us-west-2.compute.amazonaws.com

Mozilla/5.0 (compatible wpbot/1.0 +https://forms.gle/ajBaxygz9jSR8p8G9)

No website, no idea what data is collected and for what.

Xing Bot

2024-04-08 17:18:36

185.169.112.225

Mozilla/5.0 (X11 Ubuntu Linux x86_64 rv:15.0) Xing Bot

Pages shared on social media must be scanned by the service.

🔍 YaCy - Bot

2024-11-20 12:47:13

HOSTNAME / IP PEER

yacybot (/global SYSTEMINFO PEER java VERSION Europe/de), yacybot (/global amd64 Windows 11 10.0 java 21.0.5 Europe/de) http://yacy.net/bot.html

Search Engine: desired, everything okay. Peer to peer Search Engine Software: Information can be found here.

Host Name: dynamic-080-171-237-160.80.171.pool.telefonica.de - Crawler is operated by us, Informationen YaCy Crawler

🔍 YaDirectFetcher

2025-08-27 02:19:31

x-xxx-xxx-xxx.spider.yandex.com

213.180.203.0/24 
Start IP: 213.180.203.0
End IP: 213.180.203.255
 

Mozilla/5.0 (compatible YaDirectFetcher/1.0 +http://yandex.com/bots)

Search Engine: desired, everything okay

🔍 Yandex

2025-05-02 20:48:52

x-xxx-xxx-xxx.spider.yandex.com

213.180.203.0/24 
Start IP: 213.180.203.0
End IP: 213.180.203.255
 

Mozilla/5.0 (compatible YandexBot/3.0 +http://yandex.com/bots)

Search Engine: desired, everything okay

IP-Addresses

20.126.197.237

2025-03-26 23:27:32

20.33.0.0/16, 20.34.0.0/15, 20.36.0.0/14, 20.40.0.0/13, 20.48.0.0/12, 20.64.0.0/10, 20.128.0.0/16 
Start IP: 20.33.0.0
End IP: 20.128.255.255
 

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/129.0.0.0 Safari/537.36 Edg/129.0.0.0

IP-Adresse Microsoft Corporation (MSFT). It's a shame there's no information on the web crawler.

45.129.35.105

2024-07-14 11:44:23

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)

IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot

IP-Lookup 45.129.35.105 Name: Packethub S.A.; Country: PANAMA

46.38.240.0/22

2025-02-20 10:07:45

v2202411240578296028.bestsrv.de

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/100.0.9811.471 Safari/537.36

UNKNOWN: There is no information about the crawler. Looks like the RIPE Crawler

81.91.173.172

2024-05-02 10:01:36

81.91.173.172

IP-Adresse DENIC eG Niederlande (DENIC-Crawler). Operation and management of the top-level domain .de. It's a shame there's no information on the web crawler.

87.120.112.231

2025-01-20 08:18:46

Mozilla/5.0 (MSIE 10.0 Windows NT 6.1 Trident/5.0)
Mozilla/5.0 (Windows NT 6.1 WOW64 rv:33.0) Gecko/20120101 Firefox/33.0
Mozilla/5.0 (Macintosh Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML like Gecko) Version/7.0.3 Safari/7046A194A

87.120.112.0/24 - No information, no website. Classified as malicious by several portals.

89.58.0.0/18

2025-02-16 10:01:01

v2202411235230295174.ultrasrv.de

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/100.0.1554.306 Safari/537.36

UNKNOWN: There is no information about the crawler. Looks like the RIPE Crawler

102.129.145.83

2024-09-06 22:34:25

Mozilla/5.0 (X11 Linux x86_64) AppleWebKit/537.36 (KHTML like Gecko) HeadlessChrome/122.0.6261.94 Safari/537.36

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected.

IP-Lookup 102.129.145.83 Name: Internet Utilities Africa (PTY) LTD Country: ZA

108.165.237.77

2024-09-06 09:43:20

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/126.0.0.0 Safari/537.36

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected.

IP-Lookup 108.165.237.77 Name: IPXO Limited Country: US

119.29.53.223

2024-09-08 01:38:34

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/127.0.0.0 Safari/537.36

UNKNOWN: There is no information about the crawler. Could not verify why data is being collected.

IP-Lookup 119.29.53.223 Name: TencentCloud Country: HK

142.93.0.0/16, 143.110.128.0/17

2025-03-27 02:04:23

142.93.0.0/16 
Start IP: 142.93.0.0
End IP: 142.93.255.255
 

143.110.128.0/17 
Start IP: 142.93.0.0
End IP: 142.93.255.255
 

Mozilla/5.0 (compatible)

IP:Address: NOC32014-ARIN DigitalOcean, LLC (DO-13) It's a shame there's no information on the web crawler.

151.248.4.7

2024-07-14 11:44:02

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)

IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot

IP-Lookup 151.248.4.7 Name: Internetbolaget; Country: SWEDEN

152.53.0.0/16

2025-02-14 17:14:43

152.53.0.0/16
202.61.192.0/18
v2202411235230296135.megasrv.de
v2202411235230295328.powersrv.de
v2202411240578296032.ultrasrv.de

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/100.0.5666.889 Safari/537.36

IP-Adresse RIPE Network Coordination Centre Niederlande. It's a shame there's no information on the web crawler.

164.90.175.221

2025-01-17 03:52:30

Mozilla/5.0 (compatible Googlebot/2.1)

IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot

IP-Lookup 151.248.4.7 Name: Internetbolaget; Country: SWEDEN

188.68.56.0/22

2025-02-16 10:01:01

v2202411235230295174.ultrasrv.de

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/100.0.1554.306 Safari/537.36

UNKNOWN: There is no information about the crawler. Looks like the RIPE Crawler

194.110.115.43

2024-07-14 11:44:02

Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)

IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot

IP-Lookup 164.90.175.221 Name: DigitalOcean; Country: US

196.251.73.0/24 
Start IP: 196.251.73.0
End IP: 196.251.73.255
 

2025-05-02 19:16:35

ns1432.ztomy.com

Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/58.0.3029.110 Safari/537.3
python-httpx/0.28.1

UNKNOWN: There is no information about the crawler.

spider - crawler - bot

1and1.org

2024-02-19 11:54:02

Collects data only for your own company.

2ip bot/1.1 (+http://2ip.io)

2024-04-07 13:26:38

vm1564919.stark-industries.solution

Tool's for „Speed test“, „IP address“ and much more. Normally only the home page is called

Please select the first letter of the topic area.

Blocked Everything okay under observation Infos

Crawler, Spider, Bot
  • Identifies itself

    Example: +https://example.com/bot.html spider@example.com

  • Only downloads the static, textual content
  • Honors the rules of a robots.txt
  • Doesn't execute JavaScript to generate Ad Impressions or Views
  • Crawls at a slow rate by default

    robots.txt ⇒ Crawl-delay: 60

    However, you should note that this also limits the number of pages that the search engines can index or update. A crawl delay of 60 seconds, for example, means that only 1,440 pages can be indexed per day, per bot, spider, crawler.


  • UPDATE  Dataset has been updated

    NEW  Data record was newly created

     

    Database: CC BY-SA 4.0 DEED 🔗 Detlev Molitor // www.molitor-eu.de 🔗

    How it works!?

    The HTML code with -rel="nofollow"- and the link text with -style="display:none!important;"- are incorporated into the legal links. rel="nofollow" means that the search engines should not follow this link. style="display:none!important; means that the HTML code is not displayed. However, some search engines do not adhere to this!

    Source code

    add to robots.txt

    User-agent: * Allow: * Disallow: evil.php

    add to footer

    You can not see this, on your page

    <p><a href="evil.php" rel="nofollow"><img src="//novis-itineribus.de/graphics/icon/spinne.svg" alt="spinne" style="display:none!important;" loading="lazy" width="48" height="48"></a></p>

    OR: You can see this, on your page

    <p><a href="evil.php" rel="nofollow"><img src="//novis-itineribus.de/graphics/icon/awards/no-bad-bot.svg" alt="spinne" loading="lazy" width="40" height="13"></a></p>

    NEW PAGE: "evil.php"

    <!DOCTYPE HTML> <html> <?php header('HTTP/1.1 200 OK'); header('Content-Type: text/html; charset=utf-8'); ini_set('display_errors', 1); error_reporting(E_ALL & ~E_NOTICE); ?> <head> <meta name="robots" content="none"> <php $path=$_SERVER['DOCUMENT_ROOT']; $root='https://'.$_SERVER['SERVER_NAME']; $filename=$path.'/data/bot.csv'; if (isset($_SERVER['HTTP_REFERER']))$herkunft=$_SERVER['HTTP_REFERER'];else $herkunft='false'; $REMOTE_HOST=$_SERVER['REMOTE_HOST'] ?? $_SERVER['REMOTE_ADDR']; $hostname = gethostbyaddr($REMOTE_HOST); $cookiesSet = implode("~", array_keys($_COOKIE)); $queryString=substr($_SERVER['QUERY_STRING'], 0, 20); $cookiesSet=substr($cookiesSet, 0, 30); $cookiesSet=str_replace("~", "<br>", $cookiesSet); $var=date(DATE_ATOM); $var=str_replace('T', ' ',$var); $var=str_replace('+01:00', '',$var);$var=str_replace('+02:00', '',$var); $array = array ( $var , $_SERVER['REMOTE_ADDR'] , $hostname , $_SERVER['HTTP_USER_AGENT'] , $queryString , $herkunft , $cookiesSet , 'new_line~' ); $evil=implode("~", $array); if (!isset($cookiesSet))file_put_contents($filename, $evil , FILE_APPEND ); ?> </head> <body> <?php echo'<a href="'.$root.'/">HOME</a>'; if (file_exists($filename)){ echo '<table border="0" cellspacing="0" cellpadding="5" width="100%" class="csvTable">'; echo '<tr style="color:#f7f6f5; background-color:#2196F3;"><td>TIME</td><td>IP REMOTE</td><td>IP HOST</td><td>AGENT</td><td>PORT</td><td>QUERY STRING</td><td>HERKUNFT</td></tr>'; $handle = fopen($filename, 'r'); $start = 0; while (($data = fgetcsv($handle, 1000, "~")) !== FALSE) { echo '<tr>'; for ( $x = 0; $x < count($data); $x++) { if ($data[$x]=='new_line')echo'</tr><tr>'; else echo '<td>'.$data[$x].'</td>' . "\n"; } $start++; echo '</tr>' . "\n"; } fclose($handle); echo '</table>'; } else echo'<p>The spider hasn&apos;t caught anything yet!</p>'; ?> </body> </html>

    Ban Code

    PHP

    On all Pages

    <!DOCTYPE HTML> <?php if(!empty($_SERVER['HTTP_USER_AGENT']) and preg_match('/Mb2345Browser|peer39_crawler|dataprovider|Dmbot|Grapeshot|IonCrawl|URLSuMaBot|Semrush|LieBaoFast|zh-CN|MicroMessenger|zh_CN|Kinza|MJ12bot|AhrefsBot|Bytespider/i',$_SERVER['HTTP_USER_AGENT'])) { header('HTTP/1.0 403 Forbidden'); die('<h1>Error 403 Forbidden</h1><h2><a href="/evil.php">Spider Trap - Spinnenfalle</a></h2><p>[EN] Access not allowed</p><p>[DE] Zugriff nicht erlaubt</p><p>[DA] Adgang ikke tilladt</p>'); header('X-Robots-Tag: none'); }else header('HTTP/1.1 200 OK'); ?> <html> <head> … <head>

    OR, not AND

    .htaccess

    # ban code [USER AGENT] <IfModule mod_rewrite.c> RewriteCond %{HTTP_USER_AGENT} (base64_decode|eval) [NC,OR] RewriteCond %{HTTP_USER_AGENT} (DataForSeoBot|MJ12bot|wpbot|Friendly_Crawler|Bytespider|peer39_crawler|paloaltonetworks|dataprovider|Dmbot|Grapeshot) [NC] RewriteRule .* - [F] </IfModule>

    [HTTP_USER_AGENT] which are blocked via .htaccess

    Some [HTTP_USER_AGENT] are automatically blocked by the 8G firewall

    base64_decode || bin/bash || disconnect || eval || unserializ || ahrefs || archiver || curl || libwww-perl || pycurl || scan || wget || acapbot || acoonbot || alexibot || asterias || attackbot || backdorbot || becomebot || binlar || blackwidow || blekkobot || blexbot || blowfish || bullseye || bunnys || butterfly || careerbot || casper || checkpriv || cheesebot || cherrypick || chinaclaw || choppy || clshttp || cmsworld || copernic || copyrightcheck || cosmos || crescent || cy_cho || datacha || demon || diavol || discobot || dittospyder || dotbot || dotnetdotcom || dumbot || econtext || emailcollector || emailsiphon || emailwolf || eolasbot || eventures || extract || eyenetie || feedfinder || flaming || flashget || flicky || foobot || fuck || g00g1e || getright || gigabot || go-ahead-got || gozilla || grabnet || grafula || harvest || heritrix || httracks? || icarus6j || jetbot || jetcar || jikespider || kmccrew || leechftp || libweb || liebaofast || linkscan || linkwalker || loader || lwp-download || majestic || masscan || miner || mechanize || mj12bot || morfeus || moveoverbot || netmechanic || netspider || nicerspro || nikto || ninja || nominet || nutch || octopus || pagegrabber || petalbot || planetwork || postrank || proximic || purebot || queryn || queryseeker || radian6 || radiation || realdownload || remoteview || rogerbot || scan || scooter || seekerspid || semalt || siclab || sindice || sistrix || sitebot || siteexplorer || sitesnagger || skygrid || smartdownload || snoopy || sosospider || spankbot || spbot || sqlmap || stackrambler || stripper || sucker || surftbot || sux0r || suzukacz || suzuran || takeout || teleport || telesoft || true_robots || turingos || turnit || vampire || vikspider || voideye || webleacher || webreaper || webstripper || webvac || webviewer || webwhacker || winhttp || wwwoffle || woxbot || xaldon || xxxyy || yamanalab || yioopbot || youda || zeus || zmeu || zune || zyborg
     

    Images for you

    [DA] Højreklik ➛ gem billede som…

    [DE] Rechte Maustaste ➛ Bild speichern unter…

    [EN] Right click ➛ save image as…

    No Bad-Bot Spider empty Unicode® HTML-Code Symbol
    |No Bad-Bot| |Spider| |leer| 🕷 |&#128375;| 🕸 |&#128376;|


    License and copyright notice for SPIDER-TRAP


    🕸 Websites with Spider-Trap

    You have installed Spider-Trap and would like to be listed here? Write your information in our discussion forum.

    Ferienwohnung Tweer in Leutkirch im Allgäu

    Orthopädie Schutechnik Meisterbetrieb Risse e.K. - Inhaber Kuin Hasasov

    SpassAmVerreisen.de :: Hier beginnt der Urlaub schon beim Buchen

    Business View Photo Ag, Ihre Digital Marketing Agentur

    GOOGLE Street View | trusted

    Bernhard Mennemeier - Fahrradgeschäft in Waltrop

    mtandao ist Suaheli und bedeutet Netzwerk




    📧 Wir bieten Ihnen ein hohes Maß an Sicherheit. So nutzten wir grundsätzlich das Hypertext Transfer Protocol Secure (HTTPS Schloss HTTPS - sicheres Hypertext-Übertragungsprotokoll), um die Kommunikation über das Internet zu schützen. Beachten Sie aber bitte, dass die Kommunikation über das Internet (E Mails eingeschlossen) nicht völlig sicher ist, und dass Personen, die nicht zu unserem Umfeld gehören, übermittelte Informationen abfangen und auf sie zugreifen könnten. Desweiteren werden unsere Server regelmäßig auf Unregelmäßigkeiten wie Viren, Phishing, gefährliche Downloads etc., durch Google und TrustedSite überprüft.


    (4) Quellcode (HTML, JAVA, C, Batch …): Fragen & Hilfe erhalten Sie in unserem Diskussionsforum oder FAQ. | Hinweis: Wir behalten uns das Recht vor, jederzeit änderungen vorzunehmen, zusätzliche Informationen einzupflegen, oder vorhandene zu Löschen. | Eine Garantie kann nicht gegeben werden, wir schließen jedwede Ersatzansprüche aus!


    Die Angaben auf dieser Webseite entsprechen dem Stand der Technik sowie Erfahrungen. Bei der Vielfalt der Anwendungsmöglichkeiten und der technischen Gegebenheiten können sie lediglich Hinweise auf Anwendungen geben und sind nicht auf jeden Einzelfall voll übertragbar, daher können daraus keine Verbindlichkeiten, Haftungs- und Gewährleistungsansprüche abgeleitet werden.


    (●) de Alle Angaben erfolgen ohne Gewähr. fr Toutes ces dates sont données sans garantie. en All these dates are given without guarantee. dk Alle oplysninger gives uden garanti. se All information tillhandahålls utan garanti. NO All informasjon gis uten garanti. nl Alle informatie wordt zonder garantie verstrekt. eo Ĉiuj informoj estas provizitaj sen garantio. es Toda la información se proporciona sin garantía. lb All Informatioun gëtt ouni Garantie geliwwert. fy Alle ynformaasje wurdt levere sûnder garânsje. gd Tha a h-uile fiosrachadh air a thoirt seachad gun ghealladh. pt Todas as informações são fornecidas sem garantia. pl Wszystkie informacje są dostarczane bez gwarancji.



    Do you like the article?

    Would you like to save this or forward it?




     
    x

    3947558