2025-02-24 03:44:56
192.0.64.0/18
Automattic Analytics Crawler/0.2 http://wordpress.com/crawler/
Only index.php: The crawler loaded the public homepages, and checked which CMS was in use to generate statistics for comparison. The project was retired in 2014 and no longer runs. ❓
2025-01-02 14:35:52
185.3.235.245
web6.alfahosting-server.de
website crawler, from the hosting provider, to check whether malware is installed somewhere
2024-12-20 11:37:22
crawl346.us.archive.org
Mozilla/5.0 (compatible archive.org_bot +http://archive.org/details/archive.org_bot) Zeno/76f39f7 warc/v0.8.53
The Internet Archive is a nonprofit digital library that preserves web data and makes it available for research purposes through the Wayback Machine. We began archiving the web in 1996, and currently have preserved over 150 billion web documents.
2024-10-15 19:00:43+02:00
xxx.xxx.xxx.xxx.crawl.amazonbot.amazon
Mozilla/5.0 (Macintosh Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML like Gecko) Version/8.0.2 Safari/600.2.5 (Amazonbot/0.1 +https://developer.amazon.com/support/amazonbot)
Amazonbot is Amazons web crawler, such as enabling Alexa to answer even more questions for customers. Amazonbot respects standard robots.txt rules.
2024-05-11 03:49:18+02:00
164.90.225.193
SEO Company that only collects data for its own customers. You can only use your own data for a fee.
2025-01-06 02:13:39
17-241-xxx-xxx.applebot.apple.com
Mozilla/5.0 (Macintosh Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML like Gecko) Version/17.4 Safari/605.1.15 (Applebot/0.1 +http://www.apple.com/go/applebot)
Search Engine: desired, everything okay
2025-02-24 14:00:23
loquat.unboiled.info // vps-aefd39af.vps.ovh.net
Akkoma 3.13.1 https://social.anthropi.st Akkoma 3.13.1 https://social.anthropi.st Bot
Pages shared on social media must be scanned by the service.
2024-05-05 22:53:17+02:00
pot34.webmeup.com
SEO Company that only collects data for its own customers. You can only use your own data for a fee.
AWS Amazon Web Services, is used by many. AWS IP address ranges
2025-03-13 16:22:14
172.64.0.0/13
Dazzle BlueSky Bot/1.0
Pages shared on social media must be scanned by the service.
2024-04-17 01:41:40+02:00
xxx.xxx.xxx.xxx
Crawler for the chinese search engine Baidu: desired, everything okay
2025-01-29 02:39:08
msnbot-xx-xxx-xxx-xxx.search.msn.com
Mozilla/5.0 AppleWebKit/537.36 (KHTML like Gecko compatible bingbot/2.0 +http://www.bing.com/bingbot.htm) Chrome/116.0.1938.76 Safari/537.36
Search Engine: desired, everything okay
2024-07-24 13:56:41+02:00
185.170.167.18
UNKNOWN: There is no information about the crawler. (MNT By: Semrush_Net)
2024-03-27 23:21:41
154.54.249.162
SEO Company that only collects data for its own customers. You can only use your own data for a fee.
2025/02/17 11:19:02
ec2-xx-xxx-xx-xxx.ap-southeast-1.compute.amazonaws.com
Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; spider-feedback@bytedance.com) [Bytespider]
ByteDance, products and services such as TikTok, CapCut, TikTok Shop, Lark, Pico. Could not verify why data is being collected.
2024-12-18 13:19:09
ninja-crawler96.webmeup.com, rondo-crawler10.blex.seranking.com
Mozilla/5.0 (compatible BLEXBot/1.0 +http://webmeup-crawler.com/)
Free backlink checker / crawler, by SEO SpyGlass, Data for Free
2025-03-12 08:51:45
89.249.193.0/24
104.233.0.0/18
146.100.0.0/14
hyperaesthesic.gsshoppingmall.com
CodaBot/1.0
(a) Does not follow the rules of robots.txt. (b) Absolutely no information about the BOT found. IP-Adresse probably RIPE netutils.io. It's a shame there's no information on the web crawler.
2025-01-01 15:43:22
ec2-52-59-196-8.eu-central-1.compute.amazonaws.com
CheckMarkNetwork/1.0 (+http://www.checkmarknetwork.com/spider.html)
CheckMark Network – Complete Brand Protection. I don't need.
2024-11-15 06:22:35
162.142.125.198
Mozilla/5.0 (compatible CensysInspect/1.1 +https://about.censys.io/)
Collects safety-relevant!
2024-10-02 12:48:14+02:00
83.149.81.165, itbe.nl
companyspotter/2.0.0.0 (robot@companyspotter.com)
Website analysis: Find out how websites are built and what software is used. Lead generation: Find your ideal business target group through various intelligent and creative search methods.
2025-03-18 16:18:13
81.0.218.0/23
vmi….contaboserver.net // vmd….contaboserver.net
Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/91.0.4472.114 Safari/537.36 Edg/91.0.864.54
python-requests/2.32.3
Mozilla/5.0 (Macintosh Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML like Gecko) Chrome/127.0.0 Safari/537.36
UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. Net Name: RIPE Network Coordination Centre
2025-03-18 16:18:14
144.91.64.0/18
vmi….contaboserver.net // vmd….contaboserver.net
Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/91.0.4472.114 Safari/537.36 Edg/91.0.864.54
python-requests/2.32.3
Mozilla/5.0 (Macintosh Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML like Gecko) Chrome/127.0.0 Safari/537.36
UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. Net Name: CONTABO
2024-09-24 05:50:05+02:00
internettl.org, ec2-x-x-x-x.us-east-2.compute.amazonaws.com, pool-x-x-x-x.nycmny.fios.verizon.net
Could not verify why data is being collected. Scans only index.php. IP-Lookup 13.95.133.245: Handle: MSFT - Name: Microsoft Corporation
2024-07-12 21:35:22+02:00
ec2-44-215-105-52.compute-1.amazonaws.com
Advertising: Tracking websites and customers. „Data intelligence company that works with leading marketers to anonymously identify and segment audiences based on their digital behavior in real time.“
2024-04-03 03:42:33+02:00
ec2-….compute-1.amazonaws.com
18-97-14-....crawl.commoncrawl.org
CCBot/2.0 (https://commoncrawl.org/faq/)
Common Crawl is a 501(c)(3) nonprofit organization whose mission is to provide Internet researchers, companies, and individuals with a copy of the Internet, free of charge, for research and analysis purposes. Does not follow its own robots.txt rules
2024-03-21 16:56:14
x.x.x.x.bc.googleusercontent.com
Self-started / queried, everything okay
2024-11-24 00:11:56+01:00
ec2-x-x-x-x.us-east-2.compute.amazonaws.com
Mozilla/5.0 AppleWebKit/537.36 (KHTML like Gecko compatible ClaudeBot/1.0 +claudebot@anthropic.com)
AI/KI Kunstig intelligens (AI)
Künstliche Intelligenz (KI)
Artificial Intelligence (AI)
Intelligence artificielle (IA)
Sztuczna inteligencja (AI)
agent from Anthropic, could not verify why data is being collected. Does not follow its own robots.txt rules
2025-03-13 20:44:50
23.96.0.0/13, 57.150.0.0/15, 57.152.0.0/13, 57.160.0.0/12
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot
ChatGPT users may also interact with external applications via GPT Actions. ChatGPT-User governs which sites these user requests can be made to. It is not used for crawling the web in any automatic fashion, nor to crawl content for generative AI training.: desired, for me everything okay. See also GPTBot, OAI-SearchBot…
2024-04-23 13:22:49+02:00
ec2-52-90-218-67.compute-1.amazonaws.com
Advertising: Tracking websites and customers. „Comscore's contextual content analysis enables advertising partners to determine the best matching campaign for a page's content.“
2024-12-07 13:20:39
Hostname Peer
Sitemapper: Self-started / queried, everything okay.
2024-03-11 23:11:04
crawl-149-56-160-221.dataproviderbot.com
Paid search engine. Speaks against a free internet.
2024-06-24 16:17:29+02:00
bot.domainstats.com
SEO Company that only collects data for its own customers. You can only use your own data for a fee.
2024-04-16 19:19:52+02:00
crawling-gateway-136-243-228-179.dataforseo.com
SEO Company that only collects data for its own customers. You can only use your own data for a fee.
2024-05-02 10:01:36+02:00
81.91.173.172
IP-Adresse DENIC eG Niederlande. Operation and management of the top-level domain .de. It's a shame there's no information on the web crawler.
2024-03-07 03:15:59
UNKNOWN: There is no information about the crawler.
2024-08-23 09:28:59+02:00
unn-185-24-11-164.datapacket.com
UNKNOWN: There is no information about the crawler.
2024-02-22 16:48:03
Test Bot / Crawler from me
2024-03-22 05:38:31
20.191.45.212 | 40.88.21.235
Search Engine: desired, everything okay
2025-01-07 00:04:15+01:00
nl.proxy.tntcode.net
Mozilla/5.0 (compatible Example3/1.0 +https://www.example3.com/domain/novis-itineribus.de)
Web directory without links, only the texts are cloned. Links cost money. What's the point?
2024-10-11 16:57:22+02:00
45.84.107.198
Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML like Gecko) Chrome/48.0.2564.109 Safari/537.36
UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. Does not follow its own robots.txt rules
2024-10-04 19:28:10+02:00
165.154.201.75
Embarcadero URI Client/1.0
UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. Is this a Client oer Bot-Server?
IP-Lookup 165.154.201.75 Name: Scloud Pte Ltd. Country: SG-Singapore
2024-09-28 10:56:27+02:00
185.65.135.162
Mozilla/5.0 (compatible Exabot/3.0 http://www.exabot.com/go/robot)
UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. The domain name exabot.com is for sale.
2024-04-04 17:27:09+02:00
193.222.96.142
UNKNOWN: There is no information about the crawler. Could not verify why data is being collected. The domain name EVERYFEED.COM is for sale.
2024-04-25 11:16:15+02:00
holmavik.core.headline.com
No information, could not verify why data is being collected.
2024-09-07 04:51:06+02:00
vps2406078.fastwebserver.de
Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/91.0.4472.114 Safari/537.36 Edg/91.0.864.54
UNKNOWN: There is no information about the crawler. Could not verify why data is being collected.
IP-Lookup 62.141.44.236 Name: Internet No DNS data found. Country: ./.
2025-01-06 16:37:50
ns2.cacheability.com
Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/91.0.4472.124 Safari/537.36 flyriverbot/1.1 (+https://www.flyriver.com/crawler AI Content)
Web directory with ready-made Ai/Ki answers? I still don't quite understand.
filiBot is the generic name for Fili's SEO crawler - Desired, everything okay. Own website SEO.
2025-02-24 13:59:24
anonsys.net // ip5b42e95a.dynamic.kabel-deutschland.de // nobody.yourvserver.net // static.58.220.201.138.clients.your-server.de // anantes-655-1-3-127.w2-9.abo.wanadoo.fr
Friendica/2024.09-rc DatabaseVersion/1576 Request/SiteInfo/1 +https://base.nospy.net
Friendica/2024.12 DatabaseVersion/1576 Request/SiteInfo/1 +https://anonsys.net
Friendica… +https://friendica.world // +https://theprancingpony.in
A Decentralized Social Network. Pages shared on social media must be scanned by the service. Change from success to under observation. You cannot see any data without registration!
2024-03-30 12:17:55
ec2.….compute-1.amazonaws.com, proxy.flipboard.com
Mozilla/5.0 (Macintosh Intel Mac OS X 10.11 rv:49.0) Gecko/20100101 Firefox/49.0 (FlipboardProxy/1.2 +http://flipboard.com/browserproxy)
Is a content discovery app that indexes news web content. Pages shared on social media must be scanned by the service.
2025-02-13 13:08:15
57.141.0.0/16, 57.142.0.0/15, 57.144.0.0/14, 57.148.0.0/15, facebookexternalhit, facebookcatalog, meta-externalagent, fwdproxy-ncg-….fbsv.net
meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler)
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
Pages shared on social media must be scanned by the service.
2024-05-10 18:23:37+02:00
Friendly_Crawler/Nutch-1.20-SNAPSHOT // FriendlyCrawler/1.0
…us-west-2.compute.amazonaws.com
No information, could not verify why data is being collected. (Two crawlers: Note underscore)
2024-12-05 19:19:11
164.92.220.26
fedistatsCrawler/1.0
Search for hashtags of trending topics and articles on Mastodon. It's a shame there's no information on the web crawler.
2024-07-28 19:40:49+02:00
ec2-…-…-…-….us-west-1.compute.amazonaws.com
Crawls for favicon to offer it for a fee on your website.
2025-03-17 09:20:20
product-search-83-99-151-....geedo.com
Mozilla/5.0 AppleWebKit/537.36 (KHTML like Gecko compatible GeedoProductSearch +http://www.geedo.com/product-search.html) Chrome/79.0.3945.88 Safari/537.36
Scans online stores to find products.
crawl-xx-xxx-xx-x.googlebot.com
Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)
Mozilla/5.0 (Linux Android 6.0.1 Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML like Gecko) Chrome/133.0.6943.53 Mobile Safari/537.36 (compatible Googlebot/2.1 +http://www.google.com/bot.html)
Search Engine: desired, everything okay
2025-02-13 04:40:18
202.62.58.0/24-headquarter.online.com.kh
Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)
It's not a Googlebot! Anyone who uses a false identity as a bot on the Internet has something to hide. So beware!
2025-03-18 19:46:33
85.215.96.0/19
Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)
It's not a Googlebot! Anyone who uses a false identity as a bot on the Internet has something to hide. So beware! Name: IONOS SE/Strato GmbH
2025-01-27 05:14:11
google-proxy-74-125-215-201.google.com
Ads Engine: desired, everything okay
2024-11-04 22:57:37
crawl-xx-xxx-xx-x.googlebot.com
Search Engine: desired, everything okay
crawl-xx-xxx-xx-x.googlebot.com
Search Engine: desired, everything okay
2024-03-25 14:32:59
Self-started / queried, everything okay. Own website SEO.
2024-06-17 14:50:25+02:00
rate-limited-proxy-66-249-92-23.google.com
Self-started / queried, everything okay. Link Check for YouTube.
2024-04-18 14:31:37+02:00
google-proxy-66-249-83-118.google.com
Self-started / queried, everything okay. Own website SEO.
2024-02-21 08:38:25
152.67.137.35
Advertising platform supports advertisers in placing contextual advertising and affiliate programs on websites
2024-05-11 03:31:49+02:00
20.160.0.0/12
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.1; +https://openai.com/gptbot
GPTBot is OpenAI's web crawler: desired, for me everything okay. See also ChatGPT-User, OAI-SearchBot…
2024-09-07 04:51:06+02:00
36.182.49.83
Go-http-client/1.1
UNKNOWN: Causes a lot of Errors 404 NOT FOUD. There is no information about the crawler. Could not verify why data is being collected.
IP-Lookup 36.182.49.83 Net Name: CMNET. Country: CN
2025-01-17 10:03:52
124.248.190.47
Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)
Everything is obscured here. Bot is not Google.
2024-12-27 23:59:50
business-90-187-127-97.pool2.vodafone-ip.de
Mozilla/5.0 (compatible howBot/1.0)
There is no information about the crawler. Could not verify why data is being collected.
2025-01-26 20:07:36
64.124.8.178.available.above.net
Mozilla/5.0 (compatible ImagesiftBot +imagesift.com)
„ImageSiftBot is a web crawler that scrapes the internet for publicly available images to support our suite of web intelligence products.“ Could not verify why data is being collected. What happens to the pictures? „Our web intelligence products use this index to enable search and retrieval of similar images.“
2024-11-04 22:39:19
191.107.250.11
Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/126.0.0.0 Safari/537.36
IP-Adresse IANA.org EU. The IANA.org (Internet Assigned Numbers Authority) functions coordinate the Internet’s globally unique identifiers, and are provided by Public Technical Identifiers, an affiliate of ICANN. It's a shame there's no information on the web crawler. Scans only index.php. Does not follow its own robots.txt rules
2024-08-26 10:34:02+02:00
congratulated.monitoring.internet-measurement.com, lionhearted.monitoring.internet-measurement.com
Security services, simple data free
2024-02-19 11:54:02
Collects data only for your own company.
2024-07-15 21:17:53+02:00
zl-ams-nl-gp1-wk117b.internet-census.org
No imprint, no address. Collects safety-relevant!
2024-04-08 16:13:52+02:00
108-174-5-113.fwd.linkedin.com
LinkedInBot/1.0 (compatible Mozilla/5.0 Apache-HttpClient +http://www.linkedin.com)
Pages shared on social media must be scanned by the service.
2024-12-05 18:12:48
ns367083.ip-188-165-235.eu
LivelapBot/0.2 (http://site.livelap.com/crawler)
LivelapBot can visit a page if it is shared on social media, and as part of its RSS/page crawling schedule. Livelap indexes web content and makes meta data and a link to your content available in livelap.com and in the Livelap app.
2024-06-11 12:29:07+02:00
Mozilla/5.0 (compatible YaK/1.0 http://linkfluence.com/ bot@linkfluence.com)
54.39.177.173
Tracking news, social media, SEO projects
2025-02-13 08:47:28
crawl-5-102-173-71.mojeek.com
Mozilla/5.0 (compatible MojeekBot/0.11 +https://www.mojeek.com/bot.html)
Search Engine: desired, everything okay
2025-01-06 16:06:48
213.186.1.154
Mediatoolkitbot (complaints@mediatoolkit.com)
SEO Company that only collects data for its own customers. You can only use your own data for a fee.
2024-10-15 02:17:32+02:00
crc.metajobs.de
Mozilla/5.0 (compatible MetaJobBot https://www.metajob.de/crawler)
Search Engine for Jobs: desired, everything okay
2024-06-17 14:53:06+02:00
ec2-…-…-…-….eu-west-1.compute.amazonaws.com
Pages shared on social media must be scanned by the service.
2024-04-08 12:40:47+02:00
ns3088854.ip-217-182-175.eu
Blocked due to too frequent visits. SEO Company that only collects data for its own customers. You can only use your own data for a fee. According to its own statement, it does not store any web content or personal data. Only link relationships between websites are shown.
2024-03-30 12:17:54
static.254.9.130.94.clients.your-server.de // ip250.ip-51-68-203.eu // mail.mls20.de // mx.zvcdn.de // neuland.social // gamma.ohai.is // ns31628207.ip-57-128-95.eu // vps-a39c1e80.vps.ovh.net
Pages shared on social media must be scanned by the service.
2025-02-02 13:15:17
ip82.ip-51-77-122.eu
http.rb/5.1.1 (Mastodon/4.2.15 +https://mastodonapp.uk/) Bot
Pages shared on social media usually have to be scanned by the service, but it does not adhere to robots.txt in any way!
msnbot-xx-xxx-xxx-xxx.search.msn.com
Search Engine: desired, everything okay
2025-01-08 05:52:40
69.160.160.59
Mozilla/5.0 AppleWebKit/537.36 (KHTML like Gecko compatible Nicecrawler/1.1 +http://www.nicecrawler.com/) Chrome/90.0.4430.97 Safari/537.36
Mozilla/5.0 (X11 Linux x86_64) AppleWebKit/537.36 (KHTML like Gecko) HeadlessChrome/92.0.4515.107 Safari/537.36
Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/79.0.3945.130 Safari/537.36
IP address is from Intelium Corp. (US). The BOT info page states „Our goal is to create an image archive of the entire Internet, changing over time, to preserve it historically.“ But I can't query this.
2024-06-17 15:02:41+02:00
ec2-…-…-…-….compute-1.amazonaws.com
Pages shared on social media must be scanned by the service.
2025-02-10 02:07:53
vinsanto.netestate.de, bardolino.netestate.de
81.209.177.0/24
netEstate NE Crawler (+http://www.website-datenbank.de/)
Website directory. All outgoing links are nofollow (rel="nofollow"), Does not follow rules and robots.txt! Crawling of pages from other search engines is prohibited in the robots.txt file
2024-08-18 09:22:08+02:00
crawl.xxx-xxx-xx-xxx.web.naver.com
Yeti/1.1 +https://naver.me/spd
Search Engine: desired, everything okay
2025-02-26 07:47:46
51.8.0.0/16
Mozilla/5.0 (Macintosh Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML like Gecko) Chrome/131.0.0.0 Safari/537.36 compatible OAI-SearchBot/1.0 +https://openai.com/searchbot
OAI-SearchBot is used to link to and surface websites in search results in the SearchGPT prototype, and OpenAI search features.: desired, for me everything okay. See also ChatGPT-User, GPTBot…
2025-03-10 18:47:03
114.119.128.0/19
petalbot-114-119-…-….petalsearch.com
Mozilla/5.0 (Linux Android 7.0 ) AppleWebKit/537.36 (KHTML like Gecko) Mobile Safari/537.36 (compatible PetalBot +https://webmaster.petalsearch.com/site/petalbot)
Mozilla/5.0 (compatible;PetalBot;+https://webmaster.petalsearch.com/site/petalbot)
Petal search engine and present content recommendations for the user in Huawei Assistant and AI Search services. Both websites gopetal.com and petalsearch.com are currently unavailable! The HUAWEI Petal Search APP is only available as an APK from the Huawei UpToDown Store. Downloads 752759
2025-02-04 11:01:04
new1.xml-sitemaps.com // pro2.pro-sitemaps.com
Mozilla/5.0 (compatible Pro Sitemaps Generator pro-sitemaps.com) Gecko Pro-Sitemaps/1.0
Self-started / queried, everything okay. Creation of sitemap.xml / sitemap.html / sitemap.xml.gz.
2025-01-20 23:06:46
ec2-18-203-244-213.eu-west-1.compute.amazonaws.com
Pandalytics/2.0 (https://domainsbot.com/pandalytics/)
Sales and acquisitions that only collects data for its own customers. Brand monitoring, business intelligence, data provision, name suggestion
2025-01-29 03:31:18
ISP
Mozilla/5.0 AppleWebKit/537.36 (KHTML like Gecko compatible PerplexityBot/1.0 +https://perplexity.ai/perplexitybot)
AI search engine, okay for me at first
2024-03-30 12:17:55
static.100.2.21.65.clients.your-server.de
Pages shared on social media must be scanned by the service.
2024-07-20 11:24:04+02:00
198.235.24.122
Expanse a Palo Alto Networks company searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans please send IP addresses/domains to: scaninfo@paloaltonetworks.com
2024-02-15 17:04:12
ip85.215.186.x.pbiaas.com
Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)
No website, could not verify why data is being collected.
2024-02-21 08:38:25
204.15.208.26
Advertising platform supports advertisers in placing contextual advertising and affiliate programs on websites
2024-03-22 10:08:40
crawl-54-236-1-13.pinterest.com
Pages shared on social media must be scanned by the service.
2024-04-23 13:22:49+02:00
ec2-52-90-218-67.compute-1.amazonaws.com
Advertising: Tracking websites and customers. „Comscore's contextual content analysis enables advertising partners to determine the best matching campaign for a page's content.“
2025-02-18 08:58:20
cp2.seokicks.de, cp3.seokicks.de
Mozilla/5.0 (compatible SEOkicks +https://www.seokicks.de/robot.html)
Backlink Analyse
2024-12-15 21:42:21
srv01.handelsvertreter-netzwerk.de
88.99.110.77
star-finder.de Bot
Open Graph Search Engine: desired, everything okay
2024-10-17 12:23:39+02:00
static.88-99-92-200.clients.your-server.de
SiteCheckerBotCrawler/1.0 (+http://sitechecker.pro)
Self-started / queried, everything okay. Own website SEO.
2024-09-26 16:54:38+02:00
ec2-18-225-11-90.us-east-2.compute.amazonaws.com
Scrapy/2.5.1 (+https://scrapy.org)
Could not verify why data is being collected. Does not follow robots.txt rules. An open source and collaborative framework for extracting the data you need from websites.
2025-01-03 00:16:48+01:00
unn-185-156-…-….datapacket.com
Screaming Frog SEO Spider/21.3
SEO Company that only collects data for its own customers. There is free access for hobby users and beginners. This way you can check your own data.
2025-01-27 05:19:49
fulltextrobot-77-75-7x-xxx.seznam.cz // 77.75.76.0/24
Mozilla/5.0 (compatible SeznamBot/4.0 +https://o-seznam.cz/napoveda/vyhledavani/en/seznambot-crawler/)
Search Engine: desired, everything okay
2024-04-18 14:31:33+02:00
google-proxy-66-249-83-116.google.com
Self-started / queried, everything okay. Own website SEO.
2024-04-24 04:03:05+02:00
static.25.67.76.144.clients.your-server.de
SEO Company that only collects data for its own customers. You can only use your own data for a fee.
2024-02-16 16:48:35
xxx.bl.bot.semrush.com
SEO Company that only collects data for its own customers. You can only use your own data for a fee.
SEO Company: Three queries are permitted free of charge each day. This way you can check your own data.
2024-06-11 11:53:32+02:00
23-239-8-56.ip.linodeusercontent.com
Pages shared on Newsportal must be scanned by the service.
2024-05-18 07:19:41+02:00
li965-236.members.linode.com
UNKNOWN: There is no information about the crawler.
2025-02-15 09:59:20
95.214.52.0/22
Mozilla/5.0 (compatible Timpibot/0.8 +http://www.timpi.io)
You can only use your own data for a fee!
2024-12-06 21:20:10
server-55.thumbnail.ws
Mozilla/5.0 (Windows rv:81.0) Gecko/20100101 Firefox/81.0
Thumbnail (englisch für „Minibild“ oder „Vorschaubild“) für Webseiten vorschau Bilder.
2025-01-06 16:07:48
p8n13, p11n4, p15n14, p16n20
Mozilla/5.0 (Windows NT 10.0 Win64 x64 trendictionbot0.5.0 trendiction search http://www.trendiction.de/bot please let us know of any problems web at trendiction.com) Gecko/20100101 Firefox/125.0
This bot crawls public websites, including news sites, message boards and blogs, including their comments, and collects data only for its own clients, market researchers, agencies and other web applications. Does not comply with robots.txt rules
2024-03-25 02:56:52
The t3versions bot will save the domain name of a website to a database, if the website has been identified to use TYPO3 CMS. So you don't need to scan our website.
2024-03-24 23:14:22
199.16.156.0/22
r-xxx-xx-xxx-xxx.twttr.com
Twitterbot/1.0
Pages shared on social media must be scanned by the service.
2024-03-30 12:17:56
ec2-34-241-108-107.eu-west-1.compute.amazonaws.com
Pages shared on social media must be scanned by the service.
2024-02-15 17:04:12
ip85.215.186.233.pbiaas.com
No SSL website, no idea what data is collected and for what.
2024-12-10 01:56:41
pu13.purple-umbrella.com
Mozilla/5.0 (Windows NT 6.1 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/74.0.3729.169 Safari/537.36
DNS server from Cisco, should be fine
2024-02-15 17:04:12
i5387BC22.versanet.de
Mozilla/5.0 (Macintosh Intel Mac OS X 10_11_1) AppleWebKit/601.2.4 (KHTML like Gecko) Version/9.0.1 Safari/601.2.4 facebookexternalhit/1.1 Facebot Twitterbot/1.0
There is no information about the crawler. Could not verify why data is being collected.
2024-11-14 05:01:48
85.215.99.19
Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko compatible WebSuseBot/1.0 +https://www.WebSuse.de/bot.html) Chrome/70.0.3538.77 Safari/537.36
UNKNOWN: There is no information about the crawler. 1&1 IONOS Strato! As long as only the index.php is crawled, this is ok.
2024-06-17 22:38:36+02:00
de-bot.webwiki.com
Desired, everything okay
Private project: z.B. Information can be found here.
2024-04-09 21:43:41+02:00
ec2-xx-xx-xx-xxx.us-west-2.compute.amazonaws.com
Mozilla/5.0 (compatible wpbot/1.0 +https://forms.gle/ajBaxygz9jSR8p8G9)
No website, no idea what data is collected and for what.
2024-04-08 17:18:36+02:00
185.169.112.225
Mozilla/5.0 (X11 Ubuntu Linux x86_64 rv:15.0) Xing Bot
Pages shared on social media must be scanned by the service.
2024-11-20 12:47:13
HOSTNAME / IP PEER
yacybot (/global SYSTEMINFO PEER java VERSION Europe/de), yacybot (/global amd64 Windows 11 10.0 java 21.0.5 Europe/de) http://yacy.net/bot.html
Search Engine: desired, everything okay. Peer to peer Search Engine Software: Information can be found here.
Host Name: dynamic-080-171-237-160.80.171.pool.telefonica.de - Crawler is operated by us, Informationen YaCy Crawler
2025-02-13 17:16:20
x-xxx-xxx-xxx.spider.yandex.com
Mozilla/5.0 (compatible YandexBot/3.0 +http://yandex.com/bots)
Search Engine: desired, everything okay
2024-07-14 11:44:23+02:00
Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)
IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot
IP-Lookup 45.129.35.105 Name: Packethub S.A.; Country: PANAMA
2025-02-20 10:07:45
v2202411240578296028.bestsrv.de
Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/100.0.9811.471 Safari/537.36
UNKNOWN: There is no information about the crawler. Looks like the RIPE Crawler
2024-05-02 10:01:36+02:00
81.91.173.172
IP-Adresse DENIC eG Niederlande (DENIC-Crawler). Operation and management of the top-level domain .de. It's a shame there's no information on the web crawler.
2025-01-20 08:18:46
Mozilla/5.0 (MSIE 10.0 Windows NT 6.1 Trident/5.0)
Mozilla/5.0 (Windows NT 6.1 WOW64 rv:33.0) Gecko/20120101 Firefox/33.0
Mozilla/5.0 (Macintosh Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML like Gecko) Version/7.0.3 Safari/7046A194A
87.120.112.0/24 - No information, no website. Classified as malicious by several portals.
2025-02-16 10:01:01
v2202411235230295174.ultrasrv.de
Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/100.0.1554.306 Safari/537.36
UNKNOWN: There is no information about the crawler. Looks like the RIPE Crawler
2024-09-06 22:34:25+02:00
Mozilla/5.0 (X11 Linux x86_64) AppleWebKit/537.36 (KHTML like Gecko) HeadlessChrome/122.0.6261.94 Safari/537.36
UNKNOWN: There is no information about the crawler. Could not verify why data is being collected.
IP-Lookup 102.129.145.83 Name: Internet Utilities Africa (PTY) LTD Country: ZA
2024-09-06 09:43:20+02:00
Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/126.0.0.0 Safari/537.36
UNKNOWN: There is no information about the crawler. Could not verify why data is being collected.
IP-Lookup 108.165.237.77 Name: IPXO Limited Country: US
2024-09-08 01:38:34+02:00
Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/127.0.0.0 Safari/537.36
UNKNOWN: There is no information about the crawler. Could not verify why data is being collected.
IP-Lookup 119.29.53.223 Name: TencentCloud Country: HK
2024-10-04 07:18:38+02:00
Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)
IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot
IP-Lookup 151.248.3.19 Name: Internetbolaget; Country: SWEDEN - Change Name to internetnord.se
2024-07-14 11:44:02+02:00
Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)
IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot
IP-Lookup 151.248.4.7 Name: Internetbolaget; Country: SWEDEN
2025-02-14 17:14:43
152.53.0.0/16
202.61.192.0/18
v2202411235230296135.megasrv.de
v2202411235230295328.powersrv.de
v2202411240578296032.ultrasrv.de
Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/100.0.5666.889 Safari/537.36
IP-Adresse RIPE Network Coordination Centre Niederlande. It's a shame there's no information on the web crawler.
2025-01-17 03:52:30
Mozilla/5.0 (compatible Googlebot/2.1)
IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot
IP-Lookup 151.248.4.7 Name: Internetbolaget; Country: SWEDEN
2025-02-16 10:01:01
v2202411235230295174.ultrasrv.de
Mozilla/5.0 (Windows NT 10.0 Win64 x64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/100.0.1554.306 Safari/537.36
UNKNOWN: There is no information about the crawler. Looks like the RIPE Crawler
2024-07-14 11:44:02+02:00
Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)
IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot
IP-Lookup 164.90.175.221 Name: DigitalOcean; Country: US
2024-02-19 11:54:02
Collects data only for your own company.
2024-04-07 13:26:38+02:00
vm1564919.stark-industries.solution
Tool's for „Speed test“, „IP address“ and much more. Normally only the home page is called
Blocked Everything okay under observation Infos
Example: +https://example.com/bot.html spider@example.com
robots.txt ⇒ Crawl-delay: 60
However, you should note that this also limits the number of pages that the search engines can index or update. A crawl delay of 60 seconds, for example, means that only 1,440 pages can be indexed per day, per bot, spider, crawler.
Dataset has been updated
Data record was newly created
Database: CC BY-SA 4.0 DEED 🔗 Detlev Molitor // www.molitor-eu.de 🔗
The HTML code with -rel="nofollow"- and the link text with -style="display:none!important;"- are incorporated into the legal links. rel="nofollow" means that the search engines should not follow this link. style="display:none!important; means that the HTML code is not displayed. However, some search engines do not adhere to this!
User-agent: *
Allow: *
Disallow: evil.php
You can not see this, on your page
<p><a href="evil.php" rel="nofollow"><img src="//novis-itineribus.de/graphics/icon/spinne.svg" alt="spinne" style="display:none!important;" loading="lazy" width="48" height="48"></a></p>
OR: You can see this, on your page
<p><a href="evil.php" rel="nofollow"><img src="//novis-itineribus.de/graphics/icon/awards/no-bad-bot.svg" alt="spinne" loading="lazy" width="40" height="13"></a></p>
<!DOCTYPE HTML>
<html>
<?php
header('HTTP/1.1 200 OK');
header('Content-Type: text/html; charset=utf-8');
ini_set('display_errors', 1);
error_reporting(E_ALL & ~E_NOTICE);
?>
<head>
<meta name="robots" content="none">
<php
$path=$_SERVER['DOCUMENT_ROOT'];
$root='https://'.$_SERVER['SERVER_NAME'];
$filename=$path.'/data/bot.csv';
if (isset($_SERVER['HTTP_REFERER']))$herkunft=$_SERVER['HTTP_REFERER'];else $herkunft='false';
$REMOTE_HOST=$_SERVER['REMOTE_HOST'] ?? $_SERVER['REMOTE_ADDR'];
$hostname = gethostbyaddr($REMOTE_HOST);
$cookiesSet = implode("~", array_keys($_COOKIE)); $queryString=substr($_SERVER['QUERY_STRING'], 0, 20); $cookiesSet=substr($cookiesSet, 0, 30); $cookiesSet=str_replace("~", "<br>", $cookiesSet);
$var=date(DATE_ATOM); $var=str_replace('T', ' ',$var); $var=str_replace('+01:00', '',$var);$var=str_replace('+02:00', '',$var);
$array = array (
$var ,
$_SERVER['REMOTE_ADDR'] ,
$hostname ,
$_SERVER['HTTP_USER_AGENT'] ,
$queryString ,
$herkunft ,
$cookiesSet ,
'new_line~'
);
$evil=implode("~", $array);
if (!isset($cookiesSet))file_put_contents($filename, $evil , FILE_APPEND );
?>
</head>
<body>
<?php
echo'<a href="'.$root.'/">HOME</a>';
if (file_exists($filename)){
echo '<table border="0" cellspacing="0" cellpadding="5" width="100%" class="csvTable">';
echo '<tr style="color:#f7f6f5; background-color:#2196F3;"><td>TIME</td><td>IP REMOTE</td><td>IP HOST</td><td>AGENT</td><td>PORT</td><td>QUERY STRING</td><td>HERKUNFT</td></tr>';
$handle = fopen($filename, 'r');
$start = 0;
while (($data = fgetcsv($handle, 1000, "~")) !== FALSE)
{
echo '<tr>';
for ( $x = 0; $x < count($data); $x++)
{
if ($data[$x]=='new_line')echo'</tr><tr>';
else echo '<td>'.$data[$x].'</td>' . "\n";
}
$start++;
echo '</tr>' . "\n";
}
fclose($handle);
echo '</table>';
}
else echo'<p>The spider hasn't caught anything yet!</p>';
?>
</body>
</html>
On all Pages
<!DOCTYPE HTML>
<?php
if(!empty($_SERVER['HTTP_USER_AGENT']) and preg_match('/Mb2345Browser|peer39_crawler|dataprovider|Dmbot|Grapeshot|IonCrawl|URLSuMaBot|Semrush|LieBaoFast|zh-CN|MicroMessenger|zh_CN|Kinza|MJ12bot|AhrefsBot|Bytespider/i',$_SERVER['HTTP_USER_AGENT'])) {
header('HTTP/1.0 403 Forbidden');
die('<h1>Error 403 Forbidden</h1><h2><a href="/evil.php">Spider Trap - Spinnenfalle</a></h2><p>[EN] Access not allowed</p><p>[DE] Zugriff nicht erlaubt</p><p>[DA] Adgang ikke tilladt</p>');
header('X-Robots-Tag: none');
}else header('HTTP/1.1 200 OK');
?>
<html>
<head>
…
<head>
# ban code [USER AGENT]
<IfModule mod_rewrite.c>
RewriteCond %{HTTP_USER_AGENT} (base64_decode|eval) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (DataForSeoBot|MJ12bot|wpbot|Friendly_Crawler|Bytespider|peer39_crawler|paloaltonetworks|dataprovider|Dmbot|Grapeshot) [NC]
RewriteRule .* - [F]
</IfModule>
Some [HTTP_USER_AGENT] are automatically blocked by the 8G firewall
base64_decode || bin/bash || disconnect || eval || unserializ || ahrefs || archiver || curl || libwww-perl || pycurl || scan || wget || acapbot || acoonbot || alexibot || asterias || attackbot || backdorbot || becomebot || binlar || blackwidow || blekkobot || blexbot || blowfish || bullseye || bunnys || butterfly || careerbot || casper || checkpriv || cheesebot || cherrypick || chinaclaw || choppy || clshttp || cmsworld || copernic || copyrightcheck || cosmos || crescent || cy_cho || datacha || demon || diavol || discobot || dittospyder || dotbot || dotnetdotcom || dumbot || econtext || emailcollector || emailsiphon || emailwolf || eolasbot || eventures || extract || eyenetie || feedfinder || flaming || flashget || flicky || foobot || fuck || g00g1e || getright || gigabot || go-ahead-got || gozilla || grabnet || grafula || harvest || heritrix || httracks? || icarus6j || jetbot || jetcar || jikespider || kmccrew || leechftp || libweb || liebaofast || linkscan || linkwalker || loader || lwp-download || majestic || masscan || miner || mechanize || mj12bot || morfeus || moveoverbot || netmechanic || netspider || nicerspro || nikto || ninja || nominet || nutch || octopus || pagegrabber || petalbot || planetwork || postrank || proximic || purebot || queryn || queryseeker || radian6 || radiation || realdownload || remoteview || rogerbot || scan || scooter || seekerspid || semalt || siclab || sindice || sistrix || sitebot || siteexplorer || sitesnagger || skygrid || smartdownload || snoopy || sosospider || spankbot || spbot || sqlmap || stackrambler || stripper || sucker || surftbot || sux0r || suzukacz || suzuran || takeout || teleport || telesoft || true_robots || turingos || turnit || vampire || vikspider || voideye || webleacher || webreaper || webstripper || webvac || webviewer || webwhacker || winhttp || wwwoffle || woxbot || xaldon || xxxyy || yamanalab || yioopbot || youda || zeus || zmeu || zune || zyborg[DA] Højreklik ➛ gem billede som…
[DE] Rechte Maustaste ➛ Bild speichern unter…
[EN] Right click ➛ save image as…
No Bad-Bot | Spider | empty | Unicode® HTML-Code Symbol | |
| |
| |
|![]() |
🕷 |🕷| | 🕸 |🕸| |
Orthopädie Schutechnik Meisterbetrieb Risse e.K. - Inhaber Kuin Hasasov
SpassAmVerreisen.de :: Hier beginnt der Urlaub schon beim Buchen
Business View Photo Ag, Ihre Digital Marketing Agentur
Bernhard Mennemeier - Fahrradgeschäft in Waltrop
mtandao ist Suaheli und bedeutet Netzwerk
📧 Wir bieten Ihnen ein hohes Maß an Sicherheit. So nutzten wir grundsätzlich das Hypertext Transfer Protocol Secure ( HTTPS - sicheres Hypertext-Übertragungsprotokoll), um die Kommunikation über das Internet zu schützen. Beachten Sie aber bitte, dass die Kommunikation über das Internet (E Mails eingeschlossen) nicht völlig sicher ist, und dass Personen, die nicht zu unserem Umfeld gehören, übermittelte Informationen abfangen und auf sie zugreifen könnten. Desweiteren werden unsere Server regelmäßig auf Unregelmäßigkeiten wie Viren, Phishing, gefährliche Downloads etc., durch Google und TrustedSite überprüft.
(4) Quellcode (HTML, JAVA, C, Batch …): Fragen & Hilfe erhalten Sie in unserem Diskussionsforum oder FAQ. | Hinweis: Wir behalten uns das Recht vor, jederzeit änderungen vorzunehmen, zusätzliche Informationen einzupflegen, oder vorhandene zu Löschen. | Eine Garantie kann nicht gegeben werden, wir schließen jedwede Ersatzansprüche aus!
Die Angaben auf dieser Webseite entsprechen dem Stand der Technik sowie Erfahrungen. Bei der Vielfalt der Anwendungsmöglichkeiten und der technischen Gegebenheiten können sie lediglich Hinweise auf Anwendungen geben und sind nicht auf jeden Einzelfall voll übertragbar, daher können daraus keine Verbindlichkeiten, Haftungs- und Gewährleistungsansprüche abgeleitet werden.
(●) Alle Angaben erfolgen ohne Gewähr.
Toutes ces dates sont données sans garantie.
All these dates are given without guarantee.
Alle oplysninger gives uden garanti.
All information tillhandahålls utan garanti.
All informasjon gis uten garanti.
Alle informatie wordt zonder garantie verstrekt.
Ĉiuj informoj estas provizitaj sen garantio.
Toda la información se proporciona sin garantía.
All Informatioun gëtt ouni Garantie geliwwert.
Alle ynformaasje wurdt levere sûnder garânsje.
Tha a h-uile fiosrachadh air a thoirt seachad gun ghealladh.
Todas as informações são fornecidas sem garantia.
Wszystkie informacje są dostarczane bez gwarancji.
↓ Permalink ↓