2024-05-11 03:49:18+02:00
164.90.225.193
SEO Company that only collects data for its own customers. You can only use your own data for a fee.
17-241-xxx-xxx.applebot.apple.com
Search Engine: desired, everything okay
2024-03-30 12:17:55
loquat.unboiled.info // vps-aefd39af.vps.ovh.net
Pages shared on social media must be scanned by the service.
2024-05-05 22:53:17+02:00
pot34.webmeup.com
SEO Company that only collects data for its own customers. You can only use your own data for a fee.
AWS Amazon Web Services, is used by many. AWS IP address ranges
2024-04-17 01:41:40+02:00
xxx.xx.xx.xx
Crawler for the chinese search engine Baidu: desired, everything okay
msnbot-xx-xxx-xxx-xxx.search.msn.com
Search Engine: desired, everything okay
2024-07-24 13:56:41+02:00
185.170.167.18
UNKNOWN: There is no information about the crawler. (MNT By: Semrush_Net)
2024-03-27 23:21:41
154.54.249.162
SEO Company that only collects data for its own customers. You can only use your own data for a fee.
2024-03-24 21:29:01
ec2-xx-xxx-xx-xxx.ap-southeast-1.compute.amazonaws.com
ByteDance, products and services such as TikTok, CapCut, TikTok Shop, Lark, Pico. Could not verify why data is being collected.
2024-04-01 06:42:42+02:00
ninja-crawler96.webmeup.com
Free backlink checker / crawler, by SEO SpyGlass, Data for Free
2024-07-17 14:33:41+02:00
ec2-18-226-186-157.us-east-2.compute.amazonaws.com
Could not verify why data is being collected
2024-07-12 21:35:22+02:00
ec2-44-215-105-52.compute-1.amazonaws.com
Advertising: Tracking websites and customers. „Data intelligence company that works with leading marketers to anonymously identify and segment audiences based on their digital behavior in real time.“
2024-04-03 03:42:33+02:00
ec2-….compute-1.amazonaws.com
Common Crawl is a 501(c)(3) nonprofit organization whose mission is to provide Internet researchers, companies, and individuals with a copy of the Internet, free of charge, for research and analysis purposes. Does not follow its own robots.txt rules
2024-03-21 16:56:14
Self-started / queried, everything okay
2024-04-20 09:49:54+02:00
ec2-3-129-45-92.us-east-2.compute.amazonaws.com
AI/KI Kunstig intelligens (AI)
Künstliche Intelligenz (KI)
Artificial Intelligence (AI)
Intelligence artificielle (IA)
Sztuczna inteligencja (AI)
agent from Anthropic, could not verify why data is being collected.
2024-04-23 13:22:49+02:00
ec2-52-90-218-67.compute-1.amazonaws.com
Advertising: Tracking websites and customers. „Comscore's contextual content analysis enables advertising partners to determine the best matching campaign for a page's content.“
2024-03-11 23:11:04
crawl-149-56-160-221.dataproviderbot.com
Paid search engine. Speaks against a free internet.
2024-06-24 16:17:29+02:00
bot.domainstats.com
SEO Company that only collects data for its own customers. You can only use your own data for a fee.
2024-04-16 19:19:52+02:00
crawling-gateway-136-243-228-179.dataforseo.com
SEO Company that only collects data for its own customers. You can only use your own data for a fee.
2024-05-02 10:01:36+02:00
81.91.173.172
IP-Adresse DENIC eG Niederlande. Operation and management of the top-level domain .de. It's a shame there's no information on the web crawler.
2024-03-07 03:15:59
UNKNOWN: There is no information about the crawler.
2024-02-22 16:48:03
Test Bot / Crawler from me
2024-03-22 05:38:31
20.191.45.212 | 40.88.21.235
Search Engine: desired, everything okay
2024-04-04 17:27:09+02:00
193.222.96.142
The domain name EVERYFEED.COM is for sale.
2024-04-25 11:16:15+02:00
holmavik.core.headline.com
No information, could not verify why data is being collected.
filiBot is the generic name for Fili's SEO crawler - Desired, everything okay. Own website SEO.
2024-06-11 11:42:17+02:00
anonsys.net
Pages shared on social media must be scanned by the service.
2024-03-30 12:17:55
ec2.....compute-1.amazonaws.com
Is a content discovery app that indexes news web content.
facebookexternalhit, facebookcatalog
Pages shared on social media must be scanned by the service.
2024-05-10 18:23:37+02:00
Friendly_Crawler/Nutch-1.20-SNAPSHOT // FriendlyCrawler/1.0
...us-west-2.compute.amazonaws.com
No information, could not verify why data is being collected. (Two crawlers: Note underscore)
2024-06-24 07:48:38+02:00
164.92.220.26
Search for hashtags of trending topics and articles on Mastodon. It's a shame there's no information on the web crawler.
2024-03-20 13:17:32
product-search-83-99-151-68.geedo.com
Scans online stores to find products.
crawl-xx-xxx-xx-x.googlebot.com
Search Engine: desired, everything okay
crawl-xx-xxx-xx-x.googlebot.com
Search Engine: desired, everything okay
2024-03-25 14:32:59
Self-started / queried, everything okay. Own website SEO.
2024-06-17 14:50:25+02:00
rate-limited-proxy-66-249-92-23.google.com
Self-started / queried, everything okay. Link Check for YouTube.
2024-04-18 14:31:37+02:00
google-proxy-66-249-83-118.google.com
Self-started / queried, everything okay. Own website SEO.
2024-02-21 08:38:25
152.67.137.35
Advertising platform supports advertisers in placing contextual advertising and affiliate programs on websites
2024-05-11 03:31:49+02:00
….….….…
GPTBot is OpenAI's web crawler: desired, for me everything okay.
2024-07-26 00:23:25+02:00
congratulated.monitoring.internet-measurement.com
Security services, simple data free
2024-02-19 11:54:02
Collects data only for your own company.
2024-07-15 21:17:53+02:00
zl-ams-nl-gp1-wk117b.internet-census.org
No imprint, no address. Collects safety-relevant!
2024-04-08 16:13:52+02:00
108-174-5-113.fwd.linkedin.com
Pages shared on social media must be scanned by the service.
2024-03-30 12:17:55
Is a content discovery app that indexes news web content.
2024-06-11 12:29:07+02:00
54.39.177.173
Tracking news, social media, SEO projects
2024-06-17 14:53:06+02:00
ec2-…-…-…-….eu-west-1.compute.amazonaws.com
Pages shared on social media must be scanned by the service.
2024-04-08 12:40:47+02:00
ns3088854.ip-217-182-175.eu
Blocked due to too frequent visits. SEO Company that only collects data for its own customers. You can only use your own data for a fee. According to its own statement, it does not store any web content or personal data. Only link relationships between websites are shown.
2024-03-30 12:17:54
static.254.9.130.94.clients.your-server.de // ip250.ip-51-68-203.eu // mail.mls20.de // mx.zvcdn.de // neuland.social // gamma.ohai.is // ns31628207.ip-57-128-95.eu // vps-a39c1e80.vps.ovh.net
Pages shared on social media must be scanned by the service.
msnbot-xx-xxx-xxx-xxx.search.msn.com
Search Engine: desired, everything okay
2024-06-17 15:02:41+02:00
ec2-…-…-…-….compute-1.amazonaws.com
Pages shared on social media must be scanned by the service.
2024-06-08 00:06:04+02:00
81.209.177…
Website directory. Does not follow rules!
2024-03-30 12:17:55
static.100.2.21.65.clients.your-server.de
Pages shared on social media must be scanned by the service.
2024-07-20 11:24:04+02:00
198.235.24.122
Expanse a Palo Alto Networks company searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans please send IP addresses/domains to: scaninfo@paloaltonetworks.com
2024-02-15 17:04:12
ip85.215.186.233.pbiaas.com
No website, could not verify why data is being collected.
2024-02-21 08:38:25
204.15.208.26
Advertising platform supports advertisers in placing contextual advertising and affiliate programs on websites
2024-03-22 10:08:40
crawl-54-236-1-13.pinterest.com
Pages shared on social media must be scanned by the service.
2024-04-23 13:22:49+02:00
ec2-52-90-218-67.compute-1.amazonaws.com
Advertising: Tracking websites and customers. „Comscore's contextual content analysis enables advertising partners to determine the best matching campaign for a page's content.“
2024-06-17 09:01:45+02:00
unn-185-156-…-….datapacket.com
SEO Company that only collects data for its own customers. There is free access for hobby users and beginners. This way you can check your own data.
fulltextrobot-77-75-78-164.seznam.cz // 74.114.154.xxx
Search Engine: desired, everything okay
2024-04-18 14:31:33+02:00
google-proxy-66-249-83-116.google.com
Self-started / queried, everything okay. Own website SEO.
2024-04-24 04:03:05+02:00
static.25.67.76.144.clients.your-server.de
SEO Company that only collects data for its own customers. You can only use your own data for a fee.
2024-02-16 16:48:35
xxx.bl.bot.semrush.com
SEO Company that only collects data for its own customers. You can only use your own data for a fee.
SEO Company that only collects data for its own customers. There is free access for hobby users and beginners. This way you can check your own data.
2024-06-11 11:53:32+02:00
23-239-8-56.ip.linodeusercontent.com
Pages shared on Newsportal must be scanned by the service.
2024-05-18 07:19:41+02:00
li965-236.members.linode.com
UNKNOWN: There is no information about the crawler.
2024-03-30 12:18:20
p15n14
Tracking news, social media, SEO projects
2024-03-25 02:56:52
The t3versions bot will save the domain name of a website to a database, if the website has been identified to use TYPO3 CMS. So you don't need to scan our website.
2024-03-24 23:14:22
r-xxx-xx-xxx-xxx.twttr.com
Pages shared on social media must be scanned by the service.
2024-03-30 12:17:56
ec2-34-241-108-107.eu-west-1.compute.amazonaws.com
Pages shared on social media must be scanned by the service.
2024-02-15 17:04:12
ip85.215.186.233.pbiaas.com
No SSL website, no idea what data is collected and for what.
2024-06-17 22:38:36+02:00
de-bot.webwiki.com
Desired, everything okay
Private project: Information can be found here.
2024-04-09 21:43:41+02:00
ec2-xx-xx-xx-xxx.us-west-2.compute.amazonaws.com
Mozilla/5.0 (compatible wpbot/1.0 +https://forms.gle/ajBaxygz9jSR8p8G9)
No website, no idea what data is collected and for what.
2024-04-08 17:18:36+02:00
185.169.xxx.225
Pages shared on social media must be scanned by the service.
2024-04-30 12:54:23+02:00
new1.xml-sitemaps.com // pro….pro-sitemaps.com
Self-started / queried, everything okay. Creation of sitemap.xml / sitemap.html / sitemap.xml.gz.
Search Engine: desired, everything okay
Search Engine Software: Information can be found here.
x-xxx-xxx-xxx.spider.yandex.com
Search Engine: desired, everything okay
2024-05-02 10:01:36+02:00
81.91.173.172
IP-Adresse DENIC eG Niederlande (DENIC-Crawler). Operation and management of the top-level domain .de. It's a shame there's no information on the web crawler.
2024-07-14 11:44:23+02:00
Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)
IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot
IP-Lookup 45.129.35.105 Name: Packethub S.A.; Country: PANAMA
2024-07-14 11:44:02+02:00
Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)
IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot
IP-Lookup 151.248.4.7 Name: Internetbolaget; Country: SWEDEN
2024-07-14 11:44:02+02:00
Mozilla/5.0 (compatible Googlebot/2.1 +http://www.google.com/bot.html)
IP address does not match Agent Google, see IP : Special Crawlers, User Triggered Fetchers, googlebot
IP-Lookup 194.110.115.43 Name: M247 Brussels NOC; Country: BELGIUM
2024-02-19 11:54:02
Collects data only for your own company.
2024-04-07 13:26:38+02:00
vm1564919.stark-industries.solution
Tool's for „Speed test“, „IP address“ and much more. Normally only the home page is called
Blocked Everything okay under observation Infos
Example: +https://example.com/bot.html spider@example.com
robots.txt ⇒ Crawl-delay: 60
However, you should note that this also limits the number of pages that the search engines can index or update. A crawl delay of 60 seconds, for example, means that only 1,440 pages can be indexed per day, per bot, spider, crawler.
In den gesetzen Links ist der HTML Code mit -rel="nofollow"- und der Linktext mit -style="display:none!important;"- eingebaut. rel="nofollow" bedeutet das die Suchmaschinen diesem Link nicht folgen sollten. style="display:none!important; bedeutet das der HTML Code nicht angezeigt wird. Manche Suchmaschinen halten sich jedoch nicht daran!
User-agent: *
Allow: *
Disallow: evil.php
You can not see this, on your page
<p><a href="evil.php" rel="nofollow"><img src="//novis-itineribus.de/graphics/icon/spinne.svg" alt="spinne" style="display:none!important;" loading="lazy" width="60" height="16"></a></p>
OR: You can see this, on your page
<p><a href="evil.php" rel="nofollow"><img src="//novis-itineribus.de/graphics/icon/awards/no-bad-bot.svg" alt="spinne" loading="lazy" width="16" height="16"></a></p>
<!DOCTYPE HTML>
<html>
<?php
header('HTTP/1.1 200 OK');
header('Content-Type: text/html; charset=utf-8');
ini_set('display_errors', 1);
error_reporting(E_ALL & ~E_NOTICE);
?>
<head>
<meta name="robots" content="none">
<php
$path=$_SERVER['DOCUMENT_ROOT'];
$root='https://'.$_SERVER['SERVER_NAME'];
$filename=$path.'/data/bot.csv';
if (isset($_SERVER['HTTP_REFERER']))$herkunft=$_SERVER['HTTP_REFERER'];else $herkunft='false';
$REMOTE_HOST=$_SERVER['REMOTE_HOST'] ?? $_SERVER['REMOTE_ADDR'];
$hostname = gethostbyaddr($REMOTE_HOST);
$cookiesSet = implode("~", array_keys($_COOKIE)); $queryString=substr($_SERVER['QUERY_STRING'], 0, 20); $cookiesSet=substr($cookiesSet, 0, 30); $cookiesSet=str_replace("~", "<br>", $cookiesSet);
$var=date(DATE_ATOM); $var=str_replace('T', ' ',$var); $var=str_replace('+01:00', '',$var);$var=str_replace('+02:00', '',$var);
$array = array (
$var ,
$_SERVER['REMOTE_ADDR'] ,
$hostname ,
$_SERVER['HTTP_USER_AGENT'] ,
$queryString ,
$herkunft ,
$cookiesSet ,
'new_line~'
);
$evil=implode("~", $array);
if (!isset($cookiesSet))file_put_contents($filename, $evil , FILE_APPEND );
?>
</head>
<body>
<?php
echo'<a href="'.$root.'/">HOME</a>';
if (file_exists($filename)){
echo '<table border="0" cellspacing="0" cellpadding="5" width="100%" class="csvTable">';
echo '<tr style="color:#f7f6f5; background-color:#2196F3;"><td>TIME</td><td>IP REMOTE</td><td>IP HOST</td><td>AGENT</td><td>PORT</td><td>QUERY STRING</td><td>HERKUNFT</td></tr>';
$handle = fopen($filename, 'r');
$start = 0;
while (($data = fgetcsv($handle, 1000, "~")) !== FALSE)
{
echo '<tr>';
for ( $x = 0; $x < count($data); $x++)
{
if ($data[$x]=='new_line')echo'</tr><tr>';
else echo '<td>'.$data[$x].'</td>' . "\n";
}
$start++;
echo '</tr>' . "\n";
}
fclose($handle);
echo '</table>';
}
else echo'<p>The spider hasn't caught anything yet!</p>';
?>
</body>
</html>
On all Pages
<!DOCTYPE HTML>
<?php
if(!empty($_SERVER['HTTP_USER_AGENT']) and preg_match('/Mb2345Browser|peer39_crawler|dataprovider|Dmbot|Grapeshot|IonCrawl|URLSuMaBot|Semrush|LieBaoFast|zh-CN|MicroMessenger|zh_CN|Kinza|MJ12bot|AhrefsBot|Bytespider/i',$_SERVER['HTTP_USER_AGENT'])) {
header('HTTP/1.0 403 Forbidden');
die('<h1>Error 403 Forbidden</h1><h2><a href="/evil.php">Spider Trap - Spinnenfalle</a></h2><p>[EN] Access not allowed</p><p>[DE] Zugriff nicht erlaubt</p><p>[DA] Adgang ikke tilladt</p>');
header('X-Robots-Tag: none');
}else header('HTTP/1.1 200 OK');
?>
<html>
<head>
...
<head>
# ban code [USER AGENT]
<IfModule mod_rewrite.c>
RewriteCond %{HTTP_USER_AGENT} (base64_decode|eval) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (DataForSeoBot|MJ12bot|wpbot|Friendly_Crawler|Bytespider|peer39_crawler|paloaltonetworks|dataprovider|Dmbot|Grapeshot) [NC]
RewriteRule .* - [F]
</IfModule>
Some [HTTP_USER_AGENT] are automatically blocked by the 8G firewall
base64_decode || bin/bash || disconnect || eval || unserializ || ahrefs || archiver || curl || libwww-perl || pycurl || scan || wget || acapbot || acoonbot || alexibot || asterias || attackbot || backdorbot || becomebot || binlar || blackwidow || blekkobot || blexbot || blowfish || bullseye || bunnys || butterfly || careerbot || casper || checkpriv || cheesebot || cherrypick || chinaclaw || choppy || clshttp || cmsworld || copernic || copyrightcheck || cosmos || crescent || cy_cho || datacha || demon || diavol || discobot || dittospyder || dotbot || dotnetdotcom || dumbot || econtext || emailcollector || emailsiphon || emailwolf || eolasbot || eventures || extract || eyenetie || feedfinder || flaming || flashget || flicky || foobot || fuck || g00g1e || getright || gigabot || go-ahead-got || gozilla || grabnet || grafula || harvest || heritrix || httracks? || icarus6j || jetbot || jetcar || jikespider || kmccrew || leechftp || libweb || liebaofast || linkscan || linkwalker || loader || lwp-download || majestic || masscan || miner || mechanize || mj12bot || morfeus || moveoverbot || netmechanic || netspider || nicerspro || nikto || ninja || nominet || nutch || octopus || pagegrabber || petalbot || planetwork || postrank || proximic || purebot || queryn || queryseeker || radian6 || radiation || realdownload || remoteview || rogerbot || scan || scooter || seekerspid || semalt || siclab || sindice || sistrix || sitebot || siteexplorer || sitesnagger || skygrid || smartdownload || snoopy || sosospider || spankbot || spbot || sqlmap || stackrambler || stripper || sucker || surftbot || sux0r || suzukacz || suzuran || takeout || teleport || telesoft || true_robots || turingos || turnit || vampire || vikspider || voideye || webleacher || webreaper || webstripper || webvac || webviewer || webwhacker || winhttp || wwwoffle || woxbot || xaldon || xxxyy || yamanalab || yioopbot || youda || zeus || zmeu || zune || zyborg[DA] Højreklik ➛ gem billede som...
[DE] Rechte Maustaste ➛ Bild speichern unter...
[EN] Right click ➛ save image as...
No Bad-Bot | Spider | empty | Unicode® HTML-Code Symbol | |
| |
| |
|![]() |
🕷 |🕷| | 🕸 |🕸| |
SpassAmVerreisen.de :: Hier beginnt der Urlaub schon beim Buchen
Business View Photo Ag, Ihre Digital Marketing Agentur
Bernhard Mennemeier - Fahrradgeschäft in Waltrop
mtandao ist Suaheli und bedeutet Netzwerk
📧 Wir bieten Ihnen ein hohes Maß an Sicherheit. So nutzten wir grundsätzlich das Hypertext Transfer Protocol Secure ( HTTPS - sicheres Hypertext-Übertragungsprotokoll), um die Kommunikation über das Internet zu schützen. Beachten Sie aber bitte, dass die Kommunikation über das Internet (E Mails eingeschlossen) nicht völlig sicher ist, und dass Personen, die nicht zu unserem Umfeld gehören, übermittelte Informationen abfangen und auf sie zugreifen könnten. Desweiteren werden unsere Server regelmäßig auf Unregelmäßigkeiten wie Viren, Phishing, gefährliche Downloads etc., durch Google und TrustedSite überprüft.
(4) Quellcode (HTML, JAVA, C, Batch ...): Fragen & Hilfe erhalten Sie in unserem Diskussionsforum oder FAQ. | Hinweis: Wir behalten uns das Recht vor, jederzeit änderungen vorzunehmen, zusätzliche Informationen einzupflegen, oder vorhandene zu Löschen. | Eine Garantie kann nicht gegeben werden, wir schließen jedwede Ersatzansprüche aus!
Die Angaben auf dieser Webseite entsprechen dem Stand der Technik sowie Erfahrungen. Bei der Vielfalt der Anwendungsmöglichkeiten und der technischen Gegebenheiten können sie lediglich Hinweise auf Anwendungen geben und sind nicht auf jeden Einzelfall voll übertragbar, daher können daraus keine Verbindlichkeiten, Haftungs- und Gewährleistungsansprüche abgeleitet werden.
(●) Alle Angaben erfolgen ohne Gewähr.
Toutes ces dates sont données sans garantie.
All these dates are given without guarantee.
Alle oplysninger gives uden garanti.
All information tillhandahålls utan garanti.
All informasjon gis uten garanti.
Alle informatie wordt zonder garantie verstrekt.
Ĉiuj informoj estas provizitaj sen garantio.
Toda la información se proporciona sin garantía.
All Informatioun gëtt ouni Garantie geliwwert.
Alle ynformaasje wurdt levere sûnder garânsje.
Tha a h-uile fiosrachadh air a thoirt seachad gun ghealladh.
Todas as informações são fornecidas sem garantia.
Wszystkie informacje są dostarczane bez gwarancji.
❗ Report Error • Rapporter fejl • Fehler melden • Informar error • Signaler une erreur
↓ Permalink ↓