国产激情久久久久影院小草,成人免费一区二区三区视频软件,jzzijzzij在线观看亚洲熟妇,mm1313午夜视频,白丝爆浆18禁一区二区三区


方法一:域名DNS托管到cloudflare,一鍵屏蔽AI爬蟲

如果訪問(wèn)不了cloudflare,那就需要自己搞定梯子。
(國(guó)內(nèi)域名幾乎不影響訪問(wèn)速度,有些人會(huì)覺(jué)得使用國(guó)內(nèi)DNS速度快,其實(shí)速度差不多)

方法二:寶塔防火墻設(shè)置屏蔽AI爬蟲(我用的是破解版寶塔,免費(fèi)版不知道能不能設(shè)置)

Amazonbot
ClaudeBot
PetalBot
gptbot
Ahrefs
Semrush
Imagesift
Teoma
ia_archiver
twiceler
MSNBot
Scrubby
Robozilla
Gigabot
yahoo-mmcrawler
yahoo-blogs/v3.9
psbot
Scrapy
SemrushBot
AhrefsBot
Applebot
AspiegelBot
DotBot
DataForSeoBot
java
MJ12bot
python
seo
Censys




方法三:復(fù)制下面的代碼,保存為robots.txt,上傳到網(wǎng)站根目錄

User-agent: Ahrefs
Disallow: /
User-agent: Semrush
Disallow: /
User-agent: Imagesift
Disallow: /
User-agent: Amazonbot
Disallow: /
User-agent: gptbot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: PetalBot
Disallow: /
User-agent: Baiduspider
Disallow: 
User-agent: Sosospider
Disallow: 
User-agent: sogou spider
Disallow: 
User-agent: YodaoBot
Disallow: 
User-agent: Googlebot
Disallow: 
User-agent: Bingbot
Disallow: 
User-agent: Slurp
Disallow: 
User-agent: Teoma
Disallow: /
User-agent: ia_archiver
Disallow: /
User-agent: twiceler
Disallow: /
User-agent: MSNBot
Disallow: /
User-agent: Scrubby
Disallow: /
User-agent: Robozilla
Disallow: /
User-agent: Gigabot
Disallow: /
User-agent: googlebot-image
Disallow: 
User-agent: googlebot-mobile
Disallow: 
User-agent: yahoo-mmcrawler
Disallow: /
User-agent: yahoo-blogs/v3.9
Disallow: /
User-agent: psbot
Disallow: 
User-agent: dotbot
Disallow: /



方法四:防止網(wǎng)站被采集(寶塔配置文件保存以下代碼)

#禁止Scrapy等工具的抓取
if ($http_user_agent ~* (Scrapy|Curl|HttpClient|crawl|curb|git|Wtrace)) {

     return 403;

}

#禁止指定UA及UA為空的訪問(wèn)
if ($http_user_agent ~* "CheckMarkNetwork|Synapse|Nimbostratus-Bot|Dark|scraper|LMAO|Hakai|Gemini|Wappalyzer|masscan|crawler4j|Mappy|Center|eright|aiohttp|MauiBot|Crawler|researchscan|Dispatch|AlphaBot|Census|ips-agent|NetcraftSurveyAgent|ToutiaoSpider|EasyHttp|Iframely|sysscan|fasthttp|muhstik|DeuSu|mstshash|HTTP_Request|ExtLinksBot|package|SafeDNSBot|CPython|SiteExplorer|SSH|MegaIndex|BUbiNG|CCBot|NetTrack|Digincore|aiHitBot|SurdotlyBot|null|SemrushBot|Test|Copied|ltx71|Nmap|DotBot|AdsBot|InetURL|Pcore-HTTP|PocketParser|Wotbox|newspaper|DnyzBot|redback|PiplBot|SMTBot|WinHTTP|Auto Spider 1.0|GrabNet|TurnitinBot|Go-Ahead-Got-It|Download Demon|Go!Zilla|GetWeb!|GetRight|libwww-perl|Cliqzbot|MailChimp|SMTBot|Dataprovider|XoviBot|linkdexbot|SeznamBot|Qwantify|spbot|evc-batch|zgrab|Go-http-client|FeedDemon|Jullo|Feedly|YandexBot|oBot|FlightDeckReports|Linguee Bot|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|CoolpadWebkit|Java|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|HttpClient|MJ12bot|EasouSpider|LinkpadBot|Ezooms|^$" ) {
 
     return 403;
 
}

#禁止非GET|HEAD|POST方式的抓取
if ($request_method !~ ^(GET|HEAD|POST)$) {

    return 403;

}


添加完畢后保存,重啟nginx即可,這樣這些蜘蛛或工具掃描網(wǎng)站的時(shí)候就會(huì)提示403禁止訪問(wèn)。
注意:如果你網(wǎng)站使用火車頭采集發(fā)布,使用以上代碼會(huì)返回403錯(cuò)誤,發(fā)布不了的。如果想使用火車頭采集發(fā)布,請(qǐng)使用下面的代碼:

#禁止Scrapy等工具的抓取
if ($http_user_agent ~* (Scrapy|Curl|HttpClient|crawl|curb|git|Wtrace)) {

     return 403;

}

#禁止指定UA及UA為空的訪問(wèn)
if ($http_user_agent ~* "CheckMarkNetwork|Synapse|Nimbostratus-Bot|Dark|scraper|LMAO|Hakai|Gemini|Wappalyzer|masscan|crawler4j|Mappy|Center|eright|aiohttp|MauiBot|Crawler|researchscan|Dispatch|AlphaBot|Census|ips-agent|NetcraftSurveyAgent|ToutiaoSpider|EasyHttp|Iframely|sysscan|fasthttp|muhstik|DeuSu|mstshash|HTTP_Request|ExtLinksBot|package|SafeDNSBot|CPython|SiteExplorer|SSH|MegaIndex|BUbiNG|CCBot|NetTrack|Digincore|aiHitBot|SurdotlyBot|null|SemrushBot|Test|Copied|ltx71|Nmap|DotBot|AdsBot|InetURL|Pcore-HTTP|PocketParser|Wotbox|newspaper|DnyzBot|redback|PiplBot|SMTBot|WinHTTP|Auto Spider 1.0|GrabNet|TurnitinBot|Go-Ahead-Got-It|Download Demon|Go!Zilla|GetWeb!|GetRight|libwww-perl|Cliqzbot|MailChimp|SMTBot|Dataprovider|XoviBot|linkdexbot|SeznamBot|Qwantify|spbot|evc-batch|zgrab|Go-http-client|FeedDemon|Jullo|Feedly|YandexBot|oBot|FlightDeckReports|Linguee Bot|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|CoolpadWebkit|Java|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|HttpClient|MJ12bot|EasouSpider|LinkpadBot|Ezooms ) {
 
     return 403;
 
}

#禁止非GET|HEAD|POST方式的抓取
if ($request_method !~ ^(GET|HEAD|POST)$) {

    return 403;

}

設(shè)置完了可以用模擬爬去來(lái)看看有沒(méi)有誤傷了好蜘蛛,說(shuō)明:以上屏蔽的蜘蛛名不包括以下常見(jiàn)的6大蜘蛛名:百度蜘蛛:Baiduspider谷歌蜘蛛:Googlebot必應(yīng)蜘蛛:bingbot搜狗蜘蛛:Sogou web spider360蜘蛛:360Spider神馬蜘蛛:YisouSpider爬蟲常見(jiàn)的User-Agent如下:

FeedDemon       內(nèi)容采集
BOT/0.1 (BOT for JCE) sql注入
CrawlDaddy      sql注入
Java         內(nèi)容采集
Jullo         內(nèi)容采集
Feedly        內(nèi)容采集
UniversalFeedParser  內(nèi)容采集
ApacheBench      cc攻擊器
Swiftbot       無(wú)用爬蟲
YandexBot       無(wú)用爬蟲
AhrefsBot       無(wú)用爬蟲
jikeSpider      無(wú)用爬蟲
MJ12bot        無(wú)用爬蟲
ZmEu phpmyadmin    漏洞掃描
WinHttp        采集cc攻擊
EasouSpider      無(wú)用爬蟲
HttpClient      tcp攻擊
Microsoft URL Control 掃描
YYSpider       無(wú)用爬蟲
jaunty        wordpress爆破掃描器
oBot         無(wú)用爬蟲
Python-urllib     內(nèi)容采集
Indy Library     掃描
FlightDeckReports Bot 無(wú)用爬蟲
Linguee Bot      無(wú)用爬蟲
評(píng)論 (0)

請(qǐng)登錄

嘿,我來(lái)幫您