系列文章專輯:https://bbs.ichunqiu.com/forum.php?mod=collection&action=view&ctid=137
0×00 前言
URl采集在批量刷洞中也是很重要的
0×01 目錄
0×01 前言
0×02 ZoomEyeAPI腳本編寫
0×03 ShoDanAPI腳本編寫
0×04 簡易BaiduURL采集腳本編寫
0×05 【彩蛋篇】論壇自動簽到腳本
0×02 ZoomEyeAPI腳本編寫
ZoomEye是一款針對網絡空間的搜索引擎,收錄了互聯網空間中的設備、網站及其使用的服務或組件等信息。
ZoomEye 擁有兩大探測引擎:Xmap 和 Wmap,分別針對網絡空間中的設備及網站, 通過 24 小時不間斷的探測、識別,標識出互聯網設備及網站所使用的服務及組件。 研究人員可以通過 ZoomEye 方便的了解組件的普及率及漏洞的危害范圍等信息。
雖然被稱為 “黑客友好” 的搜索引擎,但 ZoomEye 並不會主動對網絡設備、網站發起攻擊,收錄的數據也僅用於安全研究。ZoomEye更像是互聯網空間的一張航海圖。
ZoomEyeAPI參考手冊在這:ZoomEye API 參考手冊
先登錄,然后獲取access_token
#-*- coding: UTF-8 -*-
import requests
import json
user = raw_input('[-] PLEASE INPUT YOUR USERNAME:')
passwd = raw_input('[-] PLEASE INPUT YOUR PASSWORD:')
def Login():
data_info = {'username' : user,'password' : passwd}
data_encoded = json.dumps(data_info)
respond = requests.post(url = 'https://api.zoomeye.org/user/login',data = data_encoded)
try:
r_decoded = json.loads(respond.text)
access_token = r_decoded['access_token']
except KeyError:
return '[-] INFO : USERNAME OR PASSWORD IS WRONG, PLEASE TRY AGAIN'
return access_token
if __name__ == '__main__':
print Login()
然后,API手冊是這么寫的,根據這個,咱們先寫一個HOST的單頁面采集的….
#-*- coding: UTF-8 -*-
import requests
import json
user = raw_input('[-] PLEASE INPUT YOUR USERNAME:')
passwd = raw_input('[-] PLEASE INPUT YOUR PASSWORD:')
def Login():
data_info = {'username' : user,'password' : passwd}
data_encoded = json.dumps(data_info)
respond = requests.post(url = 'https://api.zoomeye.org/user/login',data = data_encoded)
try:
r_decoded = json.loads(respond.text)
access_token = r_decoded['access_token']
except KeyError:
return '[-] INFO : USERNAME OR PASSWORD IS WRONG, PLEASE TRY AGAIN'
return access_token
def search():
headers = {'Authorization': 'JWT ' + Login()}
r = requests.get(url = 'https://api.zoomeye.org/host/search?query=tomcat&page=1',
headers = headers)
response = json.loads(r.text)
print response
if __name__ == '__main__':
search()
返回的信息量極大啊,但它也是個JSON數據,SO,我們可以取出IP部分…
for x in response['matches']: print x['ip']
之后,HOST的單頁面采集也就OK了,WEB的也五五開,留着你們自己分析,其實差不多,后文會貼的
接下來,就是用FOR循環….獲取多頁的IP
#-*- coding: UTF-8 -*-
import requests
import json
def Login():
data_info = {'username' : user,'password' : passwd}
data_encoded = json.dumps(data_info)
respond = requests.post(url = 'https://api.zoomeye.org/user/login',data = data_encoded)
try:
r_decoded = json.loads(respond.text)
access_token = r_decoded['access_token']
except KeyError:
return '[-] INFO : USERNAME OR PASSWORD IS WRONG, PLEASE TRY AGAIN'
return access_token
def search():
headers = {'Authorization': 'JWT ' + Login()}
for i in range(1,int(PAGECOUNT)):
r = requests.get(url = 'https://api.zoomeye.org/host/search?query=tomcat&page='+str(i),
headers = headers)
response = json.loads(r.text)
for x in response['matches']:
print x['ip']
if __name__ == '__main__':
user = raw_input('[-] PLEASE INPUT YOUR USERNAME:')
passwd = raw_input('[-] PLEASE INPUT YOUR PASSWORD:')
PAGECOUNT = raw_input('[-] PLEASE INPUT YOUR SEARCH_PAGE_COUNT(eg:10):')
search()
這樣就取出了你想要的頁碼的數據,然后就是完善+美觀代碼了…..
#-*- coding: UTF-8 -*-
import requests
import json
def Login(user,passwd):
data_info = {'username' : user,'password' : passwd}
data_encoded = json.dumps(data_info)
respond = requests.post(url = 'https://api.zoomeye.org/user/login',data = data_encoded)
try:
r_decoded = json.loads(respond.text)
access_token = r_decoded['access_token']
except KeyError:
return '[-] INFO : USERNAME OR PASSWORD IS WRONG, PLEASE TRY AGAIN'
return access_token
def search(queryType,queryStr,PAGECOUNT,user,passwd):
headers = {'Authorization': 'JWT ' + Login(user,passwd)}
for i in range(1,int(PAGECOUNT)):
r = requests.get(url = 'https://api.zoomeye.org/'+ queryType +'/search?query='+queryStr+'&page=' + str(i),
headers = headers)
response = json.loads(r.text)
try:
if queryType == "host":
for x in response['matches']:
print x['ip']
if queryType == "web":
for x in response['matches']:
print x['ip'][0]
except KeyError:
print "[ERROR] No hosts found"
def main():
print " _____ _____ ____ "
print "|__ /___ ___ _ __ ___ | ____| _ ___/ ___| ___ __ _ _ __"
print " / // _ \ / _ \| '_ ` _ \| _|| | | |/ _ \___ \ / __/ _` | '_ \ "
print " / /| (_) | (_) | | | | | | |__| |_| | __/___) | (_| (_| | | | |"
print "/____\___/ \___/|_| |_| |_|_____\__, |\___|____/ \___\__,_|_| |_|"
print " |___/ "
user = raw_input('[-] PLEASE INPUT YOUR USERNAME:')
passwd = raw_input('[-] PLEASE INPUT YOUR PASSWORD:')
PAGECOUNT = raw_input('[-] PLEASE INPUT YOUR SEARCH_PAGE_COUNT(eg:10):')
queryType = raw_input('[-] PLEASE INPUT YOUR SEARCH_TYPE(eg:web/host):')
queryStr = raw_input('[-] PLEASE INPUT YOUR KEYWORD(eg:tomcat):')
Login(user,passwd)
search(queryType,queryStr,PAGECOUNT,user,passwd)
if __name__ == '__main__':
main()
0×03 ShoDanAPI腳本編寫
Shodan是互聯網上最可怕的搜索引擎。
CNNMoney的一篇文章寫道,雖然目前人們都認為谷歌是最強勁的搜索引擎,但Shodan才是互聯網上最可怕的搜索引擎。
與谷歌不同的是,Shodan不是在網上搜索網址,而是直接進入互聯網的背后通道。Shodan可以說是一款“黑暗”谷歌,一刻不停的在尋找着所有和互聯網關聯的服務器、攝像頭、打印機、路由器等等。每個月Shodan都會在大約5億個服務器上日夜不停地搜集信息。
Shodan所搜集到的信息是極其驚人的。凡是鏈接到互聯網的紅綠燈、安全攝像頭、家庭自動化設備以及加熱系統等等都會被輕易的搜索到。Shodan的使用者曾發現過一個水上公園的控制系統,一個加油站,甚至一個酒店的葡萄酒冷卻器。而網站的研究者也曾使用Shodan定位到了核電站的指揮和控制系統及一個粒子回旋加速器。
Shodan真正值得注意的能力就是能找到幾乎所有和互聯網相關聯的東西。而Shodan真正的可怕之處就是這些設備幾乎都沒有安裝安全防御措施,其可以隨意進入。
淺安dalao寫過,介紹的也很詳細…..
地址傳送門:基於ShodanApi接口的調用python版
先說基於API查詢。。。官方文檔:http://shodan.readthedocs.io/en/latest/tutorial.html
每次查詢要扣除1積分…..,而用shodan庫模塊不需要….
寫個簡單的,他跟Zoomeye的五五開,就不細寫了…
#-*- coding: UTF-8 -*-
import requests
import json
def getip():
API_KEY = *************
url = 'https://api.shodan.io/shodan/host/search?key='+API_KEY+'&query=apache'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87'}
req = requests.get(url=url,headers=headers)
content = json.loads(req.text)
for i in content['matches']:
print i['ip_str']
if __name__ == '__main__':
getip()
接下來,就是基於shodan模塊的…直接引用淺安dalao的。。。我懶得寫….
安裝:pip install shodan
#-*- coding: UTF-8 -*-
import shodan
import sys
API_KEY = ‘YOU_API_KEY’ #調用shodan api
FACETS = [
('country',100), # 匹配出前一百位的國家數量,100可自定義
]
FACET_TITLES = {
'country': 'Top 100 Countries',
}
#輸入判斷
if len(sys.argv) == 1:
print 'Search Method:Input the %s and then the keyword' % sys.argv[0]
sys.exit()
try:
api = shodan.Shodan(API_KEY)
query = ' '.join(sys.argv[1:])
print "You Search is:" + query
result = api.count(query, facets=FACETS) # 使用count比search快
for facet in result['facets']:
print FACET_TITLES[facet]
for key in result['facets'][facet]:
countrie = '%s : %s' % (key['value'], key['count'])
print countrie
with open(u"搜索" + " " + query + " " + u"關鍵字" +'.txt','a+') as f:
f.write(countrie +"\n")
f.close()
print " "
print "save is coutures.txt"
print "Search is Complete."
except Exception, e:
print 'Error: %s' % e
0×04 簡易BaiduURL采集腳本編寫
先是爬去單頁的URL,舉個栗子是爬去阿甫哥哥這個關鍵字的URL
#-*- coding: UTF-8 -*-
import requests
from bs4 import BeautifulSoup as bs
import re
def getfromBaidu(word):
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87'}
url = 'https://www.baidu.com.cn/s?wd=' + word + '&pn=1'
html = requests.get(url=url,headers=headers,timeout=5)
soup = bs(html.content, 'lxml', from_encoding='utf-8')
bqs = soup.find_all(name='a', attrs={'data-click':re.compile(r'.'), 'class':None})
for i in bqs:
r = requests.get(i['href'], headers=headers, timeout=5)
print r.url
if __name__ == '__main__':
getfromBaidu('阿甫哥哥')
然后是多頁的爬取,比如爬取前20頁的
#-*- coding: UTF-8 -*-
import requests
from bs4 import BeautifulSoup as bs
import re
def getfromBaidu(word,pageout):
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87'}
for k in range(0,(pageout-1)*10,10):
url = 'https://www.baidu.com.cn/s?wd=' + word + '&pn=' + str(k)
html = requests.get(url=url,headers=headers,timeout=5)
soup = bs(html.content, 'lxml', from_encoding='utf-8')
bqs = soup.find_all(name='a', attrs={'data-click':re.compile(r'.'), 'class':None})
for i in bqs:
r = requests.get(i['href'], headers=headers, timeout=5)
print r.url
if __name__ == '__main__':
getfromBaidu('阿甫哥哥',10)
0×05 【彩蛋篇】論壇自動簽到腳本
之前其實貼出來了,只是怕有些人沒看到….在分享一次….
簽到可以獲取大量魔法幣….他的多種獲取方法,請戳:
https://bbs.ichunqiu.com/thread-36007-1-1.html
實現方法只需要將COOKIE修改為你的即可
實現功能是每天24點自動簽到…掛在服務器上即可….
#-*- coding: UTF-8 -*-
import requests
import datetime
import time
import re
def sign():
url = 'https://bbs.ichunqiu.com/plugin.php?id=dsu_paulsign:sign'
cookie = {'__jsluid':'3e29e6c**********8966d9e0a481220',' UM_distinctid':'1605f635c78159************016-5d4e211f-1fa400-1605f635c7ac0',' pgv_pvi':'4680553472',******...........}
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87'}
r = requests.get(url=url,cookies=cookie,headers=headers)
rows = re.findall(r'<input type=\"hidden\" name=\"formhash\" value=\"(.*?)\" />', r.content)
if len(rows)!=0:
formhash = rows[0]
print '[-]Formhash is: ' + formhash
else:
print '[-]None formhash!'
if '您今天已經簽到過了或者簽到時間還未開始' in r.text:
print '[-]Already signed!!'
else:
sign_url = 'https://bbs.ichunqiu.com/plugin.php?id=dsu_paulsign:sign&operation=qiandao&infloat=1&inajax=1'
sign_payload = {
'formhash':formhash,
'qdxq':'fd',
'qdmode':'2',
'todaysay':'',
'fastreply':0,
}
sign_req = requests.post(url=sign_url,data=sign_payload,headers=headers,cookies=cookie)
if '簽到成功' in sign_req.text:
print '[-]Sign success!!'
else:
print '[-]Something error...'
time.sleep(60)
def main(h=0, m=0):
while True:
while True:
now = datetime.datetime.now()
if now.hour==h and now.minute==m:
break
time.sleep(20)
sign()
if __name__ == '__main__':
main()
>>>>>> 黑客入門必備技能 帶你入坑和逗比表哥們一起聊聊黑客的事兒,他們說高精尖的技術比農葯都好玩~



