案例:頁面中的一個鏈接,審核元素得到的地址是“http://iphone.myzaker.com/l.php?l=54472e161bc8e0fd4a8b4573” ,點擊之后頁面自動跳轉到另一個地址“
http://mp.weixin.qq.com/s?__biz=MjM5NjExNjI4MA==&mid=202695292&idx=1&sn=8638f15ba27381236641077a77d43e03&scene=4#wechat_redirect”。
wget 分析地址
apples-air:mzread apple$ wget http://iphone.myzaker.com/l.php?l=54472e161bc8e0fd4a8b4573 --2014-10-23 17:27:17-- http://iphone.myzaker.com/l.php?l=54472e161bc8e0fd4a8b4573 Resolving iphone.myzaker.com... 106.186.30.108 Connecting to iphone.myzaker.com|106.186.30.108|:80... connected. HTTP request sent, awaiting response... 302 Moved Temporarily Location: http://mp.weixin.qq.com/s?__biz=MjM5NjExNjI4MA==&mid=202695292&idx=8&sn=f39c6c5dc2329e41eb58c71b53ba8a50&scene=4#wechat_redirect [following] --2014-10-23 17:27:19-- http://mp.weixin.qq.com/s?__biz=MjM5NjExNjI4MA==&mid=202695292&idx=8&sn=f39c6c5dc2329e41eb58c71b53ba8a50&scene=4 Resolving mp.weixin.qq.com... 203.205.143.142 Connecting to mp.weixin.qq.com|203.205.143.142|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 42622 (42K) [text/html]
可以看到訪問原地址之后,有一個302的跳轉。
那么問題來了,怎么樣獲取到跳轉之后的頁面地址?
辦法:利用方法Net::HTTP.get_response。
具體代碼:
require 'net/http'
res=Net::HTTP.get_response(URI('http://iphone.myzaker.com/l.php?l=54472e161bc8e0fd4a8b4573'))
res['location']
=> "http://mp.weixin.qq.com/s__biz=MjM5NjExNjI4MA==&mid=202695292&idx=1&sn=8638f15ba27381236641077a77d43e03&scene=4#wechat_redirect"
這樣就可以得到跳轉之后頁面的url。
