一般我們都是利用WebRequest這個類來向服務器進行數據的POST,不過很多情況下相應的服務器都有驗證,看你是不是登陸,是不是來自同一個域,這些都簡單,我們可以更改其屬性來達到欺騙服務器。不過如果服務器做了CSRF控制,那我們怎么辦?
不熟悉CSRF的可以問下G哥此為何物,這里簡單介紹下。CSRF常規來講是在表單頁里放一個隱藏域,然后在表單提交的時候服務器驗證POST過來的NAVEVALUE里面是不是包含此域,同時如果包含驗證其值。
問題來了,在這種情況下我們POST到服務器的數據怎么寫,雖然我們可以查看HTML來得知這個NAME是什么以及它的VALUE是什么,但是這個VALUE一般情況下每刷一次都是會發生變化的。那好了在我們POST的時候怎么來得到它呢?
網上常見的那些WebRequest方法肯定不行,因為它們都是用這個類先獲得一個Stream,在這個Stream里面寫入我們要POST到服務器的數據,可這個時候我們還不知道這個CSRF的值呢,POST過去肯定出錯。理論上來講我們要先GET一次,然后自己辦法解析GET到的這個HTML,得到CSRF的值,可是接下來我們再去WebRequest.Creat打算去POST的時候,此時相當於又重新訪問了一遍,它的CSRF值已經變了,看來此路不通啊。
好在我們還有WebClient可以利用,WebClient可以讓我們保持一個實例即可,而WebRequest只有通過靜態方法創造出來,不能通過變化URL來達到使用同一個的目的,此處可能也是在NET4里微軟推出全新HttpClient的目的,用來一統HTTP訪問接口的江湖。
好了,我們現在需要做的就是繼承WebClient,重寫相應方法,代碼如下:
1 public class CookieAwareWebClient : WebClient 2 { 3 public string Method; 4 public CookieContainer CookieContainer { get; set; } 5 public Uri Uri { get; set; } 6 7 public CookieAwareWebClient() 8 : this(new CookieContainer()) 9 { 10 } 11 12 public CookieAwareWebClient(CookieContainer cookies) 13 { 14 this.CookieContainer = cookies; 15 this.Encoding = Encoding.UTF8; 16 } 17 18 protected override WebRequest GetWebRequest(Uri address) 19 { 20 WebRequest request = base.GetWebRequest(address); 21 if (request is HttpWebRequest) 22 { 23 (request as HttpWebRequest).CookieContainer = this.CookieContainer; 24 (request as HttpWebRequest).ServicePoint.Expect100Continue = false; 25 (request as HttpWebRequest).UserAgent = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.5 Safari/537.36"; 26 (request as HttpWebRequest).Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"; 27 (request as HttpWebRequest).Headers.Add(HttpRequestHeader.AcceptLanguage, "zh-CN,zh;q=0.8,en;q=0.6,nl;q=0.4,zh-TW;q=0.2"); 28 (request as HttpWebRequest).Referer = "some url"; 29 (request as HttpWebRequest).KeepAlive = true; 30 (request as HttpWebRequest).AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip; 31 if (Method == "POST") 32 { 33 (request as HttpWebRequest).ContentType = "application/x-www-form-urlencoded"; 34 } 35 } 36 HttpWebRequest httpRequest = (HttpWebRequest)request; 37 httpRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate; 38 return httpRequest; 39 } 40 41 protected override WebResponse GetWebResponse(WebRequest request) 42 { 43 WebResponse response = base.GetWebResponse(request); 44 String setCookieHeader = response.Headers[HttpResponseHeader.SetCookie]; 45 46 if (setCookieHeader != null) 47 { 48 //do something if needed to parse out the cookie. 49 try 50 { 51 if (setCookieHeader != null) 52 { 53 Cookie cookie = new Cookie(); 54 cookie.Domain = request.RequestUri.Host; 55 this.CookieContainer.Add(cookie); 56 } 57 } 58 catch (Exception) 59 { 60 61 } 62 } 63 return response; 64 } 65 }
可以看出,其實最關鍵的還是利用好CookieContainer這個類。接下來就是如何使用了,我們需要首先訪問一次登陸頁面,拿到HTML然后正則也好替換也好,拿到這個CSRF的VALUE,然后再將其POST相應的服務器。
1 var cookieJar = new CookieContainer(); 2 CookieAwareWebClient client = new CookieAwareWebClient(cookieJar); 3 4 // the website sets some cookie that is needed for login, and as well the 'lt' is always different 5 string response = client.DownloadString("url for get"); 6 string regx = "<input type=\"hidden\" id=\"lt\" name=\"lt\" value=\"(?<PID>\\S+?)\" />"; 7 // parse the 'lt' and cookie is auto handled by the cookieContainer 8 string token = Regex.Match(response, regx).Groups[1].Value; 9 string urlforlogin = "url for login"; 10 string postData = 11 string.Format("username={0}&password={1}<={2}", "user", "pass", token); 12 client.Method = "POST"; 13 response = client.UploadString("url for login", postData); 14 15 client.Method = "GET";
到此我們就可以結束,后期就是變化不同的URL去DownloadString了,俗稱爬蟲,接下來就可以根據不同的業務做不同的數據分析了。
