本項目已經上傳到github上面:https://github.com/wangqifan/WeChatAnalyse
這個Demo是利用HttpWebRequest和HttpWebResponse來爬取微信好友,講信息存入數據庫,並對信息進行分析,用圖標畫出來。
如何獲得好友信息
首先前往https://wx.qq.com/登錄自己的微信賬號,打開瀏覽器的開發者控制台
微信給后台發送了幾十個請求,我翻遍了所有請求,終於找到了想要的url
借助瀏覽器我們可以獲取這個請求的信息
我們可以把這個數據交給程序讓程序幫我們取請求,並把數據保存下來
前期准備新建一個名為WeChatAnalyse的項目
借助nuget安裝Entity Framwork和Json.net我們待會要使用這兩個框架,順便修改下配置文件
<connectionStrings> <add name="WeChartContex" connectionString="server=.;database=WeChat;uid=sa;pwd=000000" providerName="System.Data.SqlClient" /> </connectionStrings>
創建model
我們再次回到瀏覽器控制台,分析服務器返回的數據
根據返回數據,我們可以建類 Friend
[Key] public int Id { get; set; } public int Uin { get; set; } public string UserName { get; set; } public string NickName { get; set; } public string HeadImgUrl { get; set; } public int ContactFlag { get; set; } public int MemberCount { get; set; } public List<Friend> MemberList { get; set; } public string RemarkName { get; set; } public int HideInputBarFlag { get; set; } public int Sex { get; set; } public string Signature { get; set; } public int VerifyFlag { get; set; } public int OwnerUin { get; set; } public string PYInitial { get; set; } public string PYQuanPin { get; set; } public int StarFriend { get; set; } public int AppAccountFlag { get; set; } public int Statues { get; set; } public int AttrStatus { get; set; } public string Province { get; set; } public string City { get; set; } public string Alias { get; set; } public int SnsFlag { get; set; } public int UniFriend { get; set; } public string DisplayName { get; set; } public int ChatRoomId { get; set; } public string KeyWord { get; set; } public string EncryChatRoomId { get; set; }
類BaseResponse
public class BaseResponse { //"Ret": 0, "ErrMsg": "" public int Ret { get; set; } public string ErrMsg { get; set; } }
類Respone
public class Respone { public BaseResponse respoen { get; set; } public int MemberCount { get; set; } public List<Friend> MemberList { get; set; } }
創建數據庫上下文
public class WeChartContex:DbContext { public DbSet<Friend> Fridens { get; set; } }
創建控制器Sprider
給他添加一個數據庫上下文對象
WeChartContex context = new WeChartContex();
這種強耦合的代碼是不被推薦的,由於我們的Demo特別小,這里暫時這樣寫
public ActionResult GetFridendInformation() { HttpWebRequest request = (HttpWebRequest)WebRequest.Create("https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetcontact?r=1480564845349&seq=0&skey=@crypt_20089e09_d38ecc170f273d2db91833e793677276"); request.Method = "get"; request.UserAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0"; request.Referer = "https://wx.qq.com/"; // CookieContainer contain = new CookieContainer(); request.Headers.Add("Cookie", "pgv_pvi=1499432960; pt2gguin=o1694675518; RK=; /=s7972417536; wxsid=tjF6UrJ2RcvNH76H; wxloadtime=1480564533_expired);
using(Stream dataStream = response.GetResponseStream())
{ using( StreamReader reader = new StreamReader(dataStream))
{ // Read the content. string responseFromServer = reader.ReadToEnd(); Respone responsefronserver = JsonHelper.DeserializeToObject<Respone>(responseFromServer); foreach (var item in responsefronserver.MemberList) { if(item.VerifyFlag==0) context.Fridens.Add(item); }
}
}
request.Abort();
if (context.SaveChanges()>0) { return Content("ok"); }
return Content("fail"); }
基本思路是根據URL來創建一個HttpWebRequest對象,它是用http協議來請求的,我們來設置UserAgent,cookie,這里的cookie我做了刪除,cookie應該即時取瀏覽器提取,保證它是新鮮的
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("https://wx.qq.com/cgi-bin/mmwebwx-bin/webwxgetcontact?r=1480564845349&seq=0&skey=@crypt_20089e09_d38ecc170f273d2db91833e793677276"); request.Method = "get"; request.UserAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0"; request.Referer = "https://wx.qq.com/"; request.Headers.Add("Cookie", "pgv_pvi=149943")
接下來創建HttpWebResponse對象來獲取數據,並進行反序列化
首先創建一個jsonhelp
public class JsonHelper { /// <summary> /// 對數據進行序列化 /// </summary> /// <param name="value"></param> /// <returns></returns> public static string SerializeToString(object value) { return JsonConvert.SerializeObject(value); } /// <summary> /// 反序列化操作 /// </summary> /// <typeparam name="T"></typeparam> /// <param name="str"></param> /// <returns></returns> public static T DeserializeToObject<T>(string str) { return JsonConvert.DeserializeObject<T>(str); }
}
接下來對返回的數據進行反序列化
HttpWebResponse response = (HttpWebResponse)request.GetResponse(); Stream dataStream = response.GetResponseStream(); StreamReader reader = new StreamReader(dataStream); // Read the content. string responseFromServer = reader.ReadToEnd(); Respone responsefronserver = JsonHelper.DeserializeToObject<Respone>(responseFromServer);
接下來保存到數據庫中,我們只要朋友的信息
foreach (var item in responsefronserver.MemberList) { if(item.VerifyFlag==0) context.Fridens.Add(item); } if (context.SaveChanges()>0) { return Content("ok"); } return Content("fail");
VerifyFlag為0就是個人賬號,公眾號不為0
運行程序,進入Sprider/GetFridendInformation,返回OK執行成功
我一共有116個微信好友
博客推薦:
反反爬蟲策略:http://www.cnblogs.com/zuin/p/6323533.html
爬取知乎百萬爬蟲:http://www.cnblogs.com/zuin/p/6227834.html
Github:https://github.com/wangqifan/ZhiHu