雙色球想必大家都很熟悉了,盡管屢買屢不中,但還是會買。以前就想過利用雙色球的走勢圖得到雙色球的數據庫,至於得到數據庫干什么倒沒想過,不過對以往號碼有沒有重復出現還是挺好奇的。最近寫Entity Framework的博客,所以這篇文章的標題里就出現了Entity Framework的身影,其實Entity Framework在下面的程序里只占據了很少的一部分。
下面開始介紹我獲取數據庫的方法。
雙色球的走勢圖網址:http://zx.caipiao.163.com/trend/ssq_basic.html
打開之后,如下圖所示,默認顯示的是最近30期的:
根據期號進行查詢,可以得到如下的鏈接:
很容易可以發現beginPeriod表示的是開始期號,endPeriod表示的截止期號。有了這兩個參數,就可以得到任意期號的數據了。根據上述方法查詢,得到網易彩票提供的最早數據是2004009期。
下面分析走勢圖的html結構。
谷歌瀏覽器中,按Ctrl+Shift+i 或Firefox中使用Firebug可查看html的結構。
下圖是走勢圖的html結構,可以看到圖表數據在id為chartsTable的表格里。進一步查看,真正有用的數據是在<tbody></tbody>標簽中。
下面給出獲取<tbody></tbody>之間內容的代碼:
1: /// <summary>
2: /// 獲取網頁的雙色球數據
3: /// </summary>
4: /// <param name="startQH">開始期號</param>
5: /// <param name="endQH">截止期號</param>
6: /// <returns></returns>
7: private string GetOriginData(string startQH, string endQH)
8: {
9: string path = string.Format("http://zx.caipiao.163.com/trend/ssq_basic.html?beginPeriod={0}&endPeriod={1}", startQH, endQH);
10: WebRequest wp = WebRequest.Create(path);
11: Stream s = wp.GetResponse().GetResponseStream();
12: StreamReader sr = new StreamReader(s);
13: string content = sr.ReadToEnd();
14: sr.Close();
15: s.Close();
16: int startIndex = content.IndexOf("<tbody id=\"cpdata\">");
17: int endIndex = content.IndexOf("</tbody>");
18: content = content.Substring(startIndex, endIndex - startIndex).Replace("<tr class=\"bg_doe\" >", "<tr>").Replace("<tr >", "<tr>").Replace("\r\n", "");
19: return content;
20: }
<tbody></tbody>中的內容就是<tr></tr>和<td></td>了,下面給出解析<tr>和<td>的代碼,有注釋,就不多解釋了。
1: /// <summary>
2: /// 循環解析Tr
3: /// </summary>
4: /// <param name="wnRepo"></param>
5: /// <param name="content"><tbody></tbody>之間的內容</param>
6: private void ResolveTr(IRepository<WinNo> wnRepo, string content)
7: {
8: string trContent = string.Empty;
9: WinNo wn = null;
10: Regex regex = new Regex("<tr>");
11: //在<tbody></tbody>之間的內容搜索所有匹配<tr>的項
12: MatchCollection matches = regex.Matches(content);
13: foreach (Match item in matches)
14: {
15: wn = new WinNo();
16: //如果當前匹配項的下一個匹配項的值不為空
17: if (!string.IsNullOrEmpty(item.NextMatch().Value))
18: {
19: trContent = content.Substring(item.Index, item.NextMatch().Index - item.Index);
20: }
21: //最后一個<tr>的匹配項
22: else
23: {
24: trContent = content.Substring(item.Index, content.Length - item.Index);
25: }
26: ResolveTd(wn, trContent);
27: wnRepo.Insert(wn);
28: }
29: }
30: /// <summary>
31: /// 在一個TR中,解析TD,獲取一期的號碼
32: /// </summary>
33: /// <param name="wn"></param>
34: /// <param name="trContent"></param>
35: private void ResolveTd(WinNo wn, string trContent)
36: {
37: //匹配期號的表達式
38: string patternQiHao = "<td align=\"center\" title=\"開獎日期";
39: Regex regex = new Regex(patternQiHao);
40: Match qhMatch = regex.Match(trContent);
41: wn.QiHao = trContent.Substring(qhMatch.Index + 17 + patternQiHao.Length, 7);
42: //匹配藍球的表達式
43: string patternChartBall02 = "<td class=\"chartBall02\">";
44: regex = new Regex(patternChartBall02);
45: Match bMatch = regex.Match(trContent);
46: wn.B = Convert.ToInt32(trContent.Substring(bMatch.Index + patternChartBall02.Length, 2));
47: //存放匹配出來的紅球號碼
48: redBoxList = new List<int>();
49: //匹配紅球的表達式
50: string patternChartBall01 = "<td class=\"chartBall01\">";
51: regex = new Regex(patternChartBall01);
52: MatchCollection rMatches = regex.Matches(trContent);
53: foreach (Match r in rMatches)
54: {
55: redBoxList.Add(Convert.ToInt32(trContent.Substring(r.Index + patternChartBall01.Length, 2)));
56: }
57: //匹配紅球的表達式
58: string patternChartBall07 = "<td class=\"chartBall07\">";
59: regex = new Regex(patternChartBall07);
60: rMatches = regex.Matches(trContent);
61: foreach (Match r in rMatches)
62: {
63: redBoxList.Add(Convert.ToInt32(trContent.Substring(r.Index + patternChartBall07.Length, 2)));
64: }
65: //排序紅球號碼
66: redBoxList.Sort();
67: //第一個紅球號碼
68: wn.R1 = redBoxList[0];
69: //第二個紅球號碼
70: wn.R2 = redBoxList[1];
71: wn.R3 = redBoxList[2];
72: wn.R4 = redBoxList[3];
73: wn.R5 = redBoxList[4];
74: wn.R6 = redBoxList[5];
75: }
下面給出使用到Entity Framework部分的代碼:
首先,新建一個WinNo實體,用於表示雙色球信息:
1: public class WinNo
2: {
3: /// <summary>
4: /// 主鍵
5: /// </summary>
6: public int ID { get; set; }
7: /// <summary>
8: /// 期號
9: /// </summary>
10: public string QiHao { get; set; }
11:
12: /// <summary>
13: /// 第一個紅球號碼
14: /// </summary>
15: public int R1 { get; set; }
16: /// <summary>
17: /// 第二個紅球號碼
18: /// </summary>
19: public int R2 { get; set; }
20: /// <summary>
21: /// 第三個紅球號碼
22: /// </summary>
23: public int R3 { get; set; }
24: /// <summary>
25: /// 第四個紅球號碼
26: /// </summary>
27: public int R4 { get; set; }
28: /// <summary>
29: /// 第五個紅球號碼
30: /// </summary>
31: public int R5 { get; set; }
32: /// <summary>
33: /// 第六個紅球號碼
34: /// </summary>
35: public int R6 { get; set; }
36: /// <summary>
37: /// 籃球號碼
38: /// </summary>
39: public int B { get; set; }
40: }
其次,使用默認配置即可。
第三,新建一個上下文:SSQContext,代碼如下:
1: public class SSQContext : DbContext
2: {
3: public SSQContext()
4: {
5: //Database.SetInitializer(new DropCreateDatabaseAlways<SSQContext>());
6: Database.SetInitializer<SSQContext>(null);
7: }
8:
9: public DbSet<WinNo> WinNos { get; set; }
10:
11: protected override void OnModelCreating(DbModelBuilder modelBuilder)
12: {
13: modelBuilder.Conventions.Remove<PluralizingTableNameConvention>();
14: base.OnModelCreating(modelBuilder);
15: }
16: }
第四,運行程序,結果如下圖所示:
本程序的源代碼下載地址為:http://www.ef-community.com/forum.php?mod=viewthread&tid=44&extra=page%3D1



