excel批量提取網頁標題


最近時間比較忙,有時候很多網頁需要臨時保存,以便空閑的時候查看。單純的保存網頁鏈接會讓人很枯燥,所以需要自動批量提取標題。

為了這個小功能去寫個小程序有點不划算,所以就利用excel實現了這個功能。

 

先上圖:

 

 

代碼如下:

 1 Option Explicit
 2 
 3 
 4 Public Function GetTitle(url As String)
 5     Dim xmlHttp As Object
 6     Dim strHtml As String
 7     
 8     url = Trim(url)
 9     
10     If LCase(Left(url, 5)) = "https" Then
11     
12         GetTitle = "暫不支持https協議"
13         Exit Function
14     End If
15     
16     
17     '都不能構成完整的http協議,起碼也得 a.cc
18     If Len(url) < 5 Then
19         Exit Function
20     End If
21     
22     
23     url = "http://" & Replace(Trim(url), "http://", "")
24     
25     Set xmlHttp = CreateObject("Microsoft.XMLHTTP")
26     xmlHttp.Open "GET", url, True
27     xmlHttp.send (Null)
28     While xmlHttp.ReadyState <> 4
29         DoEvents
30     Wend
31     strHtml = LCase(BytesToBstr(xmlHttp.responseBody))
32     GetTitle = Split(Split(strHtml, "<title>")(1), "</title>")(0)
33     Set xmlHttp = Nothing
34 End Function
35 
36 Private Function BytesToBstr(Bytes)
37     Dim Unicode As String
38     If IsUTF8(Bytes) Then '如果不是UTF-8編碼則按照GB2312來處理
39         Unicode = "UTF-8"
40     Else
41         Unicode = "GB2312"
42     End If
43 
44     Dim objstream As Object
45     Set objstream = CreateObject("ADODB.Stream")
46     With objstream
47         .Type = 1
48         .Mode = 3
49         .Open
50         .Write Bytes
51         .Position = 0
52         .Type = 2
53         .Charset = Unicode
54         BytesToBstr = .ReadText
55        .Close
56     End With
57     Set objstream = Nothing
58 End Function
59 
60  '判斷網頁編碼函數
61 Private Function IsUTF8(Bytes) As Boolean
62         Dim i As Long, AscN As Long, Length As Long
63         Length = UBound(Bytes) + 1
64        
65         If Length < 3 Then
66             IsUTF8 = False
67             Exit Function
68         ElseIf Bytes(0) = &HEF And Bytes(1) = &HBB And Bytes(2) = &HBF Then
69             IsUTF8 = True
70             Exit Function
71         End If
72 
73         Do While i <= Length - 1
74             If Bytes(i) < 128 Then
75                 i = i + 1
76                 AscN = AscN + 1
77             ElseIf (Bytes(i) And &HE0) = &HC0 And (Bytes(i + 1) And &HC0) = &H80 Then
78                 i = i + 2
79 
80             ElseIf i + 2 < Length Then
81                 If (Bytes(i) And &HF0) = &HE0 And (Bytes(i + 1) And &HC0) = &H80 And (Bytes(i + 2) And &HC0) = &H80 Then
82                      i = i + 3
83                 Else
84                     IsUTF8 = False
85                     Exit Function
86                 End If
87             Else
88                 IsUTF8 = False
89                 Exit Function
90             End If
91         Loop
92                
93         If AscN = Length Then
94             IsUTF8 = False
95         Else
96             IsUTF8 = True
97         End If
98 
99 End Function

 

【說明】:因為目前保存的網頁都是文章類型,所以就直接避免處理https安全連接了。

 

相關知識點:excel批量提取網頁標題,excel自動提取網頁標題,vb自動識別網頁編碼,vb字符串utf8轉gbk

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM