最近經常上蝦米聽歌,有些歌蠻好聽的,昨天回上海准備下載一些音樂路上聽,發現要用蝦幣購買,第一想法在chrome瀏覽器中按下F12,看Network中發出的報文,很輕松的找到了類似http://f3.xiami.net/78926/417559/08%201769939716_1875663.mp3這樣的鏈接,這就是音樂的真實地址,可以直接下載下來。這里多說一句,很多人問怎么可以把在線的視頻或者音樂下載到本地,網上也可以看到各式各樣的回答,有用嗅探工具的,有從瀏覽器緩存找的,其實用chrome或者其他瀏覽器自帶的抓包功能就很容易就能找到。
上面是最簡單的方法,但是需要很多手工操作,下面用程序的方式來解析,更重要的是提供一個這類問題的思路。
首先來分析一下這首歌,地址是http://www.xiami.com/song/1769939716 從網頁內容可以看到歌曲名字Rainbow Trees,演唱者 Robert de Boron,所屬專輯 Diaspora,打開網頁源代碼注意到一些數字 1769939716,417599,78926.回頭看看mp3的真實地址http://f3.xiami.net/78926/417559/08%201769939716_1875663.mp3,1769939716是歌曲ID,417599是所屬專輯ID,78926是演唱者ID,發現這個url的構成 http://f3.xiami.net/演唱者ID/所屬專輯ID/08%20歌曲ID_18655663.mp3.
這里還差一些東西08是什么?18655663是什么?%20我們知道是空格符,回到專輯頁面http://www.xiami.com/album/417559發現這首歌Rainbow Trees是第八首歌,那18655663是什么?翻遍了chrome發出的所有報文,所有相關頁面的源代碼,沒找到這個數字是什么意思。沒辦法,網上找了個反編譯swf的軟件,反編譯了播放器的源代碼,找到一些源代碼
下面的代碼看起來像是獲取歌曲位置的代碼,再繼續找到getLocation方法
var dataStr:* = evt.target.data; dataStr = dataStr.replace(" xmlns=\"http://xspf.org/ns/0/\"", ""); var xmlData:* = new XML(dataStr); xmlData.ignoreWhitespace = true; uid = xmlData.uid; clearList = xmlData.clearlist; var songArr:* = xmlData.trackList.track; var tLoadArr:* = []; var backgroundStr:* = ""; var firstSongId:* = 0; var addSongTmpArr:* = []; var oldDataArr:* = []; if (songArr[0] != undefined){ for (i in songArr) { tData = songArr[i]; songLocation = ""; thisLocation = tData.location; if (thisLocation.indexOf("http://") < 0){ try { songLocation = locationDec.getLocation(tData.location); } catch(e) { }; } else { songLocation = thisLocation; };
以下是getLocation方法
public function getLocation(_arg1:String):String{ var _local10:*; var _local2:* = Number(_arg1.charAt(0)); var _local3:* = _arg1.substring(1); var _local4:* = Math.floor((_local3.length / _local2)); var _local5:* = (_local3.length % _local2); var _local6:* = new Array(); var _local7:* = 0; while (_local7 < _local5) { if (_local6[_local7] == undefined){ _local6[_local7] = ""; }; _local6[_local7] = _local3.substr(((_local4 + 1) * _local7), (_local4 + 1)); _local7++; }; _local7 = _local5; while (_local7 < _local2) { _local6[_local7] = _local3.substr(((_local4 * (_local7 - _local5)) + ((_local4 + 1) * _local5)), _local4); _local7++; }; var _local8:* = ""; _local7 = 0; while (_local7 < _local6[0].length) { _local10 = 0; while (_local10 < _local6.length) { _local8 = (_local8 + _local6[_local10].charAt(_local7)); _local10++; }; _local7++; }; _local8 = unescape(_local8); var _local9:* = ""; _local7 = 0; while (_local7 < _local8.length) { if (_local8.charAt(_local7) == "^"){ _local9 = (_local9 + "0"); } else { _local9 = (_local9 + _local8.charAt(_local7)); }; _local7++; }; _local9 = _local9.replace("+", " "); return (_local9); }
這些代碼看起來非常像獲取地址的關鍵代碼,沿着標黑的代碼往上找到一個xml文件,並且這個xml文件里面應該有location這個標簽,這時候找到這個xml文件很關鍵,這時候回到瀏覽器重新抓包,找到了這樣一個鏈接http://www.xiami.com/song/playlist/id/1769939716(歌曲ID)/object_name/default/object_id/0。內容如下
<?xml version="1.0" encoding="utf-8"?> <playlist version="1" xmlns="http://xspf.org/ns/0/"> <trackList> <track> <title><![CDATA[Rainbow Trees]]></title> <song_id>1769939716</song_id> <album_id>417559</album_id> <album_name><![CDATA[Diaspora]]></album_name> <object_id>1</object_id> <object_name>default</object_name> <insert_type>1</insert_type> <background>http://img.xiami.com/res/player/bimg/bg-5.bak.jpg</background> <grade>-1</grade> <artist><![CDATA[Robert de Boron]]></artist> <location>4h%2Fxit7645F8219186pt3Ffi.%8%19%%%736733tA%3an2927%52569_5.p%2.meF2F52E5E9716m</location> <ms></ms> <lyric>http://www.xiami.com/song/lyrictxt/id/1769939716</lyric> <pic>http://img.xiami.com/images/album/img26/78926/4175591312340942_1.jpg</pic> </track> </trackList> <uid>12390378</uid> <type>default</type> <type_id>1</type_id> <clearlist></clearlist> </playlist>
里面找到了我想要的location標簽中的內容。拿到源代碼和location參數后就明白了,4h%2Fxit7645F8219186pt3Ffi.%8%19%%%736733tA%3an2927%52569_5.p%2.meF2F52E5E9716m這串字符串中,把第一個字符4拿出來,然后把剩余的字符串分為四部分,若能整除則每部分都一樣長,若不能整除,則后余數個字符串少一個字符,這里拆開后為[h%2Fxit7645F8219186p, t3Ffi.%8%19%%%736733, tA%3an2927%52569_5., p%2.meF2F52E5E9716m],一共78個字符 4-78%4 = 2,因此數列為[20,20,19,19].然后從第一個字符串的第一個字符開始拼接,若把這個拆分后的字符串數組看成一個二維的字符數組,拼接方式為[0][0],[1][0],[2][0],[3][0],[4][0],[0][1],[1][1],[2][1],[3][1][4][1]... 拼完之后http%3A%2F%2Ff3.xiami.net%2F78926%2F417559%2F%5E8%252%5E1769939716_1875663.mp3,然后urldecode為http://f3.xiami.net/78926/417559/^8%2^1769939716_1875663.mp3,最后把^替換為字符0.
自己平時用java,把這段代碼翻譯成JAVA后。
public static String getLocation(String location) throws UnsupportedEncodingException { int _local10; int _local2 = Integer.parseInt(location.substring(0, 1)); String _local3 = location.substring(1, location.length()); double _local4 = Math.floor(_local3.length() / _local2); int _local5 = _local3.length() % _local2; String[] _local6 = new String[_local2]; int _local7 = 0; while (_local7 < _local5) { if (_local6[_local7] == null) { _local6[_local7] = ""; } _local6[_local7] = _local3.substring((((int) _local4 + 1) * _local7), (((int) _local4 + 1) * _local7) + ((int) _local4 + 1)); _local7++; } _local7 = _local5; while (_local7 < _local2) { _local6[_local7] = _local3 .substring((((int) _local4 * (_local7 - _local5)) + (((int) _local4 + 1) * _local5)), (((int) _local4 * (_local7 - _local5)) + (((int) _local4 + 1) * _local5))+(int) _local4); _local7++; } String _local8 = ""; _local7 = 0; while (_local7 < ((String) _local6[0]).length()) { _local10 = 0; while (_local10 < _local6.length) { if (_local7 >= _local6[_local10].length()) { break; } _local8 = (_local8 + _local6[_local10].charAt(_local7)); _local10++; } _local7++; } _local8 = URLDecoder.decode(_local8, "utf8"); String _local9 = ""; _local7 = 0; while (_local7 < _local8.length()) { if (_local8.charAt(_local7) == '^'){ _local9 = (_local9 + "0"); } else { _local9 = (_local9 + _local8.charAt(_local7)); }; _local7++; } _local9 = _local9.replace("+", " "); return _local9; }
把location標簽中的內容作為輸入,輸出結果就是我想要的mp3真實地址了。
這里我提供以下我處理這類問題的思路,適用於視頻真實地址,音樂真實地址的解析。首先是瀏覽器抓包,一般這種方式可以直接拿到真實地址,但是如果要做一個程序自動去抓這樣還不行,需要知道這個地址是怎么生成的,比如土豆視頻,通過一個請求獲取一個xml,xml中就有視頻地址,這種最簡單。比如優酷的直接通過抓包看不出來是怎么算出來真實地址的,這時候需要反編譯flash,然后把flash中的代碼翻譯成你自己想要的語言。