Tushare 的get_k_data


近日,挖地兔更新了tushare版本。主要是推出了新的函數get_k_data函數。來對此函數做一些分析。

函數頭部分:

def get_k_data(code=None, start='', end='',
                  ktype='D', autype='qfq', 
                  index=False,
                  retry_count=3,
                  pause=0.001):
    """
    獲取k線數據
    ---------
    Parameters:
      code:string
                  股票代碼 e.g. 600848
      start:string
                  開始日期 format:YYYY-MM-DD 為空時取當前日期
      end:string
                  結束日期 format:YYYY-MM-DD 為空時取去年今日
      autype:string
                  復權類型,qfq-前復權 hfq-后復權 None-不復權,默認為qfq
      ktype:string
                  數據類型,D=日k線 W=周 M=月 5=5分鍾 15=15分鍾 30=30分鍾 60=60分鍾,默認為D
      retry_count : int, 默認 3
                 如遇網絡等問題重復執行的次數 
      pause : int, 默認 0
                重復請求數據過程中暫停的秒數,防止請求間隔時間太短出現的問題
      drop_factor : bool, 默認 True
                是否移除復權因子,在分析過程中可能復權因子意義不大,但是如需要先儲存到數據庫之后再分析的話,有該項目會更加靈活

 

接下來一行行分析(用紅色表示get_k_data函數的代碼):

symbol = ct.INDEX_SYMBOL[code] if index else _code_to_symbol(code)
url = ''
dataflag = ''

index若為True直接去預先定義好的字典中找對應的symb,如果index是False,則調用函數_code_to_symbol:    

def _code_to_symbol(code):
    """
        生成symbol代碼標志
    """
    if code in ct.INDEX_LABELS:
        return ct.INDEX_LIST[code]
    else:
        if len(code) != 6 :
            return ''
        else:
            return 'sh%s'%code if code[:1] in ['5', '6', '9'] else 'sz%s'%code

  找到INDEX_LABELS和INDEX_LIST的定義:

INDEX_LABELS = ['sh', 'sz', 'hs300', 'sz50', 'cyb', 'zxb', 'zx300', 'zh500']
INDEX_LIST = {'sh': 'sh000001', 'sz': 'sz399001', 'hs300': 'sz399300',
              'sz50': 'sh000016', 'zxb': 'sz399005', 'cyb': 'sz399006', 'zx300': 'sz399008', 'zh500':'sh000905'}

如果code是以'5','6','9'開頭,則在code前加上sh,否則在code前加上sz。

可見這個symbol的主要作用是根據code在前面加上了sh或sz。

    if ktype.upper() in ct.K_LABELS:                                       %K_LABELS = ['D', 'W', 'M']  
        fq = autype if autype is not None else ''                          %是否復權以及復權類型
        if code[:1] in ('1', '5') or index:                                %如果code是'1','5'開頭或者index(是指數)為真
            fq = ''
        kline = '' if autype is None else 'fq'                             %只有填None才是不復權
        url = ct.KLINE_TT_URL%(ct.P_TYPE['http'], ct.DOMAINS['tt'],        %P_TYPE = {'http': 'http://', 'ftp': 'ftp://'},DOMAINS定義見下方
                                kline, fq, symbol,                         %''或者'fq',具體復權類型或者'',加了sh或sz的code
                                ct.TT_K_TYPE[ktype.upper()], start, end,   %TT_K_TYPE = {'D': 'day', 'W': 'week', 'M': 'month'}
                                fq, _random(17))                           %具體復權類型或者'',生成一個10**16到10**17-1之間的隨機數
        dataflag = '%s%s'%(fq, ct.TT_K_TYPE[ktype.upper()])                %復權類型或''並上'day'或'week'或'month'
    elif ktype in ct.K_MIN_LABELS:                                         %K_MIN_LABELS = ['5', '15', '30', '60']
        url = ct.KLINE_TT_MIN_URL%(ct.P_TYPE['http'], ct.DOMAINS['tt'],    %基本同上
                                    symbol, ktype, ktype,
                                    _random(16))
        dataflag = 'm%s'%ktype                                             %m'5'或'15'或'30'或'60'
    else:
        raise TypeError('ktype input error.')                              
DOMAINS定義:
DOMAINS = {'sina': 'sina.com.cn', 'sinahq': 'sinajs.cn',
           'ifeng': 'ifeng.com', 'sf': 'finance.sina.com.cn',
           'vsf': 'vip.stock.finance.sina.com.cn', 
           'idx': 'www.csindex.com.cn', '163': 'money.163.com',
           'em': 'eastmoney.com', 'sseq': 'query.sse.com.cn',
           'sse': 'www.sse.com.cn', 'szse': 'www.szse.cn',
           'oss': '218.244.146.57', 'idxip':'115.29.204.48',
           'shibor': 'www.shibor.org', 'mbox':'www.cbooo.cn',
           'tt': 'gtimg.cn'}

  

上面兩個URL的定義

KLINE_TT_URL = '%sweb.ifzq.%s/appstock/app/%skline/get?_var=kline_day%s&param=%s,%s,%s,%s,320,%s&r=0.%s'
KLINE_TT_MIN_URL = '%sifzq.%s/appstock/app/kline/mkline?param=%s,m%s,,320&_var=m%s_today&r=0.%s'

 

    for _ in range(retry_count):                                     %retry_count是重做次數,_只是作為一個變量,就跟變量i一樣
        time.sleep(pause)                                            %中間暫停的時間
        try:
            request = Request(url)                                   %使用上面求出的url
            lines = urlopen(request, timeout = 10).read()            %讀出數據
            if len(lines) < 100: #no data                            %如果lines太短,表明未讀到數據
                return None
        except Exception as e:
            print(e)
        else:
            lines = lines.decode('utf-8') if ct.PY3 else lines      %PY3 = (sys.version_info[0] >= 3) 這個解碼出來的lines在下方
            lines = lines.split('=')[1]                             %按'='分隔,取第一個分片。
            reg = re.compile(r',{"nd.*?}') 
            lines = re.subn(reg, '', lines)                         %對lines進行正則表達式替換
            js = json.loads(lines[0])                               %之所以要選lines[0]是因為subn返回的是一個tuple,lines[1]部分是替換次數
            df = pd.DataFrame(js['data'][symbol][dataflag], columns=ct.KLINE_TT_COLS)   %KLINE_TT_COLS就是date,open,close等六列標題
            df['code'] = symbol if index else code                                      %df新加一列code,且設置為指數代碼或股票代碼
            if ktype in ct.K_MIN_LABELS:                                                %如果是分鍾k線數據
                df['date'] = df['date'].map(lambda x: '%s-%s-%s %s:%s'%(x[0:4], x[4:6], 
                                                                        x[6:8], x[8:10], 
                                                                        x[10:12]))      %date部分改成天-時-分-秒的格式
            return df
    raise IOError(ct.NETWORK_URL_ERROR_MSG)

lines:

kline_dayhfq={"code":0,"msg":"","data":{"sz002792":{"hfqday":[["2016-10-26","84.635","82.541","85.268","82.149","27380.000"],
["2016-10-27","82.707","82.556","83.038","80.748","22315.000"],["2016-10-28","82.903","82.571","83.731","78.428","22165.000"],
["2016-10-31","82.541","81.502","82.556","79.995","16437.000"],["2016-11-01","81.517","84.319","85.072","81.517","30741.000"],
["2016-11-02","84.349","82.873","85.268","82.707","30526.000"],["2016-11-03","81.200","81.984","83.611","81.200","24593.000"],
["2016-11-04","81.863","85.720","86.729","81.863","57996.000"],["2016-11-07","85.464","85.991","86.383","84.756","31572.000"],
["2016-11-08","86.292","84.801","86.322","79.845","29328.000"]],
"qt":{"sz002792":["51","\u901a\u5b87\u901a\u8baf","002792","55.91","56.29","56.25","36536","18510","18026","55.91","38","55.90","127",
"55.89","201","55.85","10","55.83","10","55.99","30","56.00","3","56.10","10","56.12","8","56.15","26",
"15:00:04\/55.91\/301\/S\/1682891\/15265|14:57:00\/55.89\/1\/B\/5589\/15163|14:56:52\/55.71\/90\/S\/503812\/15154|
14:56:45\/55.89\/18\/B\/100602\/15146|14:56:39\/55.82\/8\/S\/44544\/15140|14:56:36\/56.12\/12\/B\/67324\/15136","20161109150137",
"-0.38","-0.68","56.75","54.46","55.89\/36235\/201929177","36536","20361","8.12","56.40","","56.75","54.46","4.07","25.16",
"125.80","7.09","61.92","50.66","1.05"],"market":["2016-11-09 20:57:01|HK_close_\u5df2\u6536\u76d8|SH_close_\u5df2\u6536\u76d8|
SZ_close_\u5df2\u6536\u76d8|US_close_\u672a\u5f00\u76d8|SQ_close_\u5df2\u4f11\u5e02|DS_close_\u5df2\u4f11\u5e02|ZS_close_
\u5df2\u4f11\u5e02"],"zjlx":["sz002792","8206.89","10347.24","-2140.35","-10.51","12154.32","10013.97","2140.35","10.51",
"20361.21","41080.23","41732.96","\u901a\u5b87\u901a\u8baf","20161109","20161108^5889.20^7540.99","20161107^6888.64^7504.11",
"20161104^15471.59^10227.30","20161103^4623.91^6113.32"]},"mx_price":{"mx":{"data":[],"timeline":[]},"price":{"data":[]}},
"prec":"22.940","version":"5"}}}

  

 這樣詳細的扣代碼就這一次吧,以后還是應該提高效率,記錄得簡略些。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM