Python常用模塊


一、os模塊

 os模塊是與操作系統交互的一個接口,使用該模塊必須先導入該模塊。

import os

#操作文件和目錄 os.getcwd()
#獲取當前腳本運行的工作目錄路徑,'C:\\Users\\Administrator\\代碼測試' os.chdir('C:\\Users\\Administrator\\代碼測試') #改變當前腳本的工作目錄路徑 os.curdir #獲取當前工作目錄,'.' os.pardir #返回當前目錄的父目錄,'..' os.mkdir("test1") #創建單級目錄 os.rmdir("test1") # 刪除單級空目錄,若目錄不為空則無法刪除,則會報錯 os.makedirs('tes2/test3') #生成多層遞歸目錄 os.removedirs('tes2/test3') #若最里面一層目錄為空,則刪除,並遞歸到上一級目錄,如若也為空,則刪除,依此類推 os.listdir('tes2') #獲取指定目錄下的所有文件和子目錄,包括隱藏文件,並以列表方式打印,['test3'] os.remove('test/1.txt') #刪除一個文件 os.rename("test","newtest") #重命名文件/目錄 os.stat('tes2/test3') #獲取文件/目錄信息,os.stat_result(st_mode=16895, st_ino=4785074604532972,
st_dev=3189517348, st_nlink=1, st_uid=0, st_gid=0, st_size=0, st_atime=1570541912,
st_mtime=1570541912, st_ctime=1570541912)

#操作文件或者目錄路徑,使用os.path模塊
os.path.abspath('test3') #返回給定的文件或目錄的絕對路徑 輸出:'C:\\Users\\Administrator\\代碼測試\\test3'
os.path.split('tes2/text3/1.txt') #將path分割成目錄和文件名二元組返回 輸出('tes2/text3', '1.txt')
os.path.dirname('tes2/text3/1.txt') #返回path的目錄。其實就是os.path.split(path)的第一個元素 ,輸出'tes2/text3'
os.path.basename('tes2/test3/1.txt') #返回path最后的文件名。如果path以/或\結尾,那么就會返回空值。即os.path.split(path)的第二個元素,輸出'1.txt'
os.path.join('tes2','test3','test.txt') # 將多個路徑組合后返回,輸出'tes2\\test3\\test.txt'

os.path.exists('test.txt') #文件存在返回True,否則返回False,注意是文件而非目錄
os.path.isdir('tes2') #用來判斷給定的path是否是一個目錄,如果是目錄代表的是最外層的目錄
os.path.isfile('test.txt') #用來判斷給定的path是否是一個文件 輸出:True

os.path.getsize('tes2/test3/1.txt') #返回path所指向的文件或者目錄的大小
os.path.getatime('tes2/test3/1.txt')  #返回path所指向的文件或者目錄的最后訪問時間,輸出1570588470.5294871
os.path.getmtime('tes2/test3/1.txt')  #返回path所指向的文件或者目錄的最后修改時間 輸出1570588470.5294871

#os模塊的屬性
os.name #輸出字符串指示當前使用平台。win->'nt'; Linux->'posix'
os.sep # 輸出操作系統特定的路徑分隔符,win下為"\\",Linux下為"/"
os.linesep #輸出當前平台使用的行終止符,win下為"\r\n",Linux下為"\n"
os.pathsep # 輸出用於分割文件路徑的字符串 win下為;,Linux下為:

二、sys模塊

sys模塊是與python解釋器交互的一個接口,使用該模塊必須先導入該模塊。

1、sys.argv  獲取當前正在執行的命令行參數的參數列表(list)

#輸出
[
'c:\\users\\administrator\\envs\\automatic\\lib\\site-packages\\ipykernel_launcher.py',
 '-f',
 'C:\\Users\\Administrator\\AppData\\Roaming\\jupyter\\runtime\\kernel-6c1095fc-19fa-4e5e-872b-192b7fcd1c55.json'

]

sys.argv[]是一個程序獲取外部參數的橋梁。sys.argv[0]表示代碼本身的文件路徑。

2、sys.modules.keys()   返回所有已經導入的模塊列表

dict_keys([
'traitlets.config.application', 
'jupyter_client.blocking.channels',
'IPython.utils.terminal',
'errno',
'jupyter_client.localinterfaces',
'ipywidgets.widgets.valuewidget',
'math',
'datetime',
'IPython.core.magics.basic',
'asyncio.windows_utils',
'IPython.core.completer',
'jupyter_client.connect',
'IPython.core.logger',
'jupyter_client.jsonutil',
'_functools',
'tornado.gen',
'encodings.latin_1',
'uuid',
...
]}

3、sys.path  path是一個目錄列表,供Python從中查找第三方擴展模塊。

[
 'c:\\users\\administrator\\envs\\automatic\\scripts\\python35.zip',
 'c:\\users\\administrator\\envs\\automatic\\DLLs',
 'c:\\users\\administrator\\envs\\automatic\\lib',
 'c:\\users\\administrator\\envs\\automatic\\scripts',
 'e:\\python\\python3.5.2\\progress\\Lib',
 'e:\\python\\python3.5.2\\progress\\DLLs',
 'c:\\users\\administrator\\envs\\automatic',
 '',
 'c:\\users\\administrator\\envs\\automatic\\lib\\site-packages',
 'c:\\users\\administrator\\envs\\automatic\\lib\\site-packages\\pip-19.0.3-py3.5.egg',
 'c:\\users\\administrator\\envs\\automatic\\lib\\site-packages\\win32',
 'c:\\users\\administrator\\envs\\automatic\\lib\\site-packages\\win32\\lib',
 'c:\\users\\administrator\\envs\\automatic\\lib\\site-packages\\Pythonwin',
 'c:\\users\\administrator\\envs\\automatic\\lib\\site-packages\\IPython\\extensions',
 'C:\\Users\\Administrator\\.ipython'
]

4、sys.exit(0)  調用sys,exit(n)可以中途退出程序,sys.exit(0)表示正常退出,n不為0時,會引發SystemExit異常,從而在主程序中可以捕獲該異常。

import sys
try:
    sys.exit(1)
except SystemExit as e:
    print(e) #1

5、sys.version    獲取python解釋程序的版本信息

>>> import sys
>>> sys.version
'3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)]'

6、sys.platform 返回操作系統平台名稱

>>> import sys
>>> sys.platform
'win32'
>>>

7、sys.stdin, sys.stdout, sys.stderr    標准輸入,標准輸出,錯誤輸出

標准輸入:一般為鍵盤輸入,stdin對象為解釋器提供輸入字符流,一般使用raw_input()和input()函數

import sys

print("Please input message:")
name = sys.stdin.readline()
print(name)

#輸出
"""
Please input message:
hello
hello
"""

標准輸出:一般為屏幕。stdout對象接收到print語句產生的輸出

import sys

sys.stdout.write("hello")
sys.stdout.flush()
"""
輸出:
hello
"""

  調用python中的print函數,事實上是調用了 sys.stdout.write() ,比如print('hello'),等價於sys.stdout.write(‘hello\n’),print 將內容打印到了控制台,並且追加了一個換行符。

三、json & pickle模塊

上述兩個模塊都是序列化模塊,什么是序列化呢?把對象(變量)從內存中變成可存儲或傳輸的過程稱之為序列化。反過來,把變量內容從序列化的對象重新讀到內存里稱之為反序列化,即unpickling。

系列化后就可以將數據寫入磁盤進行持久化存儲。

(一)json模塊

r"""JSON (JavaScript Object Notation) <http://json.org> is a subset of
JavaScript syntax (ECMA-262 3rd edition) used as a lightweight data
interchange format.

:mod:`json` exposes an API familiar to users of the standard library
:mod:`marshal` and :mod:`pickle` modules.  It is derived from a
version of the externally maintained simplejson library.

Encoding basic Python object hierarchies::

    >>> import json
    >>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])
    '["foo", {"bar": ["baz", null, 1.0, 2]}]'
    >>> print(json.dumps("\"foo\bar"))
    "\"foo\bar"
    >>> print(json.dumps('\u1234'))
    "\u1234"
    >>> print(json.dumps('\\'))
    "\\"
    >>> print(json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True))
    {"a": 0, "b": 0, "c": 0}
    >>> from io import StringIO
    >>> io = StringIO()
    >>> json.dump(['streaming API'], io)
    >>> io.getvalue()
    '["streaming API"]'

Compact encoding::

    >>> import json
    >>> from collections import OrderedDict
    >>> mydict = OrderedDict([('4', 5), ('6', 7)])
    >>> json.dumps([1,2,3,mydict], separators=(',', ':'))
    '[1,2,3,{"4":5,"6":7}]'

Pretty printing::

    >>> import json
    >>> print(json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4))
    {
        "4": 5,
        "6": 7
    }

Decoding JSON::

    >>> import json
    >>> obj = ['foo', {'bar': ['baz', None, 1.0, 2]}]
    >>> json.loads('["foo", {"bar":["baz", null, 1.0, 2]}]') == obj
    True
    >>> json.loads('"\\"foo\\bar"') == '"foo\x08ar'
    True
    >>> from io import StringIO
    >>> io = StringIO('["streaming API"]')
    >>> json.load(io)[0] == 'streaming API'
    True

Specializing JSON object decoding::

    >>> import json
    >>> def as_complex(dct):
    ...     if '__complex__' in dct:
    ...         return complex(dct['real'], dct['imag'])
    ...     return dct
    ...
    >>> json.loads('{"__complex__": true, "real": 1, "imag": 2}',
    ...     object_hook=as_complex)
    (1+2j)
    >>> from decimal import Decimal
    >>> json.loads('1.1', parse_float=Decimal) == Decimal('1.1')
    True

Specializing JSON object encoding::

    >>> import json
    >>> def encode_complex(obj):
    ...     if isinstance(obj, complex):
    ...         return [obj.real, obj.imag]
    ...     raise TypeError(repr(o) + " is not JSON serializable")
    ...
    >>> json.dumps(2 + 1j, default=encode_complex)
    '[2.0, 1.0]'
    >>> json.JSONEncoder(default=encode_complex).encode(2 + 1j)
    '[2.0, 1.0]'
    >>> ''.join(json.JSONEncoder(default=encode_complex).iterencode(2 + 1j))
    '[2.0, 1.0]'


Using json.tool from the shell to validate and pretty-print::

    $ echo '{"json":"obj"}' | python -m json.tool
    {
        "json": "obj"
    }
    $ echo '{ 1.2:3.4}' | python -m json.tool
    Expecting property name enclosed in double quotes: line 1 column 3 (char 2)
"""
__version__ = '2.0.9'
__all__ = [
    'dump', 'dumps', 'load', 'loads',
    'JSONDecoder', 'JSONDecodeError', 'JSONEncoder',
]

__author__ = 'Bob Ippolito <bob@redivi.com>'

from .decoder import JSONDecoder, JSONDecodeError
from .encoder import JSONEncoder

_default_encoder = JSONEncoder(
    skipkeys=False,
    ensure_ascii=True,
    check_circular=True,
    allow_nan=True,
    indent=None,
    separators=None,
    default=None,
)

def dump(obj, fp, skipkeys=False, ensure_ascii=True, check_circular=True,
        allow_nan=True, cls=None, indent=None, separators=None,
        default=None, sort_keys=False, **kw):
    """Serialize ``obj`` as a JSON formatted stream to ``fp`` (a
    ``.write()``-supporting file-like object).

    If ``skipkeys`` is true then ``dict`` keys that are not basic types
    (``str``, ``int``, ``float``, ``bool``, ``None``) will be skipped
    instead of raising a ``TypeError``.

    If ``ensure_ascii`` is false, then the strings written to ``fp`` can
    contain non-ASCII characters if they appear in strings contained in
    ``obj``. Otherwise, all such characters are escaped in JSON strings.

    If ``check_circular`` is false, then the circular reference check
    for container types will be skipped and a circular reference will
    result in an ``OverflowError`` (or worse).

    If ``allow_nan`` is false, then it will be a ``ValueError`` to
    serialize out of range ``float`` values (``nan``, ``inf``, ``-inf``)
    in strict compliance of the JSON specification, instead of using the
    JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``).

    If ``indent`` is a non-negative integer, then JSON array elements and
    object members will be pretty-printed with that indent level. An indent
    level of 0 will only insert newlines. ``None`` is the most compact
    representation.

    If specified, ``separators`` should be an ``(item_separator, key_separator)``
    tuple.  The default is ``(', ', ': ')`` if *indent* is ``None`` and
    ``(',', ': ')`` otherwise.  To get the most compact JSON representation,
    you should specify ``(',', ':')`` to eliminate whitespace.

    ``default(obj)`` is a function that should return a serializable version
    of obj or raise TypeError. The default simply raises TypeError.

    If *sort_keys* is ``True`` (default: ``False``), then the output of
    dictionaries will be sorted by key.

    To use a custom ``JSONEncoder`` subclass (e.g. one that overrides the
    ``.default()`` method to serialize additional types), specify it with
    the ``cls`` kwarg; otherwise ``JSONEncoder`` is used.

    """
    # cached encoder
    if (not skipkeys and ensure_ascii and
        check_circular and allow_nan and
        cls is None and indent is None and separators is None and
        default is None and not sort_keys and not kw):
        iterable = _default_encoder.iterencode(obj)
    else:
        if cls is None:
            cls = JSONEncoder
        iterable = cls(skipkeys=skipkeys, ensure_ascii=ensure_ascii,
            check_circular=check_circular, allow_nan=allow_nan, indent=indent,
            separators=separators,
            default=default, sort_keys=sort_keys, **kw).iterencode(obj)
    # could accelerate with writelines in some versions of Python, at
    # a debuggability cost
    for chunk in iterable:
        fp.write(chunk)


def dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True,
        allow_nan=True, cls=None, indent=None, separators=None,
        default=None, sort_keys=False, **kw):
    """Serialize ``obj`` to a JSON formatted ``str``.

    If ``skipkeys`` is true then ``dict`` keys that are not basic types
    (``str``, ``int``, ``float``, ``bool``, ``None``) will be skipped
    instead of raising a ``TypeError``.

    If ``ensure_ascii`` is false, then the return value can contain non-ASCII
    characters if they appear in strings contained in ``obj``. Otherwise, all
    such characters are escaped in JSON strings.

    If ``check_circular`` is false, then the circular reference check
    for container types will be skipped and a circular reference will
    result in an ``OverflowError`` (or worse).

    If ``allow_nan`` is false, then it will be a ``ValueError`` to
    serialize out of range ``float`` values (``nan``, ``inf``, ``-inf``) in
    strict compliance of the JSON specification, instead of using the
    JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``).

    If ``indent`` is a non-negative integer, then JSON array elements and
    object members will be pretty-printed with that indent level. An indent
    level of 0 will only insert newlines. ``None`` is the most compact
    representation.

    If specified, ``separators`` should be an ``(item_separator, key_separator)``
    tuple.  The default is ``(', ', ': ')`` if *indent* is ``None`` and
    ``(',', ': ')`` otherwise.  To get the most compact JSON representation,
    you should specify ``(',', ':')`` to eliminate whitespace.

    ``default(obj)`` is a function that should return a serializable version
    of obj or raise TypeError. The default simply raises TypeError.

    If *sort_keys* is ``True`` (default: ``False``), then the output of
    dictionaries will be sorted by key.

    To use a custom ``JSONEncoder`` subclass (e.g. one that overrides the
    ``.default()`` method to serialize additional types), specify it with
    the ``cls`` kwarg; otherwise ``JSONEncoder`` is used.

    """
    # cached encoder
    if (not skipkeys and ensure_ascii and
        check_circular and allow_nan and
        cls is None and indent is None and separators is None and
        default is None and not sort_keys and not kw):
        return _default_encoder.encode(obj)
    if cls is None:
        cls = JSONEncoder
    return cls(
        skipkeys=skipkeys, ensure_ascii=ensure_ascii,
        check_circular=check_circular, allow_nan=allow_nan, indent=indent,
        separators=separators, default=default, sort_keys=sort_keys,
        **kw).encode(obj)


_default_decoder = JSONDecoder(object_hook=None, object_pairs_hook=None)


def load(fp, cls=None, object_hook=None, parse_float=None,
        parse_int=None, parse_constant=None, object_pairs_hook=None, **kw):
    """Deserialize ``fp`` (a ``.read()``-supporting file-like object containing
    a JSON document) to a Python object.

    ``object_hook`` is an optional function that will be called with the
    result of any object literal decode (a ``dict``). The return value of
    ``object_hook`` will be used instead of the ``dict``. This feature
    can be used to implement custom decoders (e.g. JSON-RPC class hinting).

    ``object_pairs_hook`` is an optional function that will be called with the
    result of any object literal decoded with an ordered list of pairs.  The
    return value of ``object_pairs_hook`` will be used instead of the ``dict``.
    This feature can be used to implement custom decoders that rely on the
    order that the key and value pairs are decoded (for example,
    collections.OrderedDict will remember the order of insertion). If
    ``object_hook`` is also defined, the ``object_pairs_hook`` takes priority.

    To use a custom ``JSONDecoder`` subclass, specify it with the ``cls``
    kwarg; otherwise ``JSONDecoder`` is used.

    """
    return loads(fp.read(),
        cls=cls, object_hook=object_hook,
        parse_float=parse_float, parse_int=parse_int,
        parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)


def loads(s, encoding=None, cls=None, object_hook=None, parse_float=None,
        parse_int=None, parse_constant=None, object_pairs_hook=None, **kw):
    """Deserialize ``s`` (a ``str`` instance containing a JSON
    document) to a Python object.

    ``object_hook`` is an optional function that will be called with the
    result of any object literal decode (a ``dict``). The return value of
    ``object_hook`` will be used instead of the ``dict``. This feature
    can be used to implement custom decoders (e.g. JSON-RPC class hinting).

    ``object_pairs_hook`` is an optional function that will be called with the
    result of any object literal decoded with an ordered list of pairs.  The
    return value of ``object_pairs_hook`` will be used instead of the ``dict``.
    This feature can be used to implement custom decoders that rely on the
    order that the key and value pairs are decoded (for example,
    collections.OrderedDict will remember the order of insertion). If
    ``object_hook`` is also defined, the ``object_pairs_hook`` takes priority.

    ``parse_float``, if specified, will be called with the string
    of every JSON float to be decoded. By default this is equivalent to
    float(num_str). This can be used to use another datatype or parser
    for JSON floats (e.g. decimal.Decimal).

    ``parse_int``, if specified, will be called with the string
    of every JSON int to be decoded. By default this is equivalent to
    int(num_str). This can be used to use another datatype or parser
    for JSON integers (e.g. float).

    ``parse_constant``, if specified, will be called with one of the
    following strings: -Infinity, Infinity, NaN, null, true, false.
    This can be used to raise an exception if invalid JSON numbers
    are encountered.

    To use a custom ``JSONDecoder`` subclass, specify it with the ``cls``
    kwarg; otherwise ``JSONDecoder`` is used.

    The ``encoding`` argument is ignored and deprecated.

    """
    if not isinstance(s, str):
        raise TypeError('the JSON object must be str, not {!r}'.format(
                            s.__class__.__name__))
    if s.startswith(u'\ufeff'):
        raise JSONDecodeError("Unexpected UTF-8 BOM (decode using utf-8-sig)",
                              s, 0)
    if (cls is None and object_hook is None and
            parse_int is None and parse_float is None and
            parse_constant is None and object_pairs_hook is None and not kw):
        return _default_decoder.decode(s)
    if cls is None:
        cls = JSONDecoder
    if object_hook is not None:
        kw['object_hook'] = object_hook
    if object_pairs_hook is not None:
        kw['object_pairs_hook'] = object_pairs_hook
    if parse_float is not None:
        kw['parse_float'] = parse_float
    if parse_int is not None:
        kw['parse_int'] = parse_int
    if parse_constant is not None:
        kw['parse_constant'] = parse_constant
    return cls(**kw).decode(s)
__init__

1、dumps & loads

dumps是將對象(dict、list等)轉化成str格式,loads是將str轉化成對象(dict、list等)格式。

#進行序列化
>>> import json
>>> json.dumps({'username':'root','password':'abc123'}) #dict-->str
'{"password": "abc123", "username": "root"}'
>>> json.dumps(['root','abc123']) #list-->str
'["root", "abc123"]'
>>> json.dumps([{'root':12,'flex':25}]) #list(dict)-->str
'[{"flex": 25, "root": 12}]'

#反序列化
>>> json.loads('[{"flex": 25, "root": 12}]')
[{'flex': 25, 'root': 12}]

2、dump & load

#dump將數據對象存儲到文件中
>>> import json
>>> f = open('json_test.txt','w') #dump方法接收一個文件句柄,直接將字典轉換成json字符串寫入文件
>>> dict = {'username':'root'}
>>> json.dump(dict,f)
>>> f.close()
#查看文件的路勁
>>> import os
>>> os.path.abspath('json_test.txt')
'C:\\Users\\Administrator\\json_test.txt'
>>>

#將文件中存儲的json數據轉成對象返回
>>> f = open('json_test.txt')
>>> json.load(f) #load方法接收一個文件句柄,直接將文件中的json字符串轉換成數據結構返回
{'username': 'root'}
>>>

3、重要參數說明

  • ensure_ascii

如果ensure_ascii為true(默認值),則確保輸出中所有傳入的非ASCII字符均已轉義。如果ensure_ascii為false,則這些字符將按原樣輸出。

  • indent

  如果indent是非負整數或字符串,則JSON數組元素和對象成員將使用該縮進級別進行漂亮打印。縮進級別為0,負或""僅插入換行符。 None(默認)選擇最緊湊的表示形式。使用正整數縮進會使每個級別縮進多個空格。如果indent是字符串(例如"\t"),則該字符串用於縮進每個級別。

#漂亮打印
>>> import json
>>> print(json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4))
{
    "4": 5,
    "6": 7
}
  • sort_keys

如果sort_keys為true(默認值:False),則字典的輸出將按key排序。

 4、dumps & dump

  dump需要一個類似於文件指針的參數(並不是真的指針,可稱之為類文件對象),可以與文件操作結合,也就是說可以將dict轉成str然后存入文件中;而dumps直接給的是dict,也就是將字典轉成str。

(二)pickle模塊

  pickle是python中獨有的序列化模塊,json模塊是所有語言都可以使用的,如果以后使用pickle序列化的數據只用於python是可行的,但如果還想讓序列化的數據使用到其它語言,就需要使用json模塊了。

pickle同樣也提供了四個功能:dumps、dump、loads、load。

並且pickle序列化的數據類型也很豐富:

  • 支持所有python原生類型:布爾值,整數,浮點數,復數,字符串,字節,None。
  • 由任何原生類型組成的列表,元組,字典和集合。
  • 函數,類,類的實例

1、dumps & loads

>>> import pickle
>>> dict = {'username':'root','password':'abc123'}
>>> pickle.dumps(dict) #序列化后是一串二進制
b'\x80\x03}q\x00(X\x08\x00\x00\x00passwordq\x01X\x06\x00\x00\x00abc12
\x00\x00\x00usernameq\x03X\x04\x00\x00\x00rootq\x04u.'
>>> ret = pickle.dumps(dict)
>>> pickle.loads(ret) #反序列化
{'password': 'abc123', 'username': 'root'}
>>>

2、dump & load

>>> import pickle
#序列化寫入文件中,注意是二進制
>>> dict = {'username':'root','password':'abc123'}
>>> f = open('pickle_test.txt','wb') #以二進制的方式寫入文件中
>>> pickle.dump(dict,f)
>>> f.close()
#反序列化,從文件中讀出數據對象
>>> f = open('pickle_test.txt','rb')
>>> pickle.load(f)
{'password': 'abc123', 'username': 'root'}
>>>

四、random模塊

(一)整數函數

1、random.randrange(stop)、random.randrange(start, stop[, step])

返回一個隨機選擇的元素從random.randrange(start, stop[, step])。

>>> import random

>>> random.randrange(4) #返回一個大於等於1小於4,[1,4)的整數
3
>>> random.randrange(4,10) #規定了起始位置大於等於4小於10,[4,10)的整數
9
>>> random.randrange(4,10,2) #加入步長,大於等於4小於10的偶數
6

2、random.randint(a, b)

返回一個隨機整數N,a<= N <=b,相當於randrange(a,b+1)。

>>> import random
>>> random.randint(4,9)
8

(二)序列函數

1、random.choice(seq)

從一個非空的序列seq中隨機返回一個元素,如果序列seq是空時拋 IndexError異常。

>>> import random

>>> l = [5,6,8,{'username':'root'}]
>>> random.choice(l)
5
>>> random.choice(l)
{'username': 'root'}
>>>

2、random.sample(population, k)

返回從填充序列或集合中選擇的唯一元素的k個長度列表。一般用於隨機抽樣。

返回一個新列表,其中包含總體中的元素,同時保留原始總體不變。結果列表按選擇順序排列,因此所有子切片也將是有效的隨機樣本。

>>> import random

>>> l = [5,6,8,{'username':'root'}]
#從序列中隨機抽出兩個元素
>>> random.sample(l,2)
[5, 6]
>>> random.sample(l,2)
[6, {'username': 'root'}]

3、random.shuffle(x[, random])

將序列x打亂順序。可選參數random是一個0參數的函數,返回[0.0,1.0)中的一個隨機浮點數;默認情況下,這是function random()。

>>> import random

>>> l = [5,6,8,{'username':'root'}]
#打亂列表l原先的順序
>>> random.shuffle(l)
>>> l
[8, 6, {'username': 'root'}, 5]
>>> random.shuffle(l)
>>> l
[6, 8, {'username': 'root'}, 5]
>>>

(三)實值分布

以下函數生成特定的實值分布。函數參數以分布方程式中的相應變量命名,如通常的數學實踐中所用;這些方程式中的大多數都可以在任何統計資料中找到。

 1、random.random()

返回范圍為[0.0,1.0)的下一個隨機浮點數。

>>> import random
#大於等於0小於1之間的數
>>> random.random()
0.04794861013804974
>>> random.random()
0.8553054814287199
>>>

2、random.uniform(a, b)

用於生成一個指定范圍內的隨機符點數,兩個參數其中一個是上限,一個是下限。如果a > b,則生成的隨機數n: b <= n <= a。如果 a <b, 則 a <= n <= b。

>>> import random
#生成指定范圍的數
>>> random.uniform(1,2)
1.0994241967254763
>>> random.uniform(1,2)
1.9169119648591533
>>>

3、random.triangular(low, high, mode)

返回一個隨機浮點數N,使並在這些邊界之間使用指定的模式。該低和高界默認的0和1。所述模式參數默認為邊界之間的中點,給人一種對稱分布。low <= N <= high

>>> import random

#對稱分布
>>> random.triangular() #使用low,high默認值
0.338741665952415
>>> random.triangular(1,3) #指定low,high的值
2.2142236912771205
>>>

4、random.expovariate(lambd)

指數分布。 lambd是1.0除以所需的均值。它應該不為零。如果lambd為正,返回值的范圍從0到正無窮大;如果lambd為負,從負無窮大到0。

>>> import random
#指數分布
>>> random.expovariate(0.3)
8.262242039304834
>>>

5、random.normalvariate(mu,sigma)

正態分布。 mu是平均值,而sigma是標准偏差。

>>> import random
#正態分布
>>> random.normalvariate(4,0.5)
3.229766077847942
>>>

(四)實例

1、生成隨機字符串

#string的使用
>>> import string
>>> string.ascii_lowercase
'abcdefghijklmnopqrstuvwxyz'
>>> string.digits
'0123456789'
#生成隨機字符串
>>> random.sample(string.ascii_lowercase+string.digits,8)
['k', 'c', 'j', '7', 'z', 'e', 'm', 'y']
>>> str = random.sample(string.ascii_lowercase+string.digits,8)
>>> l = random.sample(string.ascii_lowercase+string.digits,8)
>>> ''.join(l)
'xitzm5we'
>>>

2、生成隨機驗證碼

import random

def v_code():

    code = ''
    #獲取5個字母數字組成驗證碼
    for i in range(5):
        num=random.randint(0,9) #從[0,9)中隨機生成一個數字
        alf=chr(random.randint(65,90)) #隨機生成一個大寫字母
        add=random.choice([num,alf]) #從數字或者字母中選擇一個     
        code="".join([code,str(add)]) 

print(v_code())

五、hashlib模塊

(一)摘要算法

Python的hashlib提供了常見的摘要算法,如MD5,SHA1等等。

md5(), sha1(), sha224(), sha256(), sha384(), sha512(), blake2b(), blake2s(),sha3_224, sha3_256, sha3_384, sha3_512, shake_128,  shake_256.

什么是摘要算法呢?摘要算法又稱哈希算法、散列算法。它通過一個函數,把任意長度的數據轉換為一個長度固定的數據串(通常用16進制的字符串表示)。

摘要算法就是通過摘要函數f()對任意長度的數據data計算出固定長度的摘要digest,目的是為了發現原始數據是否被人篡改過。

#.  Copyright (C) 2005-2010   Gregory P. Smith (greg@krypto.org)
#  Licensed to PSF under a Contributor Agreement.
#

__doc__ = """hashlib module - A common interface to many hash functions.

new(name, data=b'') - returns a new hash object implementing the
                      given hash function; initializing the hash
                      using the given binary data.

Named constructor functions are also available, these are faster
than using new(name):

md5(), sha1(), sha224(), sha256(), sha384(), and sha512()

More algorithms may be available on your platform but the above are guaranteed
to exist.  See the algorithms_guaranteed and algorithms_available attributes
to find out what algorithm names can be passed to new().

NOTE: If you want the adler32 or crc32 hash functions they are available in
the zlib module.

Choose your hash function wisely.  Some have known collision weaknesses.
sha384 and sha512 will be slow on 32 bit platforms.

Hash objects have these methods:
 - update(arg): Update the hash object with the bytes in arg. Repeated calls
                are equivalent to a single call with the concatenation of all
                the arguments.
 - digest():    Return the digest of the bytes passed to the update() method
                so far.
 - hexdigest(): Like digest() except the digest is returned as a unicode
                object of double length, containing only hexadecimal digits.
 - copy():      Return a copy (clone) of the hash object. This can be used to
                efficiently compute the digests of strings that share a common
                initial substring.

For example, to obtain the digest of the string 'Nobody inspects the
spammish repetition':

    >>> import hashlib
    >>> m = hashlib.md5()
    >>> m.update(b"Nobody inspects")
    >>> m.update(b" the spammish repetition")
    >>> m.digest()
    b'\\xbbd\\x9c\\x83\\xdd\\x1e\\xa5\\xc9\\xd9\\xde\\xc9\\xa1\\x8d\\xf0\\xff\\xe9'

More condensed:

    >>> hashlib.sha224(b"Nobody inspects the spammish repetition").hexdigest()
    'a4337bc45a8fc544c03f52dc550cd6e1e87021bc896588bd79e901e2'

"""

# This tuple and __get_builtin_constructor() must be modified if a new
# always available algorithm is added.
__always_supported = ('md5', 'sha1', 'sha224', 'sha256', 'sha384', 'sha512')

algorithms_guaranteed = set(__always_supported)
algorithms_available = set(__always_supported)

__all__ = __always_supported + ('new', 'algorithms_guaranteed',
                                'algorithms_available', 'pbkdf2_hmac')


__builtin_constructor_cache = {}

def __get_builtin_constructor(name):
    cache = __builtin_constructor_cache
    constructor = cache.get(name)
    if constructor is not None:
        return constructor
    try:
        if name in ('SHA1', 'sha1'):
            import _sha1
            cache['SHA1'] = cache['sha1'] = _sha1.sha1
        elif name in ('MD5', 'md5'):
            import _md5
            cache['MD5'] = cache['md5'] = _md5.md5
        elif name in ('SHA256', 'sha256', 'SHA224', 'sha224'):
            import _sha256
            cache['SHA224'] = cache['sha224'] = _sha256.sha224
            cache['SHA256'] = cache['sha256'] = _sha256.sha256
        elif name in ('SHA512', 'sha512', 'SHA384', 'sha384'):
            import _sha512
            cache['SHA384'] = cache['sha384'] = _sha512.sha384
            cache['SHA512'] = cache['sha512'] = _sha512.sha512
    except ImportError:
        pass  # no extension module, this hash is unsupported.

    constructor = cache.get(name)
    if constructor is not None:
        return constructor

    raise ValueError('unsupported hash type ' + name)


def __get_openssl_constructor(name):
    try:
        f = getattr(_hashlib, 'openssl_' + name)
        # Allow the C module to raise ValueError.  The function will be
        # defined but the hash not actually available thanks to OpenSSL.
        f()
        # Use the C function directly (very fast)
        return f
    except (AttributeError, ValueError):
        return __get_builtin_constructor(name)


def __py_new(name, data=b''):
    """new(name, data=b'') - Return a new hashing object using the named algorithm;
    optionally initialized with data (which must be bytes).
    """
    return __get_builtin_constructor(name)(data)


def __hash_new(name, data=b''):
    """new(name, data=b'') - Return a new hashing object using the named algorithm;
    optionally initialized with data (which must be bytes).
    """
    try:
        return _hashlib.new(name, data)
    except ValueError:
        # If the _hashlib module (OpenSSL) doesn't support the named
        # hash, try using our builtin implementations.
        # This allows for SHA224/256 and SHA384/512 support even though
        # the OpenSSL library prior to 0.9.8 doesn't provide them.
        return __get_builtin_constructor(name)(data)


try:
    import _hashlib
    new = __hash_new
    __get_hash = __get_openssl_constructor
    algorithms_available = algorithms_available.union(
            _hashlib.openssl_md_meth_names)
except ImportError:
    new = __py_new
    __get_hash = __get_builtin_constructor

try:
    # OpenSSL's PKCS5_PBKDF2_HMAC requires OpenSSL 1.0+ with HMAC and SHA
    from _hashlib import pbkdf2_hmac
except ImportError:
    _trans_5C = bytes((x ^ 0x5C) for x in range(256))
    _trans_36 = bytes((x ^ 0x36) for x in range(256))

    def pbkdf2_hmac(hash_name, password, salt, iterations, dklen=None):
        """Password based key derivation function 2 (PKCS #5 v2.0)

        This Python implementations based on the hmac module about as fast
        as OpenSSL's PKCS5_PBKDF2_HMAC for short passwords and much faster
        for long passwords.
        """
        if not isinstance(hash_name, str):
            raise TypeError(hash_name)

        if not isinstance(password, (bytes, bytearray)):
            password = bytes(memoryview(password))
        if not isinstance(salt, (bytes, bytearray)):
            salt = bytes(memoryview(salt))

        # Fast inline HMAC implementation
        inner = new(hash_name)
        outer = new(hash_name)
        blocksize = getattr(inner, 'block_size', 64)
        if len(password) > blocksize:
            password = new(hash_name, password).digest()
        password = password + b'\x00' * (blocksize - len(password))
        inner.update(password.translate(_trans_36))
        outer.update(password.translate(_trans_5C))

        def prf(msg, inner=inner, outer=outer):
            # PBKDF2_HMAC uses the password as key. We can re-use the same
            # digest objects and just update copies to skip initialization.
            icpy = inner.copy()
            ocpy = outer.copy()
            icpy.update(msg)
            ocpy.update(icpy.digest())
            return ocpy.digest()

        if iterations < 1:
            raise ValueError(iterations)
        if dklen is None:
            dklen = outer.digest_size
        if dklen < 1:
            raise ValueError(dklen)

        dkey = b''
        loop = 1
        from_bytes = int.from_bytes
        while len(dkey) < dklen:
            prev = prf(salt + loop.to_bytes(4, 'big'))
            # endianess doesn't matter here as long to / from use the same
            rkey = int.from_bytes(prev, 'big')
            for i in range(iterations - 1):
                prev = prf(prev)
                # rkey = rkey ^ prev
                rkey ^= from_bytes(prev, 'big')
            loop += 1
            dkey += rkey.to_bytes(inner.digest_size, 'big')

        return dkey[:dklen]


for __func_name in __always_supported:
    # try them all, some may not work due to the OpenSSL
    # version not supporting that algorithm.
    try:
        globals()[__func_name] = __get_hash(__func_name)
    except ValueError:
        import logging
        logging.exception('code for hash %s was not found.', __func_name)

# Cleanup locals()
del __always_supported, __func_name, __get_hash
del __py_new, __hash_new, __get_openssl_constructor
hashlib

哈希對象提供以下一些方法:

 - update(arg): 使用數據中的字節更新哈希對象
 - digest():通過update()方法返回字節的digest
 - hexdigest(): 與digest()類似,不同之處在於digest以字符串形式返回長度為雙精度,僅包含十六進制數字。
 - copy():返回哈希對象的副本(克隆)。這可以用來有效地計算出共享公共數據的摘要初始子字符串。

 以摘要算法MD5為例:

>>> import hashlib
>>> m = hashlib.md5()
>>> m.update(b'Nobody inspects the spammish repetition')
#返回字節
>>> m.digest()
b'\xae\xe9\xfc_.\xc6A\xb4\xd6%[\xf1\x1f5S\x05'
#返回字符串
>>> m.hexdigest()
'aee9fc5f2ec641b4d6255bf11f355305'
>>>

當然還可以更簡潔:

>>> import hashlib
>>> hashlib.sha224(b"Nobody inspects the spammish repetition").hexdigest()
'a4337bc45a8fc544c03f52dc550cd6e1e87021bc896588bd79e901e2'
>>>

(二)應用

可以用於給每一個用戶的token進行加密

def get_md5(username):
    """
    生成對應用戶的token
    :param username: 用戶名
    :return:
    """
    m=hashlib.md5()
    m.update(bytes(username,encoding="utf-8"))
    return m.hexdigest()

但是這樣雖然給對應的用戶名進行MD5加密了,如果別人盜走了數據庫的數據,根據用戶名照樣也是泄露了用戶的token,為此可以通過Salt來處理MD5。

def get_md5(username):
    """
    生成對應用戶的token
    :param username: 用戶名
    :return:
    """
    m=hashlib.md5(bytes(username,encoding="utf-8")) #將用戶名當作salt,可以防止相同用戶名的生成的MD5一樣
    ctime = str(time.time())
    m.update(bytes(ctime,encoding="utf-8")) #使用時間更新哈希對象 return m.hexdigest()

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM