Python之itertools模塊


一、無限迭代器

1、itertools.count(start=0, step=1)

創建一個迭代器,返回一個以start開頭,以step間隔的值。其大體如下:

def count(start=0, step=1):
    # count(10) --> 10 11 12 13 14 ...
    # count(2.5, 0.5) -> 2.5 3.0 3.5 ...
    n = start
    while True:
        yield n
        n += step

其實咧為:

from itertools import count
import time

for i in count(10):
    time.sleep(2)
    print(i) #10、11、12...

其中count(10)的類型為itertools.count類型,通過被用作map或者zip函數的參數。

比如:

#map使用
map(lambda x:x*2,count(5))

#zip使用
a = zip(count(10),'xy')

print(list(a))
"""
[(10, 'x'), (11, 'y')]
"""

 2、itertools.cycle(iterable)

創建一個迭代器,從迭代器返回元素,並且保存每個元素的副本。當迭代器迭代完畢后,從保存的副本中返回元素,無限重復。其大體如下:

def cycle(iterable):
    # cycle('ABCD') --> A B C D A B C D A B C D ...
    saved = []
    for element in iterable:
        yield element
        saved.append(element)
    while saved:
        for element in saved:
              yield element

實例為:

from itertools import cycle

print(cycle('ABCDE')) #<itertools.cycle object at 0x0000000000649448>

for item in cycle('ABCDE'):
    print(item) # A、B、C、D、E、A、B、C、D、E...

3、itertools.repeat(object[, times])

 創建一個迭代器,一次又一次的返回對象,除非指定times對象,否則將一直運行下去。其大體如下:

def repeat(object, times=None):
    # repeat(10, 3) --> 10 10 10
    if times is None:
        while True:
            yield object
    else:
        for i in range(times):
            yield object

其可用於map和zip函數中:

In [1]: from itertools import repeat

In [2]: list(map(pow, range(10), repeat(2)))
Out[2]: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [3]: list(zip(range(5),repeat(10)))
Out[3]: [(0, 10), (1, 10), (2, 10), (3, 10), (4, 10)]

In [4]:

二、 迭代器終止最短輸入序列

1、itertools.accumulate(iterable[, func])

  創建一個迭代器,返回累加的總和或者是其它指定函數的累加結果(通過func函數進行指定),如果提供了func,則它應該是iterable輸入的元素。如果輸入的iterable為空,則輸出的iterable也將為空。其大體如下:

def accumulate(iterable, func=operator.add):
    'Return running totals'
    # accumulate([1,2,3,4,5]) --> 1 3 6 10 15
    # accumulate([1,2,3,4,5], operator.mul) --> 1 2 6 24 120
    it = iter(iterable)
    try:
        total = next(it)
    except StopIteration:
        return
    yield total
    for element in it:
        total = func(total, element)
        yield total

其實例為:

from itertools import accumulate

print(accumulate([1,2,3])) #<itertools.accumulate object at 0x00000000006E9448>
print(list(accumulate([1,2,3]))) #[1, 3, 6]
print(list(accumulate([1,2,3],lambda x,y:x*y))) #[1, 2, 6]

2、itertools.chain(*iterables) 

  創建一個迭代器,該迭代器從第一個可迭代對象返回元素,直到耗盡為止,然后繼續進行下一個可迭代對象,直到所有可迭代對象都耗盡為止。用於將連續序列視為單個序列。大致相當於:

def chain(*iterables):
    # chain('ABC', 'DEF') --> A B C D E F
    for it in iterables:
        for element in it:
            yield element

其實例為:

from itertools import chain

print(list(chain([1,2,3],[5,6,7]))) #[1, 2, 3, 5, 6, 7]

 3、classmethod chain.from_iterable(iterable)

chain函數的替代構造函數,從一個單獨的可迭代的參數獲取連續的輸入,大致相當於:

def from_iterable(iterables):
    # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
    for it in iterables:
        for element in it:
            yield element

其實例為:

from itertools import chain

print(list(chain.from_iterable([[1,2,3],[5,6,7]]))) #[1, 2, 3, 5, 6, 7]

4、itertools.compress(data, selectors)

創造一個迭代器,用於從數據中過濾元素,這些元素是選擇器中對應的元素的結果為True。當數據或者選擇器中的元素迭代完畢后停止,其大體相當於:

def compress(data, selectors):
    # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
    return (d for d, s in zip(data, selectors) if s)

其實例為:

from itertools import compress

data = [1, 2, 3, 4]
selectors = [1, 0, 1, 0]
filter_data = compress(data, selectors)
print(filter_data)  # <itertools.compress object at 0x00000000009E5B00>
print(list(filter_data))  # [1, 3]

5、itertools.dropwhile(predicate, iterable)

  創建一個迭代器,只要predicate為真就從iterable中刪除對應的元素,然后返回iterable中剩余的元素。注意的是只要predicate為False,迭代器就不會產生任何元素了,其大體相當於:

def dropwhile(predicate, iterable):
    # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
    iterable = iter(iterable)
    for x in iterable:
        if not predicate(x):
            yield x
            break
    for x in iterable:
        yield x

其實例為:

from itertools import dropwhile

data = [1, 2, 3, 4, 5]
result = dropwhile(lambda x: x < 3, data)
print(result)  # <itertools.dropwhile object at 0x0000000000D5BD48>
print(list(result))  # [3, 4, 5]

6、itertools.filterfalse(predicate, iterable)

創建一個迭代器,過濾出那些當predicate為False時對應的iterable中的元素,如果predicate為None,則返回這個對應的元素。其大體相當於:

def filterfalse(predicate, iterable):
    # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
    if predicate is None:
        predicate = bool
    for x in iterable:
        if not predicate(x):
            yield x

其實例為:

from itertools import filterfalse

data = [1, 2, 3, 4, 5]
result = filterfalse(lambda x: x % 2, data)
print(result)  # <itertools.filterfalse object at 0x0000000000675E10>
print(list(result))  # [2, 4]

7、itertools.groupby(iterable, key=None)

創建一個迭代器,從iterable中返回一系列的key和groups。其中key是一個函數,用於計算從iterable中每一個元素產生的key值。

class groupby:
    # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
    # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
    def __init__(self, iterable, key=None):
        if key is None:
            key = lambda x: x
        self.keyfunc = key
        self.it = iter(iterable)
        self.tgtkey = self.currkey = self.currvalue = object()
    def __iter__(self):
        return self
    def __next__(self):
        while self.currkey == self.tgtkey:
            self.currvalue = next(self.it)    # Exit on StopIteration
            self.currkey = self.keyfunc(self.currvalue)
        self.tgtkey = self.currkey
        return (self.currkey, self._grouper(self.tgtkey))
    def _grouper(self, tgtkey):
        while self.currkey == tgtkey:
            yield self.currvalue
            try:
                self.currvalue = next(self.it)
            except StopIteration:
                return
            self.currkey = self.keyfunc(self.currvalue)

8、itertools.islice(iterable, start, stop[, step])

  創建一個迭代器,返回從iterable中選擇的元素。如果start非零,則iterable中的元素一直被取出直到取出的個數到達start截止;如果stop是None,則直到iterable中的元素耗盡為止,islice方法對於start、stop、step不支持負數。其大致相當於:

def islice(iterable, *args):
    # islice('ABCDEFG', 2) --> A B
    # islice('ABCDEFG', 2, 4) --> C D
    # islice('ABCDEFG', 2, None) --> C D E F G
    # islice('ABCDEFG', 0, None, 2) --> A C E G
    s = slice(*args)
    it = iter(range(s.start or 0, s.stop or sys.maxsize, s.step or 1))
    try:
        nexti = next(it)
    except StopIteration:
        return
    for i, element in enumerate(iterable):
        if i == nexti:
            yield element
            nexti = next(it)

特別的是如果start是None,迭代器是從0開始,如果step是None,默認是從1。

其實例為:

from itertools import islice

data = [1, 2, 3, 4, 5, 6]
result = islice(data, 1, 5)
print(result)  # <itertools.islice object at 0x000000000A426EF8>
print(list(result))  # [2, 3, 4, 5]

 9、itertools.starmap(function, iterable)

創建一個迭代器,從iterable中獲取參數來計算函數,map()和starmap()的區別相當於function(a,b)和function(*c),其大體如下:

def starmap(function, iterable):
    # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
    for args in iterable:
        yield function(*args)

10、itertools.takewhile(predicate, iterable)

創建一個迭代器,只要predicate為True,就返回與之對應的iterable中的元素。其大體如下:

def takewhile(predicate, iterable):
    # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
    for x in iterable:
        if predicate(x):
            yield x
        else:
            break

11、itertools.tee(iterable, n=2)

從一個iterable返回n個獨立的迭代器。其大體如下:

def tee(iterable, n=2):
    it = iter(iterable)
    deques = [collections.deque() for i in range(n)]
    def gen(mydeque):
        while True:
            if not mydeque:             # when the local deque is empty
                try:
                    newval = next(it)   # fetch a new value and
                except StopIteration:
                    return
                for d in deques:        # load it to all the deques
                    d.append(newval)
            yield mydeque.popleft()
    return tuple(gen(d) for d in deques)

實例為:

from itertools import tee

result = tee([1,2,3],2)
print(result) #(<itertools._tee object at 0x0000000000669448>, <itertools._tee object at 0x00000000006CBD08>)
for item in result:
    print(list(item)) #[1, 2, 3], [1, 2, 3]

 12、itertools.zip_longest(*iterables, fillvalue=None)

創建一個迭代器,以聚合每個iterable中的元素,如果iterable中元素的長度不均勻,則用fillvalue進行填充缺失值,迭代一直持續到最長的iterable耗盡為止。其大體相當於:

class ZipExhausted(Exception):
    pass

def zip_longest(*args, **kwds):
    # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
    fillvalue = kwds.get('fillvalue')
    counter = len(args) - 1
    def sentinel():
        nonlocal counter
        if not counter:
            raise ZipExhausted
        counter -= 1
        yield fillvalue
    fillers = repeat(fillvalue)
    iterators = [chain(it, sentinel(), fillers) for it in args]
    try:
        while iterators:
            yield tuple(map(next, iterators))
    except ZipExhausted:
        pass

三、組合迭代器

1、itertools.product(*iterables, repeat=1)

大致等效於生成器中的for循環:

((x,y) for x in A for y in B)

其大體如下:

def product(*args, repeat=1):
    # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
    # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
    pools = [tuple(pool) for pool in args] * repeat
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    for prod in result:
        yield tuple(prod)

2、itertools.permutations(iterable, r=None)

def permutations(iterable, r=None):
    # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
    # permutations(range(3)) --> 012 021 102 120 201 210
    pool = tuple(iterable)
    n = len(pool)
    r = n if r is None else r
    if r > n:
        return
    indices = list(range(n))
    cycles = list(range(n, n-r, -1))
    yield tuple(pool[i] for i in indices[:r])
    while n:
        for i in reversed(range(r)):
            cycles[i] -= 1
            if cycles[i] == 0:
                indices[i:] = indices[i+1:] + indices[i:i+1]
                cycles[i] = n - i
            else:
                j = cycles[i]
                indices[i], indices[-j] = indices[-j], indices[i]
                yield tuple(pool[i] for i in indices[:r])
                break
        else:
            return
permutations也可以用product函數來進行表示,只要排除那些重復的元素即可。
def permutations(iterable, r=None):
    pool = tuple(iterable)
    n = len(pool)
    r = n if r is None else r
    for indices in product(range(n), repeat=r):
        if len(set(indices)) == r:
            yield tuple(pool[i] for i in indices)

3、itertools.combinations(iterable, r)

組合按字典順序排序。因此,如果對輸入的iterable進行排序,則將按排序順序生成組合元組。其大體相當於:

def combinations(iterable, r):
    # combinations('ABCD', 2) --> AB AC AD BC BD CD
    # combinations(range(4), 3) --> 012 013 023 123
    pool = tuple(iterable)
    n = len(pool)
    if r > n:
        return
    indices = list(range(r))
    yield tuple(pool[i] for i in indices)
    while True:
        for i in reversed(range(r)):
            if indices[i] != i + n - r:
                break
        else:
            return
        indices[i] += 1
        for j in range(i+1, r):
            indices[j] = indices[j-1] + 1
        yield tuple(pool[i] for i in indices)

4、itertools.combinations_with_replacement(iterable, r)

從輸入迭代返回元素的r長度子序列, 允許單個元素重復多次。組合按字典順序排序。因此,如果對輸入的iterable進行排序,則將按排序順序生成組合元組。其大體如下:

def combinations_with_replacement(iterable, r):
    # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
    pool = tuple(iterable)
    n = len(pool)
    if not n and r:
        return
    indices = [0] * r
    yield tuple(pool[i] for i in indices)
    while True:
        for i in reversed(range(r)):
            if indices[i] != n - 1:
                break
        else:
            return
        indices[i:] = [indices[i] + 1] * (r - i)
        yield tuple(pool[i] for i in indices)

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM