Django學習筆記之Django ORM Aggregation聚合詳解

本文轉載自查看原文 2018-07-06 22:57 1821 Aggregation/ Django/ Django學習筆記/ ORM

在當今根據需求而不斷調整而成的應用程序中，通常不僅需要能依常規的字段，如字母順序或創建日期，來對項目進行排序，還需要按其他某種動態數據對項目進行排序。Djngo聚合就能滿足這些要求。

以下面的Model為例

from django.db import models
 
class Author(models.Model):
    name = models.CharField(max_length=100)
    age = models.IntegerField()
 
class Publisher(models.Model):
    name = models.CharField(max_length=300)
    num_awards = models.IntegerField()
 
class Book(models.Model):
    name = models.CharField(max_length=300)
    pages = models.IntegerField()
    price = models.DecimalField(max_digits=10, decimal_places=2)
    rating = models.FloatField()
    authors = models.ManyToManyField(Author)
    publisher = models.ForeignKey(Publisher)
    pubdate = models.DateField()
 
class Store(models.Model):
    name = models.CharField(max_length=300)
    books = models.ManyToManyField(Book)
    registered_users = models.PositiveIntegerField()

快速了解

# books總數量.
>>> Book.objects.count()
2452
 
# Total number of books with publisher=BaloneyPress
>>> Book.objects.filter(publisher__name='BaloneyPress').count()
73
 
# books的平均price.
>>> from django.db.models import Avg
>>> Book.objects.all().aggregate(Avg('price'))
{'price__avg': 34.35}
 
# books的最大price.
>>> from django.db.models import Max
>>> Book.objects.all().aggregate(Max('price'))
{'price__max': Decimal('81.20')}
 
# All the following queries involve traversing the Book<->Publisher
# many-to-many relationship backward
 
# 為每個publisher添加個num_books屬性，即每個pulisher出版的book的數量.
>>> from django.db.models import Count
>>> pubs = Publisher.objects.annotate(num_books=Count('book'))
>>> pubs
[<Publisher BaloneyPress>, <Publisher SalamiPress>, ...]
>>> pubs[0].num_books
73
 
# 根據num_book屬性排序.
>>> pubs = Publisher.objects.annotate(num_books=Count('book')).order_by('-num_books')[:5]
>>> pubs[0].num_books
1323

聚合生成Generating aggregates over a QuerySet

Django有兩種方法來生成聚合。第一種方法是為整個QuerySet生成聚合值，例如為全部的books生成price的平均值：

>>> from django.db.models import Avg
>>> Book.objects.all().aggregate(Avg('price'))
{'price__avg': 34.35}

可以簡略為：

>>> Book.objects.aggregate(Avg('price'))
{'price__avg': 34.35}

函數aggregate()的參數是一系列聚合函數aggregate functions:

Avg

# 返回平均值

Count

# class Count(field, distinct=False)

# 返回計數。當參數distinct=True時，返回unique的對象數目。

Max

# 返回最大值

Min

# 返回最小值.

StdDev

# class StdDev(field, sample=False)返回標准偏差
# 有一個參數sample

# 默認情況下sample=False，返回總體標准偏差，如果sample=True，返回樣本標准偏差。

Sum

# 返回總值

Variance

# class Variance(field, sample=False)
# 返回方差

# 有一個參數sample，默認返回總體方差，sample設為True時返回樣本方差。

aggregate()方法被調用時，返回一個鍵值對字典，可以指定key的名字：

>>> Book.objects.aggregate(average_price=Avg('price'))
{'average_price': 34.35}

如果你想生成多個聚合，你只需要添加另一個參數。所以，如果我們還想知道所有書的最高和最低的價格：

>>> from django.db.models import Avg, Max, Min
>>> Book.objects.aggregate(Avg('price'), Max('price'), Min('price'))
{'price__avg': 34.35, 'price__max': Decimal('81.20'), 'price__min': Decimal('12.99')}

為查詢集的每個對象生成聚合值Generating aggregates for each item in a QuerySet

這是生成聚合值的第二種方法。比如你要檢索每本書有多少個作者。book和author是manytomany的關系，我們可以為每本書總結出這種關系。

每個對象的總結可以用方法annotate()生成：

# 建立一個annotate QuerySet
>>> from django.db.models import Count
>>> q = Book.objects.annotate(Count('authors'))
# 第一個對象
>>> q[0]
<Book: The Definitive Guide to Django>
>>> q[0].authors__count
2
# 第二個對象
>>> q[1]
<Book: Practical Django Projects>
>>> q[1].authors__count
1

也可以指定生成屬性的名字：

>>> q = Book.objects.annotate(num_authors=Count('authors'))
>>> q[0].num_authors
2
>>> q[1].num_authors
1

和aggregate()不同，annotate()的輸出是一個QuerySet。

聯合聚合Joins and aggregates

目前為止，我們聚合查詢的field都屬於我們要查詢的Model，我們也可以用其它Model的field來進行聚合查詢，例如：

>>> from django.db.models import Max, Min
>>> Store.objects.annotate(min_price=Min('books__price'), max_price=Max('books__price'))

這樣就可以查詢每個Store里面books的價格范圍

聯合鏈的深度可以隨心所欲：

>>> Store.objects.aggregate(youngest_age=Min('books__authors__age'))

反向關系Following relationships backwards

通過book反向查詢publisher：

>>> from django.db.models import Count, Min, Sum, Avg
>>> Publisher.objects.annotate(Count('book'))

返回的QuerySet的每個publisher都會帶一個屬性book_count。

查詢出版最久的書的出版日期：

>>> Publisher.objects.aggregate(oldest_pubdate=Min('book__pubdate'))

查詢每個作者寫的書的總頁數：

>>> Author.objects.annotate(total_pages=Sum('book__pages'))

查詢所有作者寫的書的平均rating：

>>> Author.objects.aggregate(average_rating=Avg('book__rating'))

聚合和其它查詢集操作Aggregations and other QuerySet clauses

filter() and exclude()

聚合可以和filter和exclude一起使用：

>>> from django.db.models import Count, Avg
>>> Book.objects.filter(name__startswith="Django").annotate(num_authors=Count('authors'))
>>> Book.objects.filter(name__startswith="Django").aggregate(Avg('price'))

可以根據聚合值進行篩選：

>>> Book.objects.annotate(num_authors=Count('authors')).filter(num_authors__gt=1)

編寫一個包含annotate()和filter()從句的復雜查詢時，要特別注意作用於QuerySet的從句的順序順序的不同，產生的意義也不同：

>>> Publisher.objects.annotate(num_books=Count('book')).filter(book__rating__gt=3.0)
>>> Publisher.objects.filter(book__rating__gt=3.0).annotate(num_books=Count('book'))

兩個查詢都返回了至少出版了一本好書(評分大於3分)的出版商的列表。但是第一個查詢的注解包含其該出版商發行的所有圖書的總數；而第二個查詢的注解只包含出版過好書的出版商的所發行的好書(評分大於3分)總數。在第一個查詢中，注解在過濾器之前，所以過濾器對注解沒有影響。在第二個查詢中，過濾器在注解之前，所以，在計算注解值時，過濾器就限制了參與運算的對象的范圍

order_by()

可以根據聚合值進行排序

>>> Book.objects.annotate(num_authors=Count('authors')).order_by('num_authors')

values()

通常，注解annotate是添加到每一個對象上的，一個執行了注解操作的查詢集 QuerySet 所返回的結果中，每個對象都添加了一個注解值。但是，如果使用了values()從句，它就會限制結果中列的范圍，對注解賦值的方法就會完全不同。就不是在原始的 QuerySet 返回結果中對每個對象中添加注解，而是根據定義在 values() 從句中的字段組合對先結果進行唯一的分組，再根據每個分組算出注解值，這個注解值是根據分組中所有的成員計算而得的：

>>> Author.objects.values('name').annotate(average_rating=Avg('book__rating'))

這樣的寫法下，QuerySet會根據name進行組合，返回的是每個unique name的聚合值。如果有兩個作者有相同的名字，這兩個作者會被當做一個計算，他們的books會合在一起。

>>> Author.objects.annotate(average_rating=Avg('book__rating')).values('name', 'average_rating')

位置互換后，會為每個author都生成一個average_rating，而且只會輸出每個author的name和average_rating。

默認排序下使用聚合：

from django.db import models
 
class Item(models.Model):
    name = models.CharField(max_length=10)
    data = models.IntegerField()
 
    class Meta:
        ordering = ["name"]

如果你想知道每個非重復的data值出現的次數，你可能這樣寫：

# Warning: 不正確的寫法
Item.objects.values("data").annotate(Count("id"))

這部分代碼想通過使用它們公共的data值來分組Item對象，然后在每個分組中得到id值的總數。但是上面那樣做是行不通的。這是因為默認排序項中的name也是一個分組項，所以這個查詢會根據非重復的(data,name)進行分組，而這並不是你本來想要的結果。所以，你需要這樣寫來去除默認排序的影響：

Item.objects.values("data").annotate(Count("id")).order_by()

Aggregating annotations

>>> from django.db.models import Count, Avg
>>> Book.objects.annotate(num_authors=Count('authors')).aggregate(Avg('num_authors'))
{'num_authors__avg': 1.66}

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Django Aggregation聚合 Django學習--ORM機制詳解 django orm 聚合分組查詢 Django ORM聚合和分組查詢 073：【Django數據庫】ORM聚合函數詳解-Count 075：【Django數據庫】ORM聚合函數詳解-Sum Django學習筆記之uWSGI詳解 Django學習筆記之Queryset詳解 MongoDB學習筆記——聚合操作之聚合管道（Aggregation Pipeline） Django之ORM操作(聚合分組、F Q)