SQLAlchemy 學習筆記（三）：ORM 中的關系構建

本文轉載自查看原文 2019-05-21 22:19 519 Web開發/ orm/ 數據庫/ sqlalchemy/ Python/ sql

個人筆記，不保證正確。

關系構建：`ForeignKey` 與 `relationship`

關系構建的重點，在於搞清楚這兩個函數的用法。ForeignKey 的用法已經在 SQL表達式語言 - 表定義中的約束講過了。主要是 ondelete 和 onupdate 兩個參數的用法。

`relationship`

relationship 函數在 ORM 中用於構建表之間的關聯關系。與 ForeignKey 不同的是，它定義的關系不屬於表定義，而是動態計算的。
用它定義出來的屬性，相當於 SQL 中的視圖。

這個函數有點難用，一是因為它的有幾個參數不太好理解，二是因為它的參數非常豐富，讓人望而卻步。下面通過一對多、多對一、多對多幾個場景下 relationship 的使用，來一步步熟悉它的用法。

首先初始化：

from sqlalchemy import Table, Column, Integer, ForeignKey
from sqlalchemy.orm import relationship
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

一對多

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)

    # 因為 Child 中有 Parent 的 ForeignKey，這邊的聲明不需要再額外指定什么。
    children = relationship("Child")  # children 的集合，相當於一個視圖。

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))

一個 Parent 可以有多個 Children，通過 relationship，我們就能直接通過 parent.children 得到結果，免去繁瑣的 query 語句。

反向引用

1. `backref` 與 `back_populates`

那如果我們需要得知 child 的 parent 對象呢？能不能直接訪問 child.parent？

為了實現這個功能，SQLAlchemy 提供了 backref 和 back_populates 兩個參數。

兩個參數的效果完全一致，區別在於，backref 只需要在 Parent 類中聲明 children，Child.parent 會被動態創建。

而 back_populates 必須在兩個類中顯式地使用 back_populates，更顯繁瑣。（但是也更清晰？）

先看 backref 版：

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child",
                            backref="parent")  # backref 表示，在 Child 類中動態創建 parent 屬性，指向當前類。

# Child 類不需要修改

再看 back_populates 版：

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child", back_populates="parent")  # back_populates 

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))

    # 這邊也必須聲明，不能省略！
    parent = relationship("Parent", back_populates="children")  # parent 不是集合，是屬性！

NOTE：聲明的兩個 relationship 不需要多余的說明，SQLAlchemy 能自動識別到 parent.children 是 collection，child.parent 是 attribute.

2. 反向引用的參數：`sqlalchemy.orm.backref(name, **kwargs)`

使用 back_populates 時，我們可以很方便地在兩個 relationship 函數中指定各種參數：

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child", back_populates="parent", 
                                        lazy='dynamic')  # 指定 lazy 的值

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))
    parent = relationship("Parent", back_populates="children", 
                                      lazy='dynamic')  # 指定 lazy 的值

但是如果使用 backref，因為我們只有一個 relationship 函數，Child.parent 是被隱式創建的，我們該如何指定這個屬性的參數呢？

答案就是 backref() 函數，使用它替代 backref 參數的值：

from sqlalchemy.orm import backref

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child",
                            backref=backref("parent", lazy='dynamic'))  # 使用 backref() 函數，指定 Child.parent 屬性的參數

# Child 類不需要修改

backref() 的參數會被傳遞給 relationship()，因此它倆的參數也完全一致。

多對一

A many-to-one is similar to a one-to-many relationship. The difference is that this relationship is looked at from the "many" side.

一對一

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    child = relationship("Child", 
                                    uselist=False,   # 不使用 collection！這是關鍵
                                    back_populates="parent")

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))

     # 包含 ForeignKey 的類，此屬性默認為 attribute，因此不需要 uselist=False
    parent = relationship("Parent", back_populates="child")

多對多

# 多對多，必須要使用一個關聯表！
association_table = Table('association', Base.metadata,
    Column('left_id', Integer, ForeignKey('left.id')),  # 約定俗成的規矩，左邊是 parent
    Column('right_id', Integer, ForeignKey('right.id'))  # 右邊是 child
)

class Parent(Base):
    __tablename__ = 'left'
    id = Column(Integer, primary_key=True)
    children = relationship("Child",
                    secondary=association_table)  # 專用參數 secondary，用於指定使用的關聯表

class Child(Base):
    __tablename__ = 'right'
    id = Column(Integer, primary_key=True)

要添加反向引用時，同樣可以使用 backref 或 back_populates.

user2user

如果多對多關系中的兩邊都是 user，即都是同一個表時，該怎么聲明？

例如用戶的「關注」與「粉絲」，你是 user，你的粉絲是 user，你關注的賬號也是 user。

這個時候，關聯表 association_table 的兩個鍵都是 user，SQLAlchemy 無法區分主次，需要手動指定，為此需要使用 primaryjoin 和 secondaryjoin 兩個參數。

# 關聯表，左側的 user 正在關注右側的 user
followers = db.Table('followers',
    db.Column('follower_id', db.Integer, db.ForeignKey('user.id')),  # 左側
    db.Column('followed_id', db.Integer, db.ForeignKey('user.id'))  # 右側，被關注的 user
)

class User(UserMixin, db.Model):
    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(64), index=True, unique=True, nullable=False)
    email = db.Column(db.String(120), index=True, unique=True, nullable=False)
    password_hash = db.Column(db.String(128), nullable=False)

    # 我關注的 users
    followed = db.relationship(
        'User',
        secondary=followers,  # 指定多對多關聯表
        primaryjoin=(followers.c.follower_id == id),  # 左側，用於獲取「我關注的 users」的 join 條件
        secondaryjoin=(followers.c.followed_id == id),  # 右側，用於獲取「我的粉絲」的 join 條件
        lazy='dynamic',  # 延遲求值，這樣才能用 filter_by 等過濾函數
        backref=db.backref('followers', lazy='dynamic'))  # followers 也要延遲求值

這里比較繞的，就是容易搞混 primaryjoin 和 secondaryjoin 兩個參數。

primaryjoin：（多對多中）用於從子對象查詢其父對象的 condition（child.parents），默認只考慮外鍵。
secondaryjoin：（多對多中）用於從父對象查詢其所有子對象的 condition（parent.children），同樣的，默認情況下只考慮外鍵。

ORM 層的 “delete” cascade vs. FOREIGN KEY 層的 “ON DELETE” cascade

之前有講過 Table 定義中的級聯操作：ON DELETE 和 ON UPDATE，可以通過 ForeignKey 的參數指定為 CASCADE.

可 SQLAlchemy 還有一個 relationship 生成 SQL 語句時的配置參數 cascade，另外 passive_deletes 也可以指定為 cascade。

有這么多的 cascade，我真的是很懵。這三個 cascade 到底有何差別呢？

外鍵約束中的 ON DELETE 和 ON UPDATE，與 ORM 層的 CASCADE 在功能上，確實有很多重疊的地方。
但是也有很多不同：

數據庫層面的 ON DELETE 級聯能高效地處理 many-to-one 的關聯；我們在 many 方定義外鍵，也在這里添加 ON DELETE 約束。而在 ORM 層，就剛好相反。SQLAlchemy 在 one 方處理 many 方的刪除操作，這意味着它更適合處理 one-to-many 的關聯。
數據庫層面上，不帶 ON DELETE 的外鍵常用於防止父數據被刪除，而導致子數據成為無法被索引到的垃圾數據。如果要在一個 one-to-many 映射上實現這個行為，SQLAlchemy 將外鍵設置為 NULL 的默認行為可以通過以下兩種方式之一捕獲：
1. 最簡單也最常用的方法，當然是將外鍵定義為 NOT NULL. 嘗試將該列設為 NULL 會觸發 NOT NULL constraint exception.
2. 另一種更特殊的方法，是將 passive_deletes 標志設置為字 all. 這會完全禁用 SQLAlchemy 將外鍵列設置為 NULL 的行為，並且 DELETE 父數據而不會對子數據產生任何影響。這樣才能觸發數據庫層面的 ON DELETE 約束，或者其他的觸發器。
3. 數據庫層面的 ON DELETE 級聯比 ORM 層面的級聯更高效。數據庫可以同時在多個 relationship 中鏈接一系列級聯操作。
4. SQLAlchemy 不需要這么復雜，因為我們通過將 passive_deletes 選項與正確配置的外鍵約束結合使用，提供與數據庫的 ON DELETE 功能的平滑集成。

方法一：ORM 層的 cascade 實現

relationship 的 cascade 參數決定了修改父表時，什么時候子表要進行級聯操作。它的可選項有（str，選項之間用逗號分隔）：

save-update：默認選項之一。在 add（對應 SQL 的 insert 或 update）一個對象的時候，會 add 所有它相關聯的對象。
merge：默認選項之一。在 merge（相當字典的update操作，有就替換掉，沒有就合並）一個對象的時候，會 merge 所有和它相關聯的對象。
expunge ：移除操作的時候，會將相關聯的對象也進行移除。這個操作只是從session中移除，並不會真正的從數據庫中刪除。
delete：刪除父表數據時，同時刪除與它關聯的數據。
delete-orphan：當子對象與父對象解除關系時，刪除掉此子對象（孤兒）。（其實還是沒懂。。）
refresh-expire：不常用。
all：表示選中除 delete-orphan 之外的所有選項。（因此 all, delete-orphan 很常用，它才是真正的 all）

默認屬性是 "save-update, merge".

這只是簡略的說明，上述幾個參數的詳細文檔見 SQLAlchemy - Cascades

方法二：數據庫層的 cascade 實現

將 ForeignKey 的 ondelete 和 onupdate 參數指定為 CASCADE，實現數據庫層面的級聯。
為 relationship 添加關鍵字參數 passive_deletes="all"，這樣就完全禁用 SQLAlchemy 將外鍵列設置為 NULL 的行為，並且 DELETE 父數據不會對子數據產生任何影響。

這樣 DELETE 操作時，就會觸發數據庫的 ON DELETE 約束，從而級聯刪除子數據。

參考

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 SqlAlchemy ORM SQLAlchemy-對象關系教程ORM-create SQLAlchemy-對象關系教程ORM-連接，子查詢 Python學習筆記（六）SQLAlchemy中filter()和filter_by()有什么區別 Eloquent ORM 學習筆記 flask-SQLAlchemy學習筆記 mysql八：ORM框架SQLAlchemy Python3-ORM-Sqlalchemy Python3-sqlalchemy-orm 多對多關系建表、插入數據、查詢數據 SqlAlchemy個人學習筆記完整匯總