6.5 Lumen
6.5.1 Lumen技術特性
6.2.2.2 Lumen全局動態光照小節已經簡介過Lumen的特性,包含間接光照明、天空光、自發光照明、軟硬陰影、反射等,本節將更加詳細地介紹其技術特性。
首先需要闡明的是,Lumen是綜合使用了多種技術的結合體,而非單一技術的運用。比如,Lumen默認使用有符號距離場(SDF)的軟光追,但是當硬件光線追蹤被啟用時,可以在支持的顯卡上實現更高的質量。
下面將Lumen涉及的主要技術點羅列出來。
6.5.1.1 表面緩存(Surface Cache)
Lumen會為場景表面的附近生成自動化參數,被稱為表面緩存(Surface Cache),表面緩存用於快速查詢場景中射線命中點的光照。Lumen會為每個網格從多角度捕捉材質屬性,這些捕捉位置被稱為Cards,是逐網格被離線生成的。通過控制台參數r.Lumen.Visualize.CardPlacement 1
可以查看Lumen Cards的可視化效果:
上:正常渲染畫面;下:Lumen Card可視化。
Nanite加速了網格捕捉,用於保持Surface Cache與三角形場景同步。特別是高面數的網格,需要使用Nanite來獲得高效捕捉。
當Surface Cache被材質屬性填充后,Lumen計算這些表面位置的直接和間接照明。這些更新在多個幀上攤銷,為許多動態燈光和多反彈的全局照明提供有效的支持。
只有內部簡單的網格可以被支持,如牆壁、地板和天花板,它們應該各自使用單獨的網格,而不應該合成一個大網格。
6.5.1.2 屏幕追蹤(Screen Tracing)
Lumen的特點是先對屏幕進行追蹤(稱為屏幕追蹤或屏幕空間追蹤),如果沒有擊中,或者光線經過表面后,就使用更可靠的方法。
使用屏幕追蹤的缺點是,它極大地限制了藝術家的控制,導致只適用於間接照明,如Indirect lighting Scale、Emissive Boost等光照屬性。
件光線追蹤首先使用屏幕追蹤,然后再使用其它開銷更大的追蹤選項。如果屏幕追蹤被禁用於GI和反射,將會看見只有Lumen場景。屏幕跟蹤支持任何幾何類型,並有助於掩蓋Lumen場景和三角形場景之間的不匹配現象。
使用r.Lumen.ScreenProbeGather.ScreenTraces 0|1
開啟或關閉屏幕追蹤,以查看場景的對比效果:
上:開啟了Lumen屏幕追蹤的效果;下:關閉Lumen屏幕追蹤的效果。可知在反射上差別最明顯,其次是部分間接光。
6.5.1.3 Lumen光線追蹤
Lumen支持兩種光線追蹤模式:
1、軟件光線追蹤。可以在最廣泛的硬件和平台上運行。
2、硬件光線追蹤。需要顯卡和操作系統支持。
- 軟件光線追蹤
Lumen默認使用依賴有向距離場的軟件光線追蹤,這意味着可以運行於支持SM5的硬件上。
需要在工程設置中開啟生成網格距離場(Generate Mesh Distance Fields),UE5默認已開啟。
渲染器會合並網格的距離場到一個全局距離場(Global Distance Field)以加速追蹤。默認情況下,Lumen追蹤每一個網格距離場的前兩米的准確性,其它距離的射線則使用合並的全局距離場。如果項目需要精確控制Lumen軟光追,則可以在項目設置中使用的軟件光線追蹤模式的方法:
細節追蹤(Detail Tracing)是默認的追蹤方法,可以利用單獨的網格距離場來達到高質量的GI(前兩米才使用,其它距離用全局距離場)。全局追蹤(Global Tracing)利用全局距離場來快速追蹤,但會損失一定的畫質效果。
網格距離場會根據攝像機在世界的移動而動態流式加載或卸載。它們會被打包成一個圖集(Atlas),可以通過控制台命令r.DistanceFields.LogAtlasStats 1
輸出信息:
由於Lumen的軟光追的質量非常依賴網格距離場,所以關注網格距離場的質量可以提升Lumen的GI效果。下圖是現實網格距離場和全局距離場的菜單:
下面兩圖分別是網格距離場和全局距離場可視化:
但是,軟件光線追蹤存在着諸多限制,主要有:
-
幾何物體限制:
- Lumen場景只支持靜態網格、實例化靜態網格、層級實例化靜態網格(Hierarchical Instanced Static Meshe)。
- 不支持地貌幾何體,因此它們沒有間接反射光。未來將會支持。
-
材質限制:
- 不支持世界位置偏移(WPO)。
- 不支持透明物體,視Masked物體為不透明物體。
- 距離場數據的構建基於靜態網格資產的材質屬性,而不是覆蓋的組件(override component)。意味着運行時改變材質不會影響到Lumen的GI。
-
工作流限制:
- 軟件光線追蹤要求層級是由模塊組成。牆壁、地板和天花板應該是獨立的網格。較大的網格(如山)將有不良的表現,並可能導致自遮擋偽陰影。
- 牆壁應大於10厘米,以避免漏光。
- 距離場的分辨率依賴靜態網格導入時的設置,如果壓縮率過高,將得不到高質量的距離場數據。
- 距離場無法表達很薄的物體。
上面已經闡述完Lumen的軟件光追,下面繼續介紹其硬件光追。
- 硬件光線追蹤
硬件光線追蹤比軟件光線追蹤支持更大范圍的幾何物體類型,特別是它支持追蹤蒙皮網格。硬件光線追蹤也能更好地獲得更高的畫面質量:它與實際的三角形相交,並有選擇地來評估光線擊中點的照明,而不是較低質量的Surface Cache。
然而,硬件光線追蹤的場景設置成本很高,目前還無法擴展到實例數超過10萬的場景。動態變形網格(如蒙皮網格)也會導致更新每一幀的光線追蹤加速結構的巨大成本,該成本與蒙皮三角形的數量成正比。
對於使用Nanite的靜態網格,硬件光線追蹤為了渲染效率,只能在靜態網格編輯器設置中Nanite的Proxy Triangle Percent生成的代理網格(Proxy Mesh)上操作。這些Proxy Mesh可以通過控制台命令r.Nanite 0|1
來開關可視化:
上:全精度細節的三角形網格;下:對應的Nanite代理網格。
屏幕追蹤用於掩蓋Nanite渲染的全精度三角形網格和Lumen射線追蹤的代理網格之間的不匹配。然而,在某些情況下,不匹配太大而無法掩蓋。上面兩圖就是因為Proxy Triangle Percent數值太小,導致了自陰影的瑕疵。
Lumen只有在滿足以下條件時才啟用硬件光線追蹤:
- 工程設置里開啟了Use Hardware Ray Tracing when available和Support Hardware Ray Tracing。
- 工程運行於支持的操作系統、RHI和顯卡。目前僅以下平台支持硬件光追:
- 帶DirectX 12的Windows10。
- PlayStation 5。
- Xbox系列S / X。
- 顯卡必須NVIDIA RTX-2000系列及以上,或者AMD RX 6000系列及以上。
6.5.1.4 Lumen其它說明
Lumen場景運行於攝像機附近的世界,而不是整個世界,實現了大世界和流數據。Lumen依賴於Nanite的LOD和多視圖光柵化來快速捕捉場景,以維護Surface Cache,並控制所有操作以防止出現錯誤。Lumen不需要Nanite來操作,但是在沒有啟用Nanite的場景中,Lumen的場景捕捉會變得非常慢。如果資產沒有良好的LOD設置,這種情況尤其嚴重。
Lumen的Surface Cache覆蓋了距離攝像頭200米的位置。在此之后的范圍,只有屏幕追蹤對於全局照明是開啟的。
此外,Lumen還存在其它限制:
- Lumen全局光照不能和光照圖(Lightmap)一起使用。未來,Lumen的反射應該被擴展到和Lightmap中使用全局照明,這將進一步提升渲染質量。
- 植物還不能被很好地支持,因為嚴重依賴於下采樣渲染和時間濾波器。
- Lumen的最后收集(Final Gather)會在移動物體周圍添加顯著的噪點,目前仍在積極開發中。
- 透明材質還不支持Lumen反射。
- 透明材質沒有高質量的動態GI。
以下是Lumen相關的調試或可視化信息:
上:正常畫面;中:Lumen Scene可視化;下:Lumen GI可視化。
當然,除了以上出現的幾個可視化選項,實際上Lumen還有很多其它可視化控制命令:
r.Lumen.RadianceCache.Visualize
r.Lumen.RadianceCache.VisualizeClipmapIndex
r.Lumen.RadianceCache.VisualizeProbeRadius
r.Lumen.RadianceCache.VisualizeRadiusScale
r.Lumen.ScreenProbeGather.VisualizeTraces
r.Lumen.ScreenProbeGather.VisualizeTracesFreeze
r.Lumen.Visualize.CardInterpolateInfluenceRadius
r.Lumen.Visualize.CardPlacement
r.Lumen.Visualize.CardPlacementDistance
r.Lumen.Visualize.CardPlacementIndex
r.Lumen.Visualize.CardPlacementOrientation
r.Lumen.Visualize.ClipmapIndex
r.Lumen.Visualize.ConeAngle
r.Lumen.Visualize.ConeStepFactor
r.Lumen.Visualize.GridPixelSize
r.Lumen.Visualize.HardwareRayTracing
r.Lumen.Visualize.HardwareRayTracing.DeferredMaterial
r.Lumen.Visualize.HardwareRayTracing.DeferredMaterial.TileDimension
r.Lumen.Visualize.HardwareRayTracing.LightingMode
r.Lumen.Visualize.HardwareRayTracing.MaxTranslucentSkipCount
r.Lumen.Visualize.MaxMeshSDFTraceDistance
r.Lumen.Visualize.MaxTraceDistance
r.Lumen.Visualize.MinTraceDistance
r.Lumen.Visualize.Stats
r.Lumen.Visualize.TraceMeshSDFs
r.Lumen.Visualize.TraceRadianceCache
r.Lumen.Visualize.VoxelFaceIndex
r.Lumen.Visualize.Voxels
r.Lumen.Visualize.VoxelStepFactor
ShowFlag.LumenGlobalIllumination
ShowFlag.LumenReflections
ShowFlag.VisualizeLumenIndirectDiffuse
ShowFlag.VisualizeLumenScene
此外,還有很多控制命令,以下顯示部分命令:
r.Lumen.DiffuseIndirect.Allow
r.Lumen.DiffuseIndirect.CardInterpolateInfluenceRadius
r.Lumen.DiffuseIndirect.CardTraceEndDistanceFromCamera
r.Lumen.DirectLighting
r.Lumen.DirectLighting.BatchSize
r.Lumen.DirectLighting.CardUpdateFrequencyScale
r.Lumen.HardwareRayTracing
r.Lumen.HardwareRayTracing.PullbackBias
r.Lumen.IrradianceFieldGather
r.Lumen.IrradianceFieldGather.ClipmapDistributionBase
r.Lumen.IrradianceFieldGather.ClipmapWorldExtent
r.Lumen.MaxConeSteps
r.Lumen.MaxTraceDistance
r.Lumen.ProbeHierarchy
r.Lumen.ProbeHierarchy.AdditionalSpecularRayThreshold
r.Lumen.ProbeHierarchy.AntiTileAliasing
r.Lumen.RadianceCache.DownsampleDistanceFromCamera
r.Lumen.RadianceCache.ForceFullUpdate
r.Lumen.RadianceCache.NumFramesToKeepCachedProbes
r.Lumen.Radiosity
r.Lumen.Radiosity.CardUpdateFrequencyScale
r.Lumen.Radiosity.ComputeScatter
r.Lumen.Radiosity.ConeAngleScale
r.Lumen.Reflections.Allow
r.Lumen.Reflections.DownsampleFactor
r.Lumen.Reflections.GGXSamplingBias
r.Lumen.Reflections.HardwareRayTracing
r.Lumen.Reflections.HardwareRayTracing.DeferredMaterial
r.Lumen.Reflections.HierarchicalScreenTraces.UncertainTraceRelativeDepthThreshold
r.Lumen.Reflections.MaxRayIntensity
r.Lumen.Reflections.MaxRoughnessToTrace
r.Lumen.Reflections.RoughnessFadeLength
r.Lumen.Reflections.ScreenSpaceReconstruction
r.Lumen.Reflections.ScreenTraces
r.Lumen.Reflections.Temporal
r.Lumen.Reflections.Temporal.DistanceThreshold
r.Lumen.Reflections.Temporal.HistoryWeight
r.Lumen.Reflections.TraceMeshSDFs
r.Lumen.ScreenProbeGather
r.Lumen.ScreenProbeGather.AdaptiveProbeAllocationFraction
r.Lumen.ScreenProbeGather.AdaptiveProbeMinDownsampleFactor
r.Lumen.ScreenProbeGather.DiffuseIntegralMethod
r.Lumen.ScreenProbeGather.DownsampleFactor
r.Lumen.ScreenProbeGather.FixedJitterIndex
r.Lumen.ScreenProbeGather.FullResolutionJitterWidth
r.Lumen.ScreenProbeGather.GatherNumMips
r.Lumen.ScreenProbeGather.GatherOctahedronResolutionScale
r.Lumen.ScreenProbeGather.HardwareRayTracing
r.Lumen.ScreenProbeGather.ImportanceSample.ProbeRadianceHistory
r.Lumen.ScreenProbeGather.MaxRayIntensity
r.Lumen.ScreenProbeGather.OctahedralSolidAngleTextureSize
r.Lumen.ScreenProbeGather.RadianceCache
r.Lumen.ScreenProbeGather.RadianceCache.ClipmapDistributionBase
r.Lumen.ScreenProbeGather.ReferenceMode
r.Lumen.ScreenProbeGather.ScreenSpaceBentNormal
r.Lumen.ScreenProbeGather.ScreenTraces
r.Lumen.ScreenProbeGather.ScreenTraces.HZBTraversal
r.Lumen.ScreenProbeGather.SpatialFilterHalfKernelSize Experimental
r.Lumen.ScreenProbeGather.SpatialFilterMaxRadianceHitAngle
r.Lumen.ScreenProbeGather.Temporal
r.Lumen.ScreenProbeGather.Temporal.ClearHistoryEveryFrame
r.Lumen.ScreenProbeGather.TraceMeshSDFs
r.Lumen.ScreenProbeGather.TracingOctahedronResolution
r.Lumen.TraceMeshSDFs
r.Lumen.TraceMeshSDFs.Allow
r.Lumen.TranslucencyVolume.ConeAngleScale
r.Lumen.TranslucencyVolume.Enable
r.Lumen.TranslucencyVolume.EndDistanceFromCamera
r.LumenParallelBeginUpdate
r.LumenScene.CardAtlasAllocatorBinSize
r.LumenScene.CardAtlasSize
r.LumenScene.CardCameraDistanceTexelDensityScale
r.LumenScene.CardCaptureMargin
r.LumenScene.ClipmapResolution
r.LumenScene.ClipmapWorldExtent
r.LumenScene.ClipmapZResolutionDivisor
r.LumenScene.DiffuseReflectivityOverride
r.LumenScene.DistantScene
r.LumenScene.DistantScene.CardResolution
r.LumenScene.FastCameraMode
r.LumenScene.GlobalDFClipmapExtent
r.LumenScene.GlobalDFResolution
r.LumenScene.HeightfieldSlopeThreshold
r.LumenScene.MaxInstanceAddsPerFrame
r.LumenScene.MeshCardsCullFaces
r.LumenScene.MeshCardsMaxLOD
r.LumenScene.NaniteMultiViewCapture
r.LumenScene.NumClipmapLevels
r.LumenScene.PrimitivesPerPacket
r.LumenScene.RecaptureEveryFrame
r.LumenScene.Reset
r.LumenScene.UploadCardBufferEveryFrame
r.LumenScene.VoxelLightingAverageObjectsPerVisBufferTile
r.SSGI.AllowStandaloneLumenProbeHierarchy
r.Water.SingleLayer.LumenReflections
Lumen相關的控制台指令達到上百個,由此可知Lumen渲染的復雜度有多高!!
6.5.2 Lumen渲染基礎
本節將闡述Lumen相關的基礎概念和類型。
6.5.2.1 FLumenCard
FLumenCard就是上一小節提及的Card,是FLumenMeshCards的基本組成元素。
// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneData.h
// Lumen卡片類型。
class FLumenCard
{
public:
FLumenCard();
~FLumenCard();
// 世界空間的包圍盒.
FBox WorldBounds;
// 旋轉信息.
FVector LocalToWorldRotationX;
FVector LocalToWorldRotationY;
FVector LocalToWorldRotationZ;
// 位置.
FVector Origin;
// 局部空間的包圍盒.
FVector LocalExtent;
// 是否可見.
bool bVisible = false;
// 是否處於遠景.
bool bDistantScene = false;
// 所在圖集的信息.
bool bAllocated = false;
FIntPoint DesiredResolution;
FIntRect AtlasAllocation;
// 朝向
int32 Orientation = -1;
// 在可見列表的索引.
int32 IndexInVisibleCardIndexBuffer = -1;
// 所在的FLumenMeshCards的Card列表的索引.
int32 IndexInMeshCards = -1;
// 所在的FLumenMeshCards的索引.
int32 MeshCardsIndex = -1;
// 分辨率縮放.
float ResolutionScale = 1.0f;
// 初始化
void Initialize(float InResolutionScale, const FMatrix& LocalToWorld, const FLumenCardBuildData& CardBuildData, int32 InIndexInMeshCards, int32 InMeshCardsIndex);
// 設置變換數據
void SetTransform(const FMatrix& LocalToWorld, FVector CardLocalCenter, FVector CardLocalExtent, int32 InOrientation);
void SetTransform(const FMatrix& LocalToWorld, const FVector& LocalOrigin, const FVector& CardToLocalRotationX, const FVector& CardToLocalRotationY, const FVector& CardToLocalRotationZ, const FVector& InLocalExtent);
// 從圖集(場景)中刪除.
void RemoveFromAtlas(FLumenSceneData& LumenSceneData);
int32 GetNumTexels() const
{
return AtlasAllocation.Area();
}
inline FVector TransformWorldPositionToCardLocal(FVector WorldPosition) const
{
FVector Offset = WorldPosition - Origin;
return FVector(Offset | LocalToWorldRotationX, Offset | LocalToWorldRotationY, Offset | LocalToWorldRotationZ);
}
inline FVector TransformCardLocalPositionToWorld(FVector CardPosition) const
{
return Origin + CardPosition.X * LocalToWorldRotationX + CardPosition.Y * LocalToWorldRotationY + CardPosition.Z * LocalToWorldRotationZ;
}
};
6.5.2.2 FLumenMeshCards
FLumenMeshCards是計算Surface Cache的基本元素,也是構成Lumen Scene的基本單元。它最多可存儲6個面(朝向)的FLumenCard信息,每個朝向可存儲0~N個FLumenCard信息(由NumCardsPerOrientation
指定)。
// Engine\Source\Runtime\Renderer\Private\Lumen\LumenMeshCards.h
class FLumenMeshCards
{
public:
// 初始化.
void Initialize(
const FMatrix& InLocalToWorld,
const FBox& InBounds,
uint32 InFirstCardIndex,
uint32 InNumCards,
uint32 InNumCardsPerOrientation[6],
uint32 InCardOffsetPerOrientation[6])
{
Bounds = InBounds;
SetTransform(InLocalToWorld);
FirstCardIndex = InFirstCardIndex;
NumCards = InNumCards;
for (uint32 OrientationIndex = 0; OrientationIndex < 6; ++OrientationIndex)
{
NumCardsPerOrientation[OrientationIndex] = InNumCardsPerOrientation[OrientationIndex];
CardOffsetPerOrientation[OrientationIndex] = InCardOffsetPerOrientation[OrientationIndex];
}
}
// 設置變換矩陣.
void SetTransform(const FMatrix& InLocalToWorld)
{
LocalToWorld = InLocalToWorld;
}
// 局部到世界的矩陣.
FMatrix LocalToWorld;
// 局部包圍盒.
FBox Bounds;
// 第一個FLumenCard索引.
uint32 FirstCardIndex = 0;
// FLumenCard數量.
uint32 NumCards = 0;
// 6個朝向的FLumenCard數量.
uint32 NumCardsPerOrientation[6];
// 6個朝向的FLumenCard偏移.
uint32 CardOffsetPerOrientation[6];
};
6.5.2.3 FLumenSceneData
FLumenSceneData就是Lumen實現全局光照的場景代表,它使用的不是Nanite的高精度網格,而是基於FLumenCard和FLumenMeshCards為基本元素的粗糙的場景。其定義及相關類型如下:
// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneData.h
// Lumen圖元實例
class FLumenPrimitiveInstance
{
public:
FBox WorldSpaceBoundingBox;
// FLumenMeshCards索引.
int32 MeshCardsIndex;
bool bValidMeshCards;
};
// Lumen圖元
class FLumenPrimitive
{
public:
// 世界空間包圍盒.
FBox WorldSpaceBoundingBox;
// 屬於此圖元的FLumenMeshCards的最大包圍盒, 用於早期剔除.
float MaxCardExtent;
// 圖元實例列表.
TArray<FLumenPrimitiveInstance, TInlineAllocator<1>> Instances;
// 對應的真實場景的圖元信息.
FPrimitiveSceneInfo* Primitive = nullptr;
// 是否合並的實例.
bool bMergedInstances = false;
// 卡片分辨率縮放.
float CardResolutionScale = 1.0f;
// FLumenMeshCards的數量.
int32 NumMeshCards = 0;
// 映射到LumenDFInstanceToDFObjectIndex.
uint32 LumenDFInstanceOffset = UINT32_MAX;
int32 LumenNumDFInstances = 0;
// 獲取FLumenMeshCards索引.
int32 GetMeshCardsIndex(int32 InstanceIndex) const
{
if (bMergedInstances)
{
return Instances[0].MeshCardsIndex;
}
if (InstanceIndex < Instances.Num())
{
return Instances[InstanceIndex].MeshCardsIndex;
}
return -1;
}
};
// Lumen場景數據.
class FLumenSceneData
{
public:
int32 Generation;
// 上傳GPU的緩沖.
FScatterUploadBuffer CardUploadBuffer;
FScatterUploadBuffer UploadMeshCardsBuffer;
FScatterUploadBuffer ByteBufferUploadBuffer;
FScatterUploadBuffer UploadPrimitiveBuffer;
FUniqueIndexList CardIndicesToUpdateInBuffer;
FRWBufferStructured CardBuffer;
TArray<FBox> PrimitiveModifiedBounds;
// Lumen場景的所有Lumen圖元.
TArray<FLumenPrimitive> LumenPrimitives;
// FLumenMeshCards數據.
FUniqueIndexList MeshCardsIndicesToUpdateInBuffer;
TSparseSpanArray<FLumenMeshCards> MeshCards;
TSparseSpanArray<FLumenCard> Cards;
TArray<int32, TInlineAllocator<8>> DistantCardIndices;
FRWBufferStructured MeshCardsBuffer;
FRWByteAddressBuffer DFObjectToMeshCardsIndexBuffer;
// 從圖元映射到LumenDFInstance.
FUniqueIndexList PrimitivesToUpdate;
FRWByteAddressBuffer PrimitiveToDFLumenInstanceOffsetBuffer;
uint32 PrimitiveToLumenDFInstanceOffsetBufferSize = 0;
// 從LumenDFInstance映射到DFObjectIndex
FUniqueIndexList DFObjectIndicesToUpdateInBuffer;
FUniqueIndexList LumenDFInstancesToUpdate;
TSparseSpanArray<int32> LumenDFInstanceToDFObjectIndex;
FRWByteAddressBuffer LumenDFInstanceToDFObjectIndexBuffer;
uint32 LumenDFInstanceToDFObjectIndexBufferSize = 0;
// 可見的FLumenMeshCards列表.
TArray<int32> VisibleCardsIndices;
TRefCountPtr<FRDGPooledBuffer> VisibleCardsIndexBuffer;
// --- 從三角形場景中捕獲的數據 ---
TRefCountPtr<IPooledRenderTarget> AlbedoAtlas;
TRefCountPtr<IPooledRenderTarget> NormalAtlas;
TRefCountPtr<IPooledRenderTarget> EmissiveAtlas;
// --- 生成的數據 ---
TRefCountPtr<IPooledRenderTarget> DepthAtlas;
TRefCountPtr<IPooledRenderTarget> FinalLightingAtlas;
TRefCountPtr<IPooledRenderTarget> IrradianceAtlas;
TRefCountPtr<IPooledRenderTarget> IndirectIrradianceAtlas;
TRefCountPtr<IPooledRenderTarget> RadiosityAtlas;
TRefCountPtr<IPooledRenderTarget> OpacityAtlas;
// 其它數據.
bool bFinalLightingAtlasContentsValid;
FIntPoint MaxAtlasSize;
FBinnedTextureLayout AtlasAllocator;
int32 NumCardTexels = 0;
int32 NumMeshCardsToAddToSurfaceCache = 0;
// 增刪圖元數據.
bool bTrackAllPrimitives;
TSet<FPrimitiveSceneInfo*> PendingAddOperations;
TSet<FPrimitiveSceneInfo*> PendingUpdateOperations;
TArray<FLumenPrimitiveRemoveInfo> PendingRemoveOperations;
FLumenSceneData(EShaderPlatform ShaderPlatform, EWorldType::Type WorldType);
~FLumenSceneData();
// 增刪圖元操作.
void AddPrimitiveToUpdate(int32 PrimitiveIndex);
void AddPrimitive(FPrimitiveSceneInfo* InPrimitive);
void UpdatePrimitive(FPrimitiveSceneInfo* InPrimitive);
void RemovePrimitive(FPrimitiveSceneInfo* InPrimitive, int32 PrimitiveIndex);
// 增刪FLumenMeshCards.
void AddCardToVisibleCardList(int32 CardIndex);
void RemoveCardFromVisibleCardList(int32 CardIndex);
void AddMeshCards(int32 LumenPrimitiveIndex, int32 LumenInstanceIndex);
void UpdateMeshCards(const FMatrix& LocalToWorld, int32 MeshCardsIndex, const FMeshCardsBuildData& MeshCardsBuildData);
void RemoveMeshCards(FLumenPrimitive& LumenPrimitive, FLumenPrimitiveInstance& LumenPrimitiveInstance);
bool HasPendingOperations() const
{
return PendingAddOperations.Num() > 0 || PendingUpdateOperations.Num() > 0 || PendingRemoveOperations.Num() > 0;
}
void UpdatePrimitiveToDistanceFieldInstanceMapping(FScene& Scene, FRHICommandListImmediate& RHICmdList);
private:
// 從構建數據增加FLumenMeshCards.
int32 AddMeshCardsFromBuildData(const FMatrix& LocalToWorld, const FMeshCardsBuildData& MeshCardsBuildData, float ResolutionScale);
};
由此可知,FLumenSceneData存儲着FLumenMeshCards以及以FLumenMeshCards為基礎的圖元FLumenPrimitive和圖元實例FLumenPrimitiveInstance。每個FLumenPrimitive又存儲着若干個FLumenMeshCards,同時存儲了一個FPrimitiveSceneInfo指針,標明它是真實世界哪個FPrimitiveSceneInfo的粗糙代表。
6.5.3 Lumen數據構建
Lumen在正在渲染之前,會執行很多數據構建,包含生成Mesh Distance Field、Global Distance Field以及MeshCard。
首次啟動Lumen工程時,會構建很多數據,包含網格距離場等。
6.5.3.1 CardRepresentation
為了構建網格卡片代表,UE5獨立出了MeshCardRepresentation模塊,其核心概念和類型如下:
// Engine\Source\Runtime\Engine\Public\MeshCardRepresentation.h
// FLumenCard構建數據
class FLumenCardBuildData
{
public:
// 中心和包圍盒.
FVector Center;
FVector Extent;
// 朝向順序: -X, +X, -Y, +Y, -Z, +Z
int32 Orientation;
int32 LODLevel;
// 根據朝向旋轉Extent.
static FVector TransformFaceExtent(FVector Extent, int32 Orientation)
{
if (Orientation / 2 == 2) // 朝向: -Z, +Z
{
return FVector(Extent.Y, Extent.X, Extent.Z);
}
else if (Orientation / 2 == 1) // 朝向: -Y, +Y
{
return FVector(Extent.Z, Extent.X, Extent.Y);
}
else // (Orientation / 2 == 0), 朝向: -X, +X
{
return FVector(Extent.Y, Extent.Z, Extent.X);
}
}
};
// FLumenMeshCards構建數據.
class FMeshCardsBuildData
{
public:
FBox Bounds;
int32 MaxLODLevel;
// FLumenCard構建數據列表.
TArray<FLumenCardBuildData> CardBuildData;
(......)
};
// 每個卡片表示數據實例的唯一id。
class FCardRepresentationDataId
{
public:
uint32 Value = 0;
bool IsValid() const
{
return Value != 0;
}
bool operator==(FCardRepresentationDataId B) const
{
return Value == B.Value;
}
friend uint32 GetTypeHash(FCardRepresentationDataId DataId)
{
return GetTypeHash(DataId.Value);
}
};
// 卡片代表網格構建過程的有效負載和輸出數據.
class FCardRepresentationData : public FDeferredCleanupInterface
{
public:
// 網格卡片構建數據和ID.
FMeshCardsBuildData MeshCardsBuildData;
FCardRepresentationDataId CardRepresentationDataId;
(......)
#if WITH_EDITORONLY_DATA
// 緩存卡片代表的數據.
void CacheDerivedData(const FString& InDDCKey, const ITargetPlatform* TargetPlatform, UStaticMesh* Mesh, UStaticMesh* GenerateSource, bool bGenerateDistanceFieldAsIfTwoSided, FSourceMeshDataForDerivedDataTask* OptionalSourceMeshData);
#endif
};
// 構建任務
class FAsyncCardRepresentationTaskWorker : public FNonAbandonableTask
{
public:
(.....)
void DoWork();
private:
FAsyncCardRepresentationTask& Task;
};
// 構建任務數據載體.
class FAsyncCardRepresentationTask
{
public:
bool bSuccess = false;
#if WITH_EDITOR
TArray<FSignedDistanceFieldBuildMaterialData> MaterialBlendModes;
#endif
FSourceMeshDataForDerivedDataTask SourceMeshData;
bool bGenerateDistanceFieldAsIfTwoSided = false;
UStaticMesh* StaticMesh = nullptr;
UStaticMesh* GenerateSource = nullptr;
FString DDCKey;
FCardRepresentationData* GeneratedCardRepresentation;
TUniquePtr<FAsyncTask<FAsyncCardRepresentationTaskWorker>> AsyncTask = nullptr;
};
// 管理網格距離場的異步構建的類型.
class FCardRepresentationAsyncQueue : public FGCObject
{
public:
// 增加新的構建任務.
ENGINE_API void AddTask(FAsyncCardRepresentationTask* Task);
// 處理異步任務.
ENGINE_API void ProcessAsyncTasks(bool bLimitExecutionTime = false);
// 取消構建.
ENGINE_API void CancelBuild(UStaticMesh* StaticMesh);
ENGINE_API void CancelAllOutstandingBuilds();
// 阻塞構建任務.
ENGINE_API void BlockUntilBuildComplete(UStaticMesh* StaticMesh, bool bWarnIfBlocked);
ENGINE_API void BlockUntilAllBuildsComplete();
(......)
};
// 全局構建隊列.
extern ENGINE_API FCardRepresentationAsyncQueue* GCardRepresentationAsyncQueue;
extern ENGINE_API FString BuildCardRepresentationDerivedDataKey(const FString& InMeshKey);
extern ENGINE_API void BeginCacheMeshCardRepresentation(const ITargetPlatform* TargetPlatform, UStaticMesh* StaticMeshAsset, class FStaticMeshRenderData& RenderData, const FString& DistanceFieldKey, FSourceMeshDataForDerivedDataTask* OptionalSourceMeshData);
6.5.3.2 GCardRepresentationAsyncQueue
為了構建Lumen需要的數據,UE5聲明了兩個全局隊列變量:GCardRepresentationAsyncQueue和GDistanceFieldAsyncQueue,前者用於Lumen Card的數據構建,后者用於距離場的數據構建。它們的創建和更新邏輯如下:
// Engine\Source\Runtime\Launch\Private\LaunchEngineLoop.cpp
int32 FEngineLoop::PreInitPreStartupScreen(const TCHAR* CmdLine)
{
(......)
if (!FPlatformProperties::RequiresCookedData())
{
(......)
// 創建全局異步隊列.
GDistanceFieldAsyncQueue = new FDistanceFieldAsyncQueue();
GCardRepresentationAsyncQueue = new FCardRepresentationAsyncQueue();
(......)
}
(......)
}
void FEngineLoop::Tick()
{
(......)
// 每幀更新全局異步隊列.
if (GDistanceFieldAsyncQueue)
{
QUICK_SCOPE_CYCLE_COUNTER(STAT_FEngineLoop_Tick_GDistanceFieldAsyncQueue);
GDistanceFieldAsyncQueue->ProcessAsyncTasks();
}
if (GCardRepresentationAsyncQueue)
{
QUICK_SCOPE_CYCLE_COUNTER(STAT_FEngineLoop_Tick_GCardRepresentationAsyncQueue);
GCardRepresentationAsyncQueue->ProcessAsyncTasks();
}
(......)
}
由於GDistanceFieldAsyncQueue是UE4就存在的類型,本節將忽略之,將精力放在GCardRepresentationAsyncQueue上。
對於CardRepresentation加入到全局構建隊列GCardRepresentationAsyncQueue的時機,可在MeshCardRepresentation.cpp找到答案:
FCardRepresentationAsyncQueue* GCardRepresentationAsyncQueue = NULL;
// 開始緩存網格卡片代表.
void BeginCacheMeshCardRepresentation(const ITargetPlatform* TargetPlatform, UStaticMesh* StaticMeshAsset, FStaticMeshRenderData& RenderData, const FString& DistanceFieldKey, FSourceMeshDataForDerivedDataTask* OptionalSourceMeshData)
{
static const auto CVarCards = IConsoleManager::Get().FindTConsoleVariableDataInt(TEXT("r.MeshCardRepresentation"));
if (CVarCards->GetValueOnAnyThread() != 0)
{
FString Key = BuildCardRepresentationDerivedDataKey(DistanceFieldKey);
if (RenderData.LODResources.IsValidIndex(0))
{
// 構建FCardRepresentationData實例.
if (!RenderData.LODResources[0].CardRepresentationData)
{
RenderData.LODResources[0].CardRepresentationData = new FCardRepresentationData();
}
const FMeshBuildSettings& BuildSettings = StaticMeshAsset->GetSourceModel(0).BuildSettings;
UStaticMesh* MeshToGenerateFrom = StaticMeshAsset;
// 緩存FCardRepresentationData.
RenderData.LODResources[0].CardRepresentationData->CacheDerivedData(Key, TargetPlatform, StaticMeshAsset, MeshToGenerateFrom, BuildSettings.bGenerateDistanceFieldAsIfTwoSided, OptionalSourceMeshData);
}
}
}
// 緩存FCardRepresentationData.
void FCardRepresentationData::CacheDerivedData(const FString& InDDCKey, const ITargetPlatform* TargetPlatform, UStaticMesh* Mesh, UStaticMesh* GenerateSource, bool bGenerateDistanceFieldAsIfTwoSided, FSourceMeshDataForDerivedDataTask* OptionalSourceMeshData)
{
TArray<uint8> DerivedData;
(......)
{
COOK_STAT(Timer.TrackCyclesOnly());
// 創建新的構建任務FAsyncCardRepresentationTask.
FAsyncCardRepresentationTask* NewTask = new FAsyncCardRepresentationTask;
NewTask->DDCKey = InDDCKey;
check(Mesh && GenerateSource);
NewTask->StaticMesh = Mesh;
NewTask->GenerateSource = GenerateSource;
NewTask->GeneratedCardRepresentation = new FCardRepresentationData();
NewTask->bGenerateDistanceFieldAsIfTwoSided = bGenerateDistanceFieldAsIfTwoSided;
// 處理材質混合模式.
for (int32 MaterialIndex = 0; MaterialIndex < Mesh->GetStaticMaterials().Num(); MaterialIndex++)
{
FSignedDistanceFieldBuildMaterialData MaterialData;
// Default material blend mode
MaterialData.BlendMode = BLEND_Opaque;
MaterialData.bTwoSided = false;
if (Mesh->GetStaticMaterials()[MaterialIndex].MaterialInterface)
{
MaterialData.BlendMode = Mesh->GetStaticMaterials()[MaterialIndex].MaterialInterface->GetBlendMode();
MaterialData.bTwoSided = Mesh->GetStaticMaterials()[MaterialIndex].MaterialInterface->IsTwoSided();
}
NewTask->MaterialBlendModes.Add(MaterialData);
}
// Nanite材質用一個粗糙表示覆蓋源靜態網格。在構建網格SDF之前,需要加載原始數據。
if (OptionalSourceMeshData)
{
NewTask->SourceMeshData = *OptionalSourceMeshData;
}
// 創建Nanite的粗糙代表.
else if (Mesh->NaniteSettings.bEnabled)
{
IMeshBuilderModule& MeshBuilderModule = IMeshBuilderModule::GetForPlatform(TargetPlatform);
if (!MeshBuilderModule.BuildMeshVertexPositions(Mesh, NewTask->SourceMeshData.TriangleIndices, NewTask->SourceMeshData.VertexPositions))
{
UE_LOG(LogStaticMesh, Error, TEXT("Failed to build static mesh. See previous line(s) for details."));
}
}
// 加入全局隊列GCardRepresentationAsyncQueue.
GCardRepresentationAsyncQueue->AddTask(NewTask);
}
}
6.5.3.3 GenerateCardRepresentationData
跟蹤FCardRepresentationAsyncQueue的調用堆棧,不難查到其最終會進入FMeshUtilities::GenerateCardRepresentationData
接口,此接口會執行具體的網格卡片構建邏輯:
// Engine\Source\Developer\MeshUtilities\Private\MeshCardRepresentationUtilities.cpp
bool FMeshUtilities::GenerateCardRepresentationData(
FString MeshName,
const FSourceMeshDataForDerivedDataTask& SourceMeshData,
const FStaticMeshLODResources& LODModel,
class FQueuedThreadPool& ThreadPool,
const TArray<FSignedDistanceFieldBuildMaterialData>& MaterialBlendModes,
const FBoxSphereBounds& Bounds,
const FDistanceFieldVolumeData* DistanceFieldVolumeData,
bool bGenerateAsIfTwoSided,
FCardRepresentationData& OutData)
{
// 構建Embree場景.
FEmbreeScene EmbreeScene;
MeshRepresentation::SetupEmbreeScene(MeshName,
SourceMeshData,
LODModel,
MaterialBlendModes,
bGenerateAsIfTwoSided,
EmbreeScene);
if (!EmbreeScene.EmbreeScene)
{
return false;
}
// 處理上下文.
FGenerateCardMeshContext Context(MeshName, EmbreeScene.EmbreeScene, EmbreeScene.EmbreeDevice, OutData);
// 構建網格卡片.
BuildMeshCards(DistanceFieldVolumeData ? DistanceFieldVolumeData->LocalSpaceMeshBounds : Bounds.GetBox(), Context, OutData);
MeshRepresentation::DeleteEmbreeScene(EmbreeScene);
(......)
return true;
}
由此可知,構建網格卡片過程使用了Embree第三方庫。
關於Embree
Embree是由Intel開發維護的開源庫,是一個高性能光線追蹤內核的集合,幫助開發者提高逼真渲染的應用程序的性能。它的特性有高級頭發幾何體、運動模糊、動態場景、多關卡實例:
Embree的實現和技術有以下特點:
- 內核為支持SSE、AVX、AVX2和AVX-512指令的最新Intel處理器進行了優化。
- 支持運行時代碼選擇,以選擇遍歷和構建算法,以最佳匹配的CPU指令集。
- 支持使用Intel SPMD程序編譯器(ISPC)編寫的應用程序,還提供了核心射線追蹤算法的ISPC接口。
- 包含針對非緩存一致的工作負載(如蒙特卡羅光線追蹤算法)和緩存一致的工作負載(如主要可見性和硬陰影射線)優化的算法。
簡而言之,Embree是基於CPU的高度優化的光線追蹤渲染加速器,但不支持GPU的硬件加速。正是這個特點,Lumen的網格卡片構建時間主要取決於CPU的性能。
構建的核心邏輯位於BuildMeshCards
:
void BuildMeshCards(const FBox& MeshBounds, const FGenerateCardMeshContext& Context, FCardRepresentationData& OutData)
{
static const auto CVarMeshCardRepresentationMinSurface = IConsoleManager::Get().FindTConsoleVariableDataFloat(TEXT("r.MeshCardRepresentation.MinSurface"));
const float MinSurfaceThreshold = CVarMeshCardRepresentationMinSurface->GetValueOnAnyThread();
// 確保生成的卡片包圍盒不為空.
const FVector MeshCardsBoundsCenter = MeshBounds.GetCenter();
const FVector MeshCardsBoundsExtent = FVector::Max(MeshBounds.GetExtent() + 1.0f, FVector(5.0f));
const FBox MeshCardsBounds(MeshCardsBoundsCenter - MeshCardsBoundsExtent, MeshCardsBoundsCenter + MeshCardsBoundsExtent);
// 初始化部分輸出數據.
OutData.MeshCardsBuildData.Bounds = MeshCardsBounds;
OutData.MeshCardsBuildData.MaxLODLevel = 1;
OutData.MeshCardsBuildData.CardBuildData.Reset();
// 處理采樣和體素數據.
const float SamplesPerWorldUnit = 1.0f / 10.0f;
const int32 MinSamplesPerAxis = 4;
const int32 MaxSamplesPerAxis = 64;
FIntVector VolumeSizeInVoxels;
VolumeSizeInVoxels.X = FMath::Clamp<int32>(MeshCardsBounds.GetSize().X * SamplesPerWorldUnit, MinSamplesPerAxis, MaxSamplesPerAxis);
VolumeSizeInVoxels.Y = FMath::Clamp<int32>(MeshCardsBounds.GetSize().Y * SamplesPerWorldUnit, MinSamplesPerAxis, MaxSamplesPerAxis);
VolumeSizeInVoxels.Z = FMath::Clamp<int32>(MeshCardsBounds.GetSize().Z * SamplesPerWorldUnit, MinSamplesPerAxis, MaxSamplesPerAxis);
// 單個體素的大小.
const FVector VoxelExtent = MeshCardsBounds.GetSize() / FVector(VolumeSizeInVoxels);
// 隨機在半球上生成射線方向.
TArray<FVector4> RayDirectionsOverHemisphere;
{
FRandomStream RandomStream(0);
MeshUtilities::GenerateStratifiedUniformHemisphereSamples(64, RandomStream, RayDirectionsOverHemisphere);
}
// 遍歷6個朝向, 給每個朝向生成卡片數據.
for (int32 Orientation = 0; Orientation < 6; ++Orientation)
{
// 初始化高度場和射線等數據.
FIntPoint HeighfieldSize(0, 0);
FVector RayDirection(0.0f, 0.0f, 0.0f);
FVector RayOriginFrame = MeshCardsBounds.Min;
FVector HeighfieldStepX(0.0f, 0.0f, 0.0f);
FVector HeighfieldStepY(0.0f, 0.0f, 0.0f);
float MaxRayT = 0.0f;
int32 MeshSliceNum = 0;
// 根據朝向調整高度場和射線數據.
switch (Orientation / 2)
{
case 0: // 朝向: -X, +X
MaxRayT = MeshCardsBounds.GetSize().X + 0.1f;
MeshSliceNum = VolumeSizeInVoxels.X;
HeighfieldSize.X = VolumeSizeInVoxels.Y;
HeighfieldSize.Y = VolumeSizeInVoxels.Z;
HeighfieldStepX = FVector(0.0f, MeshCardsBounds.GetSize().Y / HeighfieldSize.X, 0.0f);
HeighfieldStepY = FVector(0.0f, 0.0f, MeshCardsBounds.GetSize().Z / HeighfieldSize.Y);
break;
case 1: // 朝向: -Y, +Y
MaxRayT = MeshCardsBounds.GetSize().Y + 0.1f;
MeshSliceNum = VolumeSizeInVoxels.Y;
HeighfieldSize.X = VolumeSizeInVoxels.X;
HeighfieldSize.Y = VolumeSizeInVoxels.Z;
HeighfieldStepX = FVector(MeshCardsBounds.GetSize().X / HeighfieldSize.X, 0.0f, 0.0f);
HeighfieldStepY = FVector(0.0f, 0.0f, MeshCardsBounds.GetSize().Z / HeighfieldSize.Y);
break;
case 2: // 朝向: -Z, +Z
MaxRayT = MeshCardsBounds.GetSize().Z + 0.1f;
MeshSliceNum = VolumeSizeInVoxels.Z;
HeighfieldSize.X = VolumeSizeInVoxels.X;
HeighfieldSize.Y = VolumeSizeInVoxels.Y;
HeighfieldStepX = FVector(MeshCardsBounds.GetSize().X / HeighfieldSize.X, 0.0f, 0.0f);
HeighfieldStepY = FVector(0.0f, MeshCardsBounds.GetSize().Y / HeighfieldSize.Y, 0.0f);
break;
}
// 根據朝向調整射線方向.
switch (Orientation)
{
case 0:
RayDirection.X = +1.0f;
break;
case 1:
RayDirection.X = -1.0f;
RayOriginFrame.X = MeshCardsBounds.Max.X;
break;
case 2:
RayDirection.Y = +1.0f;
break;
case 3:
RayDirection.Y = -1.0f;
RayOriginFrame.Y = MeshCardsBounds.Max.Y;
break;
case 4:
RayDirection.Z = +1.0f;
break;
case 5:
RayDirection.Z = -1.0f;
RayOriginFrame.Z = MeshCardsBounds.Max.Z;
break;
default:
check(false);
};
TArray<TArray<FSurfacePoint, TInlineAllocator<16>>> HeightfieldLayers;
HeightfieldLayers.SetNum(HeighfieldSize.X * HeighfieldSize.Y);
// 填充表面點的數據.
{
TRACE_CPUPROFILER_EVENT_SCOPE(FillSurfacePoints);
TArray<float> Heightfield;
Heightfield.SetNum(HeighfieldSize.X * HeighfieldSize.Y);
for (int32 HeighfieldY = 0; HeighfieldY < HeighfieldSize.Y; ++HeighfieldY)
{
for (int32 HeighfieldX = 0; HeighfieldX < HeighfieldSize.X; ++HeighfieldX)
{
Heightfield[HeighfieldX + HeighfieldY * HeighfieldSize.X] = -1.0f;
}
}
for (int32 HeighfieldY = 0; HeighfieldY < HeighfieldSize.Y; ++HeighfieldY)
{
for (int32 HeighfieldX = 0; HeighfieldX < HeighfieldSize.X; ++HeighfieldX)
{
FVector RayOrigin = RayOriginFrame;
RayOrigin += (HeighfieldX + 0.5f) * HeighfieldStepX;
RayOrigin += (HeighfieldY + 0.5f) * HeighfieldStepY;
float StepTMin = 0.0f;
for (int32 StepIndex = 0; StepIndex < 64; ++StepIndex)
{
FEmbreeRay EmbreeRay;
EmbreeRay.ray.org_x = RayOrigin.X;
EmbreeRay.ray.org_y = RayOrigin.Y;
EmbreeRay.ray.org_z = RayOrigin.Z;
EmbreeRay.ray.dir_x = RayDirection.X;
EmbreeRay.ray.dir_y = RayDirection.Y;
EmbreeRay.ray.dir_z = RayDirection.Z;
EmbreeRay.ray.tnear = StepTMin;
EmbreeRay.ray.tfar = FLT_MAX;
FEmbreeIntersectionContext EmbreeContext;
rtcInitIntersectContext(&EmbreeContext);
rtcIntersect1(Context.FullMeshEmbreeScene, &EmbreeContext, &EmbreeRay);
if (EmbreeRay.hit.geomID != RTC_INVALID_GEOMETRY_ID && EmbreeRay.hit.primID != RTC_INVALID_GEOMETRY_ID)
{
const FVector SurfacePoint = RayOrigin + RayDirection * EmbreeRay.ray.tfar;
const FVector SurfaceNormal = EmbreeRay.GetHitNormal();
const float NdotD = FVector::DotProduct(RayDirection, SurfaceNormal);
const bool bPassCullTest = EmbreeContext.IsHitTwoSided() || NdotD <= 0.0f;
const bool bPassProjectionAngleTest = FMath::Abs(NdotD) >= FMath::Cos(75.0f * (PI / 180.0f));
const float MinDistanceBetweenPoints = (MaxRayT / MeshSliceNum);
const bool bPassDistanceToAnotherSurfaceTest = EmbreeRay.ray.tnear <= 0.0f || (EmbreeRay.ray.tfar - EmbreeRay.ray.tnear > MinDistanceBetweenPoints);
if (bPassCullTest && bPassProjectionAngleTest && bPassDistanceToAnotherSurfaceTest)
{
const bool bIsInsideMesh = IsSurfacePointInsideMesh(Context.FullMeshEmbreeScene, SurfacePoint, SurfaceNormal, RayDirectionsOverHemisphere);
if (!bIsInsideMesh)
{
HeightfieldLayers[HeighfieldX + HeighfieldY * HeighfieldSize.X].Add(
{ EmbreeRay.ray.tnear, EmbreeRay.ray.tfar }
);
}
}
StepTMin = EmbreeRay.ray.tfar + 0.01f;
}
else
{
break;
}
}
}
}
}
const int32 MinCardHits = FMath::Floor(HeighfieldSize.X * HeighfieldSize.Y * MinSurfaceThreshold);
TArray<FPlacedCard, TInlineAllocator<16>> PlacedCards;
int32 PlacedCardsHits = 0;
// 放置一個默認卡片.
{
FPlacedCard PlacedCard;
PlacedCard.SliceMin = 0;
PlacedCard.SliceMax = MeshSliceNum;
PlacedCards.Add(PlacedCard);
PlacedCardsHits = UpdatePlacedCards(PlacedCards, RayOriginFrame, RayDirection, HeighfieldStepX, HeighfieldStepY, HeighfieldSize, MeshSliceNum, MaxRayT, MinCardHits, VoxelExtent, HeightfieldLayers);
if (PlacedCardsHits < MinCardHits)
{
PlacedCards.Reset();
}
}
SerializePlacedCards(PlacedCards, /*LOD level*/ 0, Orientation, MinCardHits, MeshCardsBounds, OutData);
// 嘗試通過拆分現有的卡片去放置更多的卡片.
for (uint32 CardPlacementIteration = 0; CardPlacementIteration < 4; ++CardPlacementIteration)
{
TArray<FPlacedCard, TInlineAllocator<16>> BestPlacedCards;
int32 BestPlacedCardHits = PlacedCardsHits;
for (int32 PlacedCardIndex = 0; PlacedCardIndex < PlacedCards.Num(); ++PlacedCardIndex)
{
const FPlacedCard& PlacedCard = PlacedCards[PlacedCardIndex];
for (int32 SliceIndex = PlacedCard.SliceMin + 2; SliceIndex < PlacedCard.SliceMax; ++SliceIndex)
{
TArray<FPlacedCard, TInlineAllocator<16>> TempPlacedCards(PlacedCards);
FPlacedCard NewPlacedCard;
NewPlacedCard.SliceMin = SliceIndex;
NewPlacedCard.SliceMax = PlacedCard.SliceMax;
TempPlacedCards[PlacedCardIndex].SliceMax = SliceIndex - 1;
TempPlacedCards.Insert(NewPlacedCard, PlacedCardIndex + 1);
const int32 NumHits = UpdatePlacedCards(TempPlacedCards, RayOriginFrame, RayDirection, HeighfieldStepX, HeighfieldStepY, HeighfieldSize, MeshSliceNum, MaxRayT, MinCardHits, VoxelExtent, HeightfieldLayers);
if (NumHits > BestPlacedCardHits)
{
BestPlacedCards = TempPlacedCards;
BestPlacedCardHits = NumHits;
}
}
}
if (BestPlacedCardHits >= PlacedCardsHits + MinCardHits)
{
PlacedCards = BestPlacedCards;
PlacedCardsHits = BestPlacedCardHits;
}
}
SerializePlacedCards(PlacedCards, /*LOD level*/ 1, Orientation, MinCardHits, MeshCardsBounds, OutData);
} // for (int32 Orientation = 0; Orientation < 6; ++Orientation)
}
以上代碼顯示構建卡牌數據時使用了高度場光線追蹤(Height Field Ray Tracing)來加速,而光線追蹤多年前就存在的技術。它的核心思想和步驟在於將網格離散化成大小相等的3D體素,然后根據分辨率大小從攝像機位置向每個像素位置發射一條光線和3D體素相交測試,從而渲染出高度場的輪廓。而高度場的輪廓將屏幕划分為高度場覆蓋區域和高度場以上區域的分界線:
這樣獲得的輪廓存在明顯的鋸齒,論文Ray Tracing Height Fields提供了高度場平面、線性近似平面、三角面、雙線性表面等方法來重建表面數據以緩解鋸齒。
經過以上構建之后,可以出現如下所示的網格卡片數據:
上:網格正常數據;下:網格卡片數據可視化。
網格卡片數據存在LOD,會根據鏡頭遠近選擇對應等級的LOD(點擊看視頻)。
此外,UE5構建出來的網格距離場數據做了改進,利用稀疏存儲提升了精度(下圖左),明顯要好於UE4(下圖右)。
6.5.4 Lumen渲染流程
Lumen的主要渲染流程依然在FDeferredShadingSceneRenderer::Render
中:
void FDeferredShadingSceneRenderer::Render(FRDGBuilder& GraphBuilder)
{
(......)
bool bAnyLumenEnabled = false;
if (!IsSimpleForwardShadingEnabled(ShaderPlatform))
{
(......)
// 檢測是否有視圖啟用了Lumen.
for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ViewIndex++)
{
FViewInfo& View = Views[ViewIndex];
bAnyLumenEnabled = bAnyLumenEnabled
|| GetViewPipelineState(View).DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen
|| GetViewPipelineState(View).ReflectionsMethod == EReflectionsMethod::Lumen;
}
(......)
}
(......)
// PrePass.
RenderPrePass(...);
(......)
// 更新Lumen場景.
UpdateLumenScene(GraphBuilder);
// 如果在BasePass之前執行遮擋剔除, 則在RenderBasePass之前渲染Lumen場景光照.
// bOcclusionBeforeBasePass默認為false.
if (bOcclusionBeforeBasePass)
{
{
LLM_SCOPE_BYTAG(Lumen);
RenderLumenSceneLighting(GraphBuilder, Views[0]);
}
ComputeVolumetricFog(GraphBuilder);
}
(......)
// BasePass.
RenderBasePass(...);
(......)
// BasePass之后的Lumen光照.
if (!bOcclusionBeforeBasePass)
{
const bool bAfterBasePass = true;
// 渲染陰影.
AllocateVirtualShadowMaps(bAfterBasePass);
RenderShadowDepthMaps(GraphBuilder, InstanceCullingManager);
{
LLM_SCOPE_BYTAG(Lumen);
// 渲染Lumen場景光照.
RenderLumenSceneLighting(GraphBuilder, Views[0]);
}
AddServiceLocalQueuePass(GraphBuilder);
}
(......)
// 渲染Lumen可視化.
RenderLumenSceneVisualization(GraphBuilder, SceneTextures);
// 渲染非直接漫反射和AO.
RenderDiffuseIndirectAndAmbientOcclusion(GraphBuilder, SceneTextures, LightingChannelsTexture, true);
(......)
}
下面的紅框是RenderDoc截幀中Lumen的執行步驟:
Lumen的光照主要有更新場景UpdateLumenScene
和計算場景光照RenderLumenSceneLighting
兩個階段。
6.5.5 Lumen場景更新
6.5.5.1 UpdateLumenScene
Lumen場景更新主要由UpdateLumenScene
承擔:
// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneRendering.cpp
void FDeferredShadingSceneRenderer::UpdateLumenScene(FRDGBuilder& GraphBuilder)
{
LLM_SCOPE_BYTAG(Lumen);
FViewInfo& View = Views[0];
const FPerViewPipelineState& ViewPipelineState = GetViewPipelineState(View);
const bool bAnyLumenActive = ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen || ViewPipelineState.ReflectionsMethod == EReflectionsMethod::Lumen;
if (bAnyLumenActive
// 非主要視圖更新場景
&& !View.bIsPlanarReflection
&& !View.bIsSceneCapture
&& !View.bIsReflectionCapture
&& View.ViewState)
{
const double StartTime = FPlatformTime::Seconds();
// 獲取Lumen場景和卡片數據.
FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
TArray<FCardRenderData, SceneRenderingAllocator>& CardsToRender = LumenCardRenderer.CardsToRender;
RDG_EVENT_SCOPE(GraphBuilder, "UpdateLumenScene: %u card captures %.3fM texels", CardsToRender.Num(), LumenCardRenderer.NumCardTexelsToCapture / 1e6f);
// 更新卡片場景緩沖.
UpdateCardSceneBuffer(GraphBuilder.RHICmdList, ViewFamily, Scene);
// 因為更新了Lumen的圖元映射緩沖, 所以需要重新創建視圖統一緩沖區.
Lumen::SetupViewUniformBufferParameters(Scene, *View.CachedViewUniformShaderParameters);
View.ViewUniformBuffer = TUniformBufferRef<FViewUniformShaderParameters>::CreateUniformBufferImmediate(*View.CachedViewUniformShaderParameters, UniformBuffer_SingleFrame);
LumenCardRenderer.CardIdsToRender.Empty(CardsToRender.Num());
// 捕捉卡片的臨時深度緩沖區.
const FRDGTextureDesc DepthStencilAtlasDesc = FRDGTextureDesc::Create2D(LumenSceneData.MaxAtlasSize, PF_DepthStencil, FClearValueBinding::DepthZero, TexCreate_ShaderResource | TexCreate_DepthStencilTargetable | TexCreate_NoFastClear);
FRDGTextureRef DepthStencilAtlasTexture = GraphBuilder.CreateTexture(DepthStencilAtlasDesc, TEXT("Lumen.DepthStencilAtlas"));
if (CardsToRender.Num() > 0)
{
FRHIBuffer* PrimitiveIdVertexBuffer = nullptr;
FInstanceCullingResult InstanceCullingResult;
// 裁剪卡片, 支持GPU和非GPU裁剪.
#if GPUCULL_TODO
if (Scene->GPUScene.IsEnabled())
{
int32 MaxInstances = 0;
int32 VisibleMeshDrawCommandsNum = 0;
int32 NewPassVisibleMeshDrawCommandsNum = 0;
FInstanceCullingContext InstanceCullingContext(nullptr, TArrayView<const int32>(&View.GPUSceneViewId, 1));
SetupGPUInstancedDraws(InstanceCullingContext, LumenCardRenderer.MeshDrawCommands, false, MaxInstances, VisibleMeshDrawCommandsNum, NewPassVisibleMeshDrawCommandsNum);
// Not supposed to do any compaction here.
ensure(VisibleMeshDrawCommandsNum == LumenCardRenderer.MeshDrawCommands.Num());
InstanceCullingContext.BuildRenderingCommands(GraphBuilder, Scene->GPUScene, View.DynamicPrimitiveCollector.GetPrimitiveIdRange(), InstanceCullingResult);
}
else
#endif // GPUCULL_TODO
{
// Prepare primitive Id VB for rendering mesh draw commands.
if (LumenCardRenderer.MeshDrawPrimitiveIds.Num() > 0)
{
const uint32 PrimitiveIdBufferDataSize = LumenCardRenderer.MeshDrawPrimitiveIds.Num() * sizeof(int32);
FPrimitiveIdVertexBufferPoolEntry Entry = GPrimitiveIdVertexBufferPool.Allocate(PrimitiveIdBufferDataSize);
PrimitiveIdVertexBuffer = Entry.BufferRHI;
void* RESTRICT Data = RHILockBuffer(PrimitiveIdVertexBuffer, 0, PrimitiveIdBufferDataSize, RLM_WriteOnly);
FMemory::Memcpy(Data, LumenCardRenderer.MeshDrawPrimitiveIds.GetData(), PrimitiveIdBufferDataSize);
RHIUnlockBuffer(PrimitiveIdVertexBuffer);
GPrimitiveIdVertexBufferPool.ReturnToFreeList(Entry);
}
}
FRDGTextureRef AlbedoAtlasTexture = GraphBuilder.RegisterExternalTexture(LumenSceneData.AlbedoAtlas);
FRDGTextureRef NormalAtlasTexture = GraphBuilder.RegisterExternalTexture(LumenSceneData.NormalAtlas);
FRDGTextureRef EmissiveAtlasTexture = GraphBuilder.RegisterExternalTexture(LumenSceneData.EmissiveAtlas);
uint32 NumRects = 0;
FRDGBufferRef RectMinMaxBuffer = nullptr;
{
// 上傳卡片id,用於在待渲染卡片上操作的批量繪制。
TArray<FUintVector4, SceneRenderingAllocator> RectMinMaxToRender;
RectMinMaxToRender.Reserve(CardsToRender.Num());
for (const FCardRenderData& CardRenderData : CardsToRender)
{
FIntRect AtlasRect = CardRenderData.AtlasAllocation;
FUintVector4 Rect;
Rect.X = FMath::Max(AtlasRect.Min.X, 0);
Rect.Y = FMath::Max(AtlasRect.Min.Y, 0);
Rect.Z = FMath::Max(AtlasRect.Max.X, 0);
Rect.W = FMath::Max(AtlasRect.Max.Y, 0);
RectMinMaxToRender.Add(Rect);
}
NumRects = CardsToRender.Num();
RectMinMaxBuffer = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateUploadDesc(sizeof(FUintVector4), FMath::RoundUpToPowerOfTwo(NumRects)), TEXT("Lumen.RectMinMaxBuffer"));
FPixelShaderUtils::UploadRectMinMaxBuffer(GraphBuilder, RectMinMaxToRender, RectMinMaxBuffer);
FRDGBufferSRVRef RectMinMaxBufferSRV = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(RectMinMaxBuffer, PF_R32G32B32A32_UINT));
ClearLumenCards(GraphBuilder, View, AlbedoAtlasTexture, NormalAtlasTexture, EmissiveAtlasTexture, DepthStencilAtlasTexture, LumenSceneData.MaxAtlasSize, RectMinMaxBufferSRV, NumRects);
}
// 緩存視圖信息.
FViewInfo* SharedView = View.CreateSnapshot();
{
SharedView->DynamicPrimitiveCollector = FGPUScenePrimitiveCollector(&GetGPUSceneDynamicContext());
SharedView->StereoPass = eSSP_FULL;
SharedView->DrawDynamicFlags = EDrawDynamicFlags::ForceLowestLOD;
// Don't do material texture mip biasing in proxy card rendering
SharedView->MaterialTextureMipBias = 0;
TRefCountPtr<IPooledRenderTarget> NullRef;
FPlatformMemory::Memcpy(&SharedView->PrevViewInfo.HZB, &NullRef, sizeof(SharedView->PrevViewInfo.HZB));
SharedView->CachedViewUniformShaderParameters = MakeUnique<FViewUniformShaderParameters>();
SharedView->CachedViewUniformShaderParameters->PrimitiveSceneData = Scene->GPUScene.PrimitiveBuffer.SRV;
SharedView->CachedViewUniformShaderParameters->InstanceSceneData = Scene->GPUScene.InstanceDataBuffer.SRV;
SharedView->CachedViewUniformShaderParameters->LightmapSceneData = Scene->GPUScene.LightmapDataBuffer.SRV;
SharedView->ViewUniformBuffer = TUniformBufferRef<FViewUniformShaderParameters>::CreateUniformBufferImmediate(*SharedView->CachedViewUniformShaderParameters, UniformBuffer_SingleFrame);
}
// 設置場景的紋理緩存.
FLumenCardPassUniformParameters* PassUniformParameters = GraphBuilder.AllocParameters<FLumenCardPassUniformParameters>();
SetupSceneTextureUniformParameters(GraphBuilder, Scene->GetFeatureLevel(), /*SceneTextureSetupMode*/ ESceneTextureSetupMode::None, PassUniformParameters->SceneTextures);
// 捕獲網格卡片.
{
FLumenCardPassParameters* PassParameters = GraphBuilder.AllocParameters<FLumenCardPassParameters>();
PassParameters->View = Scene->UniformBuffers.LumenCardCaptureViewUniformBuffer;
PassParameters->CardPass = GraphBuilder.CreateUniformBuffer(PassUniformParameters);
PassParameters->RenderTargets[0] = FRenderTargetBinding(AlbedoAtlasTexture, ERenderTargetLoadAction::ELoad);
PassParameters->RenderTargets[1] = FRenderTargetBinding(NormalAtlasTexture, ERenderTargetLoadAction::ELoad);
PassParameters->RenderTargets[2] = FRenderTargetBinding(EmissiveAtlasTexture, ERenderTargetLoadAction::ELoad);
PassParameters->RenderTargets.DepthStencil = FDepthStencilBinding(DepthStencilAtlasTexture, ERenderTargetLoadAction::ELoad, FExclusiveDepthStencil::DepthWrite_StencilNop);
InstanceCullingResult.GetDrawParameters(PassParameters->InstanceCullingDrawParams);
// 捕獲網格卡片Pass.
GraphBuilder.AddPass(
RDG_EVENT_NAME("MeshCardCapture"),
PassParameters,
ERDGPassFlags::Raster,
[this, Scene = Scene, PrimitiveIdVertexBuffer, SharedView, &CardsToRender, PassParameters](FRHICommandList& RHICmdList)
{
QUICK_SCOPE_CYCLE_COUNTER(MeshPass);
// 將所有待渲染的卡片准備數據並提交繪制指令.
for (FCardRenderData& CardRenderData : CardsToRender)
{
if (CardRenderData.NumMeshDrawCommands > 0)
{
FIntRect AtlasRect = CardRenderData.AtlasAllocation;
RHICmdList.SetViewport(AtlasRect.Min.X, AtlasRect.Min.Y, 0.0f, AtlasRect.Max.X, AtlasRect.Max.Y, 1.0f);
CardRenderData.PatchView(RHICmdList, Scene, SharedView);
Scene->UniformBuffers.LumenCardCaptureViewUniformBuffer.UpdateUniformBufferImmediate(*SharedView->CachedViewUniformShaderParameters);
FGraphicsMinimalPipelineStateSet GraphicsMinimalPipelineStateSet;
#if GPUCULL_TODO
if (Scene->GPUScene.IsEnabled())
{
FRHIBuffer* DrawIndirectArgsBuffer = nullptr;
FRHIBuffer* InstanceIdOffsetBuffer = nullptr;
FInstanceCullingDrawParams& InstanceCullingDrawParams = PassParameters->InstanceCullingDrawParams;
if (InstanceCullingDrawParams.DrawIndirectArgsBuffer != nullptr && InstanceCullingDrawParams.InstanceIdOffsetBuffer != nullptr)
{
DrawIndirectArgsBuffer = InstanceCullingDrawParams.DrawIndirectArgsBuffer->GetRHI();
InstanceIdOffsetBuffer = InstanceCullingDrawParams.InstanceIdOffsetBuffer->GetRHI();
}
// GPU裁剪調用GPUInstanced接口.
SubmitGPUInstancedMeshDrawCommandsRange(
LumenCardRenderer.MeshDrawCommands,
GraphicsMinimalPipelineStateSet,
CardRenderData.StartMeshDrawCommandIndex,
CardRenderData.NumMeshDrawCommands,
1,
InstanceIdOffsetBuffer,
DrawIndirectArgsBuffer,
RHICmdList);
}
else
#endif // GPUCULL_TODO
{
// 非GPU裁剪調用普通繪制接口.
SubmitMeshDrawCommandsRange(
LumenCardRenderer.MeshDrawCommands,
GraphicsMinimalPipelineStateSet,
PrimitiveIdVertexBuffer,
0,
false,
CardRenderData.StartMeshDrawCommandIndex,
CardRenderData.NumMeshDrawCommands,
1,
RHICmdList);
}
}
}
}
);
}
// 記錄待渲染卡片的id和檢測是否存在需要渲染Nanite網格的標記.
bool bAnyNaniteMeshes = false;
for (FCardRenderData& CardRenderData : CardsToRender)
{
bAnyNaniteMeshes = bAnyNaniteMeshes || CardRenderData.NaniteInstanceIds.Num() > 0 || CardRenderData.bDistantScene;
LumenCardRenderer.CardIdsToRender.Add(CardRenderData.CardIndex);
}
// 渲染Lumen場景的Nanite網格.
if (UseNanite(ShaderPlatform) && ViewFamily.EngineShowFlags.NaniteMeshes && bAnyNaniteMeshes)
{
TRACE_CPUPROFILER_EVENT_SCOPE(NaniteMeshPass);
QUICK_SCOPE_CYCLE_COUNTER(NaniteMeshPass);
const FIntPoint DepthStencilAtlasSize = DepthStencilAtlasDesc.Extent;
const FIntRect DepthAtlasRect = FIntRect(0, 0, DepthStencilAtlasSize.X, DepthStencilAtlasSize.Y);
FRDGBufferSRVRef RectMinMaxBufferSRV = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(RectMinMaxBuffer, PF_R32G32B32A32_UINT));
// 光柵化上下文.
Nanite::FRasterContext RasterContext = Nanite::InitRasterContext(
GraphBuilder,
FeatureLevel,
DepthStencilAtlasSize,
Nanite::EOutputBufferMode::VisBuffer,
true,
RectMinMaxBufferSRV,
NumRects);
const bool bUpdateStreaming = false;
const bool bSupportsMultiplePasses = true;
const bool bForceHWRaster = RasterContext.RasterScheduling == Nanite::ERasterScheduling::HardwareOnly;
// 非主要上下文(和Nanite的主要Pass區別開來)
const bool bPrimaryContext = false;
// 裁剪上下文
Nanite::FCullingContext CullingContext = Nanite::InitCullingContext(
GraphBuilder,
*Scene,
nullptr,
FIntRect(),
false,
bUpdateStreaming,
bSupportsMultiplePasses,
bForceHWRaster,
bPrimaryContext);
// 多視圖渲染.
if (GLumenSceneNaniteMultiViewCapture)
{
const uint32 NumCardsToRender = CardsToRender.Num();
// 第一層while循環是為了拆分卡片數量, 防止同一個批次的卡片超過MAX_VIEWS_PER_CULL_RASTERIZE_PASS.
uint32 NextCardIndex = 0;
while(NextCardIndex < NumCardsToRender)
{
TArray<Nanite::FPackedView, SceneRenderingAllocator> NaniteViews;
TArray<Nanite::FInstanceDraw, SceneRenderingAllocator> NaniteInstanceDraws;
// 給每個待渲染卡片生成一個FPackedViewParams實例, 添加到NaniteViews, 直到NaniteViews達到了最大視圖數量.
while(NextCardIndex < NumCardsToRender && NaniteViews.Num() < MAX_VIEWS_PER_CULL_RASTERIZE_PASS)
{
const FCardRenderData& CardRenderData = CardsToRender[NextCardIndex];
if(CardRenderData.NaniteInstanceIds.Num() > 0)
{
for(uint32 InstanceID : CardRenderData.NaniteInstanceIds)
{
NaniteInstanceDraws.Add(Nanite::FInstanceDraw { InstanceID, (uint32)NaniteViews.Num() });
}
Nanite::FPackedViewParams Params;
Params.ViewMatrices = CardRenderData.ViewMatrices;
Params.PrevViewMatrices = CardRenderData.ViewMatrices;
Params.ViewRect = CardRenderData.AtlasAllocation;
Params.RasterContextSize = DepthStencilAtlasSize;
Params.LODScaleFactor = CardRenderData.NaniteLODScaleFactor;
NaniteViews.Add(Nanite::CreatePackedView(Params));
}
NextCardIndex++;
}
// 光柵化卡片.
if (NaniteInstanceDraws.Num() > 0)
{
RDG_EVENT_SCOPE(GraphBuilder, "Nanite::RasterizeLumenCards");
Nanite::FRasterState RasterState;
Nanite::CullRasterize(
GraphBuilder,
*Scene,
NaniteViews,
CullingContext,
RasterContext,
RasterState,
&NaniteInstanceDraws
);
}
}
}
else // 單視圖渲染
{
RDG_EVENT_SCOPE(GraphBuilder, "RenderLumenCardsWithNanite");
// 單視圖渲染比較暴力, 線性遍歷所有待渲染卡片, 每個卡片構建一個view並調用一次繪制.
for(FCardRenderData& CardRenderData : CardsToRender)
{
if(CardRenderData.NaniteInstanceIds.Num() > 0)
{
TArray<Nanite::FInstanceDraw, SceneRenderingAllocator> NaniteInstanceDraws;
for( uint32 InstanceID : CardRenderData.NaniteInstanceIds )
{
NaniteInstanceDraws.Add( Nanite::FInstanceDraw { InstanceID, 0u } );
}
CardRenderData.PatchView(GraphBuilder.RHICmdList, Scene, SharedView);
Nanite::FPackedView PackedView = Nanite::CreatePackedViewFromViewInfo(*SharedView, DepthStencilAtlasSize, 0);
Nanite::CullRasterize(
GraphBuilder,
*Scene,
{ PackedView },
CullingContext,
RasterContext,
Nanite::FRasterState(),
&NaniteInstanceDraws
);
}
}
}
extern float GLumenDistantSceneMinInstanceBoundsRadius;
// 為遠處的卡片渲染整個場景.
for (FCardRenderData& CardRenderData : CardsToRender)
{
// bDistantScene標記了是否遠處的卡片.
if (CardRenderData.bDistantScene)
{
Nanite::FRasterState RasterState;
RasterState.bNearClip = false;
CardRenderData.PatchView(GraphBuilder.RHICmdList, Scene, SharedView);
Nanite::FPackedView PackedView = Nanite::CreatePackedViewFromViewInfo(
*SharedView,
DepthStencilAtlasSize,
/*Flags*/ 0,
/*StreamingPriorityCategory*/ 0,
GLumenDistantSceneMinInstanceBoundsRadius,
Lumen::GetDistanceSceneNaniteLODScaleFactor());
Nanite::CullRasterize(
GraphBuilder,
*Scene,
{ PackedView },
CullingContext,
RasterContext,
RasterState);
}
}
// Lumen網格捕獲Pass.
Nanite::DrawLumenMeshCapturePass(
GraphBuilder,
*Scene,
SharedView,
CardsToRender,
CullingContext,
RasterContext,
PassUniformParameters,
RectMinMaxBufferSRV,
NumRects,
LumenSceneData.MaxAtlasSize,
AlbedoAtlasTexture,
NormalAtlasTexture,
EmissiveAtlasTexture,
DepthStencilAtlasTexture
);
}
ConvertToExternalTexture(GraphBuilder, AlbedoAtlasTexture, LumenSceneData.AlbedoAtlas);
ConvertToExternalTexture(GraphBuilder, NormalAtlasTexture, LumenSceneData.NormalAtlas);
ConvertToExternalTexture(GraphBuilder, EmissiveAtlasTexture, LumenSceneData.EmissiveAtlas);
}
// 上傳卡片數據.
{
QUICK_SCOPE_CYCLE_COUNTER(UploadCardIndexBuffers);
// 上傳索引緩沖.
{
FRDGBufferRef CardIndexBuffer = GraphBuilder.CreateBuffer(
FRDGBufferDesc::CreateUploadDesc(sizeof(uint32), FMath::Max(LumenCardRenderer.CardIdsToRender.Num(), 1)),
TEXT("Lumen.CardsToRenderIndexBuffer"));
FLumenCardIdUpload* PassParameters = GraphBuilder.AllocParameters<FLumenCardIdUpload>();
PassParameters->CardIds = CardIndexBuffer;
const uint32 CardIdBytes = LumenCardRenderer.CardIdsToRender.GetTypeSize() * LumenCardRenderer.CardIdsToRender.Num();
const void* CardIdPtr = LumenCardRenderer.CardIdsToRender.GetData();
GraphBuilder.AddPass(
RDG_EVENT_NAME("Upload CardsToRenderIndexBuffer NumIndices=%d", LumenCardRenderer.CardIdsToRender.Num()),
PassParameters,
ERDGPassFlags::Copy,
[PassParameters, CardIdBytes, CardIdPtr](FRHICommandListImmediate& RHICmdList)
{
if (CardIdBytes > 0)
{
void* DestCardIdPtr = RHILockBuffer(PassParameters->CardIds->GetRHI(), 0, CardIdBytes, RLM_WriteOnly);
FPlatformMemory::Memcpy(DestCardIdPtr, CardIdPtr, CardIdBytes);
RHIUnlockBuffer(PassParameters->CardIds->GetRHI());
}
});
ConvertToExternalBuffer(GraphBuilder, CardIndexBuffer, LumenCardRenderer.CardsToRenderIndexBuffer);
}
// 上傳哈希映射表緩沖.
{
const uint32 NumHashMapUInt32 = FLumenCardRenderer::NumCardsToRenderHashMapBucketUInt32;
const uint32 NumHashMapBytes = 4 * NumHashMapUInt32;
const uint32 NumHashMapBuckets = 32 * NumHashMapUInt32;
FRDGBufferRef CardHashMapBuffer = GraphBuilder.CreateBuffer(
FRDGBufferDesc::CreateUploadDesc(sizeof(uint32), NumHashMapUInt32),
TEXT("Lumen.CardsToRenderHashMapBuffer"));
LumenCardRenderer.CardsToRenderHashMap.Init(0, NumHashMapBuckets);
for (int32 CardIndex : LumenCardRenderer.CardIdsToRender)
{
LumenCardRenderer.CardsToRenderHashMap[CardIndex % NumHashMapBuckets] = 1;
}
FLumenCardIdUpload* PassParameters = GraphBuilder.AllocParameters<FLumenCardIdUpload>();
PassParameters->CardIds = CardHashMapBuffer;
const void* HashMapDataPtr = LumenCardRenderer.CardsToRenderHashMap.GetData();
GraphBuilder.AddPass(
RDG_EVENT_NAME("Upload CardsToRenderHashMapBuffer NumUInt32=%d", NumHashMapUInt32),
PassParameters,
ERDGPassFlags::Copy,
[PassParameters, NumHashMapBytes, HashMapDataPtr](FRHICommandListImmediate& RHICmdList)
{
if (NumHashMapBytes > 0)
{
void* DestCardIdPtr = RHILockBuffer(PassParameters->CardIds->GetRHI(), 0, NumHashMapBytes, RLM_WriteOnly);
FPlatformMemory::Memcpy(DestCardIdPtr, HashMapDataPtr, NumHashMapBytes);
RHIUnlockBuffer(PassParameters->CardIds->GetRHI());
}
});
ConvertToExternalBuffer(GraphBuilder, CardHashMapBuffer, LumenCardRenderer.CardsToRenderHashMapBuffer);
}
// 上傳可見卡片索引緩沖.
{
FRDGBufferRef VisibleCardsIndexBuffer = GraphBuilder.CreateBuffer(
FRDGBufferDesc::CreateUploadDesc(sizeof(uint32), FMath::Max(LumenSceneData.VisibleCardsIndices.Num(), 1)),
TEXT("Lumen.VisibleCardsIndexBuffer"));
FLumenCardIdUpload* PassParameters = GraphBuilder.AllocParameters<FLumenCardIdUpload>();
PassParameters->CardIds = VisibleCardsIndexBuffer;
const uint32 CardIdBytes = sizeof(uint32) * LumenSceneData.VisibleCardsIndices.Num();
const void* CardIdPtr = LumenSceneData.VisibleCardsIndices.GetData();
GraphBuilder.AddPass(
RDG_EVENT_NAME("Upload VisibleCardIndices NumIndices=%d", LumenSceneData.VisibleCardsIndices.Num()),
PassParameters,
ERDGPassFlags::Copy,
[PassParameters, CardIdBytes, CardIdPtr](FRHICommandListImmediate& RHICmdList)
{
if (CardIdBytes > 0)
{
void* DestCardIdPtr = RHILockBuffer(PassParameters->CardIds->GetRHI(), 0, CardIdBytes, RLM_WriteOnly);
FPlatformMemory::Memcpy(DestCardIdPtr, CardIdPtr, CardIdBytes);
RHIUnlockBuffer(PassParameters->CardIds->GetRHI());
}
});
ConvertToExternalBuffer(GraphBuilder, VisibleCardsIndexBuffer, LumenSceneData.VisibleCardsIndexBuffer);
}
}
// 預過濾Lumen場景深度.
if (LumenCardRenderer.CardIdsToRender.Num() > 0)
{
TRDGUniformBufferRef<FLumenCardScene> LumenCardSceneUniformBuffer;
{
FLumenCardScene* LumenCardSceneParameters = GraphBuilder.AllocParameters<FLumenCardScene>();
SetupLumenCardSceneParameters(GraphBuilder, Scene, *LumenCardSceneParameters);
LumenCardSceneUniformBuffer = GraphBuilder.CreateUniformBuffer(LumenCardSceneParameters);
}
PrefilterLumenSceneDepth(GraphBuilder, LumenCardSceneUniformBuffer, DepthStencilAtlasTexture, LumenCardRenderer.CardIdsToRender, View);
}
}
FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
LumenSceneData.CardIndicesToUpdateInBuffer.Reset();
LumenSceneData.MeshCardsIndicesToUpdateInBuffer.Reset();
LumenSceneData.DFObjectIndicesToUpdateInBuffer.Reset();
}
更新Lumen場景的過程主要有裁剪卡片、上傳卡片ID、緩存視圖和場景紋理、捕獲網格卡片、將卡片當做視圖光柵化Lumen場景、渲染遠處卡片、繪制網格捕獲、上傳卡片數據及可見數據等步驟。
由於以上過程比較多,無法將所有過程都詳細闡述,本節將重點闡述捕獲網格卡片和光柵化網格卡片涉及的階段。
6.5.5.2 CardsToRender
為了闡述捕獲網格卡片和光柵化網格卡片的階段,需要弄清楚LumenCardRenderer.CardsToRender的添加過程。下面捋清Lumen場景上有哪些卡片需要捕獲和渲染,它的處理者是InitView
階段的BeginUpdateLumenSceneTasks
:
// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneRendering.cpp
void FDeferredShadingSceneRenderer::BeginUpdateLumenSceneTasks(FRDGBuilder& GraphBuilder)
{
LLM_SCOPE_BYTAG(Lumen);
const FViewInfo& MainView = Views[0];
const bool bAnyLumenActive = ShouldRenderLumenDiffuseGI(Scene, MainView, true)
|| ShouldRenderLumenReflections(MainView, true);
if (bAnyLumenActive
&& !ViewFamily.EngineShowFlags.HitProxies)
{
SCOPED_NAMED_EVENT(FDeferredShadingSceneRenderer_BeginUpdateLumenSceneTasks, FColor::Emerald);
QUICK_SCOPE_CYCLE_COUNTER(BeginUpdateLumenSceneTasks);
const double StartTime = FPlatformTime::Seconds();
FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
// 獲取待渲染卡片列表並重置.
TArray<FCardRenderData, SceneRenderingAllocator>& CardsToRender = LumenCardRenderer.CardsToRender;
LumenCardRenderer.Reset();
const int32 LocalLumenSceneGeneration = GLumenSceneGeneration;
const bool bRecaptureLumenSceneOnce = LumenSceneData.Generation != LocalLumenSceneGeneration;
LumenSceneData.Generation = LocalLumenSceneGeneration;
const bool bReallocateAtlas = LumenSceneData.MaxAtlasSize != GetDesiredAtlasSize()
|| (LumenSceneData.RadiosityAtlas && LumenSceneData.RadiosityAtlas->GetDesc().Extent != GetRadiosityAtlasSize(LumenSceneData.MaxAtlasSize))
|| GLumenSceneReset;
if (GLumenSceneReset != 2)
{
GLumenSceneReset = 0;
}
LumenSceneData.NumMeshCardsToAddToSurfaceCache = 0;
// 更新臟卡片.
UpdateDirtyCards(Scene, bReallocateAtlas, bRecaptureLumenSceneOnce);
// 更新Lumen場景的圖元信息.
UpdateLumenScenePrimitives(Scene);
// 更新遠處場景.
UpdateDistantScene(Scene, Views[0]);
const FVector LumenSceneCameraOrigin = GetLumenSceneViewOrigin(MainView, GetNumLumenVoxelClipmaps() - 1);
const float MaxCardUpdateDistanceFromCamera = ComputeMaxCardUpdateDistanceFromCamera();
// 重新分配卡片Atlas.
if (bReallocateAtlas)
{
LumenSceneData.MaxAtlasSize = GetDesiredAtlasSize();
// 在重新創建Atlas之前,應該釋放所有內容
ensure(LumenSceneData.NumCardTexels == 0);
LumenSceneData.AtlasAllocator = FBinnedTextureLayout(LumenSceneData.MaxAtlasSize, GLumenSceneCardAtlasAllocatorBinSize);
}
// 每幀捕獲和更新卡片紋素以及它們的數量, 是否更新由GLumenSceneRecaptureLumenSceneEveryFrame(控制台命令r.LumenScene.RecaptureEveryFrame)決定.
const int32 CardCapturesPerFrame = GLumenSceneRecaptureLumenSceneEveryFrame != 0 ? INT_MAX : GetMaxLumenSceneCardCapturesPerFrame();
const int32 CardTexelsToCapturePerFrame = GLumenSceneRecaptureLumenSceneEveryFrame != 0 ? INT_MAX : GetLumenSceneCardResToCapturePerFrame() * GetLumenSceneCardResToCapturePerFrame();
if (CardCapturesPerFrame > 0 && CardTexelsToCapturePerFrame > 0)
{
QUICK_SCOPE_CYCLE_COUNTER(FillCardsToRender);
TArray<FLumenSurfaceCacheUpdatePacket, SceneRenderingAllocator> Packets;
TArray<FMeshCardsAdd, SceneRenderingAllocator> MeshCardsAddsSortedByPriority;
// 准備表面緩存更新.
{
TRACE_CPUPROFILER_EVENT_SCOPE(PrepareSurfaceCacheUpdate);
const int32 NumPrimitivesPerPacket = FMath::Max(GLumenScenePrimitivesPerPacket, 1);
const int32 NumPackets = FMath::DivideAndRoundUp(LumenSceneData.LumenPrimitives.Num(), NumPrimitivesPerPacket);
CardsToRender.Reset(GetMaxLumenSceneCardCapturesPerFrame());
Packets.Reserve(NumPackets);
for (int32 PacketIndex = 0; PacketIndex < NumPackets; ++PacketIndex)
{
Packets.Emplace(
LumenSceneData.LumenPrimitives,
LumenSceneData.MeshCards,
LumenSceneData.Cards,
LumenSceneCameraOrigin,
MaxCardUpdateDistanceFromCamera,
PacketIndex * NumPrimitivesPerPacket,
NumPrimitivesPerPacket);
}
}
// 執行准備緩存更新任務.
{
TRACE_CPUPROFILER_EVENT_SCOPE(RunPrepareSurfaceCacheUpdate);
const bool bExecuteInParallel = FApp::ShouldUseThreadingForPerformance();
ParallelFor(Packets.Num(),
[&Packets](int32 Index)
{
Packets[Index].AnyThreadTask();
},
!bExecuteInParallel
);
}
// 打包上述任務的結果.
{
TRACE_CPUPROFILER_EVENT_SCOPE(PacketResults);
const float CARD_DISTANCE_BUCKET_SIZE = 100.0f;
uint32 NumMeshCardsAddsPerBucket[MAX_ADD_PRIMITIVE_PRIORITY + 1];
for (int32 BucketIndex = 0; BucketIndex < UE_ARRAY_COUNT(NumMeshCardsAddsPerBucket); ++BucketIndex)
{
NumMeshCardsAddsPerBucket[BucketIndex] = 0;
}
// Count how many cards fall into each bucket
for (int32 PacketIndex = 0; PacketIndex < Packets.Num(); ++PacketIndex)
{
const FLumenSurfaceCacheUpdatePacket& Packet = Packets[PacketIndex];
LumenSceneData.NumMeshCardsToAddToSurfaceCache += Packet.MeshCardsAdds.Num();
for (int32 CardIndex = 0; CardIndex < Packet.MeshCardsAdds.Num(); ++CardIndex)
{
const FMeshCardsAdd& MeshCardsAdd = Packet.MeshCardsAdds[CardIndex];
++NumMeshCardsAddsPerBucket[MeshCardsAdd.Priority];
}
}
int32 NumMeshCardsInBucketsUpToMaxBucket = 0;
int32 MaxBucketIndexToAdd = 0;
// 選擇前N個桶進行分配
for (int32 BucketIndex = 0; BucketIndex < UE_ARRAY_COUNT(NumMeshCardsAddsPerBucket); ++BucketIndex)
{
NumMeshCardsInBucketsUpToMaxBucket += NumMeshCardsAddsPerBucket[BucketIndex];
MaxBucketIndexToAdd = BucketIndex;
if (NumMeshCardsInBucketsUpToMaxBucket > CardCapturesPerFrame)
{
break;
}
}
MeshCardsAddsSortedByPriority.Reserve(GetMaxLumenSceneCardCapturesPerFrame());
// 拷貝前N個桶到CardsToAllocateSortedByDistance
for (int32 PacketIndex = 0; PacketIndex < Packets.Num(); ++PacketIndex)
{
const FLumenSurfaceCacheUpdatePacket& Packet = Packets[PacketIndex];
for (int32 CardIndex = 0; CardIndex < Packet.MeshCardsAdds.Num() && MeshCardsAddsSortedByPriority.Num() < CardCapturesPerFrame; ++CardIndex)
{
const FMeshCardsAdd& MeshCardsAdd = Packet.MeshCardsAdds[CardIndex];
if (MeshCardsAdd.Priority <= MaxBucketIndexToAdd)
{
MeshCardsAddsSortedByPriority.Add(MeshCardsAdd);
}
}
}
// 移除所有不可見的網格卡片.
for (int32 PacketIndex = 0; PacketIndex < Packets.Num(); ++PacketIndex)
{
const FLumenSurfaceCacheUpdatePacket& Packet = Packets[PacketIndex];
for (int32 MeshCardsToRemoveIndex = 0; MeshCardsToRemoveIndex < Packet.MeshCardsRemoves.Num(); ++MeshCardsToRemoveIndex)
{
const FMeshCardsRemove& MeshCardsRemove = Packet.MeshCardsRemoves[MeshCardsToRemoveIndex];
FLumenPrimitive& LumenPrimitive = LumenSceneData.LumenPrimitives[MeshCardsRemove.LumenPrimitiveIndex];
FLumenPrimitiveInstance& LumenPrimitiveInstance = LumenPrimitive.Instances[MeshCardsRemove.LumenInstanceIndex];
LumenSceneData.RemoveMeshCards(LumenPrimitive, LumenPrimitiveInstance);
}
}
}
// 分配遠處場景.
extern int32 GLumenUpdateDistantSceneCaptures;
if (GLumenUpdateDistantSceneCaptures)
{
for (int32 DistantCardIndex : LumenSceneData.DistantCardIndices)
{
FLumenCard& DistantCard = LumenSceneData.Cards[DistantCardIndex];
extern int32 GLumenDistantSceneCardResolution;
DistantCard.DesiredResolution = FIntPoint(GLumenDistantSceneCardResolution, GLumenDistantSceneCardResolution);
if (!DistantCard.bVisible)
{
LumenSceneData.AddCardToVisibleCardList(DistantCardIndex);
DistantCard.bVisible = true;
}
DistantCard.RemoveFromAtlas(LumenSceneData);
LumenSceneData.CardIndicesToUpdateInBuffer.Add(DistantCardIndex);
// 加入到CardsToRender列表.
CardsToRender.Add(FCardRenderData(
DistantCard,
nullptr,
-1,
FeatureLevel,
DistantCardIndex));
}
}
// 分配新的卡片.
for (int32 SortedCardIndex = 0; SortedCardIndex < MeshCardsAddsSortedByPriority.Num(); ++SortedCardIndex)
{
const FMeshCardsAdd& MeshCardsAdd = MeshCardsAddsSortedByPriority[SortedCardIndex];
FLumenPrimitive& LumenPrimitive = LumenSceneData.LumenPrimitives[MeshCardsAdd.LumenPrimitiveIndex];
FLumenPrimitiveInstance& LumenPrimitiveInstance = LumenPrimitive.Instances[MeshCardsAdd.LumenInstanceIndex];
LumenSceneData.AddMeshCards(MeshCardsAdd.LumenPrimitiveIndex, MeshCardsAdd.LumenInstanceIndex);
if (LumenPrimitiveInstance.MeshCardsIndex >= 0)
{
// 獲取圖元實例的網格卡片.
const FLumenMeshCards& MeshCards = LumenSceneData.MeshCards[LumenPrimitiveInstance.MeshCardsIndex];
// 遍歷網格卡片的所有卡片, 添加有效的卡片到CardsToRender列表.
for (uint32 CardIndex = MeshCards.FirstCardIndex; CardIndex < MeshCards.FirstCardIndex + MeshCards.NumCards; ++CardIndex)
{
FLumenCard& LumenCard = LumenSceneData.Cards[CardIndex];
// 分配卡片.
FCardAllocationOutput CardAllocation;
ComputeCardAllocation(LumenCard, LumenSceneCameraOrigin, MaxCardUpdateDistanceFromCamera, CardAllocation);
LumenCard.DesiredResolution = CardAllocation.TextureAllocationSize;
if (LumenCard.bVisible != CardAllocation.bVisible)
{
LumenCard.bVisible = CardAllocation.bVisible;
if (LumenCard.bVisible)
{
LumenSceneData.AddCardToVisibleCardList(CardIndex);
}
else
{
LumenCard.RemoveFromAtlas(LumenSceneData);
LumenSceneData.RemoveCardFromVisibleCardList(CardIndex);
}
LumenSceneData.CardIndicesToUpdateInBuffer.Add(CardIndex);
}
// 如果卡片可見且分辨率和預期不一樣, 才添加到CardsToRender.
if (LumenCard.bVisible && LumenCard.AtlasAllocation.Width() != LumenCard.DesiredResolution.X && LumenCard.AtlasAllocation.Height() != LumenCard.DesiredResolution.Y)
{
LumenCard.RemoveFromAtlas(LumenSceneData);
LumenSceneData.CardIndicesToUpdateInBuffer.Add(CardIndex);
// 加入到CardsToRender列表.
CardsToRender.Add(FCardRenderData(
LumenCard,
LumenPrimitive.Primitive,
LumenPrimitive.bMergedInstances ? -1 : MeshCardsAdd.LumenInstanceIndex,
FeatureLevel,
CardIndex));
LumenCardRenderer.NumCardTexelsToCapture += LumenCard.AtlasAllocation.Area();
}
} // for
// 如果卡片或卡片紋素超限, 終止循環.
if (CardsToRender.Num() >= CardCapturesPerFrame
|| LumenCardRenderer.NumCardTexelsToCapture >= CardTexelsToCapturePerFrame)
{
break;
}
}
}
}
// 分配和更新卡片Atlas.
AllocateOptionalCardAtlases(GraphBuilder, LumenSceneData, MainView, bReallocateAtlas);
UpdateLumenCardAtlasAllocation(GraphBuilder, MainView, bReallocateAtlas, bRecaptureLumenSceneOnce);
// 處理待渲染卡片.
if (CardsToRender.Num() > 0)
{
// 設置網格通道.
{
QUICK_SCOPE_CYCLE_COUNTER(MeshPassSetup);
// 在渲染之前,確保所有的網格渲染數據都已准備好.
{
QUICK_SCOPE_CYCLE_COUNTER(PrepareStaticMeshData);
// Set of unique primitives requiring static mesh update
TSet<FPrimitiveSceneInfo*> PrimitivesToUpdateStaticMeshes;
for (FCardRenderData& CardRenderData : CardsToRender)
{
FPrimitiveSceneInfo* PrimitiveSceneInfo = CardRenderData.PrimitiveSceneInfo;
if (PrimitiveSceneInfo && PrimitiveSceneInfo->Proxy->AffectsDynamicIndirectLighting())
{
if (PrimitiveSceneInfo->NeedsUniformBufferUpdate())
{
PrimitiveSceneInfo->UpdateUniformBuffer(GraphBuilder.RHICmdList);
}
if (PrimitiveSceneInfo->NeedsUpdateStaticMeshes())
{
PrimitivesToUpdateStaticMeshes.Add(PrimitiveSceneInfo);
}
}
}
if (PrimitivesToUpdateStaticMeshes.Num() > 0)
{
TArray<FPrimitiveSceneInfo*> UpdatedSceneInfos;
UpdatedSceneInfos.Reserve(PrimitivesToUpdateStaticMeshes.Num());
for (FPrimitiveSceneInfo* PrimitiveSceneInfo : PrimitivesToUpdateStaticMeshes)
{
UpdatedSceneInfos.Add(PrimitiveSceneInfo);
}
FPrimitiveSceneInfo::UpdateStaticMeshes(GraphBuilder.RHICmdList, Scene, UpdatedSceneInfos, true);
}
}
// 增加卡片捕獲繪制.
for (FCardRenderData& CardRenderData : CardsToRender)
{
CardRenderData.StartMeshDrawCommandIndex = LumenCardRenderer.MeshDrawCommands.Num();
CardRenderData.NumMeshDrawCommands = 0;
int32 NumNanitePrimitives = 0;
const FLumenCard& Card = LumenSceneData.Cards[CardRenderData.CardIndex];
checkSlow(Card.bVisible && Card.bAllocated);
// 創建或處理卡片對應的FVisibleMeshDrawCommand.
AddCardCaptureDraws(Scene,
GraphBuilder.RHICmdList,
CardRenderData,
LumenCardRenderer.MeshDrawCommands,
LumenCardRenderer.MeshDrawPrimitiveIds);
CardRenderData.NumMeshDrawCommands = LumenCardRenderer.MeshDrawCommands.Num() - CardRenderData.StartMeshDrawCommandIndex;
}
}
(.....)
}
}
}
以上可知,網格卡片並不是每幀更新,在GLumenSceneRecaptureLumenSceneEveryFrame
(控制台命令r.LumenScene.RecaptureEveryFrame
)開啟的情況下,網格卡片的分辨率發生改變且可見的情況下,才會加入到待渲染列表,並且每幀都有上限,防止一幀需要更新和繪制的卡片過多導致性能瓶頸。
6.5.5.3 MeshCardCapture
分析完如何將網格卡片加入到待渲染列表,便可以繼續分析捕獲卡片的具體過程了:
// 捕獲網格卡片.
{
FLumenCardPassParameters* PassParameters = GraphBuilder.AllocParameters<FLumenCardPassParameters>();
// 卡片視圖信息.
PassParameters->View = Scene->UniformBuffers.LumenCardCaptureViewUniformBuffer;
PassParameters->CardPass = GraphBuilder.CreateUniformBuffer(PassUniformParameters);
// Atlas渲染目標有3個: 基礎色, 法線, 自發光.
PassParameters->RenderTargets[0] = FRenderTargetBinding(AlbedoAtlasTexture, ERenderTargetLoadAction::ELoad);
PassParameters->RenderTargets[1] = FRenderTargetBinding(NormalAtlasTexture, ERenderTargetLoadAction::ELoad);
PassParameters->RenderTargets[2] = FRenderTargetBinding(EmissiveAtlasTexture, ERenderTargetLoadAction::ELoad);
// 深度目標緩沖.
PassParameters->RenderTargets.DepthStencil = FDepthStencilBinding(DepthStencilAtlasTexture, ERenderTargetLoadAction::ELoad, FExclusiveDepthStencil::DepthWrite_StencilNop);
InstanceCullingResult.GetDrawParameters(PassParameters->InstanceCullingDrawParams);
// 捕獲網格卡片Pass.
GraphBuilder.AddPass(
RDG_EVENT_NAME("MeshCardCapture"),
PassParameters,
ERDGPassFlags::Raster,
[this, Scene = Scene, PrimitiveIdVertexBuffer, SharedView, &CardsToRender, PassParameters](FRHICommandList& RHICmdList)
{
QUICK_SCOPE_CYCLE_COUNTER(MeshPass);
// 將所有待渲染的卡片准備數據並提交繪制指令.
for (FCardRenderData& CardRenderData : CardsToRender)
{
if (CardRenderData.NumMeshDrawCommands > 0)
{
FIntRect AtlasRect = CardRenderData.AtlasAllocation;
// 設置視口.
RHICmdList.SetViewport(AtlasRect.Min.X, AtlasRect.Min.Y, 0.0f, AtlasRect.Max.X, AtlasRect.Max.Y, 1.0f);
// 處理視圖數據.
CardRenderData.PatchView(RHICmdList, Scene, SharedView);
Scene->UniformBuffers.LumenCardCaptureViewUniformBuffer.UpdateUniformBufferImmediate(*SharedView->CachedViewUniformShaderParameters);
FGraphicsMinimalPipelineStateSet GraphicsMinimalPipelineStateSet;
#if GPUCULL_TODO
if (Scene->GPUScene.IsEnabled())
{
FRHIBuffer* DrawIndirectArgsBuffer = nullptr;
FRHIBuffer* InstanceIdOffsetBuffer = nullptr;
FInstanceCullingDrawParams& InstanceCullingDrawParams = PassParameters->InstanceCullingDrawParams;
if (InstanceCullingDrawParams.DrawIndirectArgsBuffer != nullptr && InstanceCullingDrawParams.InstanceIdOffsetBuffer != nullptr)
{
DrawIndirectArgsBuffer = InstanceCullingDrawParams.DrawIndirectArgsBuffer->GetRHI();
InstanceIdOffsetBuffer = InstanceCullingDrawParams.InstanceIdOffsetBuffer->GetRHI();
}
// GPU裁剪調用GPUInstanced接口.
SubmitGPUInstancedMeshDrawCommandsRange(
LumenCardRenderer.MeshDrawCommands,
GraphicsMinimalPipelineStateSet,
CardRenderData.StartMeshDrawCommandIndex,
CardRenderData.NumMeshDrawCommands,
1,
InstanceIdOffsetBuffer,
DrawIndirectArgsBuffer,
RHICmdList);
}
#endif // GPUCULL_TODO
(......)
}
}
}
);
}
繪制卡片階段,渲染網格卡片時為每個網格卡片以低分辨率從不同的方向獲取網格表面屬性的投影,這些投影后的網格屬性被儲存在紋理atlas中,但不同於傳統的渲染管線,此處只光柵化卡片視圖范圍內的Nanite網格的三種屬性:基礎色、法線、自發光。(下圖)
卡片捕捉階段投影在網格卡片的網格屬性圖集。上:基礎色圖集,下:法線圖集。
下面是捕獲網格卡片使用的VS和PS:
// Engine\Shaders\Private\Lumen\LumenCardVertexShader.usf
struct FLumenCardInterpolantsVSToPS
{
};
struct FLumenCardVSToPS
{
FVertexFactoryInterpolantsVSToPS FactoryInterpolants;
FLumenCardInterpolantsVSToPS PassInterpolants;
float4 Position : SV_POSITION;
};
// 網格卡片VS主入口.
void Main(
FVertexFactoryInput Input,
OPTIONAL_VertexID
out FLumenCardVSToPS Output
)
{
uint EyeIndex = 0;
ResolvedView = ResolveView();
FVertexFactoryIntermediates VFIntermediates = GetVertexFactoryIntermediates(Input);
float4 WorldPositionExcludingWPO = VertexFactoryGetWorldPosition(Input, VFIntermediates);
float4 WorldPosition = WorldPositionExcludingWPO;
float4 ClipSpacePosition;
float3x3 TangentToLocal = VertexFactoryGetTangentToLocal(Input, VFIntermediates);
FMaterialVertexParameters VertexParameters = GetMaterialVertexParameters(Input, VFIntermediates, WorldPosition.xyz, TangentToLocal);
ISOLATE
{
// 材質的位置偏移.
WorldPosition.xyz += GetMaterialWorldPositionOffset(VertexParameters);
// 光柵化的位置偏移.
float4 RasterizedWorldPosition = VertexFactoryGetRasterizedWorldPosition(Input, VFIntermediates, WorldPosition);
// 將位置變換到裁剪空間.
ClipSpacePosition = INVARIANT(mul(RasterizedWorldPosition, ResolvedView.TranslatedWorldToClip));
Output.Position = INVARIANT(ClipSpacePosition);
}
bool bClampToNearPlane = false;// GetPrimitiveData(Input.PrimitiveId).ObjectWorldPositionAndRadius.w < .5f * max();
if (bClampToNearPlane && Output.Position.z < 0)
{
Output.Position.z = 0.01f;
Output.Position.w = 1.0f;
}
Output.FactoryInterpolants = VertexFactoryGetInterpolantsVSToPS(Input, VFIntermediates, VertexParameters);
}
// Engine\Shaders\Private\Lumen\LumenCardPixelShader.usf
struct FLumenCardInterpolantsVSToPS
{
};
// 網格卡片PS主入口.
void Main(
FVertexFactoryInterpolantsVSToPS Interpolants,
FLumenCardInterpolantsVSToPS PassInterpolants,
in INPUT_POSITION_QUALIFIERS float4 SvPosition : SV_Position // after all interpolators
OPTIONAL_IsFrontFace,
out float4 OutTarget0 : SV_Target0,
out float4 OutTarget1 : SV_Target1,
out float4 OutTarget2 : SV_Target2)
{
ResolvedView = ResolveView();
// 獲取材質的基本屬性.
FMaterialPixelParameters MaterialParameters = GetMaterialPixelParameters(Interpolants, SvPosition);
FPixelMaterialInputs PixelMaterialInputs;
// 計算材質的額外屬性.
{
float4 ScreenPosition = SvPositionToResolvedScreenPosition(SvPosition);
float3 TranslatedWorldPosition = SvPositionToResolvedTranslatedWorld(SvPosition);
CalcMaterialParametersEx(MaterialParameters, PixelMaterialInputs, SvPosition, ScreenPosition, bIsFrontFace, TranslatedWorldPosition, TranslatedWorldPosition);
}
// 獲取材質覆蓋和裁剪數據.
GetMaterialCoverageAndClipping(MaterialParameters, PixelMaterialInputs);
float3 BaseColor = GetMaterialBaseColor(PixelMaterialInputs);
float Metallic = GetMaterialMetallic(PixelMaterialInputs);
float Specular = GetMaterialSpecular(PixelMaterialInputs);
float Roughness = GetMaterialRoughness(PixelMaterialInputs);
float Opacity = GetMaterialOpacity(PixelMaterialInputs);
float3 DiffuseColor = BaseColor - BaseColor * Metallic;
float3 SpecularColor = lerp(0.08 * Specular.xxx, BaseColor, Metallic.xxx);
// 計算環境光的影響.
EnvBRDFApproxFullyRough(DiffuseColor, SpecularColor);
// 存儲基礎色, 法線, 自發光.
//@todo DynamicGI better encoding for low precision, hemispherical normal encoding
OutTarget0 = float4(sqrt(DiffuseColor), Opacity);
OutTarget1 = float4(MaterialParameters.WorldNormal * .5f + .5f, 0);
OutTarget2 = float4(GetMaterialEmissive(PixelMaterialInputs), 0);
}
其中VS的輸入是局部空間的長方體,VS的輸出是裁剪空間的長方體:
經過PS渲染完之后,會在基礎色、法線、自發光的三個RT圖集中對應的位置存儲數據。需要特意提出的是,這里的VS和PS邏輯遠遠沒有傳統BasePass的VS和PS復雜,這也是Lumen得以實時渲染的其中一個重要優化措施。
另外說一下,渲染新卡片到Atlas圖集的位置可由Bin packing problem解決,渲染時只要將起始點和寬高設置到ViewPort就行了,對應的類型是FBinnedTextureLayout
,其它相關類型還有FTextureLayout
和FTextureLayout3d
。比如以下截幀的卡片ViewPort的位置是(0, 0),寬高是(64, 64),意味着它將被渲染到圖集中最前面寬高為64的區域:
順帶提一下,網格卡片的繪制指令是在FLumenCardMeshProcessor中處理的:
// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneRendering.cpp
void FLumenCardMeshProcessor::AddMeshBatch(const FMeshBatch& RESTRICT MeshBatch, uint64 BatchElementMask, const FPrimitiveSceneProxy* RESTRICT PrimitiveSceneProxy, int32 StaticMeshId)
{
LLM_SCOPE_BYTAG(Lumen);
if (MeshBatch.bUseForMaterial && DoesPlatformSupportLumenGI(GetFeatureLevelShaderPlatform(FeatureLevel)))
{
// 處理材質.
const FMaterialRenderProxy* FallbackMaterialRenderProxyPtr = nullptr;
const FMaterial& Material = MeshBatch.MaterialRenderProxy->GetMaterialWithFallback(FeatureLevel, FallbackMaterialRenderProxyPtr);
const FMaterialRenderProxy& MaterialRenderProxy = FallbackMaterialRenderProxyPtr ? *FallbackMaterialRenderProxyPtr : *MeshBatch.MaterialRenderProxy;
// 處理渲染狀態.
const EBlendMode BlendMode = Material.GetBlendMode();
const FMaterialShadingModelField ShadingModels = Material.GetShadingModels();
const bool bIsTranslucent = IsTranslucentBlendMode(BlendMode);
const FMeshDrawingPolicyOverrideSettings OverrideSettings = ComputeMeshOverrideSettings(MeshBatch);
const ERasterizerFillMode MeshFillMode = ComputeMeshFillMode(MeshBatch, Material, OverrideSettings);
const ERasterizerCullMode MeshCullMode = ComputeMeshCullMode(MeshBatch, Material, OverrideSettings);
if (!bIsTranslucent
&& (PrimitiveSceneProxy && PrimitiveSceneProxy->ShouldRenderInMainPass() && PrimitiveSceneProxy->AffectsDynamicIndirectLighting())
&& ShouldIncludeDomainInMeshPass(Material.GetMaterialDomain()))
{
// 選擇VS和PS等shader
const FVertexFactory* VertexFactory = MeshBatch.VertexFactory;
FVertexFactoryType* VertexFactoryType = VertexFactory->GetType();
TMeshProcessorShaders<FLumenCardVS, FLumenCardPS> PassShaders;
PassShaders.VertexShader = Material.GetShader<FLumenCardVS>(VertexFactoryType);
PassShaders.PixelShader = Material.GetShader<FLumenCardPS>(VertexFactoryType);
FMeshMaterialShaderElementData ShaderElementData;
ShaderElementData.InitializeMeshMaterialData(ViewIfDynamicMeshCommand, PrimitiveSceneProxy, MeshBatch, StaticMeshId, false);
const FMeshDrawCommandSortKey SortKey = CalculateMeshStaticSortKey(PassShaders.VertexShader, PassShaders.PixelShader);
// 構建繪制指令
BuildMeshDrawCommands(
MeshBatch,
BatchElementMask,
PrimitiveSceneProxy,
MaterialRenderProxy,
Material,
PassDrawRenderState,
PassShaders,
MeshFillMode,
MeshCullMode,
SortKey,
EMeshPassFeatures::Default,
ShaderElementData);
}
}
}
6.5.5.4 RasterizeLumenCards
光柵化Lumen卡片邏輯如下:
if (UseNanite(ShaderPlatform) && ViewFamily.EngineShowFlags.NaniteMeshes && bAnyNaniteMeshes)
{
(......)
Nanite::FRasterContext RasterContext = Nanite::InitRasterContext(...);
(......)
Nanite::FCullingContext CullingContext = Nanite::InitCullingContext(...);
if (GLumenSceneNaniteMultiViewCapture) // 多視圖繪制模型
{
const uint32 NumCardsToRender = CardsToRender.Num();
// 拆分視圖, 防止超過同批次的最大數量.
uint32 NextCardIndex = 0;
while(NextCardIndex < NumCardsToRender)
{
TArray<Nanite::FPackedView, SceneRenderingAllocator> NaniteViews;
TArray<Nanite::FInstanceDraw, SceneRenderingAllocator> NaniteInstanceDraws;
while(NextCardIndex < NumCardsToRender && NaniteViews.Num() < MAX_VIEWS_PER_CULL_RASTERIZE_PASS)
{
const FCardRenderData& CardRenderData = CardsToRender[NextCardIndex];
if(CardRenderData.NaniteInstanceIds.Num() > 0)
{
for(uint32 InstanceID : CardRenderData.NaniteInstanceIds)
{
NaniteInstanceDraws.Add(Nanite::FInstanceDraw { InstanceID, (uint32)NaniteViews.Num() });
}
Nanite::FPackedViewParams Params;
Params.ViewMatrices = CardRenderData.ViewMatrices;
Params.PrevViewMatrices = CardRenderData.ViewMatrices;
Params.ViewRect = CardRenderData.AtlasAllocation;
Params.RasterContextSize = DepthStencilAtlasSize;
Params.LODScaleFactor = CardRenderData.NaniteLODScaleFactor;
NaniteViews.Add(Nanite::CreatePackedView(Params));
}
NextCardIndex++;
}
// 實例化繪制.
if (NaniteInstanceDraws.Num() > 0)
{
RDG_EVENT_SCOPE(GraphBuilder, "Nanite::RasterizeLumenCards");
Nanite::FRasterState RasterState;
Nanite::CullRasterize(
GraphBuilder,
*Scene,
NaniteViews,
CullingContext,
RasterContext,
RasterState,
&NaniteInstanceDraws
);
}
}
}
else // 單視圖模式.
{
(......)
}
extern float GLumenDistantSceneMinInstanceBoundsRadius;
// 渲染遠景的卡片.
for (FCardRenderData& CardRenderData : CardsToRender)
{
if (CardRenderData.bDistantScene)
{
(......)
}
}
// 繪制Lumen的網格.
Nanite::DrawLumenMeshCapturePass(
GraphBuilder,
*Scene,
SharedView,
CardsToRender,
CullingContext,
RasterContext,
PassUniformParameters,
RectMinMaxBufferSRV,
NumRects,
LumenSceneData.MaxAtlasSize,
AlbedoAtlasTexture,
NormalAtlasTexture,
EmissiveAtlasTexture,
DepthStencilAtlasTexture
);
}
光柵化卡片的階段跟Nanite流程基本一致:
光柵化后輸出的結果也是一致,包含可見性、深度模板緩沖、三角形ID等信息:
之后的步驟就是繪制網格卡片,這個階段也和Nanite基本一致:
輸出的GBuffer依然是上面提及的基礎色、法線、自發光三個圖集,但會附加到它們的空白區域。
6.5.6 Lumen場景光照
6.5.6.1 Voxel Cone Tracing
后面小節會較多地涉及到Voxel Cone Tracing(體素椎體追蹤)的相關知識,本小節先補充一下它的相關知識,論文依據是Interactive Indirect Illumination Using Voxel Cone Tracing和Voxel Cone Tracing and Sparse Voxel Octree for Real-time Global Illumination。
對場景執行Voxel Cone Tracing的第一步是構建場景物體的稀疏體素八叉樹(Sparse Voxel Octree),UE5使用了稀疏HLOD的網格距離場。
下圖是Sponza場景體素化后的情形:
渲染引擎(如UE)一般使用了混合渲染管線,直接光(Primary ray)使用傳統的光柵化獲得,次級光則使用椎體追蹤:
在體素椎體追蹤之前,會預過濾幾何體,然后像參合介質那樣去追蹤(可使用體積光線投射法)。而體素使用不透明場+入射輻射率來代表場景物體,這樣可以使用四線性(Quadrilinearly)插值采樣來模擬椎體射線覆蓋的腳印:
上圖步驟中的單條椎體射線追蹤需要用到MIP映射圖,MIP映射圖的生成使用了高斯權重,即體素中心的權重最大,偏離體素中心越遠的點權重越小:
利用高斯權重生成的MIP圖越高的Level越模糊,剛好可以匹配椎體的形狀:椎體射線離起點越遠,其覆蓋的范圍越大,接收到的光照越模糊!在此前提下,就可以根據椎體射線相交點與起點的距離去四線性采樣對應Level的MIP圖,以快速得到椎體射線相交點的輻射率:
Voxel的渲染過程可分拆成3個Pass:第一個Pass是光照,烘焙輻照度(反射陰影圖,RSM);第二個Pass是預過濾,使用稀疏八叉樹下采樣輻射率;第三個Pass是相機Pass,收集每個可見片元(像素)的輻照度。(下圖)
同樣地,Voxel追蹤還可以用於鏡面反射、AO、軟陰影中。對於鏡面反射,可以采用類似的追蹤方式,只是生成的鏡面椎體數量少且范圍小:
實際上,在Cone Tracing中,不同粗糙度的表面可以構造不同的數量和大小的椎體進行追蹤:
左:高粗糙度表面,即漫反射,需要多個椎體追蹤;中:較粗糙的鏡面反射,只需一個角度較大的椎體追蹤;右:低粗糙的鏡面反射,只需一個角度較小的椎體追蹤。
對於AO,采用近處多采樣椎體追蹤+遠景AO+離線遮擋的綜合方式:
對於軟陰影,可以用一個像素一個椎體的方式采樣,達到越光滑越高效的計算效果:
論文還提到了只用一個Pass達到體素化的技術,以及用Compute Shader構建稀疏八叉樹的技術和過程:
6.5.6.2 RenderLumenSceneLighting
Lumen的場景光照由RenderLumenSceneLighting擔當,它的代碼如下:
// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneLighting.cpp
void FDeferredShadingSceneRenderer::RenderLumenSceneLighting(
FRDGBuilder& GraphBuilder,
FViewInfo& View)
{
FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
// 檢測是否開啟了Lumen: 非直接漫反射或反射方式的其中一個是Lumen即可.
const bool bAnyLumenEnabled = GetViewPipelineState(Views[0]).DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen
|| GetViewPipelineState(Views[0]).ReflectionsMethod == EReflectionsMethod::Lumen;
if (bAnyLumenEnabled)
{
RDG_EVENT_SCOPE(GraphBuilder, "LumenSceneLighting");
FGlobalShaderMap* GlobalShaderMap = View.ShaderMap;
FLumenCardTracingInputs TracingInputs(GraphBuilder, Scene, Views[0]);
if (LumenSceneData.VisibleCardsIndices.Num() > 0)
{
FRDGTextureRef RadiosityAtlas = GraphBuilder.RegisterExternalTexture(LumenSceneData.RadiosityAtlas, TEXT("Lumen.RadiosityAtlas"));
// 渲染輻射度.
RenderRadiosityForLumenScene(GraphBuilder, TracingInputs, GlobalShaderMap, RadiosityAtlas);
ConvertToExternalTexture(GraphBuilder, RadiosityAtlas, LumenSceneData.RadiosityAtlas);
FLumenCardScatterContext DirectLightingCardScatterContext;
extern float GLumenSceneCardDirectLightingUpdateFrequencyScale;
// 構建間接參數並寫入卡片的面,這些面用來更新這一幀的直接照明.
DirectLightingCardScatterContext.Init(
GraphBuilder,
View,
LumenSceneData,
LumenCardRenderer,
ECullCardsMode::OperateOnSceneForceUpdateForCardsToRender,
1);
// 裁剪卡片到指定形狀.
DirectLightingCardScatterContext.CullCardsToShape(
GraphBuilder,
View,
LumenSceneData,
LumenCardRenderer,
TracingInputs.LumenCardSceneUniformBuffer,
ECullCardsShapeType::None,
FCullCardsShapeParameters(),
GLumenSceneCardDirectLightingUpdateFrequencyScale,
0);
// 構建散射非直接參數.
DirectLightingCardScatterContext.BuildScatterIndirectArgs(
GraphBuilder,
View);
extern int32 GLumenSceneRecaptureLumenSceneEveryFrame;
// 清理光照相關的圖集: 最終收集圖集, 輻照度圖集, 非直接輻照度圖集.
if (GLumenSceneRecaptureLumenSceneEveryFrame)
{
ClearAtlasRDG(GraphBuilder, TracingInputs.FinalLightingAtlas);
if (Lumen::UseIrradianceAtlas(View))
{
ClearAtlasRDG(GraphBuilder, TracingInputs.IrradianceAtlas);
}
if (Lumen::UseIndirectIrradianceAtlas(View))
{
ClearAtlasRDG(GraphBuilder, TracingInputs.IndirectIrradianceAtlas);
}
}
// 組合場景光照.
CombineLumenSceneLighting(
Scene,
View,
GraphBuilder,
TracingInputs.LumenCardSceneUniformBuffer,
TracingInputs.FinalLightingAtlas,
TracingInputs.OpacityAtlas,
RadiosityAtlas,
GlobalShaderMap,
DirectLightingCardScatterContext);
// 拷貝TracingInputs.FinalLightingAtlas的數據到TracingInputs.IndirectIrradianceAtlas.
if (Lumen::UseIndirectIrradianceAtlas(View))
{
CopyLumenCardAtlas(
Scene,
View,
GraphBuilder,
TracingInputs.LumenCardSceneUniformBuffer,
TracingInputs.FinalLightingAtlas,
TracingInputs.IndirectIrradianceAtlas,
GlobalShaderMap,
DirectLightingCardScatterContext);
}
// 渲染Lumen場景的直接光照.
RenderDirectLightingForLumenScene(
GraphBuilder,
TracingInputs.LumenCardSceneUniformBuffer,
TracingInputs.FinalLightingAtlas,
TracingInputs.OpacityAtlas,
GlobalShaderMap,
DirectLightingCardScatterContext);
if (Lumen::UseIrradianceAtlas(View))
{
CopyLumenCardAtlas(
Scene,
View,
GraphBuilder,
TracingInputs.LumenCardSceneUniformBuffer,
TracingInputs.FinalLightingAtlas,
TracingInputs.IrradianceAtlas,
GlobalShaderMap,
DirectLightingCardScatterContext);
}
FRDGTextureRef AlbedoAtlas = GraphBuilder.RegisterExternalTexture(LumenSceneData.AlbedoAtlas, TEXT("Lumen.AlbedoAtlas"));
FRDGTextureRef EmissiveAtlas = GraphBuilder.RegisterExternalTexture(LumenSceneData.EmissiveAtlas, TEXT("Lumen.EmissiveAtlas"));
// 應用Lumen卡片的基礎色.
ApplyLumenCardAlbedo(
Scene,
View,
GraphBuilder,
TracingInputs.LumenCardSceneUniformBuffer,
TracingInputs.FinalLightingAtlas,
AlbedoAtlas,
EmissiveAtlas,
GlobalShaderMap,
DirectLightingCardScatterContext);
LumenSceneData.bFinalLightingAtlasContentsValid = true;
// 預過濾光照.
PrefilterLumenSceneLighting(GraphBuilder, View, TracingInputs, GlobalShaderMap, DirectLightingCardScatterContext);
ConvertToExternalTexture(GraphBuilder, TracingInputs.FinalLightingAtlas, LumenSceneData.FinalLightingAtlas);
if (Lumen::UseIrradianceAtlas(View))
{
ConvertToExternalTexture(GraphBuilder, TracingInputs.IrradianceAtlas, LumenSceneData.IrradianceAtlas);
}
if (Lumen::UseIndirectIrradianceAtlas(View))
{
ConvertToExternalTexture(GraphBuilder, TracingInputs.IndirectIrradianceAtlas, LumenSceneData.IndirectIrradianceAtlas);
}
}
// 計算Voxel光照.
ComputeLumenSceneVoxelLighting(GraphBuilder, TracingInputs, GlobalShaderMap);
// 透明物體GI.
ComputeLumenTranslucencyGIVolume(GraphBuilder, TracingInputs, GlobalShaderMap);
}
}
RenderDoc的截幀一目了然地顯示了以上流程:
后面的小節對部分主要步驟執行分析。
6.5.6.3 RenderRadiosityForLumenScene
RenderRadiosityForLumenScene的邏輯是渲染Lumen場景的輻射度,代碼如下:
// Engine\Source\Runtime\Renderer\Private\Lumen\LumenRadiosity.cpp
void FDeferredShadingSceneRenderer::RenderRadiosityForLumenScene(
FRDGBuilder& GraphBuilder,
const FLumenCardTracingInputs& TracingInputs,
FGlobalShaderMap* GlobalShaderMap,
FRDGTextureRef RadiosityAtlas)
{
LLM_SCOPE_BYTAG(Lumen);
const FViewInfo& MainView = Views[0];
FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
extern int32 GLumenSceneRecaptureLumenSceneEveryFrame;
if (IsRadiosityEnabled()
&& !GLumenSceneRecaptureLumenSceneEveryFrame
&& LumenSceneData.bFinalLightingAtlasContentsValid
&& TracingInputs.NumClipmapLevels > 0)
{
RDG_EVENT_SCOPE(GraphBuilder, "Radiosity");
FLumenCardScatterContext VisibleCardScatterContext;
// 構建間接參數並寫入卡片的面,這些面用來更新這一幀的直接照明.
VisibleCardScatterContext.Init(
GraphBuilder,
MainView,
LumenSceneData,
LumenCardRenderer,
ECullCardsMode::OperateOnSceneForceUpdateForCardsToRender);
VisibleCardScatterContext.CullCardsToShape(
GraphBuilder,
MainView,
LumenSceneData,
LumenCardRenderer,
TracingInputs.LumenCardSceneUniformBuffer,
ECullCardsShapeType::None,
FCullCardsShapeParameters(),
GLumenSceneCardRadiosityUpdateFrequencyScale,
0);
// 構建非直接散射參數.
VisibleCardScatterContext.BuildScatterIndirectArgs(
GraphBuilder,
MainView);
// 生成采樣點.
RadiosityDirections.GenerateSamples(
FMath::Clamp(GLumenRadiosityNumTargetCones, 1, (int32)MaxRadiosityConeDirections),
1,
GLumenRadiosityNumTargetCones,
false,
true /* Cosine distribution */);
const bool bRenderSkylight = Lumen::ShouldHandleSkyLight(Scene, ViewFamily);
// 渲染輻射度的散射.
if (GLumenRadiosityComputeTraceBlocksScatter) // CS模式
{
RenderRadiosityComputeScatter(
GraphBuilder,
Scene,
Views[0],
bRenderSkylight,
LumenSceneData,
RadiosityAtlas,
TracingInputs,
VisibleCardScatterContext.Parameters,
GlobalShaderMap);
}
else // PS模式
{
FLumenCardRadiosity* PassParameters = GraphBuilder.AllocParameters<FLumenCardRadiosity>();
PassParameters->RenderTargets[0] = FRenderTargetBinding(RadiosityAtlas, ERenderTargetLoadAction::ENoAction);
PassParameters->VS.LumenCardScene = TracingInputs.LumenCardSceneUniformBuffer;
PassParameters->VS.CardScatterParameters = VisibleCardScatterContext.Parameters;
PassParameters->VS.ScatterInstanceIndex = 0;
PassParameters->VS.CardUVSamplingOffset = FVector2D::ZeroVector;
SetupTraceFromTexelParameters(Views[0], TracingInputs, LumenSceneData, PassParameters->PS.TraceFromTexelParameters);
FLumenCardRadiosityPS::FPermutationDomain PermutationVector;
PermutationVector.Set<FLumenCardRadiosityPS::FDynamicSkyLight>(bRenderSkylight);
auto PixelShader = GlobalShaderMap->GetShader<FLumenCardRadiosityPS>(PermutationVector);
FScene* LocalScene = Scene;
const int32 RadiosityDownsampleArea = GLumenRadiosityDownsampleFactor * GLumenRadiosityDownsampleFactor;
// 從圖集中追蹤輻射度.
GraphBuilder.AddPass(
RDG_EVENT_NAME("TraceFromAtlasTexels: %u Cones", RadiosityDirections.SampleDirections.Num()),
PassParameters,
ERDGPassFlags::Raster,
[LocalScene, PixelShader, PassParameters, GlobalShaderMap](FRHICommandListImmediate& RHICmdList)
{
FIntPoint ViewRect = FIntPoint::DivideAndRoundDown(LocalScene->LumenSceneData->MaxAtlasSize, GLumenRadiosityDownsampleFactor);
DrawQuadsToAtlas(ViewRect, PixelShader, PassParameters, GlobalShaderMap, TStaticBlendState<>::GetRHI(), RHICmdList);
});
}
}
else
{
ClearAtlasRDG(GraphBuilder, RadiosityAtlas);
}
}
以上代碼中最后階段是計算輻射度,通常情況下,會進入CS模式RenderRadiosityComputeScatter
,下面進入其代碼分析:
void RenderRadiosityComputeScatter(
FRDGBuilder& GraphBuilder,
const FScene* Scene,
const FViewInfo& View,
bool bRenderSkylight,
const FLumenSceneData& LumenSceneData,
FRDGTextureRef RadiosityAtlas,
const FLumenCardTracingInputs& TracingInputs,
const FLumenCardScatterParameters& CardScatterParameters,
FGlobalShaderMap* GlobalShaderMap)
{
const bool bUseIrradianceCache = GLumenRadiosityUseIrradianceCache != 0;
// 構建追蹤塊的非直接參數.
FRDGBufferRef SetupCardTraceBlocksIndirectArgsBuffer = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateIndirectDesc<FRHIDispatchIndirectParameters>(1), TEXT("SetupCardTraceBlocksIndirectArgsBuffer"));
{
FRDGBufferUAVRef SetupCardTraceBlocksIndirectArgsBufferUAV = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(SetupCardTraceBlocksIndirectArgsBuffer));
FPlaceProbeIndirectArgsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FPlaceProbeIndirectArgsCS::FParameters>();
PassParameters->RWIndirectArgs = SetupCardTraceBlocksIndirectArgsBufferUAV;
PassParameters->QuadAllocator = CardScatterParameters.QuadAllocator;
auto ComputeShader = GlobalShaderMap->GetShader< FPlaceProbeIndirectArgsCS >(0);
ensure(GSetupCardTraceBlocksGroupSize == GPlaceRadiosityProbeGroupSize);
const FIntVector GroupSize(1, 1, 1);
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("SetupCardTraceBlocksIndirectArgsCS"),
ComputeShader,
PassParameters,
GroupSize);
}
const int32 TraceBlockMaxSize = 2;
extern int32 GLumenSceneCardLightingForceFullUpdate;
const int32 Divisor = TraceBlockMaxSize * GLumenRadiosityDownsampleFactor * (GLumenSceneCardLightingForceFullUpdate ? 1 : GLumenRadiosityTraceBlocksAllocationDivisor);
const int32 NumTraceBlocksToAllocate = (LumenSceneData.MaxAtlasSize.X / Divisor)
* (LumenSceneData.MaxAtlasSize.Y / Divisor);
FRDGBufferRef CardTraceBlockAllocator = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(uint32), 1), TEXT("CardTraceBlockAllocator"));
FRDGBufferRef CardTraceBlockData = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(FIntVector4), NumTraceBlocksToAllocate), TEXT("CardTraceBlockData"));
FRDGBufferUAVRef CardTraceBlockAllocatorUAV = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(CardTraceBlockAllocator, PF_R32_UINT));
FRDGBufferUAVRef CardTraceBlockDataUAV = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(CardTraceBlockData, PF_R32G32B32A32_UINT));
FComputeShaderUtils::ClearUAV(GraphBuilder, View.ShaderMap, CardTraceBlockAllocatorUAV, 0);
// 構建卡片追蹤塊.
{
FSetupCardTraceBlocksCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FSetupCardTraceBlocksCS::FParameters>();
PassParameters->RWCardTraceBlockAllocator = CardTraceBlockAllocatorUAV;
PassParameters->RWCardTraceBlockData = CardTraceBlockDataUAV;
PassParameters->QuadAllocator = CardScatterParameters.QuadAllocator;
PassParameters->QuadData = CardScatterParameters.QuadData;
PassParameters->CardBuffer = LumenSceneData.CardBuffer.SRV;
PassParameters->RadiosityAtlasSize = FIntPoint::DivideAndRoundDown(LumenSceneData.MaxAtlasSize, GLumenRadiosityDownsampleFactor);
PassParameters->IndirectArgs = SetupCardTraceBlocksIndirectArgsBuffer;
auto ComputeShader = GlobalShaderMap->GetShader<FSetupCardTraceBlocksCS>();
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("SetupCardTraceBlocksCS"),
ComputeShader,
PassParameters,
SetupCardTraceBlocksIndirectArgsBuffer,
0);
}
// 構建卡片追蹤參數.
FRDGBufferRef TraceBlocksIndirectArgsBuffer = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateIndirectDesc<FRHIDispatchIndirectParameters>(1), TEXT("TraceBlocksIndirectArgsBuffer"));
{
FRDGBufferUAVRef TraceBlocksIndirectArgsBufferUAV = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(TraceBlocksIndirectArgsBuffer));
FTraceBlocksIndirectArgsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FTraceBlocksIndirectArgsCS::FParameters>();
PassParameters->RWIndirectArgs = TraceBlocksIndirectArgsBufferUAV;
PassParameters->CardTraceBlockAllocator = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(CardTraceBlockAllocator, PF_R32_UINT));
FTraceBlocksIndirectArgsCS::FPermutationDomain PermutationVector;
PermutationVector.Set<FTraceBlocksIndirectArgsCS::FIrradianceCache>(bUseIrradianceCache);
auto ComputeShader = GlobalShaderMap->GetShader< FTraceBlocksIndirectArgsCS >(PermutationVector);
const FIntVector GroupSize(1, 1, 1);
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("TraceBlocksIndirectArgsCS"),
ComputeShader,
PassParameters,
GroupSize);
}
LumenRadianceCache::FRadianceCacheInterpolationParameters RadianceCacheParameters;
// 渲染輻照度緩存.
if (bUseIrradianceCache)
{
const LumenRadianceCache::FRadianceCacheInputs RadianceCacheInputs = LumenRadiosity::SetupRadianceCacheInputs();
FRadiosityMarkUsedProbesData MarkUsedProbesData;
MarkUsedProbesData.Parameters.View = View.ViewUniformBuffer;
MarkUsedProbesData.Parameters.DepthAtlas = LumenSceneData.DepthAtlas->GetRenderTargetItem().ShaderResourceTexture;
MarkUsedProbesData.Parameters.CurrentOpacityAtlas = LumenSceneData.OpacityAtlas->GetRenderTargetItem().ShaderResourceTexture;
MarkUsedProbesData.Parameters.CardTraceBlockAllocator = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(CardTraceBlockAllocator, PF_R32_UINT));
MarkUsedProbesData.Parameters.CardTraceBlockData = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(CardTraceBlockData, PF_R32G32B32A32_UINT));
MarkUsedProbesData.Parameters.CardBuffer = LumenSceneData.CardBuffer.SRV;
MarkUsedProbesData.Parameters.RadiosityAtlasSize = FIntPoint::DivideAndRoundDown(LumenSceneData.MaxAtlasSize, GLumenRadiosityDownsampleFactor);
MarkUsedProbesData.Parameters.IndirectArgs = TraceBlocksIndirectArgsBuffer;
RenderRadianceCache(
GraphBuilder,
TracingInputs,
RadianceCacheInputs,
Scene,
View,
nullptr,
nullptr,
FMarkUsedRadianceCacheProbes::CreateStatic(&RadianceCacheMarkUsedProbes),
&MarkUsedProbesData,
View.ViewState->RadiosityRadianceCacheState,
RadianceCacheParameters);
}
// 從圖集中追蹤卡片紋素的輻射度.
{
FLumenCardRadiosityTraceBlocksCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FLumenCardRadiosityTraceBlocksCS::FParameters>();
PassParameters->RWRadiosityAtlas = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(RadiosityAtlas));
PassParameters->RadianceCacheParameters = RadianceCacheParameters;
PassParameters->CardTraceBlockAllocator = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(CardTraceBlockAllocator, PF_R32_UINT));
PassParameters->CardTraceBlockData = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(CardTraceBlockData, PF_R32G32B32A32_UINT));
PassParameters->ProbeOcclusionNormalBias = GLumenRadiosityIrradianceCacheProbeOcclusionNormalBias;
PassParameters->IndirectArgs = TraceBlocksIndirectArgsBuffer;
SetupTraceFromTexelParameters(View, TracingInputs, LumenSceneData, PassParameters->TraceFromTexelParameters);
FLumenCardRadiosityTraceBlocksCS::FPermutationDomain PermutationVector;
PermutationVector.Set<FLumenCardRadiosityTraceBlocksCS::FDynamicSkyLight>(bRenderSkylight);
PermutationVector.Set<FLumenCardRadiosityTraceBlocksCS::FIrradianceCache>(bUseIrradianceCache);
auto ComputeShader = GlobalShaderMap->GetShader< FLumenCardRadiosityTraceBlocksCS >(PermutationVector);
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("TraceFromAtlasTexels: %u Cones", RadiosityDirections.SampleDirections.Num()),
ComputeShader,
PassParameters,
TraceBlocksIndirectArgsBuffer,
0);
}
}
由此可知計算輻射度的過程比較多,包含裁剪、構建追蹤參數、追蹤圖集紋素等:
最后階段的追蹤紋素主要是構造采樣方向,每個采樣方向構建一個椎體(Cone)去追蹤附近的輻射度,它的輸入參數主要有全局距離場圖集、場景深度、場景透明度、場景法線、VoxelLighting等數據:
追蹤卡片紋素所需的數據:左上是全局距離場圖集,右上是場景深度圖集,左下是場景透明度,右下是場景法線。
輸出的是場景輻射度圖集:
對應的CS shader代碼如下:
// Engine\Shaders\Private\Lumen\LumenRadiosity.usf
float ProbeOcclusionNormalBias;
// 用於保持線程組的光照結果, 注意是groupshared的.
groupshared float3 ThreadLighting[THREADGROUP_SIZE];
[numthreads(THREADGROUP_SIZE, 1, 1)]
void LumenCardRadiosityTraceBlocksCS(
uint3 DispatchThreadId : SV_DispatchThreadID,
uint3 GroupThreadId : SV_GroupThreadID)
{
#if IRRADIANCE_CACHE // 輻照度緩存模式
uint ThreadIndex = DispatchThreadId.x;
uint GlobalBlockIndex = ThreadIndex / (CARD_TRACE_BLOCK_SIZE * CARD_TRACE_BLOCK_SIZE);
if (GlobalBlockIndex < CardTraceBlockAllocator[0])
{
// 計算紋素索引.
uint TexelIndexInBlock = ThreadIndex % (CARD_TRACE_BLOCK_SIZE * CARD_TRACE_BLOCK_SIZE);
uint2 TexelOffsetInBlock = uint2(TexelIndexInBlock % CARD_TRACE_BLOCK_SIZE, TexelIndexInBlock / CARD_TRACE_BLOCK_SIZE);
// 獲取追蹤塊數據.
uint4 TraceBlockData = CardTraceBlockData[GlobalBlockIndex];
uint CardId = TraceBlockData.x;
uint ProbeIndex = TraceBlockData.y;
uint BlockIndex = TraceBlockData.z;
// 獲取卡片數據.
FLumenCardData CardData = GetLumenCardData(CardId, CardBuffer);
float2 CardSizeTexels = abs(CardData.LocalExtent.xy * 2 * CardData.LocalPositionToAtlasUVScale * RadiosityAtlasSize);
uint2 NumBlocksXY = ((uint2)CardSizeTexels + CARD_TRACE_BLOCK_SIZE - 1) / CARD_TRACE_BLOCK_SIZE;
uint2 BlockOffset = uint2(BlockIndex % NumBlocksXY.x, BlockIndex / NumBlocksXY.x);
float2 TexelCoord = BlockOffset * CARD_TRACE_BLOCK_SIZE + TexelOffsetInBlock;
if (all(TexelCoord < CardSizeTexels))
{
// 計算卡片UV.
float2 CardUV = (TexelCoord + .5f) / (float2)CardSizeTexels;
float2 CardUVToAtlasScale = GetCardUVToAtlasScale(CardData.LocalPositionToAtlasUVScale, CardData.LocalExtent);
float2 CardUVToAtlasBias = GetCardUVToAtlasBias(CardUVToAtlasScale, CardData.LocalPositionToAtlasUVBias);
float2 AtlasUV = CardUV * CardUVToAtlasScale + CardUVToAtlasBias;
float Opacity = Texture2DSampleLevel(CurrentOpacityAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).x;
float3 DiffuseLighting = 0;
// 透明度大於0的輻射度才有意義.
if (Opacity > 0)
{
float Depth = 1.0f - Texture2DSampleLevel(DepthAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).x;
float3 LocalPosition;
LocalPosition.xy = (AtlasUV - CardData.LocalPositionToAtlasUVBias) / CardData.LocalPositionToAtlasUVScale;
LocalPosition.z = -CardData.LocalExtent.z + Depth * 2 * CardData.LocalExtent.z;
// 計算世界空間的位置和法線.
float3 WorldPosition = mul(CardData.WorldToLocalRotation, LocalPosition) + CardData.Origin;
float3 WorldNormal = normalize(Texture2DSampleLevel(NormalAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).xyz * 2 - 1);
uint ClipmapIndex = GetRadianceProbeClipmap(WorldPosition);
// 計算漫反射光照. 如果裁剪圖有效, 則從中插值獲得.
if (ClipmapIndex < NumRadianceProbeClipmaps)
{
float3 BiasOffset = WorldNormal * ProbeOcclusionNormalBias;
// 從RadianceProbeIndirectionTexture采樣計算漫反射.
DiffuseLighting = SampleIrradianceCacheInterpolated(WorldPosition, WorldNormal, BiasOffset, ClipmapIndex);
}
else // 沒有有效裁剪圖, 從天空光的球諧中計算漫反射.
{
DiffuseLighting = GetSkySHDiffuse(WorldNormal) * View.SkyLightColor.rgb;
}
}
// 存儲輻射度.
uint2 AtlasCoord = uint2(AtlasUV * RadiosityAtlasSize);
RWRadiosityAtlas[AtlasCoord] = float4(DiffuseLighting * PI, 0);
}
}
#else // 非輻照度緩存模式
ThreadLighting[GroupThreadId.x] = 0;
uint ThreadIndex = DispatchThreadId.x;
uint GlobalBlockIndex = ThreadIndex / (CARD_TRACE_BLOCK_SIZE * CARD_TRACE_BLOCK_SIZE * THREADS_PER_RADIOSITY_TEXEL);
int2 AtlasCoord = -1;
if (GlobalBlockIndex < CardTraceBlockAllocator[0])
{
uint TexelIndexInBlock = (ThreadIndex / THREADS_PER_RADIOSITY_TEXEL) % (CARD_TRACE_BLOCK_SIZE * CARD_TRACE_BLOCK_SIZE);
uint2 TexelOffsetInBlock = uint2(TexelIndexInBlock % CARD_TRACE_BLOCK_SIZE, TexelIndexInBlock / CARD_TRACE_BLOCK_SIZE);
uint4 TraceBlockData = CardTraceBlockData[GlobalBlockIndex];
uint CardId = TraceBlockData.x;
uint ProbeIndex = TraceBlockData.y;
uint BlockIndex = TraceBlockData.z;
FLumenCardData CardData = GetLumenCardData(CardId, CardBuffer);
float2 CardSizeTexels = abs(CardData.LocalExtent.xy * 2 * CardData.LocalPositionToAtlasUVScale * RadiosityAtlasSize);
uint2 NumBlocksXY = ((uint2)CardSizeTexels + CARD_TRACE_BLOCK_SIZE - 1) / CARD_TRACE_BLOCK_SIZE;
uint2 BlockOffset = uint2(BlockIndex % NumBlocksXY.x, BlockIndex / NumBlocksXY.x);
float2 TexelCoord = BlockOffset * CARD_TRACE_BLOCK_SIZE + TexelOffsetInBlock;
if (all(TexelCoord < CardSizeTexels))
{
uint TraceThreadIndex = ThreadIndex % THREADS_PER_RADIOSITY_TEXEL;
float2 CardUV = (TexelCoord + .5f) / (float2)CardSizeTexels;
float2 CardUVToAtlasScale = GetCardUVToAtlasScale(CardData.LocalPositionToAtlasUVScale, CardData.LocalExtent);
float2 CardUVToAtlasBias = GetCardUVToAtlasBias(CardUVToAtlasScale, CardData.LocalPositionToAtlasUVBias);
float2 AtlasUV = CardUV * CardUVToAtlasScale + CardUVToAtlasBias;
uint NumTracesPerThread = NumCones / THREADS_PER_RADIOSITY_TEXEL;
uint ConeStartIndex = TraceThreadIndex * NumTracesPerThread;
AtlasCoord = int2(AtlasUV * RadiosityAtlasSize);
// 從卡片紋素追蹤輻射度.
float3 Lighting = RadiosityTraceFromTexel(AtlasUV, AtlasCoord, ProbeIndex, CardData, ConeStartIndex, ConeStartIndex + NumTracesPerThread);
ThreadLighting[GroupThreadId.x] = Lighting;
}
}
// 等待同線程組的其它線程完成計算.
GroupMemoryBarrierWithGroupSync();
uint TraceThreadIndex = ThreadIndex % THREADS_PER_RADIOSITY_TEXEL;
// 疊加同線程組所有線程的光照結果並保存. TraceThreadIndex == 0表明只在每個線程組的第一個線程執行.
if (TraceThreadIndex == 0 && all(AtlasCoord >= 0))
{
float3 Lighting = 0;
for (uint OtherThreadIndex = GroupThreadId.x; OtherThreadIndex < GroupThreadId.x + THREADS_PER_RADIOSITY_TEXEL; OtherThreadIndex += 1)
{
Lighting += ThreadLighting[OtherThreadIndex];
}
RWRadiosityAtlas[AtlasCoord] = float4(Lighting, 0);
}
#endif
}
由此可知,追蹤輻射度時,支持兩種模式:輻照度緩存模式和非輻照度緩存模式。輻照度緩存模式是從3D的RadianceProbeIndirectionTexture采樣、插值計算而得到輻射度,而非輻照度緩存模式是實時追蹤卡片紋素附近的輻射度,再疊加它們的結果,其中用到了RadiosityTraceFromTexel的邏輯如下:
float3 RadiosityTraceFromTexel(float2 AtlasUV, int2 AtlasCoord, uint ProbeIndex, FLumenCardData LumenCardData, uint ConeStartIndex, uint ConeEndIndex)
{
float Opacity = Texture2DSampleLevel(CurrentOpacityAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).x;
float3 Lighting = 0;
if (Opacity > 0)
{
float Depth = 1.0f - Texture2DSampleLevel(DepthAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).x;
// 重建局部位置
float3 LocalPosition;
LocalPosition.xy = (AtlasUV - LumenCardData.LocalPositionToAtlasUVBias) / LumenCardData.LocalPositionToAtlasUVScale;
LocalPosition.z = -LumenCardData.LocalExtent.z + Depth * 2 * LumenCardData.LocalExtent.z;
// 世界空間的位置和法線.
float3 WorldPosition = mul(LumenCardData.WorldToLocalRotation, LocalPosition) + LumenCardData.Origin;
float3 WorldNormal = normalize(Texture2DSampleLevel(NormalAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).xyz * 2 - 1);
//@todo - derive bias from texel world size
WorldPosition += WorldNormal * SurfaceBias;
// 追蹤起點.
float VoxelTraceStartDistance = CalculateVoxelTraceStartDistance(MinTraceDistance, MaxTraceDistance, MaxMeshSDFTraceDistance, false);
// 遍歷所有方向的椎體, 疊加它們的結果.
for (uint ConeIndex = ConeStartIndex; ConeIndex < ConeEndIndex; ConeIndex++)
{
//uint ConeIndex = ConeStartIndex;
float3x3 TangentBasis = GetTangentBasisFrisvad(WorldNormal);
// 計算椎體方向.
#define PRECOMPUTED_SAMPLE_DIRECTIONS 1
#if PRECOMPUTED_SAMPLE_DIRECTIONS // 預計算的方向.
float3 LocalConeDirection = RadiosityConeDirections[ConeIndex].xyz;
float3 WorldConeDirection = mul(LocalConeDirection, TangentBasis);
#else // 非預計算, 直接通過低差異序列生成方向.
uint2 Seed0 = Rand3DPCG16(int3(AtlasCoord + 17, 0)).xy;
float2 E = Hammersley16(ConeIndex, NumCones, Seed0);
float2 DiskE = UniformSampleDiskConcentric(E.xy);
float TangentZ = sqrt(1 - length2(DiskE));
float3 WorldConeDirection = mul(float3(DiskE, TangentZ), TangentBasis);
#endif
//@todo - derive bias from texel world size
// 采樣位置.
float3 SamplePosition = WorldPosition + SurfaceBias * WorldConeDirection;
// 構建椎體追蹤輸入數據.
FConeTraceInput TraceInput;
TraceInput.Setup(SamplePosition, WorldConeDirection, DiffuseConeHalfAngle, MinSampleRadius, MinTraceDistance, MaxTraceDistance, StepFactor);
TraceInput.VoxelStepFactor = VoxelStepFactor;
TraceInput.VoxelTraceStartDistance = VoxelTraceStartDistance;
TraceInput.SDFStepFactor = 1;
// 執行椎體追蹤, 保存結果.
FConeTraceResult TraceResult;
ConeTraceVoxels(TraceInput, TraceResult);
// 用椎體計算天空光的輻射度.
EvaluateSkyRadianceForCone(WorldConeDirection, TraceInput.TanConeAngle, TraceResult);
// 疊加采樣的光照結果.
Lighting += TraceResult.Lighting;
}
}
// 縮放采樣結果, 防止能量不守恆.
Lighting *= PI / (float)NumCones;
return Lighting;
}
上面涉及到了椎體追蹤場景的接口ConeTraceVoxels
就是6.5.6.1 Voxel Cone Tracing提及的方式,代碼如下:
// Engine\Shaders\Private\Lumen\LumenTracingCommon.ush
void ConeTraceVoxels(
FConeTraceInput TraceInput,
inout FConeTraceResult OutResult)
{
FGlobalSDFTraceResult SDFTraceResult;
// 追蹤SDF射線
{
FGlobalSDFTraceInput SDFTraceInput = SetupGlobalSDFTraceInput(TraceInput.ConeOrigin, TraceInput.ConeDirection, TraceInput.MinTraceDistance, TraceInput.MaxTraceDistance, TraceInput.SDFStepFactor, TraceInput.VoxelStepFactor);
SDFTraceInput.bExpandSurfaceUsingRayTimeInsteadOfMaxDistance = TraceInput.bExpandSurfaceUsingRayTimeInsteadOfMaxDistance;
SDFTraceInput.InitialMaxDistance = TraceInput.InitialMaxDistance;
// 追蹤全局距離場.
SDFTraceResult = RayTraceGlobalDistanceField(SDFTraceInput);
}
float4 LightingAndAlpha = float4(0, 0, 0, 1);
// 只有全局距離場命中才執行下面的邏輯.
if (GlobalSDFTraceResultIsHit(SDFTraceResult))
{
float3 SampleWorldPosition = TraceInput.ConeOrigin + TraceInput.ConeDirection * SDFTraceResult.HitTime;
uint VoxelClipmapIndex = 0;
float3 VoxelClipmapCenter = ClipmapWorldCenter[VoxelClipmapIndex].xyz;
float3 VoxelClipmapExtent = ClipmapWorldSamplingExtent[VoxelClipmapIndex].xyz;
bool bOutsideValidRegion = any(SampleWorldPosition > VoxelClipmapCenter + VoxelClipmapExtent || SampleWorldPosition < VoxelClipmapCenter - VoxelClipmapExtent);
// 查找匹配當前步進的椎體寬度的voxel clipmap.
while (bOutsideValidRegion && VoxelClipmapIndex + 1 < NumClipmapLevels)
{
VoxelClipmapIndex++;
VoxelClipmapCenter = ClipmapWorldCenter[VoxelClipmapIndex].xyz;
VoxelClipmapExtent = ClipmapWorldSamplingExtent[VoxelClipmapIndex].xyz;
bOutsideValidRegion = any(SampleWorldPosition > VoxelClipmapCenter + VoxelClipmapExtent || SampleWorldPosition < VoxelClipmapCenter - VoxelClipmapExtent);
}
LightingAndAlpha.xyzw = 0.0f;
// 如果沒有超出有效范圍, 則計算Voxel光照.
if (!bOutsideValidRegion)
{
float3 DistanceFieldGradient = -TraceInput.ConeDirection;
float3 ClipmapVolumeUV = ComputeGlobalUV(SampleWorldPosition, SDFTraceResult.HitClipmapIndex);
uint PageIndex = GetGlobalDistanceFieldPage(ClipmapVolumeUV, SDFTraceResult.HitClipmapIndex);
if (PageIndex < GLOBAL_DISTANCE_FIELD_INVALID_PAGE_ID)
{
float3 PageUV = ComputeGlobalDistanceFieldPageUV(ClipmapVolumeUV, PageIndex);
DistanceFieldGradient = GlobalDistanceFieldPageCentralDiff(PageUV);
}
float DistanceFieldGradientLength = length(DistanceFieldGradient);
float3 SampleNormal = DistanceFieldGradientLength > 0.001 ? DistanceFieldGradient / DistanceFieldGradientLength : -TraceInput.ConeDirection;
// 采樣3D紋理VoxelLighting, 獲得光照.
float4 StepLighting = SampleVoxelLighting(SampleWorldPosition, -SampleNormal, VoxelClipmapIndex);
StepLighting.xyz = StepLighting.xyz * (1.0f / max(StepLighting.w, 0.1));
// 計算自遮擋因子.
float VoxelSelfLightingBias = 1.0f;
if (TraceInput.bExpandSurfaceUsingRayTimeInsteadOfMaxDistance)
{
// 對於漫射光線,最好是過度遮擋, 而不該漏光.
VoxelSelfLightingBias = smoothstep(1.5 * ClipmapVoxelSizeAndRadius[VoxelClipmapIndex].w, 2.0 * ClipmapVoxelSizeAndRadius[VoxelClipmapIndex].w, SDFTraceResult.HitTime);
}
// 獲得自遮擋后的光照結果.
LightingAndAlpha.xyz = StepLighting.xyz * VoxelSelfLightingBias;
}
}
// 根據Opacity過渡光照結果.
LightingAndAlpha = FadeOutVoxelConeTraceMinTransparency(LightingAndAlpha);
// 保存結果.
OutResult = (FConeTraceResult)0;
#if !VISIBILITY_ONLY_TRACE
OutResult.Lighting = LightingAndAlpha.rgb;
#endif
OutResult.Transparency = LightingAndAlpha.a;
OutResult.NumSteps = SDFTraceResult.TotalStepsTaken;
OutResult.OpaqueHitDistance = GlobalSDFTraceResultIsHit(SDFTraceResult) ? SDFTraceResult.HitTime : TraceInput.MaxTraceDistance;
}
上面的椎體追蹤中使用了VoxelLighting的3D紋理,該紋理同時還是Clipmap,筆者所截取的數據中顯示它的維度是64x256x384,並且很多切片(Slice)是黑色的,只有少許是有像素的,且區域很小:
6.5.6.4 CombineLumenSceneLighting
CombineLumenSceneLighting是組合光照,具體邏輯如下:
void CombineLumenSceneLighting(
FScene* Scene,
FViewInfo& View,
FRDGBuilder& GraphBuilder,
TRDGUniformBufferRef<FLumenCardScene> LumenCardSceneUniformBuffer,
FRDGTextureRef FinalLightingAtlas,
FRDGTextureRef OpacityAtlas,
FRDGTextureRef RadiosityAtlas,
FGlobalShaderMap* GlobalShaderMap,
const FLumenCardScatterContext& VisibleCardScatterContext)
{
LLM_SCOPE_BYTAG(Lumen);
FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
{
FLumenCardLightingEmissive* PassParameters = GraphBuilder.AllocParameters<FLumenCardLightingEmissive>();
extern int32 GLumenRadiosityDownsampleFactor;
FVector2D CardUVSamplingOffset = FVector2D::ZeroVector;
if (GLumenRadiosityDownsampleFactor > 1)
{
// Offset bilinear samples in order to not sample outside of the lower res radiosity card bounds
CardUVSamplingOffset.X = (GLumenRadiosityDownsampleFactor * 0.25f) / LumenSceneData.MaxAtlasSize.X;
CardUVSamplingOffset.Y = (GLumenRadiosityDownsampleFactor * 0.25f) / LumenSceneData.MaxAtlasSize.Y;
}
PassParameters->RenderTargets[0] = FRenderTargetBinding(FinalLightingAtlas, ERenderTargetLoadAction::ENoAction);
PassParameters->VS.LumenCardScene = LumenCardSceneUniformBuffer;
PassParameters->VS.CardScatterParameters = VisibleCardScatterContext.Parameters;
PassParameters->VS.ScatterInstanceIndex = 0;
PassParameters->VS.CardUVSamplingOffset = CardUVSamplingOffset;
PassParameters->PS.View = View.ViewUniformBuffer;
PassParameters->PS.LumenCardScene = LumenCardSceneUniformBuffer;
PassParameters->PS.RadiosityAtlas = RadiosityAtlas;
PassParameters->PS.OpacityAtlas = OpacityAtlas;
// 增加光照組合Pass, 用的是傳統的光柵化流程.
GraphBuilder.AddPass(
RDG_EVENT_NAME("LightingCombine"),
PassParameters,
ERDGPassFlags::Raster,
[MaxAtlasSize = Scene->LumenSceneData->MaxAtlasSize, PassParameters, GlobalShaderMap](FRHICommandListImmediate& RHICmdList)
{
FLumenCardLightingInitializePS::FPermutationDomain PermutationVector;
auto PixelShader = GlobalShaderMap->GetShader< FLumenCardLightingInitializePS >(PermutationVector);
DrawQuadsToAtlas(MaxAtlasSize, PixelShader, PassParameters, GlobalShaderMap, TStaticBlendState<>::GetRHI(), RHICmdList);
});
}
}
這個階段是將上一節的場景輻射度圖集作為輸入,然后輸出輸出輻射度顏色到SceneFinalLighting中。
6.5.6.5 RenderDirectLightingForLumenScene
RenderDirectLightingForLumenScene是計算Lumen場景的直接光照,流程有點類似於傳統的光照:
// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneDirectLighting.cpp
void FDeferredShadingSceneRenderer::RenderDirectLightingForLumenScene(
FRDGBuilder& GraphBuilder,
TRDGUniformBufferRef<FLumenCardScene> LumenCardSceneUniformBuffer,
FRDGTextureRef FinalLightingAtlas,
FRDGTextureRef OpacityAtlas,
FGlobalShaderMap* GlobalShaderMap,
const FLumenCardScatterContext& VisibleCardScatterContext)
{
LLM_SCOPE_BYTAG(Lumen);
if (GLumenDirectLighting)
{
RDG_EVENT_SCOPE(GraphBuilder, "DirectLighting");
QUICK_SCOPE_CYCLE_COUNTER(RenderDirectLightingForLumenScene);
const FViewInfo& MainView = Views[0];
FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
const bool bLumenUseHardwareRayTracedShadow = Lumen::UseHardwareRayTracedShadows(MainView);
FLumenDirectLightingHardwareRayTracingData LumenDirectLightingHardwareRayTracingData;
if(bLumenUseHardwareRayTracedShadow)
{
LumenDirectLightingHardwareRayTracingData.Initialize(GraphBuilder, Scene);
}
TArray<const FLightSceneInfo*, TInlineAllocator<64>> GatheredLocalLights;
// 遍歷場景的所有光源.
for (TSparseArray<FLightSceneInfoCompact>::TConstIterator LightIt(Scene->Lights); LightIt; ++LightIt)
{
const FLightSceneInfoCompact& LightSceneInfoCompact = *LightIt;
const FLightSceneInfo* LightSceneInfo = LightSceneInfoCompact.LightSceneInfo;
if (LightSceneInfo->ShouldRenderLightViewIndependent()
&& LightSceneInfo->ShouldRenderLight(MainView, true)
&& LightSceneInfo->Proxy->GetIndirectLightingScale() > 0.0f)
{
const ELightComponentType LightType = (ELightComponentType)LightSceneInfo->Proxy->GetLightType();
// 平行光
if (LightType == LightType_Directional)
{
// 不需要裁剪, 直接繪制.
FString LightNameWithLevel;
FSceneRenderer::GetLightNameForDrawEvent(LightSceneInfo->Proxy, LightNameWithLevel);
// 渲染直接光到Lumen卡片.
RenderDirectLightIntoLumenCards(
GraphBuilder,
Scene,
MainView,
ViewFamily.EngineShowFlags,
VisibleLightInfos,
LumenCardSceneUniformBuffer,
FinalLightingAtlas,
OpacityAtlas,
LightSceneInfo,
LightNameWithLevel,
VisibleCardScatterContext,
0,
LumenDirectLightingHardwareRayTracingData,
VirtualShadowMapArray);
}
else // 非平行光, 收集到GatheredLocalLights.
{
GatheredLocalLights.Add(LightSceneInfo);
}
}
}
const int32 LightBatchSize = FMath::Clamp(GLumenDirectLightingBatchSize, 1, 256);
// 分批的光照裁剪和繪圖
for (int32 LightBatchIndex = 0; LightBatchIndex * LightBatchSize < GatheredLocalLights.Num(); ++LightBatchIndex)
{
const int32 FirstLightIndex = LightBatchIndex * LightBatchSize;
const int32 LastLightIndex = FMath::Min((LightBatchIndex + 1) * LightBatchSize, GatheredLocalLights.Num());
FLumenCardScatterContext CardScatterContext;
{
RDG_EVENT_SCOPE(GraphBuilder, "Cull Cards %d Lights", LastLightIndex - FirstLightIndex);
// 初始化上下文.
CardScatterContext.Init(
GraphBuilder,
MainView,
LumenSceneData,
LumenCardRenderer,
ECullCardsMode::OperateOnSceneForceUpdateForCardsToRender,
LightBatchSize);
// 將卡片裁剪到光源的形狀上.
for (int32 LightIndex = FirstLightIndex; LightIndex < LastLightIndex; ++LightIndex)
{
const int32 ScatterInstanceIndex = LightIndex - FirstLightIndex;
const FLightSceneInfo* LightSceneInfo = GatheredLocalLights[LightIndex];
const ELightComponentType LightType = (ELightComponentType)LightSceneInfo->Proxy->GetLightType();
const FSphere LightBounds = LightSceneInfo->Proxy->GetBoundingSphere();
ECullCardsShapeType ShapeType = ECullCardsShapeType::None;
if (LightType == LightType_Point)
{
ShapeType = ECullCardsShapeType::PointLight;
}
else if (LightType == LightType_Spot)
{
ShapeType = ECullCardsShapeType::SpotLight;
}
else if (LightType == LightType_Rect)
{
ShapeType = ECullCardsShapeType::RectLight;
}
else
{
ensureMsgf(false, TEXT("Need Lumen card culling for new light type"));
}
FCullCardsShapeParameters ShapeParameters;
ShapeParameters.InfluenceSphere = FVector4(LightBounds.Center, LightBounds.W);
ShapeParameters.LightPosition = LightSceneInfo->Proxy->GetPosition();
ShapeParameters.LightDirection = LightSceneInfo->Proxy->GetDirection();
ShapeParameters.LightRadius = LightSceneInfo->Proxy->GetRadius();
ShapeParameters.CosConeAngle = FMath::Cos(LightSceneInfo->Proxy->GetOuterConeAngle());
ShapeParameters.SinConeAngle = FMath::Sin(LightSceneInfo->Proxy->GetOuterConeAngle());
// 根據光源形狀裁剪卡片
CardScatterContext.CullCardsToShape(
GraphBuilder,
MainView,
LumenSceneData,
LumenCardRenderer,
LumenCardSceneUniformBuffer,
ShapeType,
ShapeParameters,
GLumenSceneCardDirectLightingUpdateFrequencyScale,
ScatterInstanceIndex);
}
// 構建散射非直接參數.
CardScatterContext.BuildScatterIndirectArgs(
GraphBuilder,
MainView);
}
// 繪制非平行光的光源.
{
RDG_EVENT_SCOPE(GraphBuilder, "Draw %d Lights", LastLightIndex - FirstLightIndex);
for (int32 LightIndex = FirstLightIndex; LightIndex < LastLightIndex; ++LightIndex)
{
const int32 ScatterInstanceIndex = LightIndex - FirstLightIndex;
const FLightSceneInfo* LightSceneInfo = GatheredLocalLights[LightIndex];
FString LightNameWithLevel;
FSceneRenderer::GetLightNameForDrawEvent(LightSceneInfo->Proxy, LightNameWithLevel);
// 繪制非平行光的光源到Lumen卡片.
RenderDirectLightIntoLumenCards(
GraphBuilder,
Scene,
MainView,
ViewFamily.EngineShowFlags,
VisibleLightInfos,
LumenCardSceneUniformBuffer,
FinalLightingAtlas,
OpacityAtlas,
LightSceneInfo,
LightNameWithLevel,
CardScatterContext,
ScatterInstanceIndex,
LumenDirectLightingHardwareRayTracingData,
VirtualShadowMapArray);
}
}
}
}
}
下面是繪制單個光源RenderDirectLightIntoLumenCards
的代碼:
void RenderDirectLightIntoLumenCards(
FRDGBuilder& GraphBuilder,
const FScene* Scene,
const FViewInfo& View,
const FEngineShowFlags& EngineShowFlags,
TArray<FVisibleLightInfo, SceneRenderingAllocator>& VisibleLightInfos,
TRDGUniformBufferRef<FLumenCardScene> LumenCardSceneUniformBuffer,
FRDGTextureRef FinalLightingAtlas,
FRDGTextureRef OpacityAtlas,
const FLightSceneInfo* LightSceneInfo,
const FString& LightName,
const FLumenCardScatterContext& CardScatterContext,
int32 ScatterInstanceIndex,
FLumenDirectLightingHardwareRayTracingData& LumenDirectLightingHardwareRayTracingData,
const FVirtualShadowMapArray& VirtualShadowMapArray)
{
FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
const FSphere LightBounds = LightSceneInfo->Proxy->GetBoundingSphere();
const ELightComponentType LightType = (ELightComponentType)LightSceneInfo->Proxy->GetLightType();
bool bShadowed = LightSceneInfo->Proxy->CastsDynamicShadow();
// 轉換光源類型.
ELumenLightType LumenLightType = ELumenLightType::MAX;
{
switch (LightType)
{
case LightType_Directional: LumenLightType = ELumenLightType::Directional; break;
case LightType_Point: LumenLightType = ELumenLightType::Point; break;
case LightType_Spot: LumenLightType = ELumenLightType::Spot; break;
case LightType_Rect: LumenLightType = ELumenLightType::Rect; break;
}
check(LumenLightType != ELumenLightType::MAX);
}
// 設置陰影信息.
FVisibleLightInfo& VisibleLightInfo = VisibleLightInfos[LightSceneInfo->Id];
FLumenShadowSetup ShadowSetup = GetShadowForLumenDirectLighting(VisibleLightInfo);
const bool bDynamicallyShadowed = ShadowSetup.DenseShadowMap != nullptr;
FDistanceFieldObjectBufferParameters ObjectBufferParameters = DistanceField::SetupObjectBufferParameters(Scene->DistanceFieldSceneData);
FLightTileIntersectionParameters LightTileIntersectionParameters;
FDistanceFieldCulledObjectBufferParameters CulledObjectBufferParameters;
FMatrix WorldToMeshSDFShadowValue = FMatrix::Identity;
const bool bLumenUseHardwareRayTracedShadow = Lumen::UseHardwareRayTracedShadows(View) && bShadowed;
const bool bTraceMeshSDFs = bShadowed
&& LumenLightType == ELumenLightType::Directional
&& DoesPlatformSupportDistanceFieldShadowing(View.GetShaderPlatform())
&& GLumenDirectLightingOffscreenShadowingTraceMeshSDFs != 0
&& Lumen::UseMeshSDFTracing()
&& ObjectBufferParameters.NumSceneObjects > 0;
// 處理虛擬陰影圖ID.
int32 VirtualShadowMapId = -1;
if (bDynamicallyShadowed
&& !bLumenUseHardwareRayTracedShadow
&& GLumenDirectLightingVirtualShadowMap != 0
&& VirtualShadowMapArray.IsAllocated())
{
if (LightType == LightType_Directional)
{
VirtualShadowMapId = VisibleLightInfo.VirtualShadowMapClipmaps[0]->GetVirtualShadowMap()->ID;
}
else if (ShadowSetup.VirtualShadowMap)
{
VirtualShadowMapId = ShadowSetup.VirtualShadowMap->VirtualShadowMaps[0]->ID;
}
}
const bool bUseVirtualShadowMap = VirtualShadowMapId >= 0;
if (!bUseVirtualShadowMap)
{
// Fallback to a complete shadow map
ShadowSetup.VirtualShadowMap = nullptr;
ShadowSetup.DenseShadowMap = GetShadowForInjectionIntoVolumetricFog(VisibleLightInfo);
}
if (bLumenUseHardwareRayTracedShadow)
{
RenderHardwareRayTracedShadowIntoLumenCards(
GraphBuilder, Scene, View, LumenCardSceneUniformBuffer, OpacityAtlas,
LightSceneInfo, LightName, CardScatterContext, ScatterInstanceIndex,
LumenDirectLightingHardwareRayTracingData, bDynamicallyShadowed, LumenLightType);
}
else if (bTraceMeshSDFs)
{
CullMeshSDFsForLightCards(GraphBuilder, Scene, View, LightSceneInfo, ObjectBufferParameters, WorldToMeshSDFShadowValue, CulledObjectBufferParameters, LightTileIntersectionParameters);
}
FLumenCardDirectLighting* PassParameters = GraphBuilder.AllocParameters<FLumenCardDirectLighting>();
{
PassParameters->RenderTargets[0] = FRenderTargetBinding(FinalLightingAtlas, ERenderTargetLoadAction::ELoad);
PassParameters->VS.InfluenceSphere = FVector4(LightBounds.Center, LightBounds.W);
PassParameters->VS.LumenCardScene = LumenCardSceneUniformBuffer;
PassParameters->VS.CardScatterParameters = CardScatterContext.Parameters;
PassParameters->VS.ScatterInstanceIndex = ScatterInstanceIndex;
PassParameters->VS.CardUVSamplingOffset = FVector2D::ZeroVector;
// 獲取體積陰影shader參數.
GetVolumeShadowingShaderParameters(
GraphBuilder,
View,
LightSceneInfo,
ShadowSetup.DenseShadowMap,
0,
bDynamicallyShadowed,
PassParameters->PS.VolumeShadowingShaderParameters);
// 光源全局緩沖.
FDeferredLightUniformStruct DeferredLightUniforms = GetDeferredLightParameters(View, *LightSceneInfo);
if (LightSceneInfo->Proxy->IsInverseSquared())
{
DeferredLightUniforms.LightParameters.FalloffExponent = 0;
}
PassParameters->PS.View = View.ViewUniformBuffer;
PassParameters->PS.LumenCardScene = LumenCardSceneUniformBuffer;
PassParameters->PS.OpacityAtlas = OpacityAtlas;
DeferredLightUniforms.LightParameters.Color *= LightSceneInfo->Proxy->GetIndirectLightingScale();
PassParameters->PS.DeferredLightUniforms = CreateUniformBufferImmediate(DeferredLightUniforms, UniformBuffer_SingleDraw);
PassParameters->PS.ForwardLightData = View.ForwardLightingResources->ForwardLightDataUniformBuffer;
SetupLightFunctionParameters(LightSceneInfo, 1.0f, PassParameters->PS.LightFunctionParameters);
PassParameters->PS.VirtualShadowMapId = VirtualShadowMapId;
if (bUseVirtualShadowMap)
{
PassParameters->PS.VirtualShadowMapSamplingParameters = VirtualShadowMapArray.GetSamplingParameters(GraphBuilder);
}
PassParameters->PS.ObjectBufferParameters = ObjectBufferParameters;
PassParameters->PS.CulledObjectBufferParameters = CulledObjectBufferParameters;
PassParameters->PS.LightTileIntersectionParameters = LightTileIntersectionParameters;
FDistanceFieldAtlasParameters DistanceFieldAtlasParameters = DistanceField::SetupAtlasParameters(Scene->DistanceFieldSceneData);
// 距離場圖集
PassParameters->PS.DistanceFieldAtlasParameters = DistanceFieldAtlasParameters;
PassParameters->PS.WorldToShadow = WorldToMeshSDFShadowValue;
extern float GTwoSidedMeshDistanceBias;
PassParameters->PS.TwoSidedMeshDistanceBias = GTwoSidedMeshDistanceBias;
PassParameters->PS.TanLightSourceAngle = FMath::Tan(LightSceneInfo->Proxy->GetLightSourceAngle());
PassParameters->PS.MaxTraceDistance = GOffscreenShadowingMaxTraceDistance;
PassParameters->PS.StepFactor = FMath::Clamp(GOffscreenShadowingTraceStepFactor, .1f, 10.0f);
PassParameters->PS.SurfaceBias = FMath::Clamp(GShadowingSurfaceBias, .01f, 100.0f);
PassParameters->PS.SlopeScaledSurfaceBias = FMath::Clamp(GShadowingSlopeScaledSurfaceBias, .01f, 100.0f);
PassParameters->PS.SDFSurfaceBiasScale = FMath::Clamp(GOffscreenShadowingSDFSurfaceBiasScale, .01f, 100.0f);
PassParameters->PS.VirtualShadowMapSurfaceBias = FMath::Clamp(GLumenDirectLightingVirtualShadowMapBias, .01f, 100.0f);
PassParameters->PS.ForceOffscreenShadowing = GLumenDirectLightingForceOffscreenShadowing;
if (bLumenUseHardwareRayTracedShadow)
{
PassParameters->PS.ShadowMaskAtlas = LumenDirectLightingHardwareRayTracingData.ShadowMaskAtlas;
}
// IES
{
FTexture* IESTextureResource = LightSceneInfo->Proxy->GetIESTextureResource();
if (View.Family->EngineShowFlags.TexturedLightProfiles && IESTextureResource)
{
PassParameters->PS.UseIESProfile = 1;
PassParameters->PS.IESTexture = IESTextureResource->TextureRHI;
}
else
{
PassParameters->PS.UseIESProfile = 0;
PassParameters->PS.IESTexture = GWhiteTexture->TextureRHI;
}
PassParameters->PS.IESTextureSampler = TStaticSamplerState<SF_Bilinear,AM_Clamp,AM_Clamp,AM_Clamp>::GetRHI();
}
}
FRasterizeToCardsVS::FPermutationDomain VSPermutationVector;
VSPermutationVector.Set< FRasterizeToCardsVS::FClampToInfluenceSphere >(LightType != LightType_Directional);
auto VertexShader = View.ShaderMap->GetShader<FRasterizeToCardsVS>(VSPermutationVector);
const FMaterialRenderProxy* LightFunctionMaterialProxy = LightSceneInfo->Proxy->GetLightFunctionMaterial();
bool bUseLightFunction = true;
if (!LightFunctionMaterialProxy
|| !LightFunctionMaterialProxy->GetIncompleteMaterialWithFallback(Scene->GetFeatureLevel()).IsLightFunction()
|| !EngineShowFlags.LightFunctions)
{
bUseLightFunction = false;
LightFunctionMaterialProxy = UMaterial::GetDefaultMaterial(MD_LightFunction)->GetRenderProxy();
}
const bool bUseCloudTransmittance = SetupLightCloudTransmittanceParameters(Scene, View, GLumenDirectLightingCloudTransmittance != 0 ? LightSceneInfo : nullptr, PassParameters->PS.LightCloudTransmittanceParameters);
// 設置排列.
FLumenCardDirectLightingPS::FPermutationDomain PermutationVector;
PermutationVector.Set< FLumenCardDirectLightingPS::FLightType >(LumenLightType);
PermutationVector.Set< FLumenCardDirectLightingPS::FDynamicallyShadowed >(bDynamicallyShadowed);
PermutationVector.Set< FLumenCardDirectLightingPS::FShadowed >(bShadowed);
PermutationVector.Set< FLumenCardDirectLightingPS::FTraceMeshSDFs >(bTraceMeshSDFs);
PermutationVector.Set< FLumenCardDirectLightingPS::FVirtualShadowMap >(bUseVirtualShadowMap);
PermutationVector.Set< FLumenCardDirectLightingPS::FLightFunction >(bUseLightFunction);
PermutationVector.Set< FLumenCardDirectLightingPS::FRayTracingShadowPassCombine>(bLumenUseHardwareRayTracedShadow);
PermutationVector.Set< FLumenCardDirectLightingPS::FCloudTransmittance >(bUseCloudTransmittance);
PermutationVector = FLumenCardDirectLightingPS::RemapPermutation(PermutationVector);
const FMaterial& Material = LightFunctionMaterialProxy->GetMaterialWithFallback(Scene->GetFeatureLevel(), LightFunctionMaterialProxy);
const FMaterialShaderMap* MaterialShaderMap = Material.GetRenderingThreadShaderMap();
auto PixelShader = MaterialShaderMap->GetShader<FLumenCardDirectLightingPS>(PermutationVector);
ClearUnusedGraphResources(PixelShader, &PassParameters->PS);
const uint32 CardIndirectArgOffset = CardScatterContext.GetIndirectArgOffset(ScatterInstanceIndex);
// 光照繪制Pass.
GraphBuilder.AddPass(
RDG_EVENT_NAME("%s %s", *LightName, bDynamicallyShadowed ? TEXT("Shadowmap") : TEXT("")),
PassParameters,
ERDGPassFlags::Raster,
[MaxAtlasSize = LumenSceneData.MaxAtlasSize, PassParameters, LightSceneInfo, VertexShader, PixelShader, GlobalShaderMap = View.ShaderMap, LightFunctionMaterialProxy, &Material, &View, CardIndirectArgOffset](FRHICommandListImmediate& RHICmdList)
{
DrawQuadsToAtlas(
MaxAtlasSize,
VertexShader,
PixelShader,
PassParameters,
GlobalShaderMap,
TStaticBlendState<CW_RGBA, BO_Add, BF_One, BF_One>::GetRHI(),
RHICmdList,
[LightFunctionMaterialProxy, &Material, &View](FRHICommandListImmediate& RHICmdList, TShaderRefBase<FLumenCardDirectLightingPS, FShaderMapPointerTable> Shader, FRHIPixelShader* ShaderRHI, const FLumenCardDirectLightingPS::FParameters& Parameters)
{
Shader->SetParameters(RHICmdList, ShaderRHI, LightFunctionMaterialProxy, Material, View);
},
CardIndirectArgOffset);
});
}
直接光照被截幀后的流程如下所示:
光照計算過程中輸入的紋理數據根據光源類型有所不同,但所有光源類型都會輸入深度、法線、Opacity等數據,不同的是局部光源(非平行光)會輸入距離場相關紋理和16x16x16的Perlin噪點3D紋理,而平行光會輸入128x128x128的3D材質VolumeTexture(下圖是切片0放大4倍后的效果):
經過光照計算后輸出如下所示的結果:
直接光照計算使用的PS如下所示:
// Engine\Shaders\Private\Lumen\LumenSceneDirectLighting.usf
void LumenCardDirectLightingPS(
FCardVSToPS CardInterpolants,
out float4 OutColor : SV_Target0)
{
float Opacity = Texture2DSampleLevel(OpacityAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0).x;
float3 Irradiance = 0;
if (Opacity > 0)
{
// 構建光源數據.
FDeferredLightData LightData;
{
LightData.Position = DeferredLightUniforms.Position;
LightData.InvRadius = DeferredLightUniforms.InvRadius;
LightData.Color = DeferredLightUniforms.Color;
LightData.FalloffExponent = DeferredLightUniforms.FalloffExponent;
LightData.Direction = DeferredLightUniforms.Direction;
LightData.Tangent = DeferredLightUniforms.Tangent;
LightData.SpotAngles = DeferredLightUniforms.SpotAngles;
LightData.SourceRadius = DeferredLightUniforms.SourceRadius;
LightData.SourceLength = DeferredLightUniforms.SourceLength;
LightData.SoftSourceRadius = DeferredLightUniforms.SoftSourceRadius;
LightData.SpecularScale = DeferredLightUniforms.SpecularScale;
LightData.ContactShadowLength = abs(DeferredLightUniforms.ContactShadowLength);
LightData.ContactShadowLengthInWS = DeferredLightUniforms.ContactShadowLength < 0.0f;
LightData.DistanceFadeMAD = DeferredLightUniforms.DistanceFadeMAD;
LightData.ShadowMapChannelMask = DeferredLightUniforms.ShadowMapChannelMask;
LightData.ShadowedBits = DeferredLightUniforms.ShadowedBits;
LightData.RectLightBarnCosAngle = DeferredLightUniforms.RectLightBarnCosAngle;
LightData.RectLightBarnLength = DeferredLightUniforms.RectLightBarnLength;
LightData.bInverseSquared = LightData.FalloffExponent == 0.0f;
LightData.bRadialLight = LIGHT_TYPE != LIGHT_TYPE_DIRECTIONAL;
LightData.bSpotLight = LIGHT_TYPE == LIGHT_TYPE_SPOT;
LightData.bRectLight = LIGHT_TYPE == LIGHT_TYPE_RECT;
}
// 獲取Lumen卡片數據.
FLumenCardData LumenCardData = GetLumenCardData(CardInterpolants.CardId);
float Depth = 1.0f - Texture2DSampleLevel(LumenCardScene.DepthAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0).x;
// 計算位置.
float3 LocalPosition;
LocalPosition.xy = (CardInterpolants.AtlasCoord - LumenCardData.LocalPositionToAtlasUVBias) / LumenCardData.LocalPositionToAtlasUVScale;
LocalPosition.z = -LumenCardData.LocalExtent.z + Depth * 2 * LumenCardData.LocalExtent.z;
float3 WorldPosition = mul(LumenCardData.WorldToLocalRotation, LocalPosition) + LumenCardData.Origin;
float3 LightColor = DeferredLightUniforms.Color;
float3 L = LightData.Direction;
float3 ToLight = L;
// 計算光源衰減.
#if LIGHT_TYPE == LIGHT_TYPE_DIRECTIONAL
float CombinedAttenuation = 1;
#else
float LightMask = 1;
if (LightData.bRadialLight)
{
LightMask = GetLocalLightAttenuation(WorldPosition, LightData, ToLight, L);
}
float Attenuation;
if (LightData.bRectLight)
{
FRect Rect = GetRect(ToLight, LightData);
FRectTexture RectTexture = InitRectTexture(DeferredLightUniforms.SourceTexture);
Attenuation = IntegrateLight(Rect, RectTexture);
}
else
{
FCapsuleLight Capsule = GetCapsule(ToLight, LightData);
Capsule.DistBiasSqr = 0;
Attenuation = IntegrateLight(Capsule, LightData.bInverseSquared);
}
float CombinedAttenuation = Attenuation * LightMask;
#endif
if (CombinedAttenuation > 0)
{
float3 WorldNormal = Texture2DSampleLevel(LumenCardScene.NormalAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0).xyz * 2 - 1;
// 面向光源的表面才計算光源.
if (dot(WorldNormal, L) > 0)
{
float ShadowFactor = 1.0f;
#if SHADOWED_LIGHT // 帶陰影
{
// 硬件光追陰影
#if HARDWARE_RAYTRACING_SHADOW_PASS_COMBINE
{
float2 AtlasTextureSize = LumenCardScene.AtlasSize;
uint2 Pos2D = CardInterpolants.AtlasCoord * AtlasTextureSize.xy - float2(0.5, 0.5) / AtlasTextureSize.xy;
ShadowFactor = ShadowMaskAtlas.Load(uint3(Pos2D, 0));
}
#else // 非硬件光追陰影
{
bool bShadowFactorComplete = false;
bool bVSMValid = false;
// 使用虛擬陰影圖
#if VIRTUAL_SHADOW_MAP
{
// Bias only ray start to maximize chances of hitting an allocated page
FVirtualShadowMapSampleResult VirtualShadowMapSample = SampleVirtualShadowMap(VirtualShadowMapId, WorldPosition, VirtualShadowMapSurfaceBias, WorldNormal);
bVSMValid = VirtualShadowMapSample.bValid;
bShadowFactorComplete = VirtualShadowMapSample.bValid && VirtualShadowMapSample.bOccluded;
ShadowFactor = VirtualShadowMapSample.ShadowFactor;
}
#endif
// 計算陰影強度ShadowFactor.
if (!bShadowFactorComplete)
{
float3 WorldPositionForShadowing = GetWorldPositionForShadowing(WorldPosition, L, WorldNormal, 1.0f);
#if LIGHT_TYPE == LIGHT_TYPE_DIRECTIONAL
{
#if DYNAMICALLY_SHADOWED
float SceneDepth = dot(WorldPositionForShadowing - View.WorldCameraOrigin, View.ViewForward);
bool bShadowingFromValidUVArea = false;
float NewShadowFactor = ComputeDirectionalLightDynamicShadowing(WorldPositionForShadowing, SceneDepth, bShadowingFromValidUVArea);
float4 PostProjectionPosition = mul(float4(WorldPosition, 1.0), View.WorldToClip);
// CSM's are culled so only query points inside the view are valid
float2 ValidTexelSize = float2(length(ddx(WorldPosition)), length(ddy(WorldPosition))) * 2;
if (bShadowingFromValidUVArea && all(PostProjectionPosition.xy - ValidTexelSize < PostProjectionPosition.w&& PostProjectionPosition.xy + ValidTexelSize > -PostProjectionPosition.w))
{
ShadowFactor *= NewShadowFactor;
bShadowFactorComplete = VIRTUAL_SHADOW_MAP ? bVSMValid : true;
}
#endif
}
#else
{
bool bShadowingFromValidUVArea = false;
float NewShadowFactor = ComputeVolumeShadowing(WorldPositionForShadowing, LightData.bRadialLight && !LightData.bSpotLight, LightData.bSpotLight, bShadowingFromValidUVArea);
if (bShadowingFromValidUVArea)
{
ShadowFactor *= NewShadowFactor;
bShadowFactorComplete = VIRTUAL_SHADOW_MAP ? bVSMValid : true;
}
}
#endif
}
// 處理離屏陰影.
bool bOffscreenShadowing = !bShadowFactorComplete;
if (ForceOffscreenShadowing != 0)
{
ShadowFactor = 1.0;
bOffscreenShadowing = true;
}
if (bOffscreenShadowing)
{
ShadowFactor *= TraceOffscreenShadows(WorldPosition, L, ToLight, WorldNormal);
}
}
#endif // End hardware/software shadow selection
}
#endif // End ShadowLight
// 光照圖
#if LIGHT_FUNCTION
ShadowFactor *= GetLightFunction(WorldPosition);
#endif
// 雲體透射
#if USE_CLOUD_TRANSMITTANCE
{
float OutOpticalDepth = 0.0f;
ShadowFactor *= lerp(1.0f, GetCloudVolumetricShadow(WorldPosition, CloudShadowmapWorldToLightClipMatrix, CloudShadowmapFarDepthKm, CloudShadowmapTexture, CloudShadowmapSampler, OutOpticalDepth), CloudShadowmapStrength);
}
#endif
// IES
if (UseIESProfile > 0)
{
ShadowFactor *= ComputeLightProfileMultiplier(WorldPosition, DeferredLightUniforms.Position, -DeferredLightUniforms.Direction, DeferredLightUniforms.Tangent);
}
// 最終輻照度
float NoL = saturate(dot(WorldNormal, L));
Irradiance = LightColor * (CombinedAttenuation * NoL * ShadowFactor);
//Irradiance = bShadowFactorValid ? float3(0, 1, 0) : float3(0.2f, 0.0f, 0.0f);
}
}
}
OutColor = float4(Irradiance, 0);
}
6.5.6.6 PrefilterLumenSceneLighting
這個過程類似於6.5.6.1 Voxel Cone Tracing提及的Geometry Prefiltering:
// Engine\Source\Runtime\Renderer\Private\Lumen\LumenScenePrefilter.cpp
void FDeferredShadingSceneRenderer::PrefilterLumenSceneLighting(
FRDGBuilder& GraphBuilder,
const FViewInfo& View,
FLumenCardTracingInputs& TracingInputs,
FGlobalShaderMap* GlobalShaderMap,
const FLumenCardScatterContext& VisibleCardScatterContext)
{
LLM_SCOPE_BYTAG(Lumen);
RDG_EVENT_SCOPE(GraphBuilder, "Prefilter");
FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
// 根據分辨率計算Mip的數量.
const int32 NumMips = FMath::CeilLogTwo(FMath::Max(LumenSceneData.MaxAtlasSize.X, LumenSceneData.MaxAtlasSize.Y)) + 1;
{
FIntPoint SrcSize = LumenSceneData.MaxAtlasSize;
FIntPoint DestSize = SrcSize / 2;
// 循環Mip數量-1次(第0級就是初始紋理本身), 每次生成一個MIP.
for (int32 MipIndex = 1; MipIndex < NumMips; MipIndex++)
{
SrcSize.X = FMath::Max(SrcSize.X, 1);
SrcSize.Y = FMath::Max(SrcSize.Y, 1);
DestSize.X = FMath::Max(DestSize.X, 1);
DestSize.Y = FMath::Max(DestSize.Y, 1);
FLumenCardPrefilterLighting* PassParameters = GraphBuilder.AllocParameters<FLumenCardPrefilterLighting>();
// 設置渲染目標, 最多3個: 最終光照圖集, 輻照度圖集, 非直接輻照度圖集.
PassParameters->RenderTargets[0] = FRenderTargetBinding(TracingInputs.FinalLightingAtlas, ERenderTargetLoadAction::ENoAction, MipIndex);
bool bUseIrradianceAtlas = Lumen::UseIrradianceAtlas(View);
bool bUseIndirectIrradianceAtlas = Lumen::UseIndirectIrradianceAtlas(View);
if (bUseIrradianceAtlas)
{
PassParameters->RenderTargets[1] = FRenderTargetBinding(TracingInputs.IrradianceAtlas, ERenderTargetLoadAction::ENoAction, MipIndex);
if (bUseIndirectIrradianceAtlas)
{
PassParameters->RenderTargets[2] = FRenderTargetBinding(TracingInputs.IndirectIrradianceAtlas, ERenderTargetLoadAction::ENoAction, MipIndex);
}
}
else if (bUseIndirectIrradianceAtlas)
{
PassParameters->RenderTargets[1] = FRenderTargetBinding(TracingInputs.IndirectIrradianceAtlas, ERenderTargetLoadAction::ENoAction, MipIndex);
}
PassParameters->VS.LumenCardScene = TracingInputs.LumenCardSceneUniformBuffer;
PassParameters->VS.CardScatterParameters = VisibleCardScatterContext.Parameters;
PassParameters->VS.ScatterInstanceIndex = 0;
PassParameters->VS.CardUVSamplingOffset = FVector2D::ZeroVector;
PassParameters->PS.View = View.ViewUniformBuffer;
PassParameters->PS.LumenCardScene = TracingInputs.LumenCardSceneUniformBuffer;
PassParameters->PS.ParentFinalLightingAtlas = GraphBuilder.CreateSRV(FRDGTextureSRVDesc::CreateForMipLevel(TracingInputs.FinalLightingAtlas, MipIndex - 1));
// 注意創建SRV使用的是CreateForMipLevel.
if (bUseIrradianceAtlas)
{
PassParameters->PS.ParentIrradianceAtlas = GraphBuilder.CreateSRV(FRDGTextureSRVDesc::CreateForMipLevel(TracingInputs.IrradianceAtlas, MipIndex - 1));
}
if (bUseIndirectIrradianceAtlas)
{
PassParameters->PS.ParentIndirectIrradianceAtlas = GraphBuilder.CreateSRV(FRDGTextureSRVDesc::CreateForMipLevel(TracingInputs.IndirectIrradianceAtlas, MipIndex - 1));
}
PassParameters->PS.InvSize = FVector2D(1.0f / SrcSize.X, 1.0f / SrcSize.Y);
FScene* LocalScene = Scene;
// 增加預過濾Pass.
GraphBuilder.AddPass(
RDG_EVENT_NAME("PrefilterMip"),
PassParameters,
ERDGPassFlags::Raster,
[LocalScene, PassParameters, DestSize, GlobalShaderMap, bUseIrradianceAtlas, bUseIndirectIrradianceAtlas](FRHICommandListImmediate& RHICmdList)
{
FLumenCardPrefilterLightingPS::FPermutationDomain PermutationVector;
PermutationVector.Set<FLumenCardPrefilterLightingPS::FUseIrradianceAtlas>(bUseIrradianceAtlas != 0);
PermutationVector.Set<FLumenCardPrefilterLightingPS::FUseIndirectIrradianceAtlas>(bUseIndirectIrradianceAtlas != 0);
auto PixelShader = GlobalShaderMap->GetShader< FLumenCardPrefilterLightingPS >(PermutationVector);
DrawQuadsToAtlas(DestSize, PixelShader, PassParameters, GlobalShaderMap, TStaticBlendState<>::GetRHI(), RHICmdList);
});
SrcSize /= 2;
DestSize /= 2;
}
}
}
使用的Shader如下:
// Engine\Shaders\Private\Lumen\LumenSceneLighting.usf
Texture2D ParentFinalLightingAtlas;
Texture2D ParentIrradianceAtlas;
Texture2D ParentIndirectIrradianceAtlas;
void LumenCardPrefilterLightingPS(
FCardVSToPS CardInterpolants,
out float4 OutLighting : SV_Target0,
out float4 OutColor1 : SV_Target1,
out float4 OutColor2 : SV_Target2)
{
// 直接使用雙線性過濾獲得該MIP層級的顏色, 並沒有像6.5.6.1節使用高斯權重.
OutLighting = Texture2DSampleLevel(ParentFinalLightingAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0);
#if USE_IRRADIANCE_ATLAS
OutColor1 = Texture2DSampleLevel(ParentIrradianceAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0);
#if USE_INDIRECTIRRADIANCE_ATLAS
OutColor2 = Texture2DSampleLevel(ParentIndirectIrradianceAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0);
#endif
#elif USE_INDIRECTIRRADIANCE_ATLAS
OutColor1 = Texture2DSampleLevel(ParentIndirectIrradianceAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0);
#endif
}
從截幀可看到,紋理的MIP層級和PrefilterMip的Pass數量一致:
6.5.6.7 ComputeLumenSceneVoxelLighting
ComputeLumenSceneVoxelLighting的主要作用是計算Lumen場景的Voxel光照,代碼如下:
// Engine\Source\Runtime\Renderer\Private\Lumen\LumenVoxelLighting.cpp
void FDeferredShadingSceneRenderer::ComputeLumenSceneVoxelLighting(
FRDGBuilder& GraphBuilder,
FLumenCardTracingInputs& TracingInputs,
FGlobalShaderMap* GlobalShaderMap)
{
LLM_SCOPE_BYTAG(Lumen);
const FViewInfo& View = Views[0];
const int32 ClampedNumClipmapLevels = GetNumLumenVoxelClipmaps();
const FIntVector ClipmapResolution = GetClipmapResolution();
bool bForceFullUpdate = GLumenSceneVoxelLightingForceFullUpdate != 0;
// 處理體素光照3D紋理.
FRDGTextureRef VoxelLighting = TracingInputs.VoxelLighting;
{
FRDGTextureDesc LightingDesc(FRDGTextureDesc::Create3D(
FIntVector(
ClipmapResolution.X,
ClipmapResolution.Y * ClampedNumClipmapLevels,
ClipmapResolution.Z * GNumVoxelDirections),
PF_FloatRGBA,
FClearValueBinding::Black,
TexCreate_ShaderResource | TexCreate_UAV | TexCreate_3DTiling));
if (!VoxelLighting || VoxelLighting->Desc != LightingDesc)
{
bForceFullUpdate = true;
VoxelLighting = GraphBuilder.CreateTexture(LightingDesc, TEXT("Lumen.VoxelLighting"));
}
}
// 處理可見性紋理.
FRDGTextureRef VoxelVisBuffer = View.ViewState->Lumen.VoxelVisBuffer ? GraphBuilder.RegisterExternalTexture(View.ViewState->Lumen.VoxelVisBuffer) : nullptr;
{
FRDGTextureDesc VoxelVisBufferDesc(FRDGTextureDesc::Create3D(
FIntVector(
ClipmapResolution.X,
ClipmapResolution.Y * ClampedNumClipmapLevels,
ClipmapResolution.Z * GNumVoxelDirections),
PF_R32_UINT,
FClearValueBinding::Black,
TexCreate_ShaderResource | TexCreate_UAV | TexCreate_3DTiling));
if (!VoxelVisBuffer
|| VoxelVisBuffer->Desc.Extent != VoxelVisBufferDesc.Extent
|| VoxelVisBuffer->Desc.Depth != VoxelVisBufferDesc.Depth)
{
bForceFullUpdate = true;
VoxelVisBuffer = GraphBuilder.CreateTexture(VoxelVisBufferDesc, TEXT("Lumen.VoxelVisBuffer"));
uint32 VisBufferClearValue[4] = { 0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF };
AddClearUAVPass(GraphBuilder, GraphBuilder.CreateUAV(VoxelVisBuffer), VisBufferClearValue);
}
}
// 可見性緩沖區數據僅對特定場景有效,如果場景發生變化需要重新創建.
if (View.ViewState->Lumen.VoxelVisBufferCachedScene != Scene)
{
bForceFullUpdate = true;
View.ViewState->Lumen.VoxelVisBufferCachedScene = Scene;
}
// 處理需要更新的Clipmap.
TArray<int32, SceneRenderingAllocator> ClipmapsToUpdate;
ClipmapsToUpdate.Empty(ClampedNumClipmapLevels);
for (int32 ClipmapIndex = 0; ClipmapIndex < ClampedNumClipmapLevels; ClipmapIndex++)
{
if (bForceFullUpdate || ShouldUpdateVoxelClipmap(ClipmapIndex, ClampedNumClipmapLevels, View.ViewState->GetFrameIndex()))
{
ClipmapsToUpdate.Add(ClipmapIndex);
}
}
ensureMsgf(bForceFullUpdate || ClipmapsToUpdate.Num() <= 1, TEXT("Tweak ShouldUpdateVoxelClipmap for better clipmap update distribution"));
FString ClipmapsToUpdateString;
for (int32 ToUpdateIndex = 0; ToUpdateIndex < ClipmapsToUpdate.Num(); ++ToUpdateIndex)
{
ClipmapsToUpdateString += FString::FromInt(ClipmapsToUpdate[ToUpdateIndex]);
if (ToUpdateIndex + 1 < ClipmapsToUpdate.Num())
{
ClipmapsToUpdateString += TEXT(",");
}
}
RDG_EVENT_SCOPE(GraphBuilder, "VoxelizeCards Clipmaps=[%s]", *ClipmapsToUpdateString);
// 更新並體素化可見性緩沖.
if (ClipmapsToUpdate.Num() > 0)
{
TracingInputs.VoxelLighting = VoxelLighting;
TracingInputs.VoxelGridResolution = GetClipmapResolution();
TracingInputs.NumClipmapLevels = ClampedNumClipmapLevels;
// 更新可見性緩沖
UpdateVoxelVisBuffer(GraphBuilder, Scene, View, TracingInputs, VoxelVisBuffer, ClipmapsToUpdate, bForceFullUpdate);
// 體素化可見性緩沖
VoxelizeVisBuffer(View, Scene, TracingInputs, VoxelLighting, VoxelVisBuffer, ClipmapsToUpdate, GraphBuilder);
ConvertToExternalTexture(GraphBuilder, VoxelLighting, View.ViewState->Lumen.VoxelLighting);
View.ViewState->Lumen.VoxelGridResolution = TracingInputs.VoxelGridResolution;
View.ViewState->Lumen.NumClipmapLevels = TracingInputs.NumClipmapLevels;
}
ConvertToExternalTexture(GraphBuilder, VoxelVisBuffer, View.ViewState->Lumen.VoxelVisBuffer);
}
上面涉及了更新和體素化可見性緩存,其具體的代碼不再分析,但截幀的過程如下所示:
其中UpdateVoxelVisBuffer過程的最后階段VoxelTraceCS的輸入是距離場塊3D紋理,輸出是VoxelVisBuffer的3D紋理:
而VoxelizeVoxelVisBuffer過程的最后階段VisBufferShading的輸入有SceneFinalLighting、SceneOpacity、SceneDepth、距離場塊3D紋理和VoxelVisBuffer,輸出是VoxelLighting3D紋理,此階段之后,Lumen場景的光照信息已經存儲在體素化后的3D紋理中了:
6.5.7 Lumen非直接光照
6.5.7.1 RenderDiffuseIndirectAndAmbientOcclusion
此階段就是利用之前Lumen計算生成的信息計算最終的非直接光照,以模擬全局光照效果,它的過程如下所示:
可知有SSGI降噪、屏幕空間探針收集、反射以及非直接光組合等幾個階段。對應的源碼RenderDiffuseIndirectAndAmbientOcclusion
如下:
// Engine\Source\Runtime\Renderer\Private\IndirectLightRendering.cpp
oid FDeferredShadingSceneRenderer::RenderDiffuseIndirectAndAmbientOcclusion(
FRDGBuilder& GraphBuilder,
FSceneTextures& SceneTextures,
FRDGTextureRef LightingChannelsTexture,
bool bIsVisualizePass)
{
using namespace HybridIndirectLighting;
if (ViewFamily.EngineShowFlags.VisualizeLumenIndirectDiffuse != bIsVisualizePass)
{
return;
}
RDG_EVENT_SCOPE(GraphBuilder, "DiffuseIndirectAndAO");
FSceneTextureParameters SceneTextureParameters = GetSceneTextureParameters(GraphBuilder, SceneTextures.UniformBuffer);
FRDGTextureRef SceneColorTexture = SceneTextures.Color.Target;
const FRDGSystemTextures& SystemTextures = FRDGSystemTextures::Get(GraphBuilder);
// 每個view都需要單獨計算一次.
for (FViewInfo& View : Views)
{
RDG_GPU_MASK_SCOPE(GraphBuilder, View.GPUMask);
const FPerViewPipelineState& ViewPipelineState = GetViewPipelineState(View);
int32 DenoiseMode = CVarDiffuseIndirectDenoiser.GetValueOnRenderThread();
// 設置通用的漫反射參數.
FCommonParameters CommonDiffuseParameters;
SetupCommonDiffuseIndirectParameters(GraphBuilder, SceneTextureParameters, View, /* out */ CommonDiffuseParameters);
// 為降噪器更新舊的光線追蹤配置.
IScreenSpaceDenoiser::FAmbientOcclusionRayTracingConfig RayTracingConfig;
{
RayTracingConfig.RayCountPerPixel = CommonDiffuseParameters.RayCountPerPixel;
RayTracingConfig.ResolutionFraction = 1.0f / float(CommonDiffuseParameters.DownscaleFactor);
}
// 上一幀場景顏色
ScreenSpaceRayTracing::FPrevSceneColorMip PrevSceneColorMip;
if ((ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen || ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::SSGI) && View.PrevViewInfo.ScreenSpaceRayTracingInput.IsValid())
{
PrevSceneColorMip = ScreenSpaceRayTracing::ReducePrevSceneColorMip(GraphBuilder, SceneTextureParameters, View);
}
// 降噪器輸入輸出參數
FSSDSignalTextures DenoiserOutputs;
IScreenSpaceDenoiser::FDiffuseIndirectInputs DenoiserInputs;
IScreenSpaceDenoiser::FDiffuseIndirectHarmonic DenoiserSphericalHarmonicInputs;
FLumenReflectionCompositeParameters LumenReflectionCompositeParameters;
bool bLumenUseDenoiserComposite = ViewPipelineState.bUseLumenProbeHierarchy;
// 根據不同的非直接光方法獲得降噪輸入或輸出結構.
// Lumen探針層次結構
if (ViewPipelineState.bUseLumenProbeHierarchy)
{
check(ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::Disabled);
DenoiserOutputs = RenderLumenProbeHierarchy(
GraphBuilder,
SceneTextures,
CommonDiffuseParameters, PrevSceneColorMip,
View, &View.PrevViewInfo);
}
// 屏幕空間全局光照
else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::SSGI)
{
RDG_EVENT_SCOPE(GraphBuilder, "SSGI %dx%d", CommonDiffuseParameters.TracingViewportSize.X, CommonDiffuseParameters.TracingViewportSize.Y);
DenoiserInputs = ScreenSpaceRayTracing::CastStandaloneDiffuseIndirectRays(
GraphBuilder, CommonDiffuseParameters, PrevSceneColorMip, View);
}
// 光線追蹤全局光照
else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::RTGI)
{
// TODO: Refactor under the HybridIndirectLighting standard API.
// TODO: hybrid SSGI / RTGI
RenderRayTracingGlobalIllumination(GraphBuilder, SceneTextureParameters, View, /* out */ &RayTracingConfig, /* out */ &DenoiserInputs);
}
// Lumen全局光照
else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
{
check(ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::Disabled);
FLumenMeshSDFGridParameters MeshSDFGridParameters;
DenoiserOutputs = RenderLumenScreenProbeGather(
GraphBuilder,
SceneTextures,
PrevSceneColorMip,
LightingChannelsTexture,
View,
&View.PrevViewInfo,
bLumenUseDenoiserComposite,
MeshSDFGridParameters);
if (ViewPipelineState.ReflectionsMethod == EReflectionsMethod::Lumen)
{
DenoiserOutputs.Textures[2] = RenderLumenReflections(
GraphBuilder,
View,
SceneTextures,
MeshSDFGridParameters,
LumenReflectionCompositeParameters);
}
if (!DenoiserOutputs.Textures[2])
{
DenoiserOutputs.Textures[2] = DenoiserOutputs.Textures[1];
}
}
FRDGTextureRef AmbientOcclusionMask = DenoiserInputs.AmbientOcclusionMask;
// 處理降噪.
if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
{
// 由於Lumen全局輸出的已經帶了降噪, 所以此處不需要任何操作.
}
else if (ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::Disabled)
{
DenoiserOutputs.Textures[0] = DenoiserInputs.Color;
DenoiserOutputs.Textures[1] = SystemTextures.White;
}
else
{
const IScreenSpaceDenoiser* DefaultDenoiser = IScreenSpaceDenoiser::GetDefaultDenoiser();
const IScreenSpaceDenoiser* DenoiserToUse =
ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::DefaultDenoiser
? DefaultDenoiser : GScreenSpaceDenoiser;
RDG_EVENT_SCOPE(GraphBuilder, "%s%s(DiffuseIndirect) %dx%d",
DenoiserToUse != DefaultDenoiser ? TEXT("ThirdParty ") : TEXT(""),
DenoiserToUse->GetDebugName(),
View.ViewRect.Width(), View.ViewRect.Height());
if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::RTGI)
{
// 對RTGI進行降噪.
DenoiserOutputs = DenoiserToUse->DenoiseDiffuseIndirect(
GraphBuilder,
View,
&View.PrevViewInfo,
SceneTextureParameters,
DenoiserInputs,
RayTracingConfig);
AmbientOcclusionMask = DenoiserOutputs.Textures[1];
}
else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::SSGI)
{
// 對SSGI的結果降噪.
DenoiserOutputs = DenoiserToUse->DenoiseScreenSpaceDiffuseIndirect(
GraphBuilder,
View,
&View.PrevViewInfo,
SceneTextureParameters,
DenoiserInputs,
RayTracingConfig);
AmbientOcclusionMask = DenoiserOutputs.Textures[1];
}
}
// 渲染AO
bool bWritableAmbientOcclusionMask = true;
if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::Disabled)
{
ensure(!HasBeenProduced(SceneTextures.ScreenSpaceAO));
AmbientOcclusionMask = nullptr;
bWritableAmbientOcclusionMask = false;
}
else if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::RTAO)
{
RenderRayTracingAmbientOcclusion(
GraphBuilder,
View,
SceneTextureParameters,
&AmbientOcclusionMask);
}
else if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSGI)
{
check(AmbientOcclusionMask);
}
else if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSAO)
{
// Fetch result of SSAO that was done earlier.
if (HasBeenProduced(SceneTextures.ScreenSpaceAO))
{
AmbientOcclusionMask = SceneTextures.ScreenSpaceAO;
}
else
{
AmbientOcclusionMask = GetScreenSpaceAOFallback(SystemTextures);
bWritableAmbientOcclusionMask = false;
}
}
else
{
unimplemented();
bWritableAmbientOcclusionMask = false;
}
// Extract the dynamic AO for application of AO beyond RenderDiffuseIndirectAndAmbientOcclusion()
if (AmbientOcclusionMask && ViewPipelineState.AmbientOcclusionMethod != EAmbientOcclusionMethod::SSAO)
{
ensureMsgf(Views.Num() == 1, TEXT("Need to add support for one AO texture per view in FSceneTextures"));
SceneTextures.ScreenSpaceAO = AmbientOcclusionMask;
}
if (HairStrands::HasViewHairStrandsData(View) && (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSGI || ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSAO) && bWritableAmbientOcclusionMask)
{
RenderHairStrandsAmbientOcclusion(
GraphBuilder,
View,
AmbientOcclusionMask);
}
// 應用漫反射非直接光和環境光AO到場景顏色.
if ((DenoiserOutputs.Textures[0] || AmbientOcclusionMask) && (!bIsVisualizePass || ViewPipelineState.DiffuseIndirectDenoiser != IScreenSpaceDenoiser::EMode::Disabled || ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
&& !IsMetalPlatform(ShaderPlatform))
{
// 用的PS是FDiffuseIndirectCompositePS
FDiffuseIndirectCompositePS::FParameters* PassParameters = GraphBuilder.AllocParameters<FDiffuseIndirectCompositePS::FParameters>();
PassParameters->AmbientOcclusionStaticFraction = FMath::Clamp(View.FinalPostProcessSettings.AmbientOcclusionStaticFraction, 0.0f, 1.0f);
PassParameters->ApplyAOToDynamicDiffuseIndirect = 0.0f;
if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
{
PassParameters->ApplyAOToDynamicDiffuseIndirect = 1.0f;
}
const FIntPoint BufferExtent = SceneTextureParameters.SceneDepthTexture->Desc.Extent;
{
// Placeholder texture for textures pulled in from SSDCommon.ush
FRDGTextureDesc Desc = FRDGTextureDesc::Create2D(
FIntPoint(1),
PF_R32_UINT,
FClearValueBinding::Black,
TexCreate_ShaderResource);
FRDGTextureRef CompressedMetadataPlaceholder = GraphBuilder.CreateTexture(Desc, TEXT("CompressedMetadataPlaceholder"));
PassParameters->CompressedMetadata[0] = CompressedMetadataPlaceholder;
PassParameters->CompressedMetadata[1] = CompressedMetadataPlaceholder;
}
PassParameters->BufferUVToOutputPixelPosition = BufferExtent;
PassParameters->EyeAdaptation = GetEyeAdaptationTexture(GraphBuilder, View);
PassParameters->LumenReflectionCompositeParameters = LumenReflectionCompositeParameters;
PassParameters->bVisualizeDiffuseIndirect = bIsVisualizePass;
PassParameters->DiffuseIndirect = DenoiserOutputs;
PassParameters->DiffuseIndirectSampler = TStaticSamplerState<SF_Point>::GetRHI();
PassParameters->PreIntegratedGF = GSystemTextures.PreintegratedGF->GetRenderTargetItem().ShaderResourceTexture;
PassParameters->PreIntegratedGFSampler = TStaticSamplerState<SF_Bilinear, AM_Clamp, AM_Clamp, AM_Clamp>::GetRHI();
PassParameters->AmbientOcclusionTexture = AmbientOcclusionMask;
PassParameters->AmbientOcclusionSampler = TStaticSamplerState<SF_Point>::GetRHI();
if (!PassParameters->AmbientOcclusionTexture || bIsVisualizePass)
{
PassParameters->AmbientOcclusionTexture = SystemTextures.White;
}
// 設置降噪器的通用shader參數.
Denoiser::SetupCommonShaderParameters(
View, SceneTextureParameters,
View.ViewRect,
1.0f / CommonDiffuseParameters.DownscaleFactor,
/* out */ &PassParameters->DenoiserCommonParameters);
PassParameters->SceneTextures = SceneTextureParameters;
PassParameters->ViewUniformBuffer = View.ViewUniformBuffer;
PassParameters->RenderTargets[0] = FRenderTargetBinding(
SceneColorTexture, ERenderTargetLoadAction::ELoad);
{
FRDGTextureDesc Desc = FRDGTextureDesc::Create2D(
SceneColorTexture->Desc.Extent,
PF_FloatRGBA,
FClearValueBinding::None,
TexCreate_ShaderResource | TexCreate_UAV);
PassParameters->PassDebugOutput = GraphBuilder.CreateUAV(
GraphBuilder.CreateTexture(Desc, TEXT("DebugDiffuseIndirectComposite")));
}
const TCHAR* DiffuseIndirectSampling = TEXT("Disabled");
FDiffuseIndirectCompositePS::FPermutationDomain PermutationVector;
bool bUpscale = false;
if (DenoiserOutputs.Textures[0])
{
if (bLumenUseDenoiserComposite)
{
PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(2);
DiffuseIndirectSampling = TEXT("ProbeHierarchy");
}
else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::RTGI)
{
PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(3);
DiffuseIndirectSampling = TEXT("RTGI");
}
else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
{
PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(4);
DiffuseIndirectSampling = TEXT("ScreenProbeGather");
}
else
{
PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(1);
DiffuseIndirectSampling = TEXT("SSGI");
bUpscale = DenoiserOutputs.Textures[0]->Desc.Extent != SceneColorTexture->Desc.Extent;
}
PermutationVector.Set<FDiffuseIndirectCompositePS::FUpscaleDiffuseIndirectDim>(bUpscale);
}
TShaderMapRef<FDiffuseIndirectCompositePS> PixelShader(View.ShaderMap, PermutationVector);
// 清理和優化無用的shader資源綁定.
ClearUnusedGraphResources(PixelShader, PassParameters);
FRHIBlendState* BlendState = TStaticBlendState<CW_RGBA, BO_Add, BF_One, BF_Source1Color, BO_Add, BF_One, BF_Source1Alpha>::GetRHI();
if (bIsVisualizePass)
{
BlendState = TStaticBlendState<>::GetRHI();
}
// 組合非直接光Pass.
FPixelShaderUtils::AddFullscreenPass(
GraphBuilder,
View.ShaderMap,
RDG_EVENT_NAME(
"DiffuseIndirectComposite(DiffuseIndirect=%s%s%s%s) %dx%d",
DiffuseIndirectSampling,
PermutationVector.Get<FDiffuseIndirectCompositePS::FUpscaleDiffuseIndirectDim>() ? TEXT(" UpscaleDiffuseIndirect") : TEXT(""),
AmbientOcclusionMask ? TEXT(" ApplyAOToSceneColor") : TEXT(""),
PassParameters->ApplyAOToDynamicDiffuseIndirect > 0.0f ? TEXT(" ApplyAOToDynamicDiffuseIndirect") : TEXT(""),
View.ViewRect.Width(), View.ViewRect.Height()),
PixelShader,
PassParameters,
View.ViewRect,
BlendState);
} // if (DenoiserOutputs.Color || bApplySSAO)
// 應用環境cubemap.
if (IsAmbientCubemapPassRequired(View) && !bIsVisualizePass && !ViewPipelineState.bUseLumenProbeHierarchy)
{
FAmbientCubemapCompositePS::FParameters* PassParameters = GraphBuilder.AllocParameters<FAmbientCubemapCompositePS::FParameters>();
PassParameters->PreIntegratedGF = GSystemTextures.PreintegratedGF->GetRenderTargetItem().ShaderResourceTexture;
PassParameters->PreIntegratedGFSampler = TStaticSamplerState<SF_Bilinear, AM_Clamp, AM_Clamp, AM_Clamp>::GetRHI();
PassParameters->AmbientOcclusionTexture = AmbientOcclusionMask;
PassParameters->AmbientOcclusionSampler = TStaticSamplerState<SF_Point>::GetRHI();
if (!PassParameters->AmbientOcclusionTexture)
{
PassParameters->AmbientOcclusionTexture = SystemTextures.White;
}
PassParameters->SceneTextures = SceneTextureParameters;
PassParameters->ViewUniformBuffer = View.ViewUniformBuffer;
PassParameters->RenderTargets[0] = FRenderTargetBinding(
SceneColorTexture, ERenderTargetLoadAction::ELoad);
TShaderMapRef<FAmbientCubemapCompositePS> PixelShader(View.ShaderMap);
GraphBuilder.AddPass(
RDG_EVENT_NAME("AmbientCubemapComposite %dx%d", View.ViewRect.Width(), View.ViewRect.Height()),
PassParameters,
ERDGPassFlags::Raster,
[PassParameters, &View, PixelShader](FRHICommandList& RHICmdList)
{
TShaderMapRef<FPostProcessVS> VertexShader(View.ShaderMap);
RHICmdList.SetViewport(View.ViewRect.Min.X, View.ViewRect.Min.Y, 0.0f, View.ViewRect.Max.X, View.ViewRect.Max.Y, 0.0);
FGraphicsPipelineStateInitializer GraphicsPSOInit;
RHICmdList.ApplyCachedRenderTargets(GraphicsPSOInit);
// set the state
GraphicsPSOInit.BlendState = TStaticBlendState<CW_RGB, BO_Add, BF_One, BF_One, BO_Add, BF_One, BF_One>::GetRHI();
GraphicsPSOInit.RasterizerState = TStaticRasterizerState<>::GetRHI();
GraphicsPSOInit.DepthStencilState = TStaticDepthStencilState<false, CF_Always>::GetRHI();
GraphicsPSOInit.BoundShaderState.VertexDeclarationRHI = GFilterVertexDeclaration.VertexDeclarationRHI;
GraphicsPSOInit.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader();
GraphicsPSOInit.BoundShaderState.PixelShaderRHI = PixelShader.GetPixelShader();
GraphicsPSOInit.PrimitiveType = PT_TriangleList;
SetGraphicsPipelineState(RHICmdList, GraphicsPSOInit);
uint32 Count = View.FinalPostProcessSettings.ContributingCubemaps.Num();
for (const FFinalPostProcessSettings::FCubemapEntry& CubemapEntry : View.FinalPostProcessSettings.ContributingCubemaps)
{
FAmbientCubemapCompositePS::FParameters ShaderParameters = *PassParameters;
SetupAmbientCubemapParameters(CubemapEntry, &ShaderParameters.AmbientCubemap);
SetShaderParameters(RHICmdList, PixelShader, PixelShader.GetPixelShader(), ShaderParameters);
DrawPostProcessPass(
RHICmdList,
0, 0,
View.ViewRect.Width(), View.ViewRect.Height(),
View.ViewRect.Min.X, View.ViewRect.Min.Y,
View.ViewRect.Width(), View.ViewRect.Height(),
View.ViewRect.Size(),
GetSceneTextureExtent(),
VertexShader,
View.StereoPass,
false, // TODO.
EDRF_UseTriangleOptimization);
}
});
} // if (IsAmbientCubemapPassRequired(View))
} // for (FViewInfo& View : Views)
}
6.5.7.2 RenderLumenScreenProbeGather
RenderLumenScreenProbeGather的功能是渲染Lumen屏幕空間的探針收集,其代碼如下:
// Engine\Source\Runtime\Renderer\Private\Lumen\LumenScreenProbeGather.cpp
FSSDSignalTextures FDeferredShadingSceneRenderer::RenderLumenScreenProbeGather(
FRDGBuilder& GraphBuilder,
const FSceneTextures& SceneTextures,
const ScreenSpaceRayTracing::FPrevSceneColorMip& PrevSceneColorMip,
FRDGTextureRef LightingChannelsTexture,
const FViewInfo& View,
FPreviousViewInfo* PreviousViewInfos,
bool& bLumenUseDenoiserComposite,
FLumenMeshSDFGridParameters& MeshSDFGridParameters)
{
LLM_SCOPE_BYTAG(Lumen);
// 渲染Lumen輻照度場收集.
if (GLumenIrradianceFieldGather != 0)
{
bLumenUseDenoiserComposite = false;
return RenderLumenIrradianceFieldGather(GraphBuilder, SceneTextures, View);
}
RDG_EVENT_SCOPE(GraphBuilder, "LumenScreenProbeGather");
RDG_GPU_STAT_SCOPE(GraphBuilder, LumenScreenProbeGather);
check(ShouldRenderLumenDiffuseGI(Scene, View, true));
const FRDGSystemTextures& SystemTextures = FRDGSystemTextures::Get(GraphBuilder);
if (!LightingChannelsTexture)
{
LightingChannelsTexture = SystemTextures.Black;
}
// 如果沒有啟用LumenScreenProbeGather, 則直接清理降噪輸入.
if (!GLumenScreenProbeGather)
{
FSSDSignalTextures ScreenSpaceDenoiserInputs;
ScreenSpaceDenoiserInputs.Textures[0] = SystemTextures.Black;
FRDGTextureDesc RoughSpecularIndirectDesc = FRDGTextureDesc::Create2D(SceneTextures.Config.Extent, PF_FloatRGB, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV);
ScreenSpaceDenoiserInputs.Textures[1] = GraphBuilder.CreateTexture(RoughSpecularIndirectDesc, TEXT("Lumen.ScreenProbeGather.RoughSpecularIndirect"));
AddClearUAVPass(GraphBuilder, GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenSpaceDenoiserInputs.Textures[1])), FLinearColor::Black);
bLumenUseDenoiserComposite = false;
return ScreenSpaceDenoiserInputs;
}
// 從統一緩沖區拉取備用紋理.
const FSceneTextureParameters SceneTextureParameters = GetSceneTextureParameters(GraphBuilder, SceneTextures.UniformBuffer);
// 設置屏幕空間探針的參數.
FScreenProbeParameters ScreenProbeParameters;
ScreenProbeParameters.ScreenProbeTracingOctahedronResolution = LumenScreenProbeGather::GetTracingOctahedronResolution(View);
ensureMsgf(ScreenProbeParameters.ScreenProbeTracingOctahedronResolution < (1 << 6) - 1, TEXT("Tracing resolution %u was larger than supported by PackRayInfo()"), ScreenProbeParameters.ScreenProbeTracingOctahedronResolution);
ScreenProbeParameters.ScreenProbeGatherOctahedronResolution = LumenScreenProbeGather::GetGatherOctahedronResolution(ScreenProbeParameters.ScreenProbeTracingOctahedronResolution);
ScreenProbeParameters.ScreenProbeGatherOctahedronResolutionWithBorder = ScreenProbeParameters.ScreenProbeGatherOctahedronResolution + 2 * (1 << (GLumenScreenProbeGatherNumMips - 1));
ScreenProbeParameters.ScreenProbeDownsampleFactor = LumenScreenProbeGather::GetScreenDownsampleFactor(View);
ScreenProbeParameters.ScreenProbeViewSize = FIntPoint::DivideAndRoundUp(View.ViewRect.Size(), (int32)ScreenProbeParameters.ScreenProbeDownsampleFactor);
ScreenProbeParameters.ScreenProbeAtlasViewSize = ScreenProbeParameters.ScreenProbeViewSize;
ScreenProbeParameters.ScreenProbeAtlasViewSize.Y += FMath::TruncToInt(ScreenProbeParameters.ScreenProbeViewSize.Y * GLumenScreenProbeGatherAdaptiveProbeAllocationFraction);
ScreenProbeParameters.ScreenProbeAtlasBufferSize = FIntPoint::DivideAndRoundUp(SceneTextures.Config.Extent, (int32)ScreenProbeParameters.ScreenProbeDownsampleFactor);
ScreenProbeParameters.ScreenProbeAtlasBufferSize.Y += FMath::TruncToInt(ScreenProbeParameters.ScreenProbeAtlasBufferSize.Y * GLumenScreenProbeGatherAdaptiveProbeAllocationFraction);
ScreenProbeParameters.ScreenProbeGatherMaxMip = GLumenScreenProbeGatherNumMips - 1;
ScreenProbeParameters.RelativeSpeedDifferenceToConsiderLightingMoving = GLumenScreenProbeRelativeSpeedDifferenceToConsiderLightingMoving;
ScreenProbeParameters.ScreenTraceNoFallbackThicknessScale = Lumen::UseHardwareRayTracedScreenProbeGather() ? 1.0f : GLumenScreenProbeScreenTracesThicknessScaleWhenNoFallback;
ScreenProbeParameters.NumUniformScreenProbes = ScreenProbeParameters.ScreenProbeViewSize.X * ScreenProbeParameters.ScreenProbeViewSize.Y;
ScreenProbeParameters.MaxNumAdaptiveProbes = FMath::TruncToInt(ScreenProbeParameters.NumUniformScreenProbes * GLumenScreenProbeGatherAdaptiveProbeAllocationFraction);
extern int32 GLumenScreenProbeGatherVisualizeTraces;
ScreenProbeParameters.FixedJitterIndex = GLumenScreenProbeGatherVisualizeTraces == 0 ? GLumenScreenProbeFixedJitterIndex : 6;
FRDGTextureDesc DownsampledDepthDesc(FRDGTextureDesc::Create2D(ScreenProbeParameters.ScreenProbeAtlasBufferSize, PF_R32_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
ScreenProbeParameters.ScreenProbeSceneDepth = GraphBuilder.CreateTexture(DownsampledDepthDesc, TEXT("Lumen.ScreenProbeGather.ScreenProbeSceneDepth"));
FRDGTextureDesc DownsampledSpeedDesc(FRDGTextureDesc::Create2D(ScreenProbeParameters.ScreenProbeAtlasBufferSize, PF_R16F, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
ScreenProbeParameters.ScreenProbeWorldSpeed = GraphBuilder.CreateTexture(DownsampledSpeedDesc, TEXT("Lumen.ScreenProbeGather.ScreenProbeWorldSpeed"));
FBlueNoise BlueNoise;
InitializeBlueNoise(BlueNoise);
ScreenProbeParameters.BlueNoise = CreateUniformBufferImmediate(BlueNoise, EUniformBufferUsage::UniformBuffer_SingleDraw);
ScreenProbeParameters.OctahedralSolidAngleParameters.OctahedralSolidAngleTextureResolutionSq = GLumenOctahedralSolidAngleTextureSize * GLumenOctahedralSolidAngleTextureSize;
ScreenProbeParameters.OctahedralSolidAngleParameters.OctahedralSolidAngleTexture = InitializeOctahedralSolidAngleTexture(GraphBuilder, View.ShaderMap, GLumenOctahedralSolidAngleTextureSize, View.ViewState->Lumen.ScreenProbeGatherState.OctahedralSolidAngleTextureRT);
// 探針下采樣深度.
{
FScreenProbeDownsampleDepthUniformCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeDownsampleDepthUniformCS::FParameters>();
PassParameters->RWScreenProbeSceneDepth = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeSceneDepth));
PassParameters->RWScreenProbeWorldSpeed = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeWorldSpeed));
PassParameters->View = View.ViewUniformBuffer;
PassParameters->SceneTexturesStruct = SceneTextures.UniformBuffer;
PassParameters->SceneTextures = SceneTextureParameters;
PassParameters->ScreenProbeParameters = ScreenProbeParameters;
auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeDownsampleDepthUniformCS>(0);
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("UniformPlacement DownsampleFactor=%u", ScreenProbeParameters.ScreenProbeDownsampleFactor),
ComputeShader,
PassParameters,
FComputeShaderUtils::GetGroupCount(ScreenProbeParameters.ScreenProbeViewSize, FScreenProbeDownsampleDepthUniformCS::GetGroupSize()));
}
FRDGBufferRef NumAdaptiveScreenProbes = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(uint32), 1), TEXT("Lumen.ScreenProbeGather.NumAdaptiveScreenProbes"));
FRDGBufferRef AdaptiveScreenProbeData = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(uint32), FMath::Max<uint32>(ScreenProbeParameters.MaxNumAdaptiveProbes, 1)), TEXT("Lumen.ScreenProbeGather.daptiveScreenProbeData"));
ScreenProbeParameters.NumAdaptiveScreenProbes = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(NumAdaptiveScreenProbes, PF_R32_UINT));
ScreenProbeParameters.AdaptiveScreenProbeData = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(AdaptiveScreenProbeData, PF_R32_UINT));
const FIntPoint ScreenProbeViewportBufferSize = FIntPoint::DivideAndRoundUp(SceneTextures.Config.Extent, (int32)ScreenProbeParameters.ScreenProbeDownsampleFactor);
FRDGTextureDesc ScreenTileAdaptiveProbeHeaderDesc(FRDGTextureDesc::Create2D(ScreenProbeViewportBufferSize, PF_R32_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
FIntPoint ScreenTileAdaptiveProbeIndicesBufferSize = FIntPoint(ScreenProbeViewportBufferSize.X * ScreenProbeParameters.ScreenProbeDownsampleFactor, ScreenProbeViewportBufferSize.Y * ScreenProbeParameters.ScreenProbeDownsampleFactor);
FRDGTextureDesc ScreenTileAdaptiveProbeIndicesDesc(FRDGTextureDesc::Create2D(ScreenTileAdaptiveProbeIndicesBufferSize, PF_R16_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
ScreenProbeParameters.ScreenTileAdaptiveProbeHeader = GraphBuilder.CreateTexture(ScreenTileAdaptiveProbeHeaderDesc, TEXT("Lumen.ScreenProbeGather.ScreenTileAdaptiveProbeHeader"));
ScreenProbeParameters.ScreenTileAdaptiveProbeIndices = GraphBuilder.CreateTexture(ScreenTileAdaptiveProbeIndicesDesc, TEXT("Lumen.ScreenProbeGather.ScreenTileAdaptiveProbeIndices"));
FComputeShaderUtils::ClearUAV(GraphBuilder, View.ShaderMap, GraphBuilder.CreateUAV(FRDGBufferUAVDesc(NumAdaptiveScreenProbes, PF_R32_UINT)), 0);
uint32 ClearValues[4] = {0, 0, 0, 0};
AddClearUAVPass(GraphBuilder, GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeHeader)), ClearValues);
const uint32 AdaptiveProbeMinDownsampleFactor = FMath::Clamp(GLumenScreenProbeGatherAdaptiveProbeMinDownsampleFactor, 1, 64);
if (ScreenProbeParameters.MaxNumAdaptiveProbes > 0 && AdaptiveProbeMinDownsampleFactor < ScreenProbeParameters.ScreenProbeDownsampleFactor)
{
// 探針自適應地放置位置.
uint32 PlacementDownsampleFactor = ScreenProbeParameters.ScreenProbeDownsampleFactor;
do
{
PlacementDownsampleFactor /= 2;
FScreenProbeAdaptivePlacementCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeAdaptivePlacementCS::FParameters>();
PassParameters->RWScreenProbeSceneDepth = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeSceneDepth));
PassParameters->RWScreenProbeWorldSpeed = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeWorldSpeed));
PassParameters->RWNumAdaptiveScreenProbes = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(NumAdaptiveScreenProbes, PF_R32_UINT));
PassParameters->RWAdaptiveScreenProbeData = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(AdaptiveScreenProbeData, PF_R32_UINT));
PassParameters->RWScreenTileAdaptiveProbeHeader = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeHeader));
PassParameters->RWScreenTileAdaptiveProbeIndices = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeIndices));
PassParameters->View = View.ViewUniformBuffer;
PassParameters->SceneTexturesStruct = SceneTextures.UniformBuffer;
PassParameters->SceneTextures = SceneTextureParameters;
PassParameters->ScreenProbeParameters = ScreenProbeParameters;
PassParameters->PlacementDownsampleFactor = PlacementDownsampleFactor;
auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeAdaptivePlacementCS>(0);
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("AdaptivePlacement DownsampleFactor=%u", PlacementDownsampleFactor),
ComputeShader,
PassParameters,
FComputeShaderUtils::GetGroupCount(FIntPoint::DivideAndRoundDown(View.ViewRect.Size(), (int32)PlacementDownsampleFactor), FScreenProbeAdaptivePlacementCS::GetGroupSize()));
}
while (PlacementDownsampleFactor > AdaptiveProbeMinDownsampleFactor);
}
else
{
FComputeShaderUtils::ClearUAV(GraphBuilder, View.ShaderMap, GraphBuilder.CreateUAV(FRDGBufferUAVDesc(AdaptiveScreenProbeData, PF_R32_UINT)), 0);
AddClearUAVPass(GraphBuilder, GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeIndices)), ClearValues);
}
FRDGBufferRef ScreenProbeIndirectArgs = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateIndirectDesc<FRHIDispatchIndirectParameters>((uint32)EScreenProbeIndirectArgs::Max), TEXT("Lumen.ScreenProbeGather.ScreenProbeIndirectArgs"));
// 設置自適應探針的非直接參數.
{
FSetupAdaptiveProbeIndirectArgsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FSetupAdaptiveProbeIndirectArgsCS::FParameters>();
PassParameters->RWScreenProbeIndirectArgs = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(ScreenProbeIndirectArgs, PF_R32_UINT));
PassParameters->ScreenProbeParameters = ScreenProbeParameters;
auto ComputeShader = View.ShaderMap->GetShader<FSetupAdaptiveProbeIndirectArgsCS>(0);
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("SetupAdaptiveProbeIndirectArgs"),
ComputeShader,
PassParameters,
FIntVector(1, 1, 1));
}
ScreenProbeParameters.ProbeIndirectArgs = ScreenProbeIndirectArgs;
FLumenCardTracingInputs TracingInputs(GraphBuilder, Scene, View);
FRDGTextureRef BRDFProbabilityDensityFunction = nullptr;
FRDGBufferSRVRef BRDFProbabilityDensityFunctionSH = nullptr;
GenerateBRDF_PDF(GraphBuilder, View, SceneTextures, BRDFProbabilityDensityFunction, BRDFProbabilityDensityFunctionSH, ScreenProbeParameters);
const LumenRadianceCache::FRadianceCacheInputs RadianceCacheInputs = LumenScreenProbeGatherRadianceCache::SetupRadianceCacheInputs();
LumenRadianceCache::FRadianceCacheInterpolationParameters RadianceCacheParameters;
// 輻射率緩存.
if (LumenScreenProbeGather::UseRadianceCache(View))
{
FScreenGatherMarkUsedProbesData MarkUsedProbesData;
MarkUsedProbesData.Parameters.View = View.ViewUniformBuffer;
MarkUsedProbesData.Parameters.SceneTexturesStruct = SceneTextures.UniformBuffer;
MarkUsedProbesData.Parameters.ScreenProbeParameters = ScreenProbeParameters;
MarkUsedProbesData.Parameters.VisualizeLumenScene = View.Family->EngineShowFlags.VisualizeLumenScene != 0 ? 1 : 0;
MarkUsedProbesData.Parameters.RadianceCacheParameters = RadianceCacheParameters;
// 渲染輻射率緩存.
RenderRadianceCache(
GraphBuilder,
TracingInputs,
RadianceCacheInputs,
Scene,
View,
&ScreenProbeParameters,
BRDFProbabilityDensityFunctionSH,
FMarkUsedRadianceCacheProbes::CreateStatic(&ScreenGatherMarkUsedProbes),
&MarkUsedProbesData,
View.ViewState->RadianceCacheState,
RadianceCacheParameters);
}
if (LumenScreenProbeGather::UseImportanceSampling(View))
{
// 生成重要性采樣射線.
GenerateImportanceSamplingRays(
GraphBuilder,
View,
SceneTextures,
RadianceCacheParameters,
BRDFProbabilityDensityFunction,
BRDFProbabilityDensityFunctionSH,
ScreenProbeParameters);
}
const FIntPoint ScreenProbeTraceBufferSize = ScreenProbeParameters.ScreenProbeAtlasBufferSize * ScreenProbeParameters.ScreenProbeTracingOctahedronResolution;
FRDGTextureDesc TraceRadianceDesc(FRDGTextureDesc::Create2D(ScreenProbeTraceBufferSize, PF_FloatRGB, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
ScreenProbeParameters.TraceRadiance = GraphBuilder.CreateTexture(TraceRadianceDesc, TEXT("Lumen.ScreenProbeGather.TraceRadiance"));
ScreenProbeParameters.RWTraceRadiance = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.TraceRadiance));
FRDGTextureDesc TraceHitDesc(FRDGTextureDesc::Create2D(ScreenProbeTraceBufferSize, PF_R32_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
ScreenProbeParameters.TraceHit = GraphBuilder.CreateTexture(TraceHitDesc, TEXT("Lumen.ScreenProbeGather.TraceHit"));
ScreenProbeParameters.RWTraceHit = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.TraceHit));
// 追蹤屏幕空間的探針.
TraceScreenProbes(
GraphBuilder,
Scene,
View,
GLumenGatherCvars.TraceMeshSDFs != 0 && Lumen::UseMeshSDFTracing(),
SceneTextures.UniformBuffer,
PrevSceneColorMip,
LightingChannelsTexture,
TracingInputs,
RadianceCacheParameters,
ScreenProbeParameters,
MeshSDFGridParameters);
FScreenProbeGatherParameters GatherParameters;
// 過濾屏幕空間探針.
FilterScreenProbes(GraphBuilder, View, ScreenProbeParameters, GatherParameters);
FScreenSpaceBentNormalParameters ScreenSpaceBentNormalParameters;
ScreenSpaceBentNormalParameters.UseScreenBentNormal = 0;
ScreenSpaceBentNormalParameters.ScreenBentNormal = SystemTextures.Black;
ScreenSpaceBentNormalParameters.ScreenDiffuseLighting = SystemTextures.Black;
// 計算屏幕空間的環境法線.
if (LumenScreenProbeGather::UseScreenSpaceBentNormal())
{
ScreenSpaceBentNormalParameters = ComputeScreenSpaceBentNormal(GraphBuilder, Scene, View, SceneTextures, LightingChannelsTexture, ScreenProbeParameters);
}
FRDGTextureDesc DiffuseIndirectDesc = FRDGTextureDesc::Create2D(SceneTextures.Config.Extent, PF_FloatRGBA, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV);
FRDGTextureRef DiffuseIndirect = GraphBuilder.CreateTexture(DiffuseIndirectDesc, TEXT("Lumen.ScreenProbeGather.DiffuseIndirect"));
FRDGTextureDesc RoughSpecularIndirectDesc = FRDGTextureDesc::Create2D(SceneTextures.Config.Extent, PF_FloatRGB, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV);
FRDGTextureRef RoughSpecularIndirect = GraphBuilder.CreateTexture(RoughSpecularIndirectDesc, TEXT("Lumen.ScreenProbeGather.RoughSpecularIndirect"));
{
FScreenProbeIndirectCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeIndirectCS::FParameters>();
PassParameters->RWDiffuseIndirect = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(DiffuseIndirect));
PassParameters->RWRoughSpecularIndirect = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(RoughSpecularIndirect));
PassParameters->GatherParameters = GatherParameters;
PassParameters->ScreenProbeParameters = ScreenProbeParameters;
PassParameters->View = View.ViewUniformBuffer;
PassParameters->SceneTexturesStruct = SceneTextures.UniformBuffer;
PassParameters->FullResolutionJitterWidth = GLumenScreenProbeFullResolutionJitterWidth;
extern float GLumenReflectionMaxRoughnessToTrace;
extern float GLumenReflectionRoughnessFadeLength;
PassParameters->MaxRoughnessToTrace = GLumenReflectionMaxRoughnessToTrace;
PassParameters->RoughnessFadeLength = GLumenReflectionRoughnessFadeLength;
PassParameters->ScreenSpaceBentNormalParameters = ScreenSpaceBentNormalParameters;
FScreenProbeIndirectCS::FPermutationDomain PermutationVector;
PermutationVector.Set< FScreenProbeIndirectCS::FDiffuseIntegralMethod >(LumenScreenProbeGather::GetDiffuseIntegralMethod());
auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeIndirectCS>(PermutationVector);
// 計算屏幕空間探針的非直接光.
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("ComputeIndirect %ux%u", View.ViewRect.Width(), View.ViewRect.Height()),
ComputeShader,
PassParameters,
FComputeShaderUtils::GetGroupCount(View.ViewRect.Size(), FScreenProbeIndirectCS::GetGroupSize()));
}
FSSDSignalTextures DenoiserOutputs;
DenoiserOutputs.Textures[0] = DiffuseIndirect;
DenoiserOutputs.Textures[1] = RoughSpecularIndirect;
bLumenUseDenoiserComposite = false;
// 屏幕空間探針的時間過濾.
if (GLumenScreenProbeTemporalFilter)
{
if (GLumenScreenProbeUseHistoryNeighborhoodClamp)
{
FRDGTextureRef CompressedDepthTexture;
FRDGTextureRef CompressedShadingModelTexture;
{
FRDGTextureDesc Desc = FRDGTextureDesc::Create2D(
SceneTextures.Depth.Resolve->Desc.Extent,
PF_R16F,
FClearValueBinding::None,
/* InTargetableFlags = */ TexCreate_ShaderResource | TexCreate_UAV);
CompressedDepthTexture = GraphBuilder.CreateTexture(Desc, TEXT("Lumen.ScreenProbeGather.CompressedDepth"));
Desc.Format = PF_R8_UINT;
CompressedShadingModelTexture = GraphBuilder.CreateTexture(Desc, TEXT("Lumen.ScreenProbeGather.CompressedShadingModelID"));
}
{
FGenerateCompressedGBuffer::FParameters* PassParameters = GraphBuilder.AllocParameters<FGenerateCompressedGBuffer::FParameters>();
PassParameters->RWCompressedDepthBufferOutput = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(CompressedDepthTexture));
PassParameters->RWCompressedShadingModelOutput = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(CompressedShadingModelTexture));
PassParameters->View = View.ViewUniformBuffer;
PassParameters->SceneTextures = SceneTextureParameters;
auto ComputeShader = View.ShaderMap->GetShader<FGenerateCompressedGBuffer>(0);
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("GenerateCompressedGBuffer"),
ComputeShader,
PassParameters,
FComputeShaderUtils::GetGroupCount(View.ViewRect.Size(), FGenerateCompressedGBuffer::GetGroupSize()));
}
FSSDSignalTextures ScreenSpaceDenoiserInputs;
ScreenSpaceDenoiserInputs.Textures[0] = DiffuseIndirect;
ScreenSpaceDenoiserInputs.Textures[1] = RoughSpecularIndirect;
DenoiserOutputs = IScreenSpaceDenoiser::DenoiseIndirectProbeHierarchy(
GraphBuilder,
View,
PreviousViewInfos,
SceneTextureParameters,
ScreenSpaceDenoiserInputs,
CompressedDepthTexture,
CompressedShadingModelTexture);
bLumenUseDenoiserComposite = true;
}
else
{
UpdateHistoryScreenProbeGather(
GraphBuilder,
View,
SceneTextures,
DiffuseIndirect,
RoughSpecularIndirect);
DenoiserOutputs.Textures[0] = DiffuseIndirect;
DenoiserOutputs.Textures[1] = RoughSpecularIndirect;
}
}
return DenoiserOutputs;
}
結合源碼和RenderDoc截幀數據,可知屏幕空間的探針收集階段異常復雜,常規流程的主要步驟有:全局並自適應調整位置、計算BRDF、渲染輻射率緩存、計算光照PDF、生成采樣射線、追蹤屏幕空間的探針、壓縮追蹤結果、追蹤Voxel體素、組合追蹤結果、過濾帶收集的輻射率、處理環境法線、計算非直接光、更新歷史數據:
由於以上步驟涉及太多了,只能結合截幀數據挑選部分重要步驟加以分析。
- RadianceCache
光照緩存(RadianceCache)也是一系列非常復雜的過程,先后經歷清理、標記、更新、分配探針,設置繪制參數,追蹤探針,過濾探針輻射度等階段:
RadianceCache最重要的是追蹤屏幕空間的探針,它的輸入數據有全局距離場、VoxelLighting等紋理。
輸出是4096x4096的輻射率探針圖集和深度:
TraceFromProbes輸出的探針圖集(局部放大)。
其使用的Compute Shader代碼如下:
// Engine\Shaders\Private\Lumen\LumenRadianceCache.usf
groupshared float3 SharedTraceRadiance[THREADGROUP_SIZE][THREADGROUP_SIZE];
groupshared float SharedTraceHitDistance[THREADGROUP_SIZE][THREADGROUP_SIZE];
[numthreads(THREADGROUP_SIZE, THREADGROUP_SIZE, 1)]
void TraceFromProbesCS(
uint3 GroupId : SV_GroupID,
uint2 GroupThreadId : SV_GroupThreadID)
{
uint TraceTileIndex = GroupId.y * TRACE_TILE_GROUP_STRIDE + GroupId.x;
if (TraceTileIndex < ProbeTraceTileAllocator[0])
{
uint2 TraceTileCoord;
uint TraceTileLevel;
uint ProbeTraceIndex;
// 獲取追蹤塊的信息
UnpackTraceTileInfo(ProbeTraceTileData[TraceTileIndex], TraceTileCoord, TraceTileLevel, ProbeTraceIndex);
uint TraceResolution = (RadianceProbeResolution / 2) << TraceTileLevel;
// 探針紋素坐標
uint2 ProbeTexelCoord = TraceTileCoord * THREADGROUP_SIZE + GroupThreadId.xy;
float3 ProbeWorldCenter;
uint ClipmapIndex;
uint ProbeIndex;
// 獲取探針的追蹤數據.
GetProbeTraceData(ProbeTraceIndex, ProbeWorldCenter, ClipmapIndex, ProbeIndex);
if (all(ProbeTexelCoord < TraceResolution))
{
float2 ProbeTexelCenter = float2(0.5, 0.5);
float2 ProbeUV = (ProbeTexelCoord + ProbeTexelCenter) / float(TraceResolution);
float3 WorldConeDirection = OctahedralMapToDirection(ProbeUV);
float FinalMinTraceDistance = max(MinTraceDistance, GetRadianceProbeTMin(ClipmapIndex));
float FinalMaxTraceDistance = MaxTraceDistance;
float EffectiveStepFactor = StepFactor;
// 將球的立體角均勻地分布在所有錐體上,而不是基於八面體的畸變.
float ConeHalfAngle = acosFast(1.0f - 1.0f / (float)(TraceResolution * TraceResolution));
// 設置錐體追蹤輸入數據.
FConeTraceInput TraceInput;
TraceInput.Setup(
ProbeWorldCenter, WorldConeDirection,
ConeHalfAngle, MinSampleRadius,
FinalMinTraceDistance, FinalMaxTraceDistance,
EffectiveStepFactor);
TraceInput.VoxelStepFactor = VoxelStepFactor;
bool bContinueCardTracing = false;
TraceInput.VoxelTraceStartDistance = CalculateVoxelTraceStartDistance(FinalMinTraceDistance, FinalMaxTraceDistance, MaxMeshSDFTraceDistance, bContinueCardTracing);
// 為探針紋素執行錐體追蹤.
FConeTraceResult TraceResult = TraceForProbeTexel(TraceInput);
// 存儲追蹤的光照結果.
SharedTraceRadiance[GroupThreadId.y][GroupThreadId.x] = TraceResult.Lighting;
// 存儲追蹤的深度.
#if RADIANCE_CACHE_STORE_DEPTHS
SharedTraceHitDistance[GroupThreadId.y][GroupThreadId.x] = TraceResult.OpaqueHitDistance;
#endif
}
GroupMemoryBarrierWithGroupSync();
uint2 ProbeAtlasBaseCoord = RadianceProbeResolution * uint2(ProbeIndex % ProbeAtlasResolutionInProbes.x, ProbeIndex / ProbeAtlasResolutionInProbes.x);
// 存儲光照結果和相交點的距離.
if (TraceResolution < RadianceProbeResolution)
{
uint UpsampleFactor = RadianceProbeResolution / TraceResolution;
ProbeAtlasBaseCoord += (THREADGROUP_SIZE * TraceTileCoord + GroupThreadId.xy) * UpsampleFactor;
float3 Lighting = SharedTraceRadiance[GroupThreadId.y][GroupThreadId.x];
for (uint Y = 0; Y < UpsampleFactor; Y++)
{
for (uint X = 0; X < UpsampleFactor; X++)
{
RWRadianceProbeAtlasTexture[ProbeAtlasBaseCoord + uint2(X, Y)] = Lighting;
}
}
#if RADIANCE_CACHE_STORE_DEPTHS
float HitDistance = min(SharedTraceHitDistance[GroupThreadId.y][GroupThreadId.x], MaxHalfFloat);
for (uint Y = 0; Y < UpsampleFactor; Y++)
{
for (uint X = 0; X < UpsampleFactor; X++)
{
RWDepthProbeAtlasTexture[ProbeAtlasBaseCoord + uint2(X, Y)] = HitDistance;
}
}
#endif
}
else
{
uint DownsampleFactor = TraceResolution / RadianceProbeResolution;
uint WriteTileSize = THREADGROUP_SIZE / DownsampleFactor;
if (all(GroupThreadId.xy < WriteTileSize))
{
float3 Lighting = 0;
for (uint Y = 0; Y < DownsampleFactor; Y++)
{
for (uint X = 0; X < DownsampleFactor; X++)
{
Lighting += SharedTraceRadiance[GroupThreadId.y * DownsampleFactor + Y][GroupThreadId.x * DownsampleFactor + X];
}
}
ProbeAtlasBaseCoord += WriteTileSize * TraceTileCoord + GroupThreadId.xy;
RWRadianceProbeAtlasTexture[ProbeAtlasBaseCoord] = Lighting / (float)(DownsampleFactor * DownsampleFactor);
#if RADIANCE_CACHE_STORE_DEPTHS
float HitDistance = MaxHalfFloat;
for (uint Y = 0; Y < DownsampleFactor; Y++)
{
for (uint X = 0; X < DownsampleFactor; X++)
{
HitDistance = min(HitDistance, SharedTraceHitDistance[GroupThreadId.y * DownsampleFactor + Y][GroupThreadId.x * DownsampleFactor + X]);
}
}
RWDepthProbeAtlasTexture[ProbeAtlasBaseCoord] = HitDistance;
#endif
}
}
}
}
下面再進入TraceForProbeTexel
分析探針紋素的追蹤堆棧:
FConeTraceResult TraceForProbeTexel(FConeTraceInput TraceInput)
{
// 構造追蹤結果結構體.
FConeTraceResult TraceResult;
TraceResult = (FConeTraceResult)0;
TraceResult.Lighting = 0.0;
TraceResult.Transparency = 1.0;
TraceResult.OpaqueHitDistance = TraceInput.MaxTraceDistance;
// 錐體追蹤Lumen場景的紋素, 后面有解析.
ConeTraceLumenSceneVoxels(TraceInput, TraceResult);
// 遠景距離場的追蹤.
#if TRACE_DISTANT_SCENE
if (TraceResult.Transparency > .01f)
{
FConeTraceResult DistantTraceResult;
// 錐體追蹤Lumen遠處場景, 后面有解析.
ConeTraceLumenDistantScene(TraceInput, DistantTraceResult);
TraceResult.Lighting += DistantTraceResult.Lighting * TraceResult.Transparency;
TraceResult.Transparency *= DistantTraceResult.Transparency;
}
#endif
// 天空光處理.
#if ENABLE_DYNAMIC_SKY_LIGHT
if (ReflectionStruct.SkyLightParameters.y > 0)
{
float SkyAverageBrightness = 1.0f;
float Roughness = TanConeAngleToRoughness(tan(TraceInput.ConeAngle));
TraceResult.Lighting = TraceResult.Lighting + GetSkyLightReflection(TraceInput.ConeDirection, Roughness, SkyAverageBrightness) * TraceResult.Transparency;
}
#endif
return TraceResult;
}
// 錐體追蹤Lumen場景的紋素
void ConeTraceLumenSceneVoxels(
FConeTraceInput TraceInput,
inout FConeTraceResult OutResult)
{
#if SCENE_TRACE_VOXELS
if (TraceInput.VoxelTraceStartDistance < TraceInput.MaxTraceDistance)
{
FConeTraceInput VoxelTraceInput = TraceInput;
VoxelTraceInput.MinTraceDistance = TraceInput.VoxelTraceStartDistance;
FConeTraceResult VoxelTraceResult;
// 錐體追蹤體素, 之前就解析過了.
ConeTraceVoxels(VoxelTraceInput, VoxelTraceResult);
// 應用透明度.
#if !VISIBILITY_ONLY_TRACE
OutResult.Lighting += VoxelTraceResult.Lighting * OutResult.Transparency;
#endif
OutResult.Transparency *= VoxelTraceResult.Transparency;
OutResult.NumSteps += VoxelTraceResult.NumSteps;
OutResult.OpaqueHitDistance = min(OutResult.OpaqueHitDistance, VoxelTraceResult.OpaqueHitDistance);
}
#endif
}
// 錐體追蹤Lumen遠處場景.
void ConeTraceLumenDistantScene(
FConeTraceInput TraceInput,
inout FConeTraceResult OutResult)
{
float3 debug = 0;
TraceInput.MaxTraceDistance = LumenCardScene.DistantSceneMaxTraceDistance;
TraceInput.bBlackOutSteepIntersections = true;
FCardTraceBlendState CardTraceBlendState;
CardTraceBlendState.Initialize(TraceInput.MaxTraceDistance);
if (LumenCardScene.NumDistantCards > 0)
{
// 從裁剪圖獲取最小追蹤距離.
if (NumClipmapLevels > 0)
{
float3 VoxelLightingCenter = ClipmapWorldCenter[NumClipmapLevels - 1].xyz;
float3 VoxelLightingExtent = ClipmapWorldSamplingExtent[NumClipmapLevels - 1].xyz;
float3 RayEnd = TraceInput.ConeOrigin + TraceInput.ConeDirection * TraceInput.MaxTraceDistance;
float2 IntersectionTimes = LineBoxIntersect(TraceInput.ConeOrigin, RayEnd, VoxelLightingCenter - VoxelLightingExtent, VoxelLightingCenter + VoxelLightingExtent);
// If we are starting inside the voxel clipmaps, move the start of the trace past the voxel clipmaps
if (IntersectionTimes.x < IntersectionTimes.y && IntersectionTimes.x < .001f)
{
TraceInput.MinTraceDistance = IntersectionTimes.y * TraceInput.MaxTraceDistance;
}
}
float TraceEndDistance = TraceInput.MinTraceDistance;
{
uint ListIndex = 0;
uint CardIndex = LumenCardScene.DistantCardIndices[ListIndex];
// 錐體追蹤單個Lumen卡片, 后面有解析.
ConeTraceSingleLumenCard(
TraceInput,
CardIndex,
debug,
TraceEndDistance,
CardTraceBlendState);
}
}
OutResult = (FConeTraceResult)0;
// 存儲結果.
#if !VISIBILITY_ONLY_TRACE
OutResult.Lighting = CardTraceBlendState.GetFinalLighting();
#endif
OutResult.Transparency = CardTraceBlendState.GetTransparency();
OutResult.NumSteps = CardTraceBlendState.NumSteps;
OutResult.NumOverlaps = CardTraceBlendState.NumOverlaps;
OutResult.OpaqueHitDistance = CardTraceBlendState.OpaqueHitDistance;
OutResult.Debug = debug;
}
// 錐體追蹤單個Lumen卡片
void ConeTraceSingleLumenCard(
FConeTraceInput TraceInput,
uint CardIndex,
inout float3 Debug,
inout float OutTraceEndDistance,
inout FCardTraceBlendState CardTraceBlendState)
{
// 獲取卡片數據.
FLumenCardData LumenCardData = GetLumenCardData(CardIndex);
// 計算局部空間的錐體數據.
float3 LocalConeOrigin = mul(TraceInput.ConeOrigin - LumenCardData.Origin, LumenCardData.WorldToLocalRotation);
float3 LocalConeDirection = mul(TraceInput.ConeDirection, LumenCardData.WorldToLocalRotation);
float3 LocalTraceEnd = LocalConeOrigin + LocalConeDirection * TraceInput.MaxTraceDistance;
// 相交范圍.
float2 IntersectionRange = LineBoxIntersect(LocalConeOrigin, LocalTraceEnd, -LumenCardData.LocalExtent, LumenCardData.LocalExtent);
IntersectionRange.x = max(IntersectionRange.x, TraceInput.MinTraceDistance / TraceInput.MaxTraceDistance);
OutTraceEndDistance = IntersectionRange.y * TraceInput.MaxTraceDistance;
if (IntersectionRange.y > IntersectionRange.x
&& LumenCardData.bVisible)
{
{
// 卡片追蹤混合狀態.
FCardTraceBlendState ConeStepBlendState;
ConeStepBlendState.Initialize(TraceInput.MaxTraceDistance);
float StepTime = IntersectionRange.x * TraceInput.MaxTraceDistance;
float3 SamplePosition = LocalConeOrigin + StepTime * LocalConeDirection;
float TraceEndDistance = IntersectionRange.y * TraceInput.MaxTraceDistance;
float IntersectionLength = (IntersectionRange.y - IntersectionRange.x) * TraceInput.MaxTraceDistance;
float MinStepSize = IntersectionLength / (float)LumenCardScene.MaxConeSteps;
float PreviousStepTime = StepTime;
float3 PreviousSamplePosition = SamplePosition;
// Magic value to prevent linear intersection approximation on first step
float PreviousHeightfieldZ = -2;
bool bClampedToEnd = false;
bool bFoundSurface = false;
bool bRayAboveSurface = false;
float IntersectionStepTime = 0;
float2 IntersectionSamplePositionXY = SamplePosition.xy;
float IntersectionSlope = 0;
uint NumStepsPerLoop = 4; // 每次循環采樣4次.
for (uint StepIndex = 0; StepIndex < LumenCardScene.MaxConeSteps && StepTime < TraceEndDistance; StepIndex += NumStepsPerLoop)
{
float SampleRadius = max(TraceInput.ConeStartRadius + TraceInput.TanConeAngle * StepTime, TraceInput.MinSampleRadius);
float StepSize = max(SampleRadius * TraceInput.StepFactor, MinStepSize);
float TraceClampDistance = TraceEndDistance - StepSize * .0001f;
float DepthMip;
float2 DepthValidRegionScale;
CalculateMip(SampleRadius, LumenCardData, LumenCardData.LocalExtent, LumenCardData.MaxMip, DepthMip, DepthValidRegionScale);
// 4個采樣位置.
float3 SamplePosition1 = LocalConeOrigin + min(StepTime + 0 * StepSize, TraceClampDistance) * LocalConeDirection;
float3 SamplePosition2 = LocalConeOrigin + min(StepTime + 1 * StepSize, TraceClampDistance) * LocalConeDirection;
float3 SamplePosition3 = LocalConeOrigin + min(StepTime + 2 * StepSize, TraceClampDistance) * LocalConeDirection;
float3 SamplePosition4 = LocalConeOrigin + min(StepTime + 3 * StepSize, TraceClampDistance) * LocalConeDirection;
// 4個深度UV.
float2 DepthAtlasUV1 = CalculateAtlasUV(SamplePosition1.xy, DepthValidRegionScale, LumenCardData);
float2 DepthAtlasUV2 = CalculateAtlasUV(SamplePosition2.xy, DepthValidRegionScale, LumenCardData);
float2 DepthAtlasUV3 = CalculateAtlasUV(SamplePosition3.xy, DepthValidRegionScale, LumenCardData);
float2 DepthAtlasUV4 = CalculateAtlasUV(SamplePosition4.xy, DepthValidRegionScale, LumenCardData);
// 4個深度.
float Depth1 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV1, DepthMip).x;
float Depth2 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV2, DepthMip).x;
float Depth3 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV3, DepthMip).x;
float Depth4 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV4, DepthMip).x;
// 4個高度場Z值.
float HeightfieldZ1 = LumenCardData.LocalExtent.z - Depth1 * 2 * LumenCardData.LocalExtent.z;
float HeightfieldZ2 = LumenCardData.LocalExtent.z - Depth2 * 2 * LumenCardData.LocalExtent.z;
float HeightfieldZ3 = LumenCardData.LocalExtent.z - Depth3 * 2 * LumenCardData.LocalExtent.z;
float HeightfieldZ4 = LumenCardData.LocalExtent.z - Depth4 * 2 * LumenCardData.LocalExtent.z;
ConeStepBlendState.RegisterStep(NumStepsPerLoop);
// 高度場是否相交.
bool4 HeightfieldHit = bool4(
SamplePosition1.z < HeightfieldZ1,
SamplePosition2.z < HeightfieldZ2,
SamplePosition3.z < HeightfieldZ3,
SamplePosition4.z < HeightfieldZ4);
bool bRayBelowHeightfield = any(HeightfieldHit);
bool bRayWasAboveSurface = bRayAboveSurface;
if (!bRayBelowHeightfield)
{
bRayAboveSurface = true;
}
// 從高度場以下開始的追蹤必須在到達高度場以上才能被命中
if (bRayBelowHeightfield && bRayWasAboveSurface)
{
float HeightfieldZ;
if (HeightfieldHit.x)
{
SamplePosition = SamplePosition1;
HeightfieldZ = HeightfieldZ1;
StepTime = StepTime + 0 * StepSize;
}
else if (HeightfieldHit.y)
{
PreviousSamplePosition = SamplePosition1;
PreviousHeightfieldZ = HeightfieldZ1;
PreviousStepTime = StepTime + 0 * StepSize;
SamplePosition = SamplePosition2;
HeightfieldZ = HeightfieldZ2;
StepTime = StepTime + 1 * StepSize;
}
else if (HeightfieldHit.z)
{
PreviousSamplePosition = SamplePosition2;
PreviousHeightfieldZ = HeightfieldZ2;
PreviousStepTime = StepTime + 1 * StepSize;
SamplePosition = SamplePosition3;
HeightfieldZ = HeightfieldZ3;
StepTime = StepTime + 2 * StepSize;
}
else
{
PreviousSamplePosition = SamplePosition3;
PreviousHeightfieldZ = HeightfieldZ3;
PreviousStepTime = StepTime + 2 * StepSize;
SamplePosition = SamplePosition4;
HeightfieldZ = HeightfieldZ4;
StepTime = StepTime + 3 * StepSize;
}
StepTime = min(StepTime, TraceClampDistance);
if (PreviousHeightfieldZ != -2)
{
// 求出x的交點.
IntersectionStepTime = PreviousStepTime + ((PreviousSamplePosition.z - PreviousHeightfieldZ) * (StepTime - PreviousStepTime)) / (HeightfieldZ - PreviousHeightfieldZ + PreviousSamplePosition.z - SamplePosition.z);
float2 LocalPositionSlopeXY = (SamplePosition.xy - PreviousSamplePosition.xy) / (StepTime - PreviousStepTime);
IntersectionSamplePositionXY = LocalPositionSlopeXY * (IntersectionStepTime - PreviousStepTime) + PreviousSamplePosition.xy;
IntersectionSlope = abs(PreviousHeightfieldZ - HeightfieldZ) / max(length(PreviousSamplePosition.xy - SamplePosition.xy), .0001f);
PreviousHeightfieldZ = -2;
// 找到了表面.
bFoundSurface = true;
}
break;
}
PreviousStepTime = StepTime + 3 * StepSize;
PreviousSamplePosition = SamplePosition4;
PreviousHeightfieldZ = HeightfieldZ4;
StepTime += 4 * StepSize;
if (StepTime >= TraceEndDistance && !bClampedToEnd)
{
bClampedToEnd = true;
// Stop the last step just before the intersection end, since the linear approximation needs to step past the surface to detect a hit, without terminating the loop
StepTime = TraceClampDistance;
}
}
// 如果找到了表面點.
if (bFoundSurface)
{
float IntersectionSampleRadius = TraceInput.ConeStartRadius + TraceInput.TanConeAngle * IntersectionStepTime;
float MaxMip;
float2 ValidRegionScale;
CalculateMip(IntersectionSampleRadius, LumenCardData, LumenCardData.LocalExtent, LumenCardData.MaxMip, MaxMip, ValidRegionScale);
float2 IntersectionAtlasUV = CalculateAtlasUV(IntersectionSamplePositionXY, ValidRegionScale, LumenCardData);
float DistanceToSurface = 0;
float ConeIntersectSurface = saturate(DistanceToSurface / IntersectionSampleRadius);
float ConeVisibility = ConeIntersectSurface;
float MaxDistanceFade = 1;
ConeStepBlendState.RegisterOpaqueHit(IntersectionStepTime);
OutTraceEndDistance = IntersectionStepTime;
float Opacity = Texture2DSampleLevel(OpacityAtlas, TRACING_ATLAS_SAMPLER, IntersectionAtlasUV, MaxMip).x;
float ConeOcclusion = (1.0f - ConeVisibility) * Opacity * MaxDistanceFade;
#if VISIBILITY_ONLY_TRACE
float3 StepLighting = 0;
#else
float3 StepLighting = Texture2DSampleLevel(FinalLightingAtlas, TRACING_ATLAS_SAMPLER, IntersectionAtlasUV, MaxMip).rgb;
#endif
if (TraceInput.bBlackOutSteepIntersections)
{
// 假設陡峭的部分被其他面覆蓋,然后淡出。
float SlopeFade = 1 - saturate((IntersectionSlope - 5) / 1.0f);
StepLighting = lerp(0, StepLighting, SlopeFade);
ConeOcclusion = lerp(0, ConeOcclusion, SlopeFade);
}
ConeStepBlendState.AddLighting(StepLighting, ConeOcclusion, IntersectionStepTime);
}
CardTraceBlendState.AddCardTrace(ConeStepBlendState);
}
}
}
以上可知,RadianceCache階段經歷紛繁復雜的渲染過程,其中單單TraceFromProbes就先后考慮了錐體追蹤Voxel光場和場景遠處的卡片,最后還需要考慮天空光的影響。
- TraceScreenProbes
TraceScreenProbes包含追蹤屏幕的探針、網格距離場、Voxel光照等,具體的代碼如下:
// Engine\Source\Runtime\Renderer\Private\Lumen\LumenScreenProbeTracing.cpp
void TraceScreenProbes(
FRDGBuilder& GraphBuilder,
const FScene* Scene,
const FViewInfo& View,
bool bTraceMeshSDFs,
TRDGUniformBufferRef<FSceneTextureUniformParameters> SceneTexturesUniformBuffer,
const ScreenSpaceRayTracing::FPrevSceneColorMip& PrevSceneColor,
FRDGTextureRef LightingChannelsTexture,
const FLumenCardTracingInputs& TracingInputs,
const LumenRadianceCache::FRadianceCacheInterpolationParameters& RadianceCacheParameters,
FScreenProbeParameters& ScreenProbeParameters,
FLumenMeshSDFGridParameters& MeshSDFGridParameters)
{
const FSceneTextureParameters SceneTextures = GetSceneTextureParameters(GraphBuilder, SceneTexturesUniformBuffer);
// 清理探針.
{
FClearTracesCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FClearTracesCS::FParameters>();
PassParameters->ScreenProbeParameters = ScreenProbeParameters;
auto ComputeShader = View.ShaderMap->GetShader<FClearTracesCS>(0);
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("ClearTraces %ux%u", ScreenProbeParameters.ScreenProbeTracingOctahedronResolution, ScreenProbeParameters.ScreenProbeTracingOctahedronResolution),
ComputeShader,
PassParameters,
ScreenProbeParameters.ProbeIndirectArgs,
(uint32)EScreenProbeIndirectArgs::ThreadPerTrace * sizeof(FRHIDispatchIndirectParameters));
}
FLumenIndirectTracingParameters IndirectTracingParameters;
SetupLumenDiffuseTracingParameters(IndirectTracingParameters);
const bool bTraceScreen = View.PrevViewInfo.ScreenSpaceRayTracingInput.IsValid()
&& GLumenScreenProbeGatherScreenTraces != 0
&& !View.Family->EngineShowFlags.VisualizeLumenIndirectDiffuse;
// 追蹤屏幕空間的探針.
if (bTraceScreen)
{
FScreenProbeTraceScreenTexturesCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeTraceScreenTexturesCS::FParameters>();
ScreenSpaceRayTracing::SetupCommonScreenSpaceRayParameters(GraphBuilder, SceneTextures, PrevSceneColor, View, /* out */ &PassParameters->ScreenSpaceRayParameters);
PassParameters->ScreenSpaceRayParameters.CommonDiffuseParameters.SceneTextures = SceneTextures;
{
const FVector2D HZBUvFactor(
float(View.ViewRect.Width()) / float(2 * View.HZBMipmap0Size.X),
float(View.ViewRect.Height()) / float(2 * View.HZBMipmap0Size.Y));
const FVector4 ScreenPositionScaleBias = View.GetScreenPositionScaleBias(SceneTextures.SceneDepthTexture->Desc.Extent, View.ViewRect);
const FVector2D HZBUVToScreenUVScale = FVector2D(1.0f / HZBUvFactor.X, 1.0f / HZBUvFactor.Y) * FVector2D(2.0f, -2.0f) * FVector2D(ScreenPositionScaleBias.X, ScreenPositionScaleBias.Y);
const FVector2D HZBUVToScreenUVBias = FVector2D(-1.0f, 1.0f) * FVector2D(ScreenPositionScaleBias.X, ScreenPositionScaleBias.Y) + FVector2D(ScreenPositionScaleBias.W, ScreenPositionScaleBias.Z);
PassParameters->HZBUVToScreenUVScaleBias = FVector4(HZBUVToScreenUVScale, HZBUVToScreenUVBias);
}
checkf(View.ClosestHZB, TEXT("Lumen screen tracing: ClosestHZB was not setup, should have been setup by FDeferredShadingSceneRenderer::RenderHzb"));
PassParameters->ClosestHZBTexture = View.ClosestHZB;
PassParameters->SceneDepthTexture = SceneTextures.SceneDepthTexture;
PassParameters->LightingChannelsTexture = LightingChannelsTexture;
PassParameters->HZBBaseTexelSize = FVector2D(1.0f / View.ClosestHZB->Desc.Extent.X, 1.0f / View.ClosestHZB->Desc.Extent.Y);
PassParameters->MaxHierarchicalScreenTraceIterations = GLumenScreenProbeGatherHierarchicalScreenTracesMaxIterations;
PassParameters->UncertainTraceRelativeDepthThreshold = GLumenScreenProbeGatherUncertainTraceRelativeDepthThreshold;
PassParameters->NumThicknessStepsToDetermineCertainty = GLumenScreenProbeGatherNumThicknessStepsToDetermineCertainty;
PassParameters->ScreenProbeParameters = ScreenProbeParameters;
PassParameters->IndirectTracingParameters = IndirectTracingParameters;
PassParameters->RadianceCacheParameters = RadianceCacheParameters;
FScreenProbeTraceScreenTexturesCS::FPermutationDomain PermutationVector;
PermutationVector.Set< FScreenProbeTraceScreenTexturesCS::FRadianceCache >(LumenScreenProbeGather::UseRadianceCache(View));
PermutationVector.Set< FScreenProbeTraceScreenTexturesCS::FHierarchicalScreenTracing >(GLumenScreenProbeGatherHierarchicalScreenTraces != 0);
PermutationVector.Set< FScreenProbeTraceScreenTexturesCS::FStructuredImportanceSampling >(LumenScreenProbeGather::UseImportanceSampling(View));
auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeTraceScreenTexturesCS>(PermutationVector);
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("TraceScreen"),
ComputeShader,
PassParameters,
ScreenProbeParameters.ProbeIndirectArgs,
(uint32)EScreenProbeIndirectArgs::ThreadPerTrace * sizeof(FRHIDispatchIndirectParameters));
}
// 追蹤網格距離場.
if (bTraceMeshSDFs)
{
// 硬件模式
if (Lumen::UseHardwareRayTracedScreenProbeGather())
{
FCompactedTraceParameters CompactedTraceParameters = CompactTraces(
GraphBuilder,
View,
ScreenProbeParameters,
WORLD_MAX,
IndirectTracingParameters.MaxTraceDistance);
RenderHardwareRayTracingScreenProbe(GraphBuilder,
Scene,
SceneTextures,
ScreenProbeParameters,
View,
TracingInputs,
IndirectTracingParameters,
RadianceCacheParameters,
CompactedTraceParameters);
}
// 軟件模式
else
{
CullForCardTracing(
GraphBuilder,
Scene, View,
TracingInputs,
IndirectTracingParameters,
/* out */ MeshSDFGridParameters);
if (MeshSDFGridParameters.TracingParameters.DistanceFieldObjectBuffers.NumSceneObjects > 0)
{
FCompactedTraceParameters CompactedTraceParameters = CompactTraces(
GraphBuilder,
View,
ScreenProbeParameters,
IndirectTracingParameters.CardTraceEndDistanceFromCamera,
IndirectTracingParameters.MaxMeshSDFTraceDistance);
{
FScreenProbeTraceMeshSDFsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeTraceMeshSDFsCS::FParameters>();
GetLumenCardTracingParameters(View, TracingInputs, PassParameters->TracingParameters);
PassParameters->MeshSDFGridParameters = MeshSDFGridParameters;
PassParameters->ScreenProbeParameters = ScreenProbeParameters;
PassParameters->IndirectTracingParameters = IndirectTracingParameters;
PassParameters->SceneTexturesStruct = SceneTexturesUniformBuffer;
PassParameters->CompactedTraceParameters = CompactedTraceParameters;
FScreenProbeTraceMeshSDFsCS::FPermutationDomain PermutationVector;
PermutationVector.Set< FScreenProbeTraceMeshSDFsCS::FStructuredImportanceSampling >(LumenScreenProbeGather::UseImportanceSampling(View));
auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeTraceMeshSDFsCS>(PermutationVector);
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("TraceMeshSDFs"),
ComputeShader,
PassParameters,
CompactedTraceParameters.IndirectArgs,
0);
}
}
}
}
// 壓縮追蹤參數.
FCompactedTraceParameters CompactedTraceParameters = CompactTraces(
GraphBuilder,
View,
ScreenProbeParameters,
WORLD_MAX,
// Make sure the shader runs on all misses to apply radiance cache + skylight
IndirectTracingParameters.MaxTraceDistance + 1);
// 追蹤Voxel光照.
{
FScreenProbeTraceVoxelsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeTraceVoxelsCS::FParameters>();
PassParameters->RadianceCacheParameters = RadianceCacheParameters;
GetLumenCardTracingParameters(View, TracingInputs, PassParameters->TracingParameters);
PassParameters->ScreenProbeParameters = ScreenProbeParameters;
PassParameters->IndirectTracingParameters = IndirectTracingParameters;
PassParameters->SceneTexturesStruct = SceneTexturesUniformBuffer;
PassParameters->CompactedTraceParameters = CompactedTraceParameters;
const bool bRadianceCache = LumenScreenProbeGather::UseRadianceCache(View);
FScreenProbeTraceVoxelsCS::FPermutationDomain PermutationVector;
PermutationVector.Set< FScreenProbeTraceVoxelsCS::FDynamicSkyLight >(Lumen::ShouldHandleSkyLight(Scene, *View.Family));
PermutationVector.Set< FScreenProbeTraceVoxelsCS::FTraceDistantScene >(Scene->LumenSceneData->DistantCardIndices.Num() > 0);
PermutationVector.Set< FScreenProbeTraceVoxelsCS::FRadianceCache >(bRadianceCache);
PermutationVector.Set< FScreenProbeTraceVoxelsCS::FStructuredImportanceSampling >(LumenScreenProbeGather::UseImportanceSampling(View));
auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeTraceVoxelsCS>(PermutationVector);
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("TraceVoxels"),
ComputeShader,
PassParameters,
CompactedTraceParameters.IndirectArgs,
0);
}
if (GLumenScreenProbeGatherVisualizeTraces)
{
SetupVisualizeTraces(GraphBuilder, Scene, View, ScreenProbeParameters);
}
}
先結合截幀數據分析TraceScreen,它的輸入是BlueNoise、Velocity、深度、探針速度、射線信息、HZB、SSRReducedSceneColor等紋理,輸出是像素格式為R11G11B10的TraceRadiance和R32的TraceHit紋理:
左:TraceRadiance,右:TraceHit。
它使用的Compute Shader如下:
// Engine\Shaders\Private\Lumen\LumenScreenProbeTracing.usf
[numthreads(PROBE_THREADGROUP_SIZE_2D, PROBE_THREADGROUP_SIZE_2D, 1)]
void ScreenProbeTraceScreenTexturesCS(
uint3 GroupId : SV_GroupID,
uint3 DispatchThreadId : SV_DispatchThreadID,
uint3 GroupThreadId : SV_GroupThreadID)
{
#define DEINTERLEAVED_SCREEN_TRACING 1
// 計算紋理坐標
#if DEINTERLEAVED_SCREEN_TRACING
uint2 AtlasSizeInProbes = uint2(ScreenProbeAtlasViewSize.x, (GetNumScreenProbes() + ScreenProbeAtlasViewSize.x - 1) / ScreenProbeAtlasViewSize.x);
uint2 ScreenProbeAtlasCoord = DispatchThreadId.xy % AtlasSizeInProbes;
uint2 TraceTexelCoord = DispatchThreadId.xy / AtlasSizeInProbes;
#else
uint2 ScreenProbeAtlasCoord = DispatchThreadId.xy / ScreenProbeTracingOctahedronResolution;
uint2 TraceTexelCoord = DispatchThreadId.xy - ScreenProbeAtlasCoord * ScreenProbeTracingOctahedronResolution;
#endif
uint ScreenProbeIndex = ScreenProbeAtlasCoord.y * ScreenProbeAtlasViewSize.x + ScreenProbeAtlasCoord.x;
uint2 ScreenProbeScreenPosition = GetScreenProbeScreenPosition(ScreenProbeIndex);
uint2 ScreenTileCoord = GetScreenTileCoord(ScreenProbeScreenPosition);
if (ScreenProbeIndex < GetNumScreenProbes() && all(TraceTexelCoord < ScreenProbeTracingOctahedronResolution))
{
float2 ScreenUV = GetScreenUVFromScreenProbePosition(ScreenProbeScreenPosition);
float SceneDepth = GetScreenProbeDepth(ScreenProbeAtlasCoord);
if (SceneDepth > 0.0f)
{
float3 WorldPosition = GetWorldPositionFromScreenUV(ScreenUV, SceneDepth);
float2 ProbeUV;
float ConeHalfAngle;
// 獲取探針追蹤的UV.
GetProbeTracingUV(ScreenProbeAtlasCoord, TraceTexelCoord, GetProbeTexelCenter(ScreenTileCoord), 1, ProbeUV, ConeHalfAngle);
float3 WorldConeDirection = OctahedralMapToDirection(ProbeUV);
float DepthThresholdScale = HasDistanceFieldRepresentation(ScreenUV) ? 1.0f : ScreenTraceNoFallbackThicknessScale;
{
float TraceDistance = MaxTraceDistance;
bool bCoveredByRadianceCache = false;
#if RADIANCE_CACHE
float ProbeOcclusionDistance = GetRadianceProbeOcclusionDistanceWithInterpolation(WorldPosition, WorldConeDirection, bCoveredByRadianceCache);
TraceDistance = min(TraceDistance, ProbeOcclusionDistance);
#endif
#if HIERARCHICAL_SCREEN_TRACING // 層級屏幕追蹤
bool bHit;
bool bUncertain;
float3 HitUVz;
// 屏幕追蹤
TraceScreen(
WorldPosition + View.PreViewTranslation,
WorldConeDirection,
TraceDistance,
HZBUvFactorAndInvFactor,
MaxHierarchicalScreenTraceIterations,
UncertainTraceRelativeDepthThreshold * DepthThresholdScale,
NumThicknessStepsToDetermineCertainty,
bHit,
bUncertain,
HitUVz);
float Level = 1;
bool bWriteDepthOnMiss = true;
#else // 非層級屏幕追蹤
uint NumSteps = 16;
float StartMipLevel = 1.0f;
float MaxScreenTraceFraction = .2f;
// 通過限制跟蹤距離,只能在固定步長計數的屏幕跟蹤中獲得良好的質量.
float MaxWorldTraceDistance = SceneDepth * MaxScreenTraceFraction * 2.0 * GetTanHalfFieldOfView().x;
TraceDistance = min(TraceDistance, MaxWorldTraceDistance);
uint2 NoiseCoord = ScreenProbeAtlasCoord * ScreenProbeTracingOctahedronResolution + TraceTexelCoord;
float StepOffset = InterleavedGradientNoise(NoiseCoord + 0.5f, 0);
float RayRoughness = .2f;
StepOffset = StepOffset - .9f;
FSSRTCastingSettings CastSettings = CreateDefaultCastSettings();
CastSettings.bStopWhenUncertain = true;
bool bHit = false;
float Level;
float3 HitUVz;
bool bRayWasClipped;
// 初始化屏幕空間的來自世界空間的光線.
FSSRTRay Ray = InitScreenSpaceRayFromWorldSpace(
WorldPosition + View.PreViewTranslation, WorldConeDirection,
/* WorldTMax = */ TraceDistance,
/* SceneDepth = */ SceneDepth,
/* SlopeCompareToleranceScale */ 2.0f * DepthThresholdScale,
/* bExtendRayToScreenBorder = */ false,
/* out */ bRayWasClipped);
bool bUncertain;
float3 DebugOutput;
// 投射屏幕空間的射線.
CastScreenSpaceRay(
FurthestHZBTexture, FurthestHZBTextureSampler,
StartMipLevel,
CastSettings,
Ray, RayRoughness, NumSteps, StepOffset,
HZBUvFactorAndInvFactor, false,
/* out */ DebugOutput,
/* out */ HitUVz,
/* out */ Level,
/* out */ bHit,
/* out */ bUncertain);
// CastScreenSpaceRay skips Mesh SDF tracing in a lot of places where it shouldn't, in particular missing thin occluders due to low NumSteps.
bool bWriteDepthOnMiss = !bUncertain;
#endif
bHit = bHit && !bUncertain;
uint2 TraceCoord = GetTraceBufferCoord(ScreenProbeAtlasCoord, TraceTexelCoord);
bool bFastMoving = false;
// 處理相交后的邏輯.
if (bHit)
{
float2 ReducedColorUV = HitUVz.xy * ColorBufferScaleBias.xy + ColorBufferScaleBias.zw;
ReducedColorUV = min(ReducedColorUV, ReducedColorUVMax);
float3 Lighting = ColorTexture.SampleLevel(ColorTextureSampler, ReducedColorUV, Level).rgb;
#if DEBUG_VISUALIZE_TRACE_TYPES
RWTraceRadiance[TraceCoord] = float3(.5f, 0, 0) * View.PreExposure;
#else
RWTraceRadiance[TraceCoord] = Lighting;
#endif
float3 HitWorldVelocity;
{
float2 HitScreenUV = HitUVz.xy;
float2 HitScreenPosition = (HitScreenUV.xy - View.ScreenPositionScaleBias.wz) / View.ScreenPositionScaleBias.xy;
float HitDeviceZ = HitUVz.z;
float HitSceneDepth = ConvertFromDeviceZ(HitUVz.z);
float3 HitHistoryScreenPosition = GetHistoryScreenPosition(HitScreenPosition, HitScreenUV, HitDeviceZ);
float3 HitTranslatedWorldPosition = mul(float4(HitScreenPosition * HitSceneDepth, HitSceneDepth, 1), View.ScreenToTranslatedWorld).xyz;
HitWorldVelocity = HitTranslatedWorldPosition - GetPrevTranslatedWorldPosition(HitHistoryScreenPosition);
}
float ProbeWorldSpeed = ScreenProbeWorldSpeed.Load(int3(ScreenProbeAtlasCoord, 0)).x;
float HitWorldSpeed = length(HitWorldVelocity);
bFastMoving = abs(ProbeWorldSpeed - HitWorldSpeed) / max(SceneDepth, 100.0f) > RelativeSpeedDifferenceToConsiderLightingMoving;
}
// 相交或要求寫深度則保存深度.
if (bHit || bWriteDepthOnMiss)
{
float HitDistance = min(sqrt(ComputeRayHitSqrDistance(WorldPosition + View.PreViewTranslation, HitUVz)), MaxTraceDistance);
RWTraceHit[TraceCoord] = EncodeProbeRayDistance(HitDistance, bHit, bFastMoving);
}
}
}
}
}
上面會根據是否HIERARCHICAL_SCREEN_TRACING而進入兩種不同的屏幕追蹤方式,截幀數據顯示HIERARCHICAL_SCREEN_TRACING為1,即會進入TraceScreen而不會進入CastScreenSpaceRay。下面分析TraceScreen
:
// Engine\Shaders\Private\Lumen\LumenScreenTracing.ush
// 通過遍歷HZB追蹤屏幕空間, 雖然精確但比較慢。
void TraceScreen(
float3 RayTranslatedWorldOrigin,
float3 RayWorldDirection,
float MaxWorldTraceDistance,
float4 HZBUvFactorAndInvFactor,
float MaxIterations,
float UncertainTraceRelativeDepthThreshold,
float NumThicknessStepsToDetermineCertainty,
inout bool bHit,
inout bool bUncertain,
inout float3 OutScreenUV)
{
// 計算射線起點的屏幕UV.
float3 RayStartScreenUV;
{
float4 RayStartClip = mul(float4(RayTranslatedWorldOrigin, 1.0f), View.TranslatedWorldToClip);
float3 RayStartScreenPosition = RayStartClip.xyz / max(RayStartClip.w, 1.0f);
RayStartScreenUV = float3((RayStartScreenPosition.xy * float2(0.5f, -0.5f) + 0.5f) * HZBUvFactorAndInvFactor.xy, RayStartScreenPosition.z);
}
// 計算射線終點的屏幕UV.
float3 RayEndScreenUV;
{
float3 ViewRayDirection = mul(float4(RayWorldDirection, 0.0), View.TranslatedWorldToView).xyz;
float SceneDepth = mul(float4(RayTranslatedWorldOrigin, 1.0f), View.TranslatedWorldToView).z;
// 將射線夾在Z==0的平面結束,這樣結束點將在NDC空間中有效.
float RayEndWorldDistance = ViewRayDirection.z < 0.0 ? min(-0.99f * SceneDepth / ViewRayDirection.z, MaxWorldTraceDistance) : MaxWorldTraceDistance;
float3 RayWorldEnd = RayTranslatedWorldOrigin + RayWorldDirection * RayEndWorldDistance;
float4 RayEndClip = mul(float4(RayWorldEnd, 1.0f), View.TranslatedWorldToClip);
float3 RayEndScreenPosition = RayEndClip.xyz / RayEndClip.w;
RayEndScreenUV = float3((RayEndScreenPosition.xy * float2(0.5f, -0.5f) + 0.5f) * HZBUvFactorAndInvFactor.xy, RayEndScreenPosition.z);
float2 ScreenEdgeIntersections = LineBoxIntersect(RayStartScreenUV, RayEndScreenUV, float3(0, 0, 0), float3(HZBUvFactorAndInvFactor.xy, 1));
// 重新計算它離開屏幕的終點.
RayEndScreenUV = RayStartScreenUV + (RayEndScreenUV - RayStartScreenUV) * ScreenEdgeIntersections.y;
}
float BaseMipLevel = HZB_TRACE_INCLUDE_FULL_RES_DEPTH ? -1 : 0;
float MipLevel = BaseMipLevel;
// 跳出當前分塊而不進行命中測試,以避免自遮擋. 這是必要的,因為HZB mip 0是最接近2x2深度的,而且HZB存儲在16位浮點數中
bool bStepOutOfCurrentTile = true;
if (bStepOutOfCurrentTile)
{
float2 HZBTileSize = exp2(MipLevel) * HZBBaseTexelSize;
float2 BiasedUV = RayStartScreenUV.xy;
float3 HZBTileMin = float3(floor(BiasedUV.xy / HZBTileSize) * HZBTileSize, 0.0f);
float3 HZBTileMax = float3(HZBTileMin.xy + HZBTileSize, 1);
float2 TileIntersections = LineBoxIntersect(RayStartScreenUV, RayEndScreenUV, HZBTileMin, HZBTileMax);
{
float3 RayTileHit = RayStartScreenUV + (RayEndScreenUV - RayStartScreenUV) * TileIntersections.y;
RayStartScreenUV = RayTileHit;
}
}
bHit = false;
bUncertain = false;
float RayLength2D = length(RayEndScreenUV.xy - RayStartScreenUV.xy);
float2 RayDirectionScreenUV = (RayEndScreenUV.xy - RayStartScreenUV.xy) / max(RayLength2D, .0001f);
float3 RayScreenUV = RayStartScreenUV;
float NumIterations = 0;
// 無棧遍歷HZB.
while (MipLevel >= BaseMipLevel && NumIterations < MaxIterations)
{
float2 HZBTileSize = exp2(MipLevel) * HZBBaseTexelSize;
// RayScreenUV is on a tile boundary due to bStepOutOfCurrentTile
// Offset the UV along the ray direction so it always quantizes to the next tile
float2 BiasedUV = RayScreenUV.xy + .01f * RayDirectionScreenUV.xy * HZBTileSize;
float3 HZBTileMin = float3(floor(BiasedUV / HZBTileSize) * HZBTileSize, 0.0f);
float3 HZBTileMax = float3(HZBTileMin.xy + HZBTileSize, 1);
float2 TileIntersections = LineBoxIntersect(RayStartScreenUV, RayEndScreenUV, HZBTileMin, HZBTileMax);
float3 RayTileHit = RayStartScreenUV + (RayEndScreenUV - RayStartScreenUV) * TileIntersections.y;
float TileZ;
float AvoidSelfIntersectionZScale = 1.0f;
#if HZB_TRACE_INCLUDE_FULL_RES_DEPTH
if (MipLevel < 0)
{
TileZ = SceneDepthTexture.SampleLevel(GlobalPointClampedSampler, BiasedUV * HZBUVToScreenUVScaleBias.xy + HZBUVToScreenUVScaleBias.zw, 0).x;
}
else
#endif
{
TileZ = ClosestHZBTexture.SampleLevel(GlobalPointClampedSampler, BiasedUV, MipLevel).x;
// 啟發式避免錯誤的自遮擋, 因為HZB mip 0是最接近2x2深度的,而且HZB存儲在16位浮點數中
AvoidSelfIntersectionZScale = lerp(.99f, 1.0f, saturate(TileIntersections.y * 10.0f));
}
if (RayTileHit.z > TileZ * AvoidSelfIntersectionZScale)
{
RayScreenUV = RayTileHit;
MipLevel++;
if (TileIntersections.y == 1.0f)
{
// 射線沒有和HZB塊相交.
MipLevel = BaseMipLevel - 1;
}
}
else
{
if (abs(MipLevel - BaseMipLevel) < .1f)
{
// 將相交點的UV對齊到紋素的中心,進行SceneColor查找.
RayScreenUV = float3(.5f * (HZBTileMin.xy + HZBTileMax.xy), RayTileHit.z);
bHit = true;
float IntersectionDepth = ConvertFromDeviceZ(TileZ);
float RayTileEnterZ = RayStartScreenUV.z + (RayEndScreenUV.z - RayStartScreenUV.z) * TileIntersections.x;
bUncertain = (ConvertFromDeviceZ(RayTileEnterZ) - IntersectionDepth) / max(IntersectionDepth, .00001f) > UncertainTraceRelativeDepthThreshold;
}
MipLevel--;
}
NumIterations++;
}
// 沿着射線確定特定厚度的線性步驟,以拒絕非常薄的表面(草, 頭發, 植被)后面的相交.
if (bHit && !bUncertain && NumThicknessStepsToDetermineCertainty > 0)
{
float ThicknessSearchMipLevel = 0.0f;
float MipNumTexels = exp2(ThicknessSearchMipLevel);
float2 HZBTileSize = MipNumTexels * HZBBaseTexelSize;
float NumSteps = NumThicknessStepsToDetermineCertainty / MipNumTexels;
float ThicknessSearchEndTime = min(length(RayDirectionScreenUV * HZBTileSize * NumSteps) / length(RayEndScreenUV.xy - RayScreenUV.xy), 1.0f);
for (float I = 0; I < NumSteps; I++)
{
float3 SampleUV = RayScreenUV + (I / NumSteps) * ThicknessSearchEndTime * (RayEndScreenUV - RayScreenUV);
if (all(SampleUV.xy > 0 && SampleUV.xy < HZBUvFactorAndInvFactor.xy))
{
float SampleTileZ = ClosestHZBTexture.SampleLevel(GlobalPointClampedSampler, SampleUV.xy, ThicknessSearchMipLevel).x;
if (SampleUV.z > SampleTileZ)
{
bUncertain = true;
}
}
}
}
OutScreenUV.xy = RayScreenUV.xy * HZBUVToScreenUVScaleBias.xy + HZBUVToScreenUVScaleBias.zw;
OutScreenUV.z = RayScreenUV.z;
}
關於HZB屏幕空間的光線追蹤,推薦參看閆令琪大神的圖形學課程《GAMES202-高質量實時渲染》Lecture9 Real-Time Global Illumination(Screen Space),其視頻詳盡動態地描述了HZB的遍歷和追蹤過程。下圖只是截取視頻的其中一幅圖例:
- TraceVoxels
追蹤體素的輸入有全局距離場、法線、深度、天空光、藍噪點、VoxelLighting、RadianceProbeIndirectTexture、FinalRadianceAtlas、射線信息等,輸出有R32的TraceHit、R11G11B10的TraceRandiance:
TraceVoxels的輸出紋理TraceHit,存儲了相交點的深度,注意右上角范圍做了調整。
TraceVoxels的輸出紋理TraceRadiance,存儲了相交點的輻射率。
再分析其使用的compute shader:
// Engine\Shaders\Private\Lumen\LumenScreenProbeTracing.usf
[numthreads(PROBE_THREADGROUP_SIZE_1D, 1, 1)]
void ScreenProbeTraceVoxelsCS(
uint3 GroupId : SV_GroupID,
uint3 DispatchThreadId : SV_DispatchThreadID,
uint3 GroupThreadId : SV_GroupThreadID)
{
if (DispatchThreadId.x < CompactedTraceTexelAllocator[0])
{
uint ScreenProbeIndex;
uint2 TraceTexelCoord;
float TraceHitDistance;
// 解碼需要追蹤的紋素信息.
DecodeTraceTexel(CompactedTraceTexelData[DispatchThreadId.x], ScreenProbeIndex, TraceTexelCoord, TraceHitDistance);
// 計算探針所在圖集的UV.
uint2 ScreenProbeAtlasCoord = uint2(ScreenProbeIndex % ScreenProbeAtlasViewSize.x, ScreenProbeIndex / ScreenProbeAtlasViewSize.x);
// 追蹤探針紋素的體素光照.
TraceVoxels(ScreenProbeAtlasCoord, TraceTexelCoord, ScreenProbeIndex, TraceHitDistance);
}
}
void TraceVoxels(
uint2 ScreenProbeAtlasCoord,
uint2 TraceTexelCoord,
uint ScreenProbeIndex,
float TraceHitDistance)
{
// 計算追蹤的UV.
uint2 ScreenProbeScreenPosition = GetScreenProbeScreenPosition(ScreenProbeIndex);
uint2 ScreenTileCoord = GetScreenTileCoord(ScreenProbeScreenPosition);
uint2 TraceCoord = GetTraceBufferCoord(ScreenProbeAtlasCoord, TraceTexelCoord);
{
// 獲取屏幕空間的各類數據.
float2 ScreenUV = GetScreenUVFromScreenProbePosition(ScreenProbeScreenPosition);
float SceneDepth = GetScreenProbeDepth(ScreenProbeAtlasCoord);
float3 SceneNormal = DecodeNormal(SceneTexturesStruct.GBufferATexture.Load(int3(ScreenUV * View.BufferSizeAndInvSize.xy, 0)).xyz);
bool bHit = false;
{
// 計算世界坐標.
float3 WorldPosition = GetWorldPositionFromScreenUV(ScreenUV, SceneDepth);
float2 ProbeUV;
float ConeHalfAngle;
// 獲取探針追蹤UV.
GetProbeTracingUV(ScreenProbeAtlasCoord, TraceTexelCoord, GetProbeTexelCenter(ScreenTileCoord), 1, ProbeUV, ConeHalfAngle);
// 從八面體圖反算成方向.
float3 WorldConeDirection = OctahedralMapToDirection(ProbeUV);
// 采樣位置.
float3 SamplePosition = WorldPosition + SurfaceBias * WorldConeDirection;
SamplePosition += SurfaceBias * SceneNormal;
float TraceDistance = MaxTraceDistance;
bool bCoveredByRadianceCache = false;
#if RADIANCE_CACHE
float ProbeOcclusionDistance = GetRadianceProbeOcclusionDistanceWithInterpolation(WorldPosition, WorldConeDirection, bCoveredByRadianceCache);
TraceDistance = min(TraceDistance, ProbeOcclusionDistance);
#endif
// 構建錐體追蹤輸入數據.
FConeTraceInput TraceInput;
TraceInput.Setup(SamplePosition, WorldConeDirection, ConeHalfAngle, MinSampleRadius, MinTraceDistance, TraceDistance, StepFactor);
TraceInput.VoxelStepFactor = VoxelStepFactor;
TraceInput.VoxelTraceStartDistance = max(MinTraceDistance, TraceHitDistance);
// 構建錐體追蹤輸出數據.
FConeTraceResult TraceResult = (FConeTraceResult)0;
TraceResult.Lighting = 0;
TraceResult.Transparency = 1;
TraceResult.OpaqueHitDistance = TraceInput.MaxTraceDistance;
// 錐體追蹤Lumen場景的光照體素.
ConeTraceLumenSceneVoxels(TraceInput, TraceResult);
if (TraceResult.Transparency <= .5f)
{
// 掠射角追蹤的自相交產生的噪點無法被空間濾波器消除.
#define USE_VOXEL_TRACE_HIT_DISTANCE 0
#if USE_VOXEL_TRACE_HIT_DISTANCE
TraceHitDistance = TraceResult.OpaqueHitDistance;
#else
TraceHitDistance = TraceDistance;
#endif
bHit = true;
}
#if RADIANCE_CACHE
if (bCoveredByRadianceCache)
{
if (TraceResult.Transparency > .5f)
{
// 不保存輻射率緩存相交點的深度.
TraceHitDistance = MaxTraceDistance;
}
SampleRadianceCacheAndApply(WorldPosition, WorldConeDirection, ConeHalfAngle, float3(0, 0, 0), TraceResult.Lighting, TraceResult.Transparency);
}
else
#endif
{
#if TRACE_DISTANT_SCENE
// 追蹤遠處場景.
if (TraceResult.Transparency > .01f)
{
FConeTraceResult DistantTraceResult;
ConeTraceLumenDistantScene(TraceInput, DistantTraceResult);
TraceResult.Lighting += DistantTraceResult.Lighting * TraceResult.Transparency;
TraceResult.Transparency *= DistantTraceResult.Transparency;
}
#endif
// 計算天空光.
EvaluateSkyRadianceForCone(WorldConeDirection, tan(ConeHalfAngle), TraceResult);
if (TraceHitDistance >= GetProbeMaxHitDistance())
{
TraceHitDistance = MaxTraceDistance;
}
}
#if USE_PREEXPOSURE
TraceResult.Lighting *= View.PreExposure;
#endif
#if DEBUG_VISUALIZE_TRACE_TYPES
RWTraceRadiance[TraceCoord] = float3(0, 0, .5f) * View.PreExposure;
#else
RWTraceRadiance[TraceCoord] = TraceResult.Lighting;
#endif
}
// 存儲追蹤結果, 將相交點距離/是否相交/是否移動編碼到32位非負整數中.
RWTraceHit[TraceCoord] = EncodeProbeRayDistance(TraceHitDistance, bHit, false);
}
}
- CompositeTraces
CompositeTraces就是根據前面步驟生成的TraceHit、RayInfo和TraceRadianc生成ScreenProbeRadiance、ScreenProbeHitDistance、ScreenProbeTraceMoving紋理。其使用的Compute Shader是LumenScreenProbeFiltering.usf,主入口是ScreenProbeCompositeTracesWithScatterCS
,具體代碼此文忽略。
- FilterRadianceWithGather
CompositeTraces之后會經歷數次FilterRadianceWithGather,執行探針輻射率過濾:
左:過濾前的ScreenProbeRadiance;右:執行若干次過濾后的ScreenProbeRadiance。
- ComputeIndirect
這個階段就是利用之前生成的各種屏幕空間的探針數據(深度、法線、基礎色、FilteredScreenProbeRadiance、BentNormal)計算出最終的場景非直接光顏色(下圖):
6.5.7.3 RenderLumenReflections
RenderLumenReflections就是渲染Lumen場景中粗糙度比較低比較光滑的表面的反射,其流程和RenderLumenScreenProbeGather類似,但更簡單步驟更少:
其涉及的C++渲染代碼如下:
// Engine\Source\Runtime\Renderer\Private\Lumen\LumenReflections.cpp
FRDGTextureRef FDeferredShadingSceneRenderer::RenderLumenReflections(
FRDGBuilder& GraphBuilder,
const FViewInfo& View,
const FSceneTextures& SceneTextures,
const FLumenMeshSDFGridParameters& MeshSDFGridParameters,
FLumenReflectionCompositeParameters& OutCompositeParameters)
{
// 反射追蹤的最大的粗糙度, 大於此的表面將忽略.
OutCompositeParameters.MaxRoughnessToTrace = GLumenReflectionMaxRoughnessToTrace;
OutCompositeParameters.InvRoughnessFadeLength = 1.0f / GLumenReflectionRoughnessFadeLength;
(......)
{
(......)
auto ComputeShader = View.ShaderMap->GetShader<FReflectionGenerateRaysCS>(0);
// 生成射線Pass.
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("GenerateRaysCS"),
ComputeShader,
PassParameters,
ReflectionTileParameters.TracingIndirectArgs,
0);
}
FLumenCardTracingInputs TracingInputs(GraphBuilder, Scene, View);
(......)
// 追蹤反射.
TraceReflections(
GraphBuilder,
Scene,
View,
GLumenReflectionTraceMeshSDFs != 0 && Lumen::UseMeshSDFTracing(),
SceneTextures,
TracingInputs,
ReflectionTracingParameters,
ReflectionTileParameters,
MeshSDFGridParameters);
(......)
{
FReflectionResolveCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FReflectionResolveCS::FParameters>();
(......)
auto ComputeShader = View.ShaderMap->GetShader<FReflectionResolveCS>(PermutationVector);
// 解析反射.
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("ReflectionResolve"),
ComputeShader,
PassParameters,
ReflectionTileParameters.ResolveIndirectArgs,
0);
}
(......)
// 更新歷史數據.
UpdateHistoryReflections(
GraphBuilder,
View,
SceneTextures,
ReflectionTileParameters,
ResolvedSpecularIndirect,
SpecularIndirect);
return SpecularIndirect;
}
void TraceReflections(
FRDGBuilder& GraphBuilder,
const FScene* Scene,
const FViewInfo& View,
bool bTraceMeshSDFs,
const FSceneTextures& SceneTextures,
const FLumenCardTracingInputs& TracingInputs,
const FLumenReflectionTracingParameters& ReflectionTracingParameters,
const FLumenReflectionTileParameters& ReflectionTileParameters,
const FLumenMeshSDFGridParameters& InMeshSDFGridParameters)
{
{
(......)
auto ComputeShader = View.ShaderMap->GetShader<FReflectionClearTracesCS>(0);
// 清理追蹤輸出紋理.
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("ClearTraces"),
ComputeShader,
PassParameters,
ReflectionTileParameters.TracingIndirectArgs,
0);
}
FLumenIndirectTracingParameters IndirectTracingParameters;
SetupIndirectTracingParametersForReflections(IndirectTracingParameters);
const FSceneTextureParameters& SceneTextureParameters = GetSceneTextureParameters(GraphBuilder, SceneTextures);
const bool bScreenTraces = GLumenReflectionScreenTraces != 0;
if (bScreenTraces)
{
FReflectionTraceScreenTexturesCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FReflectionTraceScreenTexturesCS::FParameters>();
(......)
FReflectionTraceScreenTexturesCS::FPermutationDomain PermutationVector;
auto ComputeShader = View.ShaderMap->GetShader<FReflectionTraceScreenTexturesCS>(PermutationVector);
// 屏幕追蹤.
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("TraceScreen"),
ComputeShader,
PassParameters,
ReflectionTileParameters.TracingIndirectArgs,
0);
}
// 網格距離場追蹤.
if (bTraceMeshSDFs)
{
if (Lumen::UseHardwareRayTracedReflections()) // 硬件追蹤反射.
{
FCompactedReflectionTraceParameters CompactedTraceParameters = CompactTraces(
GraphBuilder,
View,
ReflectionTracingParameters,
ReflectionTileParameters,
WORLD_MAX,
IndirectTracingParameters.MaxTraceDistance);
RenderLumenHardwareRayTracingReflections(
GraphBuilder,
SceneTextureParameters,
View,
ReflectionTracingParameters,
ReflectionTileParameters,
TracingInputs,
CompactedTraceParameters,
IndirectTracingParameters.MaxTraceDistance);
}
else
{
FLumenMeshSDFGridParameters MeshSDFGridParameters = InMeshSDFGridParameters;
if (!MeshSDFGridParameters.NumGridCulledMeshSDFObjects)
{
CullForCardTracing(
GraphBuilder,
Scene, View,
TracingInputs,
IndirectTracingParameters,
/* out */ MeshSDFGridParameters);
}
if (MeshSDFGridParameters.TracingParameters.DistanceFieldObjectBuffers.NumSceneObjects > 0)
{
// 壓縮追蹤.
FCompactedReflectionTraceParameters CompactedTraceParameters = CompactTraces(
GraphBuilder,
View,
ReflectionTracingParameters,
ReflectionTileParameters,
IndirectTracingParameters.CardTraceEndDistanceFromCamera,
IndirectTracingParameters.MaxMeshSDFTraceDistance);
{
(......)
auto ComputeShader = View.ShaderMap->GetShader<FReflectionTraceMeshSDFsCS>(PermutationVector);
// 追蹤網格距離場.
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("TraceMeshSDFs"),
ComputeShader,
PassParameters,
CompactedTraceParameters.IndirectArgs,
0);
}
}
}
}
FCompactedReflectionTraceParameters CompactedTraceParameters = CompactTraces(...);
{
(......)
auto ComputeShader = View.ShaderMap->GetShader<FReflectionTraceVoxelsCS>(PermutationVector);
// 追蹤Voxel光照.
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("TraceVoxels"),
ComputeShader,
PassParameters,
CompactedTraceParameters.IndirectArgs,
0);
}
}
Lumen反射非直接光和Lumen漫反射非直接光最重要的區別是它們追蹤的射線數量和方式有所不同,Lumen反射需要指定追蹤的最大粗糙度GLumenReflectionMaxRoughnessToTrace(默認值是0.4,可由控制台命令r.Lumen.Reflections.MaxRoughnessToTrace改變),生成的TraceHit、TraceRadiance結果也會不同。
由於反射和漫反射涉及到的技術高度相似,此文就不再細究其技術細節了。
6.5.7.4 DiffuseIndirectComposite
此階段就是將之前的RenderLumenScreenProbeGather生成的探針的信息(DiffuseIndirect、RoughSpecularIndirect)和RenderLumenReflections生成的反射信息(SpecularIndirect),結合場景的GBuffer及相關數據,生成最終的場景顏色:
組合了GI的漫反射和鏡面反射后的場景顏色。(放大1.5倍,顏色范圍做了調整)
至於組合的過程,可以在其使用的PS中找到答案:
// Engine\Shaders\Private\DiffuseIndirectComposite.usf
void MainPS(
float4 SvPosition : SV_POSITION
, out float4 OutAddColor : SV_Target0
, out float4 OutMultiplyColor : SV_Target1
)
{
float2 SceneBufferUV = SvPositionToBufferUV(SvPosition);
float2 ScreenPosition = SvPositionToScreenPosition(SvPosition).xy;
// 采樣場景的GBuffer.
FGBufferData GBuffer = GetGBufferDataFromSceneTextures(SceneBufferUV);
// 采樣每幀動態生成的AO.
float DynamicAmbientOcclusion = AmbientOcclusionTexture.SampleLevel(AmbientOcclusionSampler, SceneBufferUV, 0).r;
// 計算最終要應用的AO.
float AOMask = (GBuffer.ShadingModelID != SHADINGMODELID_UNLIT);
float FinalAmbientOcclusion = lerp(1.0f, GBuffer.GBufferAO * DynamicAmbientOcclusion, AOMask * AmbientOcclusionStaticFraction);
float3 TranslatedWorldPosition = mul(float4(ScreenPosition * GBuffer.Depth, GBuffer.Depth, 1), View.ScreenToTranslatedWorld).xyz;
float3 N = GBuffer.WorldNormal;
float3 V = normalize(View.TranslatedWorldCameraOrigin - TranslatedWorldPosition);
float NoV = saturate(dot(N, V));
// 應用非直接漫反射.
#if DIM_APPLY_DIFFUSE_INDIRECT
{
float3 DiffuseIndirectLighting = 0;
float3 RoughSpecularIndirectLighting = 0;
float3 SpecularIndirectLighting = 0;
#if DIM_APPLY_DIFFUSE_INDIRECT == 4
DiffuseIndirectLighting = DiffuseIndirect_Textures_0.SampleLevel(GlobalPointClampedSampler, SceneBufferUV, 0).rgb;
RoughSpecularIndirectLighting = DiffuseIndirect_Textures_1.SampleLevel(GlobalPointClampedSampler, SceneBufferUV, 0).rgb;
SpecularIndirectLighting = DiffuseIndirect_Textures_2.SampleLevel(GlobalPointClampedSampler, SceneBufferUV, 0).rgb;
#else
{
// 采樣降噪器的輸出.
FSSDKernelConfig KernelConfig = CreateKernelConfig();
#if DEBUG_OUTPUT
{
KernelConfig.DebugPixelPosition = uint2(SvPosition.xy);
KernelConfig.DebugEventCounter = 0;
}
#endif
// Compile time.
KernelConfig.bSampleKernelCenter = true;
KernelConfig.BufferLayout = CONFIG_SIGNAL_INPUT_LAYOUT;
KernelConfig.bUnroll = true;
#if DIM_UPSCALE_DIFFUSE_INDIRECT
{
KernelConfig.SampleSet = SAMPLE_SET_2X2_BILINEAR;
KernelConfig.BilateralDistanceComputation = SIGNAL_WORLD_FREQUENCY_REF_METADATA_ONLY;
KernelConfig.WorldBluringDistanceMultiplier = 16.0;
KernelConfig.BilateralSettings[0] = BILATERAL_POSITION_BASED(3);
// SGPRs(Scalar General Purpose Register, 標量通用寄存器)
KernelConfig.BufferSizeAndInvSize = View.BufferSizeAndInvSize * float4(0.5, 0.5, 2.0, 2.0);
KernelConfig.BufferBilinearUVMinMax = View.BufferBilinearUVMinMax;
}
#else
{
KernelConfig.SampleSet = SAMPLE_SET_1X1;
KernelConfig.bNormalizeSample = true;
// SGPRs
KernelConfig.BufferSizeAndInvSize = View.BufferSizeAndInvSize;
KernelConfig.BufferBilinearUVMinMax = View.BufferBilinearUVMinMax;
}
#endif
// VGPRs(Vector General Purpose Register, 向量通用寄存器)
KernelConfig.BufferUV = SceneBufferUV;
{
KernelConfig.CompressedRefSceneMetadata = GBufferDataToCompressedSceneMetadata(GBuffer);
KernelConfig.RefBufferUV = SceneBufferUV;
KernelConfig.RefSceneMetadataLayout = METADATA_BUFFER_LAYOUT_DISABLED;
}
KernelConfig.HammersleySeed = Rand3DPCG16(int3(SvPosition.xy, View.StateFrameIndexMod8)).xy;
FSSDSignalAccumulatorArray UncompressedAccumulators = CreateSignalAccumulatorArray();
FSSDCompressedSignalAccumulatorArray CompressedAccumulators = CompressAccumulatorArray(
UncompressedAccumulators, CONFIG_ACCUMULATOR_VGPR_COMPRESSION);
// 累加卷積核
AccumulateKernel(
KernelConfig,
DiffuseIndirect_Textures_0,
DiffuseIndirect_Textures_1,
DiffuseIndirect_Textures_2,
DiffuseIndirect_Textures_3,
/* inout */ UncompressedAccumulators,
/* inout */ CompressedAccumulators);
// 采樣
FSSDSignalSample Sample;
#if DIM_UPSCALE_DIFFUSE_INDIRECT
Sample = NormalizeToOneSample(UncompressedAccumulators.Array[0].Moment1);
#else
Sample = UncompressedAccumulators.Array[0].Moment1;
#endif
// DIM_APPLY_DIFFUSE_INDIRECT是1或3時只有漫反射非直接光.
#if DIM_APPLY_DIFFUSE_INDIRECT == 1 || DIM_APPLY_DIFFUSE_INDIRECT == 3
{
DiffuseIndirectLighting = Sample.SceneColor.rgb;
}
// DIM_APPLY_DIFFUSE_INDIRECT是2時有漫反射和鏡面非直接光.
#elif DIM_APPLY_DIFFUSE_INDIRECT == 2
{
DiffuseIndirectLighting = UncompressedAccumulators.Array[0].Moment1.ColorArray[0];
SpecularIndirectLighting = UncompressedAccumulators.Array[0].Moment1.ColorArray[1];
}
#else
#error Unimplemented
#endif
}
#endif
float3 DiffuseColor = bVisualizeDiffuseIndirect ? float3(.18f, .18f, .18f) : GBuffer.DiffuseColor;
float3 SpecularColor = GBuffer.SpecularColor;
#if DIM_APPLY_DIFFUSE_INDIRECT == 4
RemapClearCoatDiffuseAndSpecularColor(GBuffer, NoV, DiffuseColor, SpecularColor);
#endif
#if DIM_APPLY_DIFFUSE_INDIRECT == 2 || DIM_APPLY_DIFFUSE_INDIRECT == 4
float DiffuseIndirectAO = 1;
#else
float DiffuseIndirectAO = lerp(1, FinalAmbientOcclusion, ApplyAOToDynamicDiffuseIndirect);
#endif
FDirectLighting IndirectLighting;
if (GBuffer.ShadingModelID == SHADINGMODELID_HAIR)
{
IndirectLighting.Diffuse = DiffuseIndirectLighting * GBuffer.BaseColor;
IndirectLighting.Specular = 0;
}
else
{
IndirectLighting.Diffuse = DiffuseIndirectLighting * DiffuseColor * DiffuseIndirectAO;
IndirectLighting.Transmission = 0;
#if DIM_APPLY_DIFFUSE_INDIRECT == 4
IndirectLighting.Specular = CombineRoughSpecular(GBuffer, NoV, SpecularIndirectLighting, RoughSpecularIndirectLighting, SpecularColor);
#else
IndirectLighting.Specular = SpecularIndirectLighting * EnvBRDF(SpecularColor, GBuffer.Roughness, NoV);
#endif
}
const bool bNeedsSeparateSubsurfaceLightAccumulation = UseSubsurfaceProfile(GBuffer.ShadingModelID);
if (bNeedsSeparateSubsurfaceLightAccumulation &&
View.bSubsurfacePostprocessEnabled > 0 && View.bCheckerboardSubsurfaceProfileRendering > 0)
{
bool bChecker = CheckerFromSceneColorUV(SceneBufferUV);
// Adjust for checkerboard. only apply non-diffuse lighting (including emissive)
// to the specular component, otherwise lighting is applied twice
IndirectLighting.Specular *= !bChecker;
}
// 累加光照結果.
FLightAccumulator LightAccumulator = (FLightAccumulator)0;
LightAccumulator_Add(
LightAccumulator,
IndirectLighting.Diffuse + IndirectLighting.Specular,
IndirectLighting.Diffuse,
1.0f,
bNeedsSeparateSubsurfaceLightAccumulation);
// 獲取光照結果.
OutAddColor = LightAccumulator_GetResult(LightAccumulator);
}
#else
{
OutAddColor = 0;
}
#endif
OutMultiplyColor = FinalAmbientOcclusion;
}
6.5.8 Lumen總結
Lumen的步驟很多很復雜,但總結起來可分為幾個步驟:
1、構建MeshCard和LumenCard,更新它們。
2、根據Lumen場景的Card信息,追蹤並更新對應的紋素(Texel)。
3、在漫反射和鏡面反射階段,利用多種方式追蹤和計算屏幕空間表面的光照。
4、組合前述步驟得到的非直接光的漫反射和鏡面反射,獲得疊加了非直接光的最終場景顏色。
另外,在追蹤過程中涉及到了多種方式,並且它們是按照權重過渡而成(下圖)。
混合追蹤示意圖。紅色表示屏幕追蹤,綠色表示網格距離場追蹤,藍色表示Voxel Lighting追蹤。顏色過渡代表着不同類型追蹤之間的過渡。
修改DEBUG_VISUALIZE_TRACE_TYPES為1且在命令行關閉ShowFlag.DirectLighting可以開啟追蹤權重可視化模式:
// Engine\Shaders\Private\Lumen\LumenScreenProbeTracing.usf
#define DEBUG_VISUALIZE_TRACE_TYPES 1 // 啟用追蹤權重可視化(默認為0)
整體上,Lumen綜合了SSGI、SDF(Mesh SDF和Global SDF)、Lumen Card、Voxel Cone等追蹤技術,應用了各種技術生成了各類數據息(自適應的Screen Space Probe、 Irradiance Probe、Surface Cache、Prefilter Radiance、Voxel Lighting、RSM、Virtual Texture、Clipmap),計算出非直接光的漫反射和鏡面反射,最后按權重混合成場景顏色。
Lumen漫反射GI支持軟硬件兩種方式,默認參數下,其軟件方式涉及的各類追蹤描述如下:
追蹤類型 | 譯名 | 范圍 | 描述 |
---|---|---|---|
Screen Trace | 屏幕追蹤 | 全場景 | 亦即SSGI,只要能追蹤到相交點,優先使用其反彈信息。 |
Voxel Lighting Trace | 體素光照追蹤 | 距相機200米內 | 基於Cone的射線追蹤,會采樣MIP快速得到不同Hit距離的信息。 |
Detail MeshCard Trace | 細節網格卡片追蹤 | 2~40米 | 采樣MeshCard 光照信息時會使⽤類似VSM的⽅式使⽤概率估算遮擋。 |
Distant MeshCard Trace | 遠距網格卡片追蹤 | 200~1000米 | 會追蹤預先生成的全局距離場,不再使用遮擋估算。 |
Lumen鏡面反射GI也支持軟硬件兩種方式,其中軟件方式結合了SSR + SDF Tracing(Mesh SDF、Global SDF)的技術。
6.6 其它渲染技術
6.6.1 Temporal Super Resolution
時間超分辨率(Temporal Super Resolution,TSR)是新一代的時間抗鋸齒算法,用來替換傳統(UE4)的TAA。它的特性有利於低分辨率輸入獲得高分辨率的輸出,且質量解決原生分辨率,在高頻下更少鬼影更少閃爍,針對PS5等平台做了優化,但同時需要SM5.0以上的圖形平台。
TSR使用的技術跟NVIDIA的DLSS和AMD的FidelityFX Super Resolution(FSR)相似,只是DLSS基於Tensor Core的深度學習做了加速,而TSR不需要依賴Tensor Core。換句話說,TSR可以不依賴RTX顯卡而運行於其它顯卡廠商的設備。TSR由於可以采用低分辨率輸出高分辨率的紋理,所以不僅可以提升抗鋸齒效果,還可以提升渲染性能,減少能耗。
不同於UE4,UE5只要配置沒有顯式禁用TemporalAA,無論選擇了何種抗鋸齒,在后處理階段都會走TSR通道。調用堆棧如下所示:
// Engine\Source\Runtime\Renderer\Private\PostProcess\PostProcessing.cpp
void AddPostProcessingPasses(FRDGBuilder& GraphBuilder, const FViewInfo& View, ...)
{
(......)
// TAA抗鋸齒.
EMainTAAPassConfig TAAConfig = ITemporalUpscaler::GetMainTAAPassConfig(View);
// TAA配置沒有禁用.
if (TAAConfig != EMainTAAPassConfig::Disabled)
{
(......)
// 調用FDefaultTemporalUpscaler::AddPasses, 見后面的解析.
UpscalerToUse->AddPasses(
GraphBuilder,
View,
UpscalerPassInputs,
&SceneColor.Texture,
&SecondaryViewRect,
&DownsampledSceneColor.Texture,
&DownsampledSceneColor.ViewRect);
}
(......)
}
// Engine\Source\Runtime\Renderer\Private\PostProcess\TemporalAA.cpp
void FDefaultTemporalUpscaler::AddPasses(FRDGBuilder& GraphBuilder, const FViewInfo& View,...) const final
{
// 如果啟用了且支持第五代TAA, 則進入TSR通道.
if (CVarTAAAlgorithm.GetValueOnRenderThread() && DoesPlatformSupportGen5TAA(View.GetShaderPlatform()))
{
*OutSceneColorHalfResTexture = nullptr;
return AddTemporalSuperResolutionPasses(
GraphBuilder,
View,
PassInputs,
OutSceneColorTexture,
OutSceneColorViewRect);
}
(......)
}
由此進入了AddTemporalSuperResolutionPasses
,以下是RenderDoc截取的TSR渲染過程:
由此可知,TSR相比UE4的TAA多了很多個Pass,主要包含清理上一幀紋理、放大速度緩沖、摒棄無效速度緩沖、過濾頻率、對比歷史數據、后置過濾重投射、放大重投射、更新歷史等幾個階段。
其中以上階段最重要的一步是更新歷史階段,它會根據輸入的場景顏色、深度、放大后速度、視差系數、歷史幀數據(放大后重投影、重投影、高頻、低頻、元數據、子像素信息)等數據生成最終的抗鋸齒后的場景顏色和當前的歷史幀數據。
左:場景顏色輸入;右:TSR后的場景顏色輸出。
TSR輸出的歷史幀數據:低頻、高頻、元數據、子像素信息。
下面直接進入更新歷史階段使用的Compute Shader進行分析:
// /Engine/Private/TemporalAA/TAAUpdateHistory.usf
[numthreads(TILE_SIZE, TILE_SIZE, 1)]
void MainCS(
uint2 GroupId : SV_GroupID,
uint GroupThreadIndex : SV_GroupIndex)
{
uint GroupWaveIndex = GetGroupWaveIndex(GroupThreadIndex, /* GroupSize = */ TILE_SIZE * TILE_SIZE);
float4 Debug = 0.0;
// 歷史像素位置.
taa_short2 HistoryPixelPos = (
taa_short2(GroupId) * taa_short2(TILE_SIZE, TILE_SIZE) +
Map8x8Tile2x2Lane(GroupThreadIndex));
float2 ViewportUV = (float2(HistoryPixelPos) + 0.5f) * HistoryInfo_ViewportSizeInverse;
float2 ScreenPos = ViewportUVToScreenPos(ViewportUV);
// 輸入視口中輸出像素O中心的像素坐標.
float2 PPCo = ViewportUV * InputInfo_ViewportSize + InputJitter;
// 最近的輸入像素K的中心像素坐標。
float2 PPCk = floor(PPCo) + 0.5;
taa_short2 InputPixelPos = ClampPixelOffset(
taa_short2(InputPixelPosMin) + taa_short2(PPCo),
InputPixelPosMin, InputPixelPosMax);
// 獲取重投影相關的信息.
float2 PrevScreenPos = ScreenPos;
taa_half ParallaxRejectionMask = taa_half(1.0);
taa_half LowFrequencyRejection = taa_half(1.0);
taa_half OutputPixelVelocity = taa_half(0.0);
#if 1
{
float2 EncodedVelocity = DilatedVelocityTexture[InputPixelPos];
ParallaxRejectionMask = ParallaxRejectionMaskTexture[InputPixelPos];
float2 ScreenVelocity = DecodeVelocityFromTexture(float4(EncodedVelocity, 0.0, 0.0)).xy;
PrevScreenPos = ScreenPos - ScreenVelocity;
OutputPixelVelocity = taa_half(length(ScreenVelocity * HistoryInfo_ViewportSize));
taa_ushort2 RejectionPixelPos = (taa_ushort2(InputPixelPos) - taa_short2(InputPixelPosMin)) / 2;
LowFrequencyRejection = HistoryRejectionTexture[RejectionPixelPos];
#if !CONFIG_CLAMP
{
ParallaxRejectionMask = taa_half(1.0);
LowFrequencyRejection = taa_half(1.0);
}
#endif
}
#endif
// 獲取像素是否響應AA.
bool bIsResponsiveAAPixel = false;
#if CONFIG_RESPONSIVE_STENCIL
{
const uint kResponsiveStencilMask = 1 << 3;
uint SceneStencilRef = InputSceneStencilTexture.Load(int3(InputPixelPos, 0)) STENCIL_COMPONENT_SWIZZLE;
bIsResponsiveAAPixel = (SceneStencilRef & kResponsiveStencilMask) != 0;
}
#endif
// 檢測HistoryBufferUV是否在視口之外.
bool bOffScreen = IsOffScreen(bCameraCut, PrevScreenPos, ParallaxRejectionMask);
taa_half TotalRejection = bOffScreen ? 0.0 : saturate(LowFrequencyRejection * 4.0);
// 以預測頻率過濾輸入場景顏色.
taa_half3 FilteredInputColor;
taa_half3 InputMinColor;
taa_half3 InputMaxColor;
taa_half InputPixelAlignement;
taa_half ClosestInputLuma4;
ISOLATE
{
// 從像素K到O的向量.
taa_half2 dKO = taa_half2(PPCo - PPCk);
FilteredInputColor = taa_half(0.0);
taa_half FilteredInputColorWeight = taa_half(0.0);
#if 0 // shader compiler bug :'(
taa_half InputToHistoryFactor = taa_half(HistoryInfo_ViewportSize.x * InputInfo_ViewportSizeInverse.x);
taa_half FinalInputToHistoryFactor = bOffScreen ? taa_half(1.0) : InputToHistoryFactor;
#else
float InputToHistoryFactor = float(HistoryInfo_ViewportSize.x * InputInfo_ViewportSizeInverse.x);
float FinalInputToHistoryFactor = lerp(1.0, InputToHistoryFactor, TotalRejection);
#endif
InputMinColor = taa_half(INFINITE_FLOAT);
InputMaxColor = taa_half(-INFINITE_FLOAT);
// 根據CONFIG_SAMPLES用不同方式生成采樣坐標並采樣輸入的場景顏色.
UNROLL_N(CONFIG_SAMPLES)
for (uint SampleId = 0; SampleId < CONFIG_SAMPLES; SampleId++)
{
taa_short2 SampleInputPixelPos;
taa_half2 PixelOffset;
#if CONFIG_SAMPLES == 9
{
taa_short2 iPixelOffset = taa_short2(kOffsets3x3[kSquareIndexes3x3[SampleId]]);
PixelOffset = taa_half2(iPixelOffset);
SampleInputPixelPos = AddAndClampPixelOffset(
InputPixelPos,
iPixelOffset, iPixelOffset,
InputPixelPosMin, InputPixelPosMax);
}
#elif CONFIG_SAMPLES == 5 || CONFIG_SAMPLES == 6
{
if (SampleId == 5)
{
taa_short2 iPixelOffset;
#if CONFIG_COMPILE_FP16
iPixelOffset = int16_t2(1, 1) - int16_t2((asuint16(dKO) & uint16_t(0x8000)) >> uint16_t(14));
PixelOffset = asfloat16(asuint16(half(1.0)).xx | (asuint16(dKO) & uint16_t(0x8000)));
#else
iPixelOffset = SignFastInt(dKO);
PixelOffset = asfloat(asuint(1.0).xx | (asuint(dKO) & uint(0x80000000)));
#endif
SampleInputPixelPos = ClampPixelOffset(InputPixelPos, InputPixelPosMin, InputPixelPosMax);
}
else
{
taa_short2 iPixelOffset = taa_short2(kOffsets3x3[kPlusIndexes3x3[SampleId]]);
PixelOffset = taa_half2(iPixelOffset);
SampleInputPixelPos = AddAndClampPixelOffset(
InputPixelPos,
iPixelOffset, iPixelOffset,
InputPixelPosMin, InputPixelPosMax);
}
}
#else
#error Unknown sample count
#endif
taa_half3 InputColor = InputSceneColorTexture[SampleInputPixelPos];
taa_half2 dPP = PixelOffset - dKO;
taa_half SampleSpatialWeight = ComputeSampleWeigth(FinalInputToHistoryFactor, dPP, /* MinimalContribution = */ float(0.005));
taa_half ToneWeight = HdrWeight4(InputColor);
FilteredInputColor += (SampleSpatialWeight * ToneWeight) * InputColor;
FilteredInputColorWeight += (SampleSpatialWeight * ToneWeight);
if (SampleId == 0)
{
ClosestInputLuma4 = Luma4(InputColor);
InputMinColor = TransformColorForClampingBox(InputColor);
InputMaxColor = TransformColorForClampingBox(InputColor);
}
else
{
InputMinColor = min(InputMinColor, TransformColorForClampingBox(InputColor));
InputMaxColor = max(InputMaxColor, TransformColorForClampingBox(InputColor));
}
}
FilteredInputColor *= rcp(FilteredInputColorWeight);
InputPixelAlignement = ComputeSampleWeigth(InputToHistoryFactor, dKO, /* MinimalContribution = */ float(0.0));
}
// 保存到LDS中,為VGPR采樣歷史數據騰出空間.
#if CONFIG_MANUAL_LDS_SPILL
ISOLATE
{
uint LocalGroupThreadIndex = GetGroupThreadIndex(GroupThreadIndex, GroupWaveIndex);
SharedArray0[LocalGroupThreadIndex] = taa_half4(FilteredInputColor, LowFrequencyRejection);
SharedArray1[LocalGroupThreadIndex] = taa_half4(InputMinColor, InputPixelAlignement);
SharedArray2[LocalGroupThreadIndex] = taa_half4(InputMaxColor, OutputPixelVelocity);
}
#endif
// 重投影歷史數據.
taa_half3 PrevHistoryMoment1;
taa_half PrevHistoryValidity;
taa_half3 PrevHistoryMommentMin;
taa_half3 PrevHistoryMommentMax;
taa_half3 PrevFallbackColor;
taa_half PrevFallbackWeight;
taa_subpixel_details PrevSubpixelDetails;
ISOLATE
{
// 重投影歷史數據.
taa_half3 RawHistory0 = taa_half(0);
taa_half3 RawHistory1 = taa_half(0);
taa_half2 RawHistory2 = taa_half(0);
taa_half3 RawHistory1Min = INFINITE_FLOAT;
taa_half3 RawHistory1Max = -INFINITE_FLOAT;
// 采樣原始的歷史數據.
{
float2 PrevHistoryBufferUV = (PrevHistoryInfo_ScreenPosToViewportScale * PrevScreenPos + PrevHistoryInfo_ScreenPosToViewportBias) * PrevHistoryInfo_ExtentInverse;
PrevHistoryBufferUV = clamp(PrevHistoryBufferUV, PrevHistoryInfo_UVViewportBilinearMin, PrevHistoryInfo_UVViewportBilinearMax);
#if 1
{
FCatmullRomSamples Samples = GetBicubic2DCatmullRomSamples(PrevHistoryBufferUV, PrevHistoryInfo_Extent, PrevHistoryInfo_ExtentInverse);
UNROLL
for (uint i = 0; i < Samples.Count; i++)
{
float2 SampleUV = clamp(Samples.UV[i], PrevHistoryInfo_UVViewportBilinearMin, PrevHistoryInfo_UVViewportBilinearMax);
taa_half3 Sample0 = PrevHistory_Textures_0.SampleLevel(GlobalBilinearClampedSampler, SampleUV, 0);
taa_half3 Sample1 = PrevHistory_Textures_1.SampleLevel(GlobalBilinearClampedSampler, SampleUV, 0);
taa_half2 Sample2 = PrevHistory_Textures_2.SampleLevel(GlobalBilinearClampedSampler, SampleUV, 0);
RawHistory1Min = min(RawHistory1Min, Sample1 * SafeRcp(Sample2.g));
RawHistory1Max = max(RawHistory1Max, Sample1 * SafeRcp(Sample2.g));
RawHistory0 += Sample0 * taa_half(Samples.Weight[i]);
RawHistory1 += Sample1 * taa_half(Samples.Weight[i]);
RawHistory2 += Sample2 * taa_half(Samples.Weight[i]);
}
RawHistory0 *= taa_half(Samples.FinalMultiplier);
RawHistory1 *= taa_half(Samples.FinalMultiplier);
RawHistory2 *= taa_half(Samples.FinalMultiplier);
}
#else
{
RawHistory0 = PrevHistory_Textures_0.SampleLevel(GlobalBilinearClampedSampler, PrevHistoryBufferUV, 0);
RawHistory1 = PrevHistory_Textures_1.SampleLevel(GlobalBilinearClampedSampler, PrevHistoryBufferUV, 0);
RawHistory2 = PrevHistory_Textures_2.SampleLevel(GlobalBilinearClampedSampler, PrevHistoryBufferUV, 0);
}
#endif
FSubpixelNeighborhood SubpixelNeighborhood = GatherPrevSubpixelNeighborhood(PrevHistory_Textures_3, PrevHistoryBufferUV);
{
PrevSubpixelDetails = 0;
UNROLL_N(SUB_PIXEL_COUNT)
for (uint SubpixelId = 0; SubpixelId < SUB_PIXEL_COUNT; SubpixelId++)
{
taa_subpixel_payload SubpixelPayload = GetSubpixelPayload(SubpixelNeighborhood, SubpixelId);
PrevSubpixelDetails |= SubpixelPayload << (SUB_PIXEL_BIT_COUNT * SubpixelId);
}
}
RawHistory0 = -min(-RawHistory0, taa_half(0.0));
RawHistory1 = -min(-RawHistory1, taa_half(0.0));
RawHistory2 = -min(-RawHistory2, taa_half(0.0));
}
// 解壓歷史數據.
{
PrevFallbackColor = RawHistory0;
PrevFallbackWeight = RawHistory2.r;
PrevHistoryMommentMin = RawHistory1Min;
PrevHistoryMommentMax = RawHistory1Max;
PrevHistoryMoment1 = RawHistory1;
PrevHistoryValidity = RawHistory2.g;
}
// 校正歷史數據.
{
PrevHistoryMommentMin *= taa_half(HistoryPreExposureCorrection);
PrevHistoryMommentMax *= taa_half(HistoryPreExposureCorrection);
PrevHistoryMoment1 *= taa_half(HistoryPreExposureCorrection);
PrevFallbackColor *= taa_half(HistoryPreExposureCorrection);
}
}
// 從LDS讀取數據.
#if CONFIG_MANUAL_LDS_SPILL
ISOLATE
{
uint LocalGroupThreadIndex = GetGroupThreadIndex(GroupThreadIndex, GroupWaveIndex);
taa_half4 RawLDS0 = SharedArray0[LocalGroupThreadIndex];
taa_half4 RawLDS1 = SharedArray1[LocalGroupThreadIndex];
taa_half4 RawLDS2 = SharedArray2[LocalGroupThreadIndex];
FilteredInputColor = RawLDS0.rgb;
InputMinColor = RawLDS1.rgb;
InputMaxColor = RawLDS2.rgb;
LowFrequencyRejection = RawLDS0.a;
InputPixelAlignement = RawLDS1.a;
OutputPixelVelocity = RawLDS2.a;
}
#endif
// 如果當前低頻偏離歷史低頻, 摒棄高頻細節.
#if CONFIG_LOW_FREQUENCY_DRIFT_REJECTION
{
taa_half3 PrevHighFrequencyYCoCg = TransformColorForClampingBox(PrevHistoryMoment1 * SafeRcp(PrevHistoryValidity));
taa_half3 PrevYCoCg = TransformColorForClampingBox(PrevFallbackColor);
taa_half3 ClampedPrevYCoCg = TransformColorForClampingBox(clamp(PrevFallbackColor, PrevHistoryMommentMin, PrevHistoryMommentMax));
taa_half HighFrequencyRejection = MeasureRejectionFactor(
PrevYCoCg, ClampedPrevYCoCg,
PrevHighFrequencyYCoCg, InputMinColor, InputMaxColor);
PrevHistoryMoment1 *= HighFrequencyRejection;
PrevHistoryValidity *= HighFrequencyRejection;
}
#endif
// 將當前幀的輸入輸入到下一幀的預測器中.
const taa_half Histeresis = rcp(taa_half(MAX_SAMPLE_COUNT));
const taa_half PredictionOnlyValidity = Histeresis * taa_half(2.0);
// 截取備選數據.
taa_half LumaMin;
taa_half LumaMax;
taa_half3 ClampedFallbackColor;
taa_half FallbackRejection;
{
LumaMin = InputMinColor.x;
LumaMax = InputMaxColor.x;
taa_half3 PrevYCoCg = TransformColorForClampingBox(PrevFallbackColor);
taa_half3 ClampedPrevYCoCg = clamp(PrevYCoCg, InputMinColor, InputMaxColor);
taa_half3 InputCenterYCoCg = TransformColorForClampingBox(FilteredInputColor);
ClampedFallbackColor = YCoCgToRGB(ClampedPrevYCoCg);
FallbackRejection = MeasureRejectionFactor(
PrevYCoCg, ClampedPrevYCoCg,
InputCenterYCoCg, InputMinColor, InputMaxColor);
#if !CONFIG_CLAMP
{
ClampedFallbackColor = PrevFallbackColor;
FallbackRejection = taa_half(1.0);
}
#endif
}
taa_half3 FinalHistoryMoment1;
taa_half FinalHistoryValidity;
{
// 根據完整性,計算需要摒棄多少歷史記錄.
taa_half PrevHistoryRejectionWeight = LowFrequencyRejection;
FLATTEN
if (bOffScreen)
{
PrevHistoryRejectionWeight = taa_half(0.0);
}
taa_half DesiredCurrentContribution = max(Histeresis * InputPixelAlignement, taa_half(0.0));
// 確定基於預測的摒棄是否足夠可信.
taa_half RejectionConfidentEnough = taa_half(1); // saturate(RejectionValidity * MAX_SAMPLE_COUNT - 3.0);
// 計算新摒棄的有效性.
taa_half RejectedValidity = (
min(PrevHistoryValidity, PredictionOnlyValidity - DesiredCurrentContribution) +
max(PrevHistoryValidity - PredictionOnlyValidity + DesiredCurrentContribution, taa_half(0.0)) * PrevHistoryRejectionWeight);
RejectedValidity = PrevHistoryValidity * PrevHistoryRejectionWeight;
// 計算最大輸出有效性.
taa_half OutputValidity = (
clamp(RejectedValidity + DesiredCurrentContribution, taa_half(0.0), PredictionOnlyValidity) +
clamp(RejectedValidity + DesiredCurrentContribution * PrevHistoryRejectionWeight * RejectionConfidentEnough - PredictionOnlyValidity, 0.0, 1.0 - PredictionOnlyValidity));
FLATTEN
if (bIsResponsiveAAPixel)
{
OutputValidity = taa_half(0.0);
}
taa_half InvPrevHistoryValidity = SafeRcp(PrevHistoryValidity);
taa_half PrevMomentWeight = max(OutputValidity - DesiredCurrentContribution, taa_half(0.0));
taa_half CurrentMomentWeight = min(DesiredCurrentContribution, OutputValidity);
{
taa_half PrevHistoryToneWeight = HdrWeightY(Luma4(PrevHistoryMoment1) * InvPrevHistoryValidity);
taa_half FilteredInputToneWeight = HdrWeight4(FilteredInputColor);
taa_half BlendPrevHistory = PrevMomentWeight * PrevHistoryToneWeight;
taa_half BlendFilteredInput = CurrentMomentWeight * FilteredInputToneWeight;
taa_half CommonWeight = OutputValidity * SafeRcp(BlendPrevHistory + BlendFilteredInput);
FinalHistoryMoment1 = (
PrevHistoryMoment1 * (CommonWeight * BlendPrevHistory * InvPrevHistoryValidity) +
FilteredInputColor * (CommonWeight * BlendFilteredInput));
}
// 量化有效性的8位編碼調整,以避免數字偏移.
taa_half OutputInvValidity = SafeRcp(OutputValidity);
FinalHistoryValidity = ceil(taa_half(255.0) * OutputValidity) * rcp(taa_half(255.0));
FinalHistoryMoment1 *= FinalHistoryValidity * OutputInvValidity;
}
// 計算備用的歷史數據.
taa_half3 FinalFallbackColor;
taa_half FinalFallbackWeight;
{
const taa_half TargetHesteresisCurrentFrameWeight = rcp(taa_half(MAX_FALLBACK_SAMPLE_COUNT));
taa_half LumaHistory = Luma4(PrevFallbackColor);
taa_half LumaFiltered = Luma4(FilteredInputColor);
{
taa_half OutputBlend = ComputeFallbackContribution(FinalHistoryValidity);
}
taa_half BlendFinal;
#if 1
{
taa_half CurrentFrameSampleCount = max(InputPixelAlignement, taa_half(0.005));
// 僅使用一個樣本計數就可以極快地恢復歷史摒棄, 但隨后立即穩定,以便子像素頻率可以盡快使用.
taa_half PrevFallbackSampleCount;
FLATTEN
if (PrevFallbackWeight < taa_half(1.0))
{
PrevFallbackSampleCount = PrevFallbackWeight;
}
else
{
PrevFallbackSampleCount = taa_half(MAX_FALLBACK_SAMPLE_COUNT);
}
// 根據低頻摒棄歷史數據.
#if 1
{
taa_half PrevFallbackRejectionFactor = saturate(LowFrequencyRejection * (CurrentFrameSampleCount + PrevFallbackSampleCount) / PrevFallbackSampleCount);
PrevFallbackSampleCount *= PrevFallbackRejectionFactor;
}
#endif
BlendFinal = CurrentFrameSampleCount / (CurrentFrameSampleCount + PrevFallbackSampleCount);
// 增加運動的混合權重.
#if 1
{
BlendFinal = lerp(BlendFinal, max(taa_half(0.2), BlendFinal), saturate(OutputPixelVelocity * rcp(taa_half(40.0))));
}
#endif
// 抗閃爍.
#if 1
{
taa_half DistToClamp = min( abs(LumaHistory - LumaMin), abs(LumaHistory - LumaMax) ) / max3( LumaHistory, LumaFiltered, taa_half(1e-4) );
BlendFinal *= taa_half(0.2) + taa_half(0.8) * saturate(taa_half(0.5) * DistToClamp);
}
#endif
// 確保至少有一些小的貢獻.
#if 1
{
BlendFinal = max( BlendFinal, saturate( taa_half(0.01) * LumaHistory / abs( LumaFiltered - LumaHistory ) ) );
}
#endif
// 反應力度是新幀的1/4.
BlendFinal = bIsResponsiveAAPixel ? taa_half(1.0/4.0) : BlendFinal;
// 完全摒棄歷史數據.
{
PrevFallbackSampleCount *= TotalRejection;
BlendFinal = lerp(1.0, BlendFinal, TotalRejection);
}
FinalFallbackWeight = saturate(CurrentFrameSampleCount + PrevFallbackSampleCount);
#if 1
FinalFallbackWeight = saturate(floor(255.0 * (CurrentFrameSampleCount + PrevFallbackSampleCount)) * rcp(255.0));
#endif
}
#endif
{
taa_half FilterWeight = HdrWeight4(FilteredInputColor);
taa_half ClampedHistoryWeight = HdrWeight4(ClampedFallbackColor);
taa_half2 Weights = WeightedLerpFactors(ClampedHistoryWeight, FilterWeight, BlendFinal);
FinalFallbackColor = ClampedFallbackColor * Weights.x + FilteredInputColor * Weights.y;
}
}
// 更新子像素細節.
taa_subpixel_details FinalSubpixelDetails;
{
taa_half2 dKO = taa_half2(PPCo - PPCk);
bool bUpdate = all(abs(dKO) < 0.5 * (InputInfo_ViewportSize.x * HistoryInfo_ViewportSizeInverse.x));
FinalSubpixelDetails = PrevSubpixelDetails;
taa_subpixel_payload ParallaxFactorBits = ParallaxFactorTexture[InputPixelPos] & SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK;
{
const uint ParallaxFactorMask = (
(SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 0 * SUB_PIXEL_BIT_COUNT)) |
(SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 1 * SUB_PIXEL_BIT_COUNT)) |
(SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 2 * SUB_PIXEL_BIT_COUNT)) |
(SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 3 * SUB_PIXEL_BIT_COUNT)) |
0x0);
// 重置視差系數.
FLATTEN
if (bOffScreen)
{
FinalSubpixelDetails = FinalSubpixelDetails & ~ParallaxFactorMask;
}
}
FLATTEN
if (bUpdate)
{
bool2 bBool = dKO < 0.0;
uint SubpixelId = dot(uint2(bBool), uint2(1, SUB_PIXEL_GRID_SIZE));
uint SubpixelShift = SubpixelId * SUB_PIXEL_BIT_COUNT;
taa_subpixel_payload SubpixelPayload = (ParallaxFactorBits << SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET);
FinalSubpixelDetails = (FinalSubpixelDetails & (~(SUB_PIXEL_BIT_MASK << SubpixelShift))) | (SubpixelPayload << SubpixelShift);
}
}
// 計算最終輸出.
taa_half3 FinalOutputColor;
taa_half FinalOutputValidity;
{
taa_half OutputBlend = ComputeFallbackContribution(FinalHistoryValidity);
FinalOutputValidity = lerp(taa_half(1.0), saturate(FinalHistoryValidity), OutputBlend);
taa_half3 NormalizedFinalHistoryMoment1 = taa_half3(FinalHistoryMoment1 * float(SafeRcp(FinalHistoryValidity)));
taa_half FallbackWeight = HdrWeight4(FinalFallbackColor);
taa_half Moment1Weight = HdrWeight4(NormalizedFinalHistoryMoment1);
taa_half2 Weights = WeightedLerpFactors(FallbackWeight, Moment1Weight, OutputBlend);
#if DEBUG_FALLBACK_BLENDING
taa_half3 FallbackColor = taa_half3(1, 0.25, 0.25);
taa_half3 HighFrequencyColor = taa_half3(0.25, 1, 0.25);
FinalOutputColor = FinalFallbackColor * Weights.x * FallbackColor + NormalizedFinalHistoryMoment1 * Weights.y * HighFrequencyColor;
#elif DEBUG_LOW_FREQUENCY_REJECTION
taa_half3 DebugColor = lerp(taa_half3(1, 0.5, 0.5), taa_half3(0.5, 1, 0.5), LowFrequencyRejection);
FinalOutputColor = FinalFallbackColor * Weights.x * DebugColor + NormalizedFinalHistoryMoment1 * Weights.y * DebugColor;
#else
FinalOutputColor = FinalFallbackColor * Weights.x + NormalizedFinalHistoryMoment1 * Weights.y;
#endif
}
ISOLATE
{
uint LocalGroupThreadIndex = GetGroupThreadIndex(GroupThreadIndex, GroupWaveIndex);
taa_short2 LocalHistoryPixelPos = (
taa_short2(GroupId) * taa_short2(TILE_SIZE, TILE_SIZE) +
Map8x8Tile2x2Lane(LocalGroupThreadIndex));
LocalHistoryPixelPos = InvalidateOutputPixelPos(LocalHistoryPixelPos, HistoryInfo_ViewportMax);
// 輸出最終的歷史數據.
{
#if CONFIG_ENABLE_STOCASTIC_QUANTIZATION
{
uint2 Random = Rand3DPCG16(int3(LocalHistoryPixelPos, View.StateFrameIndexMod8)).xy;
float2 E = Hammersley16(0, 1, Random);
FinalHistoryMoment1 = QuantizeForFloatRenderTarget(FinalHistoryMoment1, E.x, HistoryQuantizationError);
FinalFallbackColor = QuantizeForFloatRenderTarget(FinalFallbackColor, E.x, HistoryQuantizationError);
}
#endif
FinalFallbackColor = -min(-FinalFallbackColor, taa_half(0.0));
FinalHistoryMoment1 = -min(-FinalHistoryMoment1, taa_half(0.0));
FinalFallbackColor = min(FinalFallbackColor, taa_half(Max10BitsFloat));
FinalHistoryMoment1 = min(FinalHistoryMoment1, taa_half(Max10BitsFloat));
HistoryOutput_Textures_0[LocalHistoryPixelPos] = FinalFallbackColor;
HistoryOutput_Textures_1[LocalHistoryPixelPos] = FinalHistoryMoment1;
HistoryOutput_Textures_2[LocalHistoryPixelPos] = taa_half2(FinalFallbackWeight, FinalHistoryValidity);
HistoryOutput_Textures_3[LocalHistoryPixelPos] = FinalSubpixelDetails;
#if DEBUG_OUTPUT
{
DebugOutput[LocalHistoryPixelPos] = Debug;
}
#endif
}
// 輸出最終的場景顏色.
{
taa_half3 OutputColor = FinalOutputColor;
OutputColor = -min(-OutputColor, taa_half(0.0));
OutputColor = min(OutputColor, taa_half(Max10BitsFloat));
SceneColorOutput[LocalHistoryPixelPos] = OutputColor;
}
}
}
由此可知,相較傳統的TAA,TSR增加了很多數據,包含當前和歷史的高頻、低頻、視差系數、重投影等等數據,先后根據這些信息摒棄或恢復歷史數據,生成當前幀的混合權重,最終算出抗鋸齒之后的場景顏色和歷史幀數據。
以上代碼只是TSR的最后一個階段更新歷史數據的代碼,前面還有很多步驟來生成此階段所需的數據,此文不再分析,留給讀者們自行研究。
6.6.2 Strata
筆者粗略地看了Strata的相關代碼,看起來Strata類似於UE4的Material Layer,但它主要應用於Nanite幾何體的材質投射、混合和光影處理。Strata有專用的材質、材質節點、着色模型、可視化模式和Shader處理模塊。不過,當前EA版本尚處於體驗階段,限制較多。涉及Strata的主要文件有:
- Strata.h/cpp
- StrataMaterial.h/cpp
- StrataDefinitions.h
- MaterialExpressionStrata.h
- Strata.ush
- BasePassPixelShader.usf
- DeferredLightPixelShaders.usf
- 場景渲染管線、光照相關的代碼。
有興趣的同學自行研讀相關源碼。
6.7 本篇總結
本篇主要闡述了UE5的編輯器特性、Nanite、Lumen及相關渲染技術,但由於UE5改動巨大,無法覆蓋所有的技術點,除了本篇文章談及的技術,實際上還有很多未涉及的,這就需要感興趣的讀者自己去探索UE的源碼了。
UE5 EA階段,無論是Nanite還是Lumen,都存在着諸多瑕疵,如Nanite只支持靜態物體,Lumen的噪點、漏光,TSR的閃爍和模糊,陰影精度的不足(下圖),海量傳統特性的不支持......
鏡頭離物體足夠近時出現的物體模糊和陰影瑕疵。
雖然UE5目前存在着諸多瑕疵,但它是沐浴着陽光雨露的小樹苗,經過Epic Game的精心培育,假以時日,終會成長為枝繁葉茂的參天大樹,蔭護着UE引擎關聯的各行各業。UE5 really No.1!!!
特別說明
- 感謝所有參考文獻的作者,部分圖片來自參考文獻和網絡,侵刪。
- 本系列文章為筆者原創,只發表在博客園上,歡迎分享本文鏈接,但未經同意,不允許轉載!
- 系列文章,未完待續,完整目錄請戳內容綱目。
- 系列文章,未完待續,完整目錄請戳內容綱目。
- 系列文章,未完待續,完整目錄請戳內容綱目。
參考文獻
- Unreal Engine Source
- Rendering and Graphics
- Materials
- Graphics Programming
- New Rendering Features
- Lumen Global Illumination and Reflections
- Lumen Technical Details
- Behind the scenes of “Lumen in the Land of Nanite” | Unreal Engine 5
- Unreal Engine 5 Early Access Release Notes
- 初探虛幻引擎5
- 如何評價 Unreal Engine 5 Early-Access?
- UE5 Nanite和Lumen背后的優化技術
- Clipmap 在開放世界中的實戰應用
- Family of Graph and Hypergraph Partitioning Software
- UE5 Lumen實現分析
- GPU-Driven Rendering Pipeline
- Optimizing the Graphics Pipeline with Compute
- DynamicOcclusionWithSignedDistanceFields
- UE4硬件光追對比UE5 Lumen
- UE5 Lumen原理介紹
- Lumen | Inside Unreal
- Intel Embree
- Embree Overview
- Silhouette Partitioning for Height Field Ray Tracing Tomas Sakalauskas Vilnius University
- Ray Tracing Height Fields
- Accelerating the ray tracing of height fields
- Brief Analysis of UE5 Rendering Pipeline
- Bin packing problem
- Interactive Indirect Illumination Using Voxel Cone Tracing
- Voxel Cone Tracing and Sparse Voxel Octree for Real-time Global Illumination
- Comparing 3D-Clipmaps and Sparse Voxel Octrees for voxel based conetracing
- PRACTICAL REAL-TIME VOXEL-BASED GLOBAL ILLUMINATION FOR CURRENT GPUS
- Lecture9 Real-Time Global Illumination(Screen Space)