当我们完全了解UE4的多线程是怎么进行时,我们就需要看一下UE4的主线程和渲染线程到底如何进行的同步的。
UE4多线程架构
data:image/s3,"s3://crabby-images/8a4ed/8a4eda233bd229a752203e9d3ccfc158c5ebd5b1" alt="20210625120443 https://papalqiblog.oss-cn-beijing.aliyuncs.com/blog/picture20210625120443.png"
GameThread、RenderThread、RHI Thread和GPU之间的渲染器同步是一个非常复杂的主题。简而言之,虚幻引擎4通常配置为"后一帧(single frame behind)“渲染器。这意味着当RenderThread处理第N帧时GameThread处理第N + 1帧,除非RenderThread的运行速度比GameThread快。
添加RHI线程使同步过程更为复杂化,因为当RHI线程处理第N帧时,RenderThread能够通过完成第N+1帧的可视性计算而移动到RHI线程之前。最终结果是,当GameThread处理第N+1帧时,RenderThread可以处理第N帧或第N+1帧的命令,RHI线程也可以平移第N帧或第N+1帧的命令,具体取决于执行时间。
在帧的末尾,我们将执行主线程和渲染线程的同步。通过静态的FFrameEndSync来进行线程间的同步。
1
2
3
4
5
6
7
8
|
//同步主线程和渲染线程
{
SCOPE_CYCLE_COUNTER(STAT_FrameSyncTime);
static FFrameEndSync FrameEndSync;
static auto CVarAllowOneFrameThreadLag =IConsoleManager::Get().FindTConsoleVariableDataInt(TEXT("r.OneFrameThreadLag"));
FrameEndSync.Sync(CVarAllowOneFrameThreadLag -> GetValueOnGameThread() != 0);
}
|
我们来看一下FFrameEndSync的数据结构
1
2
3
4
5
6
7
8
9
10
11
12
13
|
class FFrameEndSync
{
/** Pair of fences. */
FRenderCommandFence Fence[2];
/** Current index into events array. */
int32 EventIndex;
public:
/**
* Syncs the game thread with the render thread. Depending on passed in bool this will be a total
* sync or a one frame lag.
*/
ENGINE_API void Sync( bool bAllowOneFrameThreadLag );
};
|
通过FFrameEndSync::Sync里面的Fence[EventIndex].BeginFence(true);
进行主线程和线程的同步
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
|
void FFrameEndSync::Sync( bool bAllowOneFrameThreadLag )
{
check(IsInGameThread());
// Since this is the frame end sync, allow sync with the RHI and GPU (true).
Fence[EventIndex].BeginFence(true);
bool bEmptyGameThreadTasks = !FTaskGraphInterface::Get().IsThreadProcessingTasks(ENamedThreads::GameThread);
if (bEmptyGameThreadTasks)
{
// need to process gamethread tasks at least once a frame no matter what
FTaskGraphInterface::Get().ProcessThreadUntilIdle(ENamedThreads::GameThread);
}
// Use two events if we allow a one frame lag.
if( bAllowOneFrameThreadLag )
{
EventIndex = (EventIndex + 1) % 2;
}
// if we only have two cores, it is important to leave them for the RT to get its work done.
static bool bEnoughCoresToDoAsyncLoadingWhileWaitingForVSync = FPlatformMisc::NumberOfCoresIncludingHyperthreads() > 2;
if (bEnoughCoresToDoAsyncLoadingWhileWaitingForVSync && GDoAsyncLoadingWhileWaitingForVSync)
{
const int32 MaxTicks = 5;
int32 NumTicks = 0;
float TimeLimit = GAsyncLoadingTimeLimit / 1000.f / float(MaxTicks);
while (NumTicks < MaxTicks && !Fence[EventIndex].IsFenceComplete() && IsAsyncLoading())
{
NumTicks++;
ProcessAsyncLoading(true, false, TimeLimit);
if (bEmptyGameThreadTasks)
{
FTaskGraphInterface::Get().ProcessThreadUntilIdle(ENamedThreads::GameThread);
}
}
}
Fence[EventIndex].Wait(bEmptyGameThreadTasks); // here we also opportunistically execute game thread tasks while we wait
}
|
我们这里通过Fence有两步操作,第一步是BeginFence, 第二步是进行Wait
BeginFence
我们来看一下FRenderCommandFence::BeginFence()
的内容
1
2
3
4
5
6
|
void FRenderCommandFence::BeginFence(bool bSyncToRHIAndGPU)
{
//...
CompletionEvent = TGraphTask<FNullGraphTask>::CreateTask(NULL, ENamedThreads::GameThread).ConstructAndDispatchWhenReady(
GET_STATID(STAT_FNullGraphTask_FenceRenderCommand), ENamedThreads::GetRenderThread());
}
|
TGraphTask 的第一个参数是依赖,第二个参数是当前线程。其到底在哪个线程运行时在后面的构造函数里面确定的。因此这里是给渲染线程队尾增加了一个任务。
Wait
如果那个任务执行完毕,说明这个任务已经完成,如果没有完成将执行等待。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
|
bool FRenderCommandFence::IsFenceComplete() const
{
if (!GIsThreadedRendering)
{
return true;
}
check(IsInGameThread() || IsInAsyncLoadingThread());
CheckRenderingThreadHealth();
if (!CompletionEvent.GetReference() || CompletionEvent->IsComplete())
{
CompletionEvent = NULL; // this frees the handle for other uses, the NULL state is considered completed
return true;
}
return false;
}
void FRenderCommandFence::Wait(bool bProcessGameThreadTasks) const
{
if (!IsFenceComplete())
{
StopRenderCommandFenceBundler();
GameThreadWaitForTask(CompletionEvent, TriggerThreadIndex, bProcessGameThreadTasks);
}
}
|
分析
- 主线程卡顿。因为渲染线程是Task驱动,所以渲染线程仅仅是没有任务而已。
- 渲染线程卡顿,主线程将等待渲染线程完成。