UE4 主线程和渲染线程的同步

当我们完全了解UE4的多线程是怎么进行时,我们就需要看一下UE4的主线程和渲染线程到底如何进行的同步的。

UE4多线程架构

https://papalqiblog.oss-cn-beijing.aliyuncs.com/blog/picture20210625120443.png

GameThread、RenderThread、RHI Thread和GPU之间的渲染器同步是一个非常复杂的主题。简而言之,虚幻引擎4通常配置为"后一帧(single frame behind)“渲染器。这意味着当RenderThread处理第N帧时GameThread处理第N + 1帧,除非RenderThread的运行速度比GameThread快。

添加RHI线程使同步过程更为复杂化,因为当RHI线程处理第N帧时,RenderThread能够通过完成第N+1帧的可视性计算而移动到RHI线程之前。最终结果是,当GameThread处理第N+1帧时,RenderThread可以处理第N帧或第N+1帧的命令,RHI线程也可以平移第N帧或第N+1帧的命令,具体取决于执行时间。

在帧的末尾,我们将执行主线程和渲染线程的同步。通过静态的FFrameEndSync来进行线程间的同步。

1
2
3
4
5
6
7
8
	//同步主线程和渲染线程
	{
		SCOPE_CYCLE_COUNTER(STAT_FrameSyncTime);
		static FFrameEndSync FrameEndSync;

		static auto CVarAllowOneFrameThreadLag =IConsoleManager::Get().FindTConsoleVariableDataInt(TEXT("r.OneFrameThreadLag"));
		FrameEndSync.Sync(CVarAllowOneFrameThreadLag -> GetValueOnGameThread() != 0);
	}

我们来看一下FFrameEndSync的数据结构

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
class FFrameEndSync
{
	/** Pair of fences. */
	FRenderCommandFence Fence[2];
	/** Current index into events array. */
	int32 EventIndex;
public:
	/**
	 * Syncs the game thread with the render thread. Depending on passed in bool this will be a total
	 * sync or a one frame lag.
	 */
	ENGINE_API void Sync( bool bAllowOneFrameThreadLag );
};

通过FFrameEndSync::Sync里面的Fence[EventIndex].BeginFence(true);进行主线程和线程的同步

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
void FFrameEndSync::Sync( bool bAllowOneFrameThreadLag )
{
	check(IsInGameThread());			

	// Since this is the frame end sync, allow sync with the RHI and GPU (true).
	Fence[EventIndex].BeginFence(true);

	bool bEmptyGameThreadTasks = !FTaskGraphInterface::Get().IsThreadProcessingTasks(ENamedThreads::GameThread);

	if (bEmptyGameThreadTasks)
	{
		// need to process gamethread tasks at least once a frame no matter what
		FTaskGraphInterface::Get().ProcessThreadUntilIdle(ENamedThreads::GameThread);
	}

	// Use two events if we allow a one frame lag.
	if( bAllowOneFrameThreadLag )
	{
		EventIndex = (EventIndex + 1) % 2;
	}

	// if we only have two cores, it is important to leave them for the RT to get its work done.
	static bool bEnoughCoresToDoAsyncLoadingWhileWaitingForVSync = FPlatformMisc::NumberOfCoresIncludingHyperthreads() > 2;

	if (bEnoughCoresToDoAsyncLoadingWhileWaitingForVSync && GDoAsyncLoadingWhileWaitingForVSync)
	{
		const int32 MaxTicks = 5;
		int32 NumTicks = 0;
		float TimeLimit = GAsyncLoadingTimeLimit / 1000.f / float(MaxTicks);
		while (NumTicks < MaxTicks && !Fence[EventIndex].IsFenceComplete() && IsAsyncLoading())
		{
			NumTicks++;
			ProcessAsyncLoading(true, false, TimeLimit);
			if (bEmptyGameThreadTasks)
			{
				FTaskGraphInterface::Get().ProcessThreadUntilIdle(ENamedThreads::GameThread);
			}
		}
	}
	Fence[EventIndex].Wait(bEmptyGameThreadTasks);  // here we also opportunistically execute game thread tasks while we wait
}

我们这里通过Fence有两步操作,第一步是BeginFence, 第二步是进行Wait

BeginFence

我们来看一下FRenderCommandFence::BeginFence()的内容

1
2
3
4
5
6
void FRenderCommandFence::BeginFence(bool bSyncToRHIAndGPU)
{
	//...
    CompletionEvent = TGraphTask<FNullGraphTask>::CreateTask(NULL, ENamedThreads::GameThread).ConstructAndDispatchWhenReady(
        GET_STATID(STAT_FNullGraphTask_FenceRenderCommand), ENamedThreads::GetRenderThread());
}

TGraphTask 的第一个参数是依赖,第二个参数是当前线程。其到底在哪个线程运行时在后面的构造函数里面确定的。因此这里是给渲染线程队尾增加了一个任务。

Wait

如果那个任务执行完毕,说明这个任务已经完成,如果没有完成将执行等待。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

bool FRenderCommandFence::IsFenceComplete() const
{
	if (!GIsThreadedRendering)
	{
		return true;
	}
	check(IsInGameThread() || IsInAsyncLoadingThread());
	CheckRenderingThreadHealth();
	if (!CompletionEvent.GetReference() || CompletionEvent->IsComplete())
	{
		CompletionEvent = NULL; // this frees the handle for other uses, the NULL state is considered completed
		return true;
	}
	return false;
}

void FRenderCommandFence::Wait(bool bProcessGameThreadTasks) const
{
	if (!IsFenceComplete())
	{
		StopRenderCommandFenceBundler();
		GameThreadWaitForTask(CompletionEvent, TriggerThreadIndex, bProcessGameThreadTasks);
	}
}

分析

  1. 主线程卡顿。因为渲染线程是Task驱动,所以渲染线程仅仅是没有任务而已。
  2. 渲染线程卡顿,主线程将等待渲染线程完成。