bricklore bricklore - 1 year ago 106
C++ Question

Why is a VkFence necessary for every swapchain command buffer when already using semaphores?

I am using exactly 3 images for the swapchain and one

(CB) per swapchain image. GPU synchronization is done with two semaphores, one for
and one for
. The present mode is
(quick overview).

Now when I am running my example without waiting for any CB fences, the validation layers report this error as soon as any swapchain image is used for the second time:

on active CB before it has completed. You must check CB fence before this call.

At first sight it seems resonable to me, as processing the commands from the CB might just not be finished yet. But the more I think about it the more I come conclude that this should not happen at all.

My current understanding is when
returns a specific image index it implies that the image returned has to be finished with rendering.

That is because I'm passing the
semaphore to
to be signaled when rendering finises and to
to wait until it becomes signaled before presenting the image.

The specification of

, if not VK_NULL_HANDLE, is an array of
objects with
entries, and specifies the semaphores to wait for before issuing the present request

Meaning: I will never present any image which hasn't finshed rendering and thus the associated CB cannot be in use any more as soon as the image is presented.

The second semaphore
is signalled by
and passed to the same
(to start the rendering). This means the rendering of any image will start no earlier than allowed by the presentation engine.

To conclude: A present request from
is not issued before the rendering of the image is finished and
blocks until an image is available and also will never return an images currently acquired.

What am I missing that makes a fence necessary?

I have included a minimal code example containing only the conceptually important parts to illustrate the problem.

VkImage[] swapchain_images;
VkCommandBuffer[] command_buffers;

VkSemaphore rendering_finished;
VkSemaphore presentation_finished;

void RenderLoop()
/* Acquire an image from the swapchain. Block until one is available.
Signal presentation_finished when we are allowed to render into the image */
int index;
vkAcquireImageKHR(device, swapchain, UINT64_MAX, presentation_finished, nullptr, &index);

/* (...) Frambuffer creation, etc. */

/* Begin CB: The command pool is flagged to reset the command buffer on reuse */
VkCommandBuffer cb = command_buffers[index];
vkBeginCommandBuffer(cb, ...);

/* (...) Trivial rendering of a single color image */

/* End CB */

/* Queue the rendering and wait for presentation_finished.
When rendering is finished, signal rendering_finished.

The VkSubmitInfo will have these important members set among others:
.pWaitSemaphores = &presentation_finished;
.pSignalSemaphores = &rendering_finished;
vkQueueSubmit(render_queue, &submit_info);

/* Submit the presentation request as soon as the rendering_finished
semaphore gets signalled

The VkPresentInfoKHR will have these important members set among others:
.pWaitSemaphores = &rendering_finished;
vkQueuePresentKHR(present_queue, &present_info);

Inserting a fence when submitting the CB to the rendering queue and waiting on it before using that CB again obviously fixes the issue, but - as explained - seems redundant.

Answer Source

vkAcquireNextImageKHR is allowed to return an image that is still the destination and/or source of ongoing asynchronous operations. This means you have no guarantee the command buffer is available at time of reuse. It would be correct to enqueue additional, distinct command buffers to write to the acquired image, as long as those commands are configured to wait on the presentation_finished semaphore; but to safely reuse that command buffer you must wait on the fence passed to vkQueueSubmit.

See section 29.6. WSI Swapchain in the Vulkan spec with KHR extensions:

An application can acquire use of a presentable image with vkAcquireNextImageKHR. After acquiring a presentable image and before modifying it, the application must use a synchronization primitive to ensure that the presentation engine has finished reading from the image. The application can then transition the image’s layout, queue rendering commands to it, etc. Finally, the application presents the image with vkQueuePresentKHR, which releases the acquisition of the image.

See also these notes for vkAcquireNextImageKHR

When successful, vkAcquireNextImageKHR acquires a presentable image that the application can use, and sets pImageIndex to the index of that image. The presentation engine may not have finished reading from the image at the time it is acquired, so the application must use semaphore and/or fence to ensure that the image layout and contents are not modified until the presentation engine reads have completed.


As mentioned above, the presentation engine may be asynchronous with respect to the application and/or logical device. vkAcquireNextImageKHR may return as soon as it can identify which image will be acquired, and can guarantee that semaphore and fence will be signaled by the presentation engine; and may not successfully return sooner. The application uses timeout to specify how long vkAcquireNextImageKHR waits for an image to become acquired.

This shows that vkAcquireNextImageKHR is not required to block on the presentation operation, and transitively is not required to block on the graphics command that the presentation operation is itself waiting on.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download