Texture Streaming by bjornbytes · Pull Request #949 · bjornbytes/lovr

bjornbytes · 2026-05-28T00:24:49Z

Add stream flag to lovr.graphics.newTexture/newModel. This will load the texture(s) asynchronously on a separate GPU queue when possible, avoiding interrupting the rendering work happening on the main queue.
Add Texture/Model:isReady to check if the asynchronous transfer is complete and the texture is ready to use. It is an error to use a Texture before it's ready.
- You can use Model:isReady to see if all of its textures are ready, or you can check individual textures and draw the ready ones with Pass:drawPart to do a progressive per-mesh load.
Vulkan details:
- core/gpu texture creation takes a CPU pointer instead of GPU buffer
- Texture upload uses VK_EXT_host_image_copy when available/optimal (for all textures).
- Otherwise, falls back to separate transfer queue when available (and the texture is streaming).
- Otherwise, falls back to doing the transfer on the graphics queue (original behavior).

Main "reason" to do it is that WebGPU really wants to take a CPU pointer for texture data.

Still unclear if this is worth it or a good idea, but it sure is cool!

- Add stream flag to newTexture/newModel. These will load the texture asynchronously on a separate GPU queue when possible, avoiding interrupting the rendering work happening on the main queue. - Add Texture/Model:isReady to check if the asynchronous transfer is complete and the texture is ready to use. It is an error to use a Texture before it's ready. - Vulkan details: - core/gpu texture creation takes a CPU pointer instead of GPU buffer - Texture upload uses VK_EXT_host_image_copy when available/efficient. - Otherwise, falls back to separate transfer queue when available. - Otherwise, falls back to doing the transfer on the graphics queue. Unclear if this is worth it or a good idea, but it sure is cool! Main reason to do it is that WebGPU really wants to take a host pointer for texture data.

DonaldHays · 2026-05-28T03:13:34Z

Does this pull the textures off disk asynchronously, too, or is it just the GPU upload that's asynchronous?

bjornbytes · 2026-05-28T04:17:32Z

This change just affects the GPU upload. You can wrap texture creation in a task too:

lovr.task.start(function()
  texture = lovr.graphics.newTexture('file', { stream = true })
end)

This gets pretty much all the overhead off of the main CPU thread / GPU queue, including the file read, image decode, GPU memory allocation, and GPU transfer.

bjornbytes · 2026-06-01T23:44:40Z

Performance story here is a little more muddled than I hoped

VK_EXT_host_image_copy
- Can be slower on the CPU, because it does the texture swizzling on the CPU instead of the GPU.
- When a host image copy texture is created on a background thread, this is a really good way to upload textures.
  - There's a question of whether LÖVR should encourage you to just put your newTexture call in a task, or whether the graphics module should engage in internal heroics to offload texture creation onto worker threads.
- One of the main wins it that it doesn't require a staging buffer, which reduces peak memory usage and can avoid some costly overhead of vkAllocateMemory/vkFreeMemory (which appears to be very real on NVIDIA, see below).
Using a separate transfer queue
- Pretty clear win on AMD iGPU. Texture uploads on graphics queue cause stutters, transfer queue is stutter free. Still profiling to quantify/understand the benefits better.
- On NVIDIA, not good. Submission to the transfer queue can be insanely slow (~50ms), even when only 1 thread is doing submits.
  - In the current architecture, this causes hitches because the transfers are submitted alongside the main graphics queue submit.
  - I think this might actually be caused by GPU memory allocation, but not entirely sure.
  - LÖVR might need to make its staging buffer allocation more sophisticated to get better performance on NV.
  - Or the transfer submit could be moved off the main thread. However, a previous version of this branch was doing that and I was still seeing tens-of-ms stalls on the graphics vkQueueSubmit. I originally thought it was due to transfer submit contending with graphics submit, but maybe it has something to do with memory management in the driver.

If I was thinking about whether this can/should be merged, the story might be like:

Host image copy is good, but can cause lovr.graphics.newTexture to take a little longer on the CPU, but this is okay because it's already slow. lovr.task.start is a good tool to solve this problem, and it's fine to add a little extra pressure to wrap texture creation in tasks.
Transfer queue is good on some drivers, bad/neutral on other drivers, and the code is kinda complicated. The bad driver thing can probably be fixed in the future by improving LÖVR's memory allocation or threading. So it's probably okay to merge, tentatively/reluctantly.
There's a question of whether { stream = true } makes sense to expose.
- On host image copy, it's a noop, unless LÖVR does the internal threading heroics and marks the texture as ready once a background thread finishes the copy.
- It's still a useful signal for people to provide to LÖVR:
  - false is "I literally need this texture this frame, don't use a transfer queue and/or please wait for the host copy to finish before rendering"
  - true is "this can wait, I don't want frame stutters, transfer queue is good here and/or don't block on the host image copy"
- On the other hand, you could actually use the task system for it:
  - newTexture is async.
  - If you call it outside of a task, you get a synchronous texture.
  - Inside a task, you'll get a streaming texture, but you can .wait if you ever want to block on it.
- If { stream = true } is not exposed, all textures either need to be synchronous (transfer queue code makes no sense) or async-with-implicit-stall (too much unnecessary magical machinery, not worth implementing).
- Given this is a pretty niche feature, it's probably fine to defer/change the exact API design later and just leave { stream = true } as an experimental flag to play with???

bjornbytes added 4 commits May 24, 2026 11:03

vk: create transfer queue;

4a81f0c

vk: enable VK_EXT_host_image_copy;

3e0ec08

Some host image copy fixes;

bd17c47

Clean up texture streaming; Reduce contention;

c103545

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Texture Streaming#949

Texture Streaming#949
bjornbytes wants to merge 5 commits into
devfrom
texture-streaming

bjornbytes commented May 28, 2026 •

edited

Loading

Uh oh!

DonaldHays commented May 28, 2026

Uh oh!

bjornbytes commented May 28, 2026

Uh oh!

bjornbytes commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

bjornbytes commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DonaldHays commented May 28, 2026

Uh oh!

bjornbytes commented May 28, 2026

Uh oh!

bjornbytes commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bjornbytes commented May 28, 2026 •

edited

Loading