StrafesNET/strafe-project

Fork 4

Graphics & Input: Minimizing Latency #39

New Issue

Open

opened 2026-02-03 17:12:25 +00:00 by Quaternions · 0 comments

Quaternions commented

2026-02-03 17:12:25 +00:00

Owner

The pipeline is a baton pass. Each stage has to wait for the next to finish to give its output to the next stage (pass the baton). Care has to be taken when scheduling work into this pipeline.

The pipeline should be scheduled at the latest possible kick-off moment for which it is predicted that the pipeline will complete just before a frame is displayed. The scheduler may need to identify the stage which bottlenecks throughput when the fps cannot keep up with the monitor refresh rate. This is because eagerly scheduling work at the beginning of the pipeline will cause the work to bunch up at the bottleneck point, causing maximum latency. The optimal strategy is use the display time minus end-to-end latency as the optimal kick off point, but rate-limit scheduling according to the pipeline stage with the lowest throughput.

The individual latency is not necessarily the inverse of throughput, for example the GPU can have multiple frames in flight, so may have higher throughput than can be calculated from the latency. Throughput can be calculated as concurrency/latency. If the gpu can handle drawing 3 frames simultaneously without a latency penalty, then this is its concurrency.

The optimal latency strategy is to individually identify each component in the pipeline (link in the chain) and recognize when the pipeline is bottlenecked and throttle to keep the weakest link fully loaded, but without inducing a buffering snowball. The pipeline is a baton pass. Each stage has to wait for the next to finish to give its output to the next stage (pass the baton). Care has to be taken when scheduling work into this pipeline. The pipeline should be scheduled at the latest possible kick-off moment for which it is predicted that the pipeline will complete just before a frame is displayed. The scheduler may need to identify the stage which bottlenecks throughput when the fps cannot keep up with the monitor refresh rate. This is because eagerly scheduling work at the beginning of the pipeline will cause the work to bunch up at the bottleneck point, causing maximum latency. The optimal strategy is use the display time minus end-to-end latency as the optimal kick off point, but rate-limit scheduling according to the pipeline stage with the lowest throughput. The individual latency is not necessarily the inverse of throughput, for example the GPU can have multiple frames in flight, so may have higher throughput than can be calculated from the latency. Throughput can be calculated as concurrency/latency. If the gpu can handle drawing 3 frames simultaneously without a latency penalty, then this is its concurrency.

master

surf-bodge

graphics-bodge

debug-35

fix-surf

debug-26

refactor-loader

zero-vertices2

refactor-snf

zero-vertices

strafe-state

minterp-bug

delete-ratio

mq-v3

minkowski-v2

dag

trj-lifetime

fcrawler-vis

roblox-bot-web2

luau-md

luau-md-debug2

luau-md-debug

ga-euclidean-point

brokensd

mesh-shader

debug-graphics2

debug-graphics-md

bug13-proc

md-compare-to-old

test-fail

md-generic

luauu

bug3-real

phys-new-setup

apple

bb2

multi-collision

bedbug

debug-bug3

mesh-v2

debug-bug7

rbx-dom_raw-str

hash_str

emu2

emu

cylinder

minkowksi-transfer

max-area-triangulation

physics-bug2

debug-graphics

debug-merge

entity

bvh-iter2

entity-debug

entity-physics

bsp-gamemechanics

web2

bsp-brush3

to-unsigned

bvh-iter

bsp-brush2

sussy-tips

bsp-brush

union3

union2

union

it-union

directories

modular

map-read-speed

surf-fix

no-cull-velocity

headless

bot

cow-mesh

instruction-cache

session

roblox-bot-web

web-map

web

web-size-hack

roblox-bot-playback

push-solve2

fixed-wide2

surf-build

temp

fw-calm-down

scripts

fixed-wide

push-solve

run-timer-gui

recalculate-touching

bot-playback

bot-worker2

bot-resimulate

bot-worker

pause

timers2

mouse-interpolator

snf

feature/style-redesign

wgpu-0.20

roblox-mesh

rustfmt

data-structure-rewrite

speed-print

file-format2

load-bsp3

shaders

debug

face-crawler

dynamic-objects

point-physics

worker-scope

sphere-gen

zeroes-opti

graphics-thread

winit-0.29.2

worker-pool

file-format

movestate

the-wrong-branch

deduplicate-models

integer-units

spirv

timers

load-bsp2

config-file

bvh

physics-thread3

physics-thread2

physics-thread

game-mechanics

load-bsp

model-look

wayland

thread-texture-loading

color-vertices

load-roblox-textures

temp-spawn

redo-input4

redo-input3

load-roblox6

redo-input2

load-roblox5

load-texture

load_roblox4

redo-input

redo-input-temp

load_roblox3

tickless-phys

timelines

free-body

load_roblox2

depth-fudge

load_roblox

suzanne

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: StrafesNET/strafe-project#39