Building a Game Loop: Architecture, Internals, and Best Practices
A complete game engine fundamentals masterclass — covering the naive busy loop, fixed timestep, delta-time variable timestep, the industry-standard fixed-update/variable-render hybrid, input handling, VSync, physics integration (Euler vs semi-implicit vs Verlet), and complete implementations in Python and C++ with an interactive frame-timing visualizer.
Every game, from the simplest Pong clone to a AAA open-world RPG, runs on the same fundamental structure: an infinite loop that repeats thousands of times per second — reading input, updating game state, and drawing to the screen. This structure is the game loop, and understanding it is the foundation of all game programming. Get it wrong and your game runs at different speeds on different machines, physics objects tunnel through walls, and fast-moving enemies warp across the screen. Get it right and your game is smooth, deterministic, and portable.
The game loop seems trivially simple — a while loop with a few function calls inside. In practice, it encodes dozens of hard-won engineering decisions about time measurement, rendering synchronization, physics simulation, and concurrency. Every major game engine (Unity, Unreal, Godot, SDL) has a carefully engineered game loop at its core. In this masterclass, Professor Pixel will build your understanding from the first principles of why a loop is needed at all, through each evolutionary stage of game loop design, to the hybrid pattern used by professional engines today.
1. Why Every Game Needs a Loop: The Simulation Model
1.1 Games as Discrete Time Simulations
A game is not a static program that runs once and terminates. It is a real-time simulation of a world — a world that must continuously respond to player input, apply physical rules, animate characters, play audio, and render a new visual frame dozens of times per second. This requires an infinite loop: as long as the game is running, it must keep updating and drawing. The game loop is the engine of this simulation, the heartbeat that drives everything forward.
The three fundamental responsibilities of every game loop, in order, are: (1) Process Input — read the current state of keyboard, mouse, gamepad, or touch; (2) Update — advance the game simulation by one logical step: move characters, apply gravity, check collisions, play audio, run AI; (3) Render — draw the current game state to the screen. These three phases happen repeatedly, as fast as the hardware allows (or as fast as the game targets), forming the classic "input → update → render" cycle.
1.2 Frame Rate and the Player's Experience
Each complete iteration of the game loop produces one frame — one still image displayed to the player. The number of frames per second (FPS) is the frame rate. Human perception requires approximately 24 frames per second to perceive smooth motion (the cinema standard). Video games target 30, 60, 120, or even 144/240 FPS for increasingly smooth visual feedback. At 60 FPS, the game loop must complete all three phases — input, update, render — in under $\frac{1}{60} = 16.67$ milliseconds. That is a very tight budget for everything a modern game must do.
Common Misconception — Higher FPS Is Always Better: While higher frame rates improve responsiveness and perceived smoothness, the relationship is not linear above a threshold. The difference between 30 and 60 FPS is immediately perceptible and significant. The difference between 120 and 240 FPS is subtle and most players cannot distinguish it. The real cost of targeting very high frame rates is reduced time budget per frame — an engine targeting 240 FPS has only 4.2 ms per frame for all game logic, physics, and rendering, which severely limits scene complexity.
2. The Naive Busy Loop: The Wrong Way and Why
2.1 The Simplest Possible Loop
A beginner's first game loop runs as fast as the hardware allows, with no time management whatsoever:
This loop has a critical flaw: game speed is coupled to hardware speed. On a faster machine the game runs faster — a problem memorably illustrated by early DOS games (designed for 4 MHz 8086 CPUs) running at unplayable speeds on modern hardware. Player movement, physics, and animations all run proportionally to CPU speed. This is the root cause of the "runs differently on different machines" bug that plagued early PC games.
2.2 CPU Monopolization
The naive loop also monopolizes the CPU — it never sleeps, never yields, running at 100% CPU utilization forever. On modern multitasking operating systems this causes other processes to starve for CPU time, the machine runs hot, battery drains rapidly on laptops, and frame rates often exceed the display's refresh rate — wasting computation rendering frames the monitor will never show. Professional game loops always include mechanisms to sleep or yield when ahead of schedule, and to synchronize rendering with the display's vertical sync (VSync) signal to eliminate tearing.
Pitfall — Spin-waiting vs Sleeping: Some games "fix" the naive loop by spin-waiting: checking the clock in a tight loop until the target frame time is elapsed. This still monopolizes the CPU — it just wastes cycles checking the time instead of doing work. The correct approach is to sleep for the remaining frame budget (time.sleep(remaining_ms / 1000) in Python, SDL_Delay(remaining_ms) in SDL), yielding the CPU to the OS scheduler. Be aware that sleep() on most OSes is imprecise — you may sleep slightly longer than requested. Account for this with a hybrid: sleep for most of the remaining time, then spin-wait for the final few milliseconds for accuracy.
3. Fixed Timestep: Determinism and Simplicity
3.1 The Fixed Update Concept
The fixed timestep loop decouples game logic speed from hardware speed by running game updates at a predetermined fixed rate — for example, exactly 60 times per second — regardless of how fast the hardware renders. The game logic (physics, AI, movement) advances by a fixed amount of simulated time each update: if the fixed rate is 60 Hz, each update advances the simulation by exactly $\Delta t = \frac{1}{60} \approx 16.67$ milliseconds of game time.
3.2 Determinism and Replay Systems
Fixed timestep loops are deterministic: given the same sequence of inputs, the simulation produces exactly the same outputs on any machine. This property is invaluable for several game systems: replay recording (store only the input sequence and replay it on the fixed simulation to reproduce any game state exactly), lockstep multiplayer (all clients run the same deterministic simulation with synchronized inputs — used by StarCraft, Age of Empires), and save states (emulators save the complete simulation state and can restore it perfectly). Floating-point determinism requires additional care — use the same FPU rounding mode on all platforms, and prefer integer arithmetic for critical game state when perfect reproducibility is required.
4. Delta Time: Time-Based Movement
4.1 What Delta Time Is
Delta time ($\Delta t$) is the elapsed real-world time between the start of the previous frame and the start of the current frame — the duration of one game loop iteration. By scaling all movement and physics by delta time, you make the simulation frame-rate-independent: an object moving at speed pixels per second travels exactly speed * dt pixels this frame, regardless of whether the frame took 8 ms or 32 ms.
4.2 The Spiral of Death
The variable timestep loop has a dangerous failure mode known as the spiral of death. If one frame takes too long (e.g., 100 ms due to garbage collection, disk I/O, or a CPU spike), the next frame gets a large delta time. With a large delta time, the physics update takes more computation time. This produces a larger delta time for the next frame. The simulation falls further and further behind real time, each update getting slower and slower in a feedback loop. The fix is clamping delta time to a maximum value — if a frame takes longer than, say, 50 ms, treat it as 50 ms for simulation purposes. The game may appear to momentarily slow down, but it will not spiral into an unusable state.
5. The Industry Standard: Fixed Update + Variable Render
5.1 The Hybrid Pattern
The professional game loop pattern, used by Unity, Unreal, Godot, and most AAA engines, combines the benefits of both approaches: physics and game logic run at a fixed timestep (for determinism and stability), while rendering runs as fast as possible at the actual hardware rate (for maximum visual smoothness). The loop accumulates elapsed time in a bucket, then drains it in fixed-size chunks for game updates, and renders once after each drain with whatever fraction of time remains:
5.2 State Interpolation for Smooth Rendering
The alpha value in the hybrid pattern is the fraction of a fixed timestep that has been accumulated but not yet simulated. At render time, the game is somewhere between two fixed update states — the previous state and the current state. By interpolating rendered positions between these two states using alpha, the visual output is perfectly smooth even when the physics runs at a lower fixed rate than the display's refresh rate:
This interpolation is what makes Unity's Rigidbody interpolation smooth at 60 Hz physics even on a 144 Hz display. Without interpolation, objects rendered between fixed updates stutter — they "jump" from one physics position to the next rather than gliding smoothly. The one frame of visual lag introduced by interpolation (you always render slightly behind the true current physics state) is imperceptible at typical frame rates and is an acceptable trade-off for perfect visual smoothness.
Pitfall — Running Too Many Fixed Updates Per Frame: If a frame takes 200 ms (e.g., due to a GPU stall) and FIXED_DT is 16.67 ms, the inner update loop tries to run 12 consecutive fixed updates to catch up. Each update takes some CPU time — potentially causing the next frame to also take too long, triggering even more catch-up updates. The accumulator cap (min(frame_time, 0.05)) prevents this: even if a frame took 200 ms, we pretend it took 50 ms, limiting catch-up to 3 fixed updates. The game time falls behind wall time gracefully — the alternative is an irrecoverable spiral of death.
6. Input Handling in the Game Loop
6.1 Event-Driven vs Polling
Input can be processed in two ways. Polling reads the current state of input devices every frame: "is the spacebar down right now?" This is simple and works well for continuous actions like movement, but it can miss brief inputs — if a key is pressed and released within a single frame (16 ms), polling may not detect it. Event queuing collects all input events (key press, key release, mouse click, gamepad button) in a queue between frames, then processes them all at the start of each loop iteration. This never misses inputs regardless of how brief they are, and is the standard approach in SDL, SFML, and most game frameworks.
In the hybrid fixed-update/variable-render loop, input is typically processed once per rendered frame (at the top of the outer loop), not once per fixed update. The input state is then available to all fixed updates that run during that frame. This means multiple physics updates in one frame see the same input snapshot — acceptable because frame input snapshots are so short (one display frame ≈ 16 ms) that any discrepancy is imperceptible.
6.2 Input Latency and Responsiveness
The pipeline from physical input to screen update includes many latency sources: the physical key press is scanned by the keyboard controller every 1–8 ms; it is reported to the OS; the OS queues it; the game reads it at the next frame start; the game updates and renders; the rendered frame sits in the GPU buffer; the GPU scans out to the monitor; the monitor displays it. End-to-end latency is typically 50–150 ms. Reducing game loop latency (by running at higher frame rates, processing input as late as possible in the frame, and using low-latency rendering modes) is a significant competitive advantage in fast-paced games where input responsiveness is critical to player feel.
7. The Rendering Pipeline and VSync
7.1 Screen Tearing and VSync
A display updates its pixels by scanning from top to bottom — the vertical scan — at a fixed refresh rate (60 Hz, 144 Hz, etc.). If the game sends a new frame to the display buffer in the middle of a scan, the top of the screen shows the old frame and the bottom shows the new frame, creating a visible horizontal split — screen tearing. VSync (Vertical Synchronization) prevents this by forcing the game to only swap the display buffer during the vertical blanking interval (VBI) — the brief period between scan completions when the display is not scanning. This eliminates tearing but caps frame rate to the display refresh rate and introduces up to one frame of additional latency.
Adaptive sync technologies (G-Sync, FreeSync, HDMI 2.1 VRR) solve VSync's limitations: the display adjusts its refresh rate to match the game's actual frame rate, preventing tearing without the fixed-rate cap or latency penalty. For game loop design, adaptive sync means you can target the highest frame rate your hardware can sustain and let the display sync dynamically — the ideal combination of visual smoothness and low latency.
Mermaid Diagram: Game loop with VSync — how the vertical blanking interval governs buffer swap timing.
8. Physics Integration: Euler vs Semi-Implicit vs Verlet (Advanced)
8.1 Why Integration Method Matters
In the fixed update, physics simulation advances object positions and velocities by integrating Newton's equations of motion. The simplest method — explicit (forward) Euler — updates position using the current velocity, then updates velocity using the current acceleration:
Explicit Euler is simple but numerically unstable for oscillatory systems (like springs). It systematically adds energy to the simulation each step — a spring simulated with Euler integration will oscillate with ever-increasing amplitude and eventually explode. This is catastrophically wrong for game physics.
8.2 Semi-Implicit Euler and Verlet
Semi-implicit (symplectic) Euler fixes the stability issue with a trivial swap: update velocity first using the current acceleration, then update position using the new velocity:
This tiny change makes the integrator symplectic — it preserves the system's energy over time, producing stable oscillations without drift. Semi-implicit Euler is the standard in most game physics engines (Unity's physics, Bullet Physics, Box2D) because it is nearly as simple as explicit Euler but dramatically more stable. Verlet integration goes further, computing position from the last two positions rather than velocity, providing excellent energy conservation and excellent behavior under constraint resolution — preferred for cloth, soft bodies, and ragdoll physics:
Advanced Pitfall — Timestep-Dependent Physics: Even with semi-implicit Euler, behavior subtly varies with FIXED_DT. A spring simulated at 30 Hz vs 120 Hz will have slightly different stiffness behavior. For gameplay-critical physics (character controllers, projectile trajectories), this means tuning spring constants and drag values at your target fixed update rate and not changing it. Changing FIXED_DT after tuning requires re-tuning all physics parameters. This is why most shipped games lock their physics update rate (Unity defaults to 50 Hz; changing it post-launch is effectively a physics re-calibration).
9. Complete Implementation Reference
9.1 Full Python Game Loop (Pygame)
Here is a complete, production-ready Python game loop using Pygame with the hybrid fixed-update/variable-render pattern:
9.2 C++ SDL2 Equivalent
The same pattern in C++ with SDL2, which is the foundation for many indie and professional games:
10. Interactive: Game Loop Frame Timing Visualizer
Watch the hybrid game loop in action — each bar represents one rendered frame. Blue bars show fixed physics update ticks; the green bar shows the render call. Adjust frame rate to see how the accumulator manages the mismatch between physics rate (60 Hz) and render rate:
11. Frame Timing Profiles: Loop Variants Compared
The chart compares frame time stability across the three major game loop patterns on the same hardware simulating the same scene. The hybrid pattern delivers the lowest variance in rendered frame times — critical for smooth player experience:
12. Frequently Asked Questions
Q1: What fixed update rate should I use for my game?
For most games: 60 Hz (FIXED_DT = 1/60). This matches the most common display refresh rate and provides a good balance between simulation accuracy and CPU cost. Physics engines like Box2D are tuned for 60 Hz and explicitly recommend it. For action games needing precise collision detection (bullets, fighting game hitboxes), consider 120 Hz or higher. For slower-paced strategy games where perfect physics accuracy is less critical, 30 Hz saves CPU. Unity defaults to 50 Hz (20 ms); changing it requires re-tuning all physics parameters.
Q2: Should I use time.sleep() in my game loop?
In the hybrid fixed-update/variable-render pattern without VSync, you typically do not sleep — you render as fast as possible and let VSync throttle the loop. If VSync is disabled and you want to cap frame rate (to save power or reduce GPU load), use a hybrid sleep: calculate the remaining frame budget, sleep for 90% of it, then spin-wait the final 10% for precision. Pure spin-waiting wastes a full CPU core; pure sleeping is imprecise and may cause stutters.
Q3: How does Unity's game loop work?
Unity uses the hybrid pattern with FixedUpdate (fixed timestep, default 50 Hz) for physics and Update (variable timestep, every rendered frame) for game logic. MonoBehaviour callbacks (Start, Awake, OnEnable) run during initialization phases. Unity's internal loop also includes LateUpdate (post-Update, for camera follow), OnPreRender, OnPostRender, OnGUI, and coroutine yield points. Physics runs in FixedUpdate threads asynchronously with rendering in newer Unity versions, allowing physics and rendering to overlap. The Time.deltaTime variable is computed by Unity and handed to Update code automatically.
Q4: What causes frame rate spikes and how do I debug them?
Frame spikes (sudden single-frame jumps to much higher frame time) are caused by: garbage collection pauses (Java/C#/Python managed memory), disk I/O on the main thread (loading assets during gameplay), shader compilation stalls (especially on first draw), synchronous network calls, or CPU/GPU synchronization bubbles. Debug with a frame profiler (Unity Profiler, Unreal Insights, RenderDoc, Nvidia NSight). The key metric is not average frame time but worst-case frame time: a game averaging 2 ms per frame with occasional 50 ms spikes feels less smooth than a game with consistent 8 ms frames.
Q5: What is "decoupled physics" and when do I need it?
Decoupled physics runs physics simulation on a separate thread from rendering, allowing both to proceed in parallel. Rendering can read the latest committed physics state (double-buffered) while the physics thread advances the next fixed step. This eliminates the CPU cost of physics from the rendering thread's budget, enabling higher render frame rates on multi-core hardware. Unreal Engine's physics runs on worker threads. The complexity cost is significant: you need thread-safe state access, double buffering of simulation states, and careful synchronization. For most indie games, single-threaded physics with the hybrid loop is sufficient.
Q6: How do I implement a pause feature in the game loop?
The simplest approach: set a paused flag. When paused, skip the fixed update inner loop and skip accumulator accumulation, but continue processing input and rendering (so the UI remains responsive and the paused frame is shown). Do NOT skip input or rendering while paused. One subtle issue: when unpausing, reset last_time = time.perf_counter() so the accumulated pause duration does not create a massive delta time that triggers dozens of catch-up physics updates the moment the game resumes.
Q7: What is the "bullet problem" in game physics?
The bullet problem (also called tunneling) occurs when a fast-moving object travels farther in one fixed timestep than the width of the objects it might collide with — the bullet "teleports" through a wall without detection. Solutions: (1) Continuous Collision Detection (CCD) — cast a swept shape along the trajectory and test for intersections; (2) reduce FIXED_DT (use 120 Hz or higher physics); (3) limit maximum velocity to prevent objects from moving more than their width per frame. Most physics engines (Box2D, Bullet) have built-in CCD options that can be enabled per-body for fast-moving objects at the cost of higher CPU usage.
Q8: How does multiplayer affect the game loop?
Multiplayer fundamentally challenges game loop design. Lockstep networking requires all clients to run the same deterministic fixed-timestep simulation with synchronized inputs — every client processes the same input set at the same tick number, producing identical game states. Client-side prediction runs the local player's inputs immediately on the local fixed update, then corrects against the server's authoritative state when network packets arrive (used by most FPS games). Rollback networking (used by fighting games like GGPO) allows the simulation to run speculatively and roll back + replay when input predictions are wrong. All of these approaches depend on a well-designed fixed-timestep game loop as their foundation.