DOEON KWON
Building AI-native 3D worlds where agents and players share the same space, digging into spatial intelligence
Selected work
An external LLM can join a multiplayer 3D world as a persistent player: it can perceive nearby context, remember places, chat, move, and build through permissioned world verbs.
I built the embodiment layer that lets an external language model live inside a SPACE0 world: a 68-tool MCP server, renderer-less symbolic perception, claim-gated player verbs, prompt-injection fencing, and spatial memory.
- 68 MCP tools across 8 modules (presence, build, memory, identity, commitment, skill, brain_state, media) expose perception, movement, building, chat, posts, and agent state
- Spatial memory and agent state persist in Supabase across sessions
- No god-mode: world actions go through server-side claims, permissions, and rate limits
- Renderer-less symbolic perception over MCP: position, nearby people, posts, chat, and voxel cells resolved through a fixed grid-to-world transform, so an external model perceives the world with no GPU and no pixels.
- Permissioned action dispatch: every world mutation goes through server-side claims, rate limits, and persistent identity instead of privileged model-only APIs.
- Prompt-injection fencing treats perceived player text as world content, not authority: it can be a request the agent reasons about, but it cannot rewrite identity, rules, or owner instructions.
Shows: agent APIs · symbolic perception · persistent memory · permissioned actions · prompt-injection boundaries
Sensorium › Working memory › Perception fusion › Retrieval › Inference › Intent dispatch › Result ingestion
Contribution: Built and own the embodiment stack: symbolic perception over MCP, claim-gated verbs, prompt-injection boundaries, and spatial memory.

The memory design doc, posted inside the world it describes.

The same doc opened in the reading overlay: index, placement, organization.

Memory regions over the live world: a spatial index, not a flat list.
An evaluation loop for spatial memory inside a live 3D world: it tests occlusion, line-of-sight, and geometry-led recall, down to the production bugs those tests surface.
I built the SPACE0 spatial-memory eval loop: live-world tests for occlusion, line-of-sight, and geometry-led recall that turned a fuzzy agent-memory claim into measurable production behavior; I fed the fixes back into the running relay. This is an internal research-engineering artifact, not published work.
- Live-world evals tested occlusion, line-of-sight, and geometry-led recall against production behavior
- The pre-registered result split the claim: spatial proximity diluted into a linear blend fails its own frozen test, while geometry-led recall plus a separate visibility predicate wins
- The live run surfaced a relay line-of-sight anchor bug that had silently no-opped the visibility query, fixed before the result was recorded
- Harness scale: 849 behind-wall targets in controlled worlds, a 2,800-trial query battery, a recall corpus of 120 memories captured from the relay, then 8 scripted live worlds with 96 hidden targets, with hypothesis and falsification frozen in git before any arm was scored
- A pre-registered falsification did its job: the shipped blend that folds spatial proximity in beside recency and importance failed its own frozen test (mean ΔHit@5 -0.0375, p = 0.306, at or below a position-blind vector baseline), while geometry-led weighting on the identical corpus flipped the null to a decisive win (+0.32, p < 10^-15).
- The live system test mattered: promoting occlusion from simulation to the running relay surfaced a production bug in the line-of-sight anchor path, which had pointed the visibility query at the observer anchor instead of the subject, a silent no-op for exactly the memories it should evaluate.
- The repaired live confirmation measured occlusion against real built geometry, eight scripted worlds and 96 behind-wall targets under a git-committed freeze (false-visible 1.000 to 0.000, pooled exact McNemar p = 2.5x10^-29), and the robustness checks narrowed the claim instead of inflating it: the gain from acting in place is object binding, not geometry.
Shows: eval design · pre-registration · live-system instrumentation · occlusion tests · research-to-product debugging
Embodied capture › Pre-register › Falsify shipped blend › Retune geometry-led › Fix relay bug › Live-confirm
Contribution: Designed and ran the eval end to end: pilots, implementation, writeup, and caveat documentation.
One Manifold Dual Contouring voxel engine in Rust, extracted behind a typed boundary and shipped across four runtimes: a web app, an App Store iOS client, native macOS/Windows desktop, and an agent relay, kept in lockstep by an FFI, TOML codegen, and a parity gate.
I extracted an 18.9K-LOC engine kernel out of the Next.js monolith into versioned packages, then carried it to native through a ~5,100-LOC extern-C FFI xcframework (the iOS C++ to Rust swap) and a wgpu/Slint desktop client, and built the TOML codegen that single-sources physics constants, copy, design tokens, and analytics events into TypeScript, Swift, Rust, and Slint, with drift caught at a pre-push parity gate.
- An 18.9K-LOC engine kernel extracted from the Next.js monolith: 1,189+ import sites codemodded in one compile-green commit, the dependency graph machine-enforced by a custom ESLint rule
- An atomic C++ to Rust engine swap behind a 3-slice C-FFI xcframework: 120/120 XCTests, blake3 byte-parity, legacy engine deleted the same day
- iOS: a Swift 6 / TCA / Metal App Store client; desktop: a native Rust client (wgpu, Slint) with PKCE auth and signed Sparkle/WinSparkle auto-updates
- TOML codegen single-sources physics constants, copy, design tokens, and analytics events into web, Swift, Rust, and Slint; drift fails the pre-push gate
- QA across two languages: Rust proptest, Kani formal-verification, cargo-mutants, TypeScript fast-check, and a record/replay divergence harness
- The engine kernel was lifted out of the app monolith into versioned workspace packages, proven app-independent by a leak audit, then a custom ESLint rule machine-enforces the dependency graph.
- iOS runs the engine on-device through a hand-written C FFI packaged as a three-slice xcframework (Swift 6 strict concurrency, TCA, Metal); desktop is a fully native Rust client (wgpu, Slint), no embedded browser, with real PKCE OAuth and signed auto-updates.
- One typed TOML source fans physics constants, copy, design tokens, and analytics events into four platforms or the build fails; QA spans Rust proptest/Kani/cargo-mutants and TypeScript fast-check, with a record/replay divergence harness on the push gate.
Shows: engine-boundary ownership · C-FFI / xcframework · cross-platform codegen · native productization · formal verification
Engine kernel (extracted) › C-FFI / xcframework › web / iOS / desktop / relay › TOML codegen › pre-push parity gate
Contribution: Own the engine boundary end to end: the extraction, the FFI and xcframework, the wgpu/Slint desktop and Swift/Metal iOS clients, the TOML codegen, and the parity enforcement.
A browser editor turns captured Gaussian splats into carveable, voxel-aligned world content, so scans become editable, not just viewable.
I built a WebGL/Three.js Gaussian-splat editor that connects captured splats to voxel-resolution carving, so 3D scans can become editable world material rather than static scene captures.
- Open-source browser editor for Gaussian splat carving
- Voxel-resolution carving links captured splats to editable world content
- WebGL and Three.js prototype for scan-to-world tooling
- A browser-based editor that treats captured Gaussian splats as raw material rather than finished scenes: carve, align, and export to voxel resolution.
- Carving at voxel resolution snaps a free-form splat onto the world grid, so a scan drops into a voxel scene already aligned, not as a foreign mesh.
- Built as an open-source prototype to explore the scan-to-world workflow and inform the SPACE0 asset pipeline direction.
Shows: Gaussian splatting · voxel editing · browser 3D tooling · scan-to-world workflow · WebGL prototyping
Captured splat › Browser viewer › Voxel-resolution carve › Editable world content
Contribution: Personal open-source prototype connecting Gaussian splats with voxel-world editing.

Snap a photo, and the pipeline job kicks off.

The finished model, ready to drop into any world.

A museum table, captured into a placeable asset.
Text, a photo, or a tweet becomes a rigged, animated, compressed, game-ready 3D asset, with GPU workers, validation, rig merge, material maps, sound, and web/iOS/desktop export handled behind the scenes.
I built the pipeline that turns a single phone photo into a rigged, compressed, game-ready 3D asset: self-hosted Hunyuan3D 2.1 on GPU workers, UniRig auto-rigging, generated material maps, and meshopt/KTX2/USDZ output behind a Supabase RPC job queue with atomic claims.
- Self-hosted Hunyuan3D 2.1 on SaladCloud RTX 4090 workers, with weights baked into the image so cold workers start without downloading checkpoints
- Material maps (normal, height, roughness, AO) generated from a single albedo; output as meshopt/KTX2/USDZ with skeletal animation and sound
- A Redis-free GPU job queue on Supabase, claimed atomically by RPC, scaling independently of the app
- An autonomous X agent turns a mention into a crafted 3D item with a turntable video; I cut its end-to-end latency from ~34 minutes to under 10
- Self-hosted Hunyuan3D 2.1 as Python GPU workers on SaladCloud RTX 4090s, with model weights baked into the container image for faster cold starts.
- Auto-rigging with UniRig: predict the skeleton and skin on a simplified proxy, score multiple seeds, then merge the rig back onto the full-resolution mesh so topology, UVs, and textures all survive.
- A production asset path with validation and failure handling around mesh generation, rig merge, material-map generation, compression, and export, so bad outputs fail as jobs rather than leaking into the world.
Shows: GPU worker orchestration · 3D asset validation · rig merge · material-map generation · cross-platform export
Contribution: Built end to end on open models: GPU workers, UniRig merge, material-map generation, meshopt, KTX2, USDZ export, the Supabase RPC job queue, and preview rendering.
One Cloudflare Durable Object per world carries presence, edits, chat, and agent actions, so humans and agents share the same live space.
I built the shared live substrate: one Cloudflare Durable Object per SPACE0 world carries presence, edits, chat, and agent actions. Short-lived signed tokens authorize every connection, and a denylist blocks a misbehaving agent the next time it joins.
- One Cloudflare Durable Object per world, holding presence and session state in memory
- Short-lived signed tokens minted by the web app authorize every connection
- Each Durable Object reports its own latency and message volume
- Ray Vibe Awards 2025, Best Social Vibe winner: judges cited 'a kind of casual presence that most browser games don't even attempt'
- One Durable Object per world keeps presence and session state in memory, so every read is local to the DO and writes propagate instantly across all connected clients.
- A misbehaving agent is one that violates rate limits, sends malformed action payloads, or triggers server-side claim rejections repeatedly. The denylist blocks it at the next join, not mid-session.
- Short-lived signed tokens minted by the web app gate every WebSocket connection, so the relay never trusts claimed identity.
Shows: live session model · agent and player presence · signed realtime auth · Durable Objects · WebSocket infra
Web / iOS / desktop › signed token › Durable Object per world › backend (voxel store)
Contribution: Built and own the relay architecture: one live session model shared across web, iOS, desktop, and MCP-driven agents.

Framed posts sitting on a real surface, on web.

A media card with real depth and clipping, on web.
Posts live inside the world rather than in a panel, turning the world itself into the social and UGC surface, with text, image, and video cards rendering across web, desktop, and iOS from one backend.
I shipped in-world media posts on three platforms: a card of text, image, or video placed on a surface renders as a depth-correct decal on web (Three.js), desktop (a GPU-accelerated Slint canvas over wgpu), and iOS (SwiftUI), all backed by the same backend.
- Three renderers (Three.js, wgpu, SwiftUI) reading one shared backend
- Desktop: drag, resize, and full undo/redo over a GPU-accelerated Slint canvas as a design-surface editing tool
- Decals projected with correct depth, occlusion, and surface alignment on every client (web/desktop/iOS)
- Posts are placed in world-space, not a sidebar panel: they sit on surfaces as depth-correct decals with occlusion, so the world itself becomes the UGC surface.
- On desktop the canvas doubles as the authoring tool: place and edit a post in the same world-space view where it will live, no separate editor.
- One backend serves three distinct renderers (Three.js web, wgpu desktop, SwiftUI iOS) without platform-specific divergence in the post schema.
Shows: in-world UGC surface · cross-platform decal rendering · design-surface editing · undo/redo · shared backend
Contribution: Built and shipped across web, desktop, and iOS.

Every component is browseable and live right in the page.

Each page: a live preview plus copy-paste cargo and npm install.
slintcn is an open-source component registry for Slint native apps, shipping 56 components with npm and crates.io installers, an MCP server, and live Slint-WASM docs.
I built slintcn: a shadcn-style component registry for Slint native apps with 56 components plus 8 blocks, npm and crates.io installers sharing one registry, an MCP server for AI agents, and live Slint-WASM docs. Built from the SPACE0 desktop work and dogfooded back into the native client.
- 56 components and 8 blocks, installable via npm or crates.io from the same registry
- 60 GitHub stars and a 62K-view r/rust launch post
- MCP server lets AI agents browse, install, and compose components
- A single registry serves both npm and crates.io installers, so web tooling and native Rust clients share the same component source without duplication.
- The MCP server exposes the registry to AI agents, letting them browse available components, read docs, and install by name into a project.
- Built from real production need: the component system grew out of the SPACE0 desktop client and was dogfooded back into it, so every component shipped in a live product before it reached the registry.
Shows: open-source registry design · Slint component system · npm and crates.io publishing · MCP tooling · WASM live docs
Contribution: Built and maintain it as an independent open-source project.
A whole natural world simulated live in a browser tab: real atmospheric circulation, a shader ocean, volumetric weather, and a metabolism that drives the ecosystem, on a kilometer-scale voxel planet.
I authored the systems that make the planet feel alive: a Rust-WASM climate model (three-cell global circulation, Coriolis, an ITCZ, and a -6.5C/1000m lapse rate), a GPU Gerstner-wave ocean in TSL, an 8,418-line volumetric cloud and weather system, a five-phase metabolism engine, and the LLM NPC brain that became the origin of the embodiment program.
- Rust-WASM climate: three-cell global circulation (Hadley/Ferrel/Polar), Coriolis, ITCZ, and lapse-rate temperature, compiled to WASM and run in the browser
- Volumetric cloud and weather: SDF raymarching, an offline imposter baker, and chunked streaming
- An LLM-driven NPC brain (FOV stealth and combat) that is the direct lineage of the SPACE0 embodiment program
- First public launch on Hacker News: ~3,000 visitors and ~500 signups
- The climate is real physics, not a texture: a wind module implements three-cell global circulation with Coriolis deflection, and a spherical-climate module models the ITCZ, storm tracks, and a -6.5C/1000m lapse rate, all running in WASM.
- A five-phase metabolism engine drives the living world as an immutable per-phase pipeline, with the simulation systems (farming, GPU-instanced vegetation) built on top.
- The LLM NPC brain (FOV-based stealth and combat over a WebSocket cognition loop) was the first field test of the autonomy and memory patterns that later became the SPACE0 embodiment layer.
Shows: climate simulation · WASM physics · shader ocean · volumetric weather · LLM NPCs
Climate physics (Rust-WASM) › Volumetric weather › Gerstner ocean › Metabolism engine › LLM NPC brain
Contribution: Authored the Rust-WASM climate physics, the volumetric weather, the Gerstner ocean, the metabolism engine, and the LLM NPC brain.
Background
Recognition
Writing
Contact