Projective AR Workstation

A camera-projector AR system for assistive manufacturing, using ChArUco calibration and real-time dynamic projection to augment factory workstations.

C++OpenCVOpenGLChArUcoMultithreading

What if the screen came to the worker instead of the other way around? Factory line work demands sustained precision under repetitive conditions — a challenge compounded for workers with disabilities. Rather than routing attention away from the task to a monitor, a projection-based AR workstation can overlay guidance directly onto the physical work surface. The result is a system where the digital interface disappears into the environment itself.

This project built exactly that: a camera-projector pair mounted above a garment assembly workstation, capable of projecting real-time visual annotations — seam paths, attachment points, step cues — precisely onto the work surface in front of the worker.

Architecture

The pipeline runs in C++ with a multithreaded dataflow design. Each processing stage — capture, calibration, tracking, rendering — operates as an independent thread communicating over lock-free queues. This keeps frame latency low and makes the system composable: stages can be swapped or extended without coupling concerns rippling across the pipeline.

ChArUco Calibration

Getting projection to land where you intend it requires an accurate geometric model of the camera-projector relationship. This is a two-part problem: intrinsic calibration (modeling each device’s own optics) and extrinsic calibration (solving the spatial relationship between them).

For intrinsics, we used ChArUco boards — a hybrid pattern combining a checkerboard grid with embedded ArUco markers. The checkerboard gives sub-pixel corner accuracy; the ArUco markers give each corner a unique identity, so the board can be partially occluded and still yield valid correspondences. At the time, this was state-of-the-art for camera calibration, and it showed: the intrinsic models were stable and the reprojection error tight.

With intrinsics in hand, we solved the extrinsic transform by observing the ChArUco board from the camera while projecting a known pattern onto it. Matching correspondences across both views yields the rotation and translation that maps world points into projector image coordinates — meaning we can now compute exactly which projector pixel corresponds to any point the camera can see.

Projection onto the Scene

With calibration locked in, the system can register visual content onto physical surfaces with spatial accuracy. A known 3D model of the task fixture is detected in the scene, giving the system a stable coordinate frame to attach overlays to. As the worker interacts with the fixture, content stays registered to the object rather than floating in screen space.

The visual output is rendered with OpenGL and sent to the projector as a display. From the worker’s perspective, it simply looks like markings on the table.

Task State and Passive Tracking

Assembly tasks were modeled as finite state machines. The camera passively watches for task jigs — the physical fixtures used at each step — and advances state as they appear, move, or are set aside. No buttons, no touchscreen. The interface recedes; the task remains front and center.

The system also accumulates performance signals in the background: step times, time-delta consistency, completion rates. These are the kind of signals that can surface fatigue patterns or flag a step where workers consistently slow down — useful data for both training design and accessibility tuning.

Why This Is Hard

Projective AR sounds straightforward until you’re on the factory floor. Reflective surfaces (zippers, pins, foils) scatter projected light unpredictably. Moving hands occlude the surface mid-task. A single projector casts shadows that a multi-projector rig could fill in — but coordinating multiple projectors against a live BRDF model is a research problem in its own right.

The solutions to these problems are what make projective AR genuinely interesting. Dynamic projection mapping that updates the display in response to surface normals in real time; gaze-aware rendering that adjusts brightness based on viewing angle; multi-projector coordination that treats illumination as a controllable variable rather than a fixed condition. There’s a lot of runway left in this space.