Research

Digital Surface Labs

"OpenArcade HN Launch Draft"

"Show HN post title options and first comment"

OpenArcade — Hacker News Launch Draft

Title Options

HN titles are capped at 80 characters. "Show HN: " takes 10, leaving 70 for the pitch.

# Title Angle
1 Show HN: Play arcade games in your browser, generate training data for a vision AI Core value prop — play + AI training
2 Show HN: OpenArcade – Browser games that train a neural net to play from raw pixels Technical/specific — "raw pixels" is the hook
3 Show HN: We built arcade games whose landing page optimizes itself with bandits Meta/self-referential — the page IS the project
4 Show HN: OpenArcade – Play games, watch the AI learn in real time Curiosity gap — "watch it learn"

Recommended: #2. It's specific ("raw pixels"), technical enough for HN, and raises immediate questions (how? what architecture? does it work?). #1 is the safe pick. #3 is the boldest — HN loves meta/self-referential projects, but it buries the games.


First Comment (the "founder comment")

This goes up immediately after submission. It should be honest, technical, and concise. HN penalizes marketing-speak and rewards transparency about what works and what doesn't.


Hey HN — I built this as a side project. The idea is simple: classic arcade games in the browser, but every session silently captures screen frames + keyboard inputs and feeds them into a pipeline that trains a vision model to play the games from raw pixels.

How it works:

  • 7 games (Tetris, Space Invaders, Flappy Bird, Snake, Breakout, Pong, Asteroids), each a single self-contained HTML file
  • A lightweight recorder (recorder.js) captures canvas frames at ~4 fps + all keyboard events, uploads to an ingest server
  • The data feeds into a screen-to-action model — the AI sees exactly what you see (pixel grid) and learns to map frames to keypresses
  • Everything runs on a Jetson Orin Nano at my house

The landing page optimizes itself:

The page uses Thompson Sampling (a multi-armed bandit algorithm) to figure out which game ordering gets the most people to actually play. It tracks which layout you saw, whether you clicked through, and whether you actually played for 30+ seconds. The bandit shifts traffic toward orderings that produce real engagement, not just clicks.

You can watch the AI training dashboard live at ssd.digitalsurface.com — it shows frames being captured, training progress, and the model's attempts to play.

What I'd love feedback on:

  • Is the "your gameplay trains AI" pitch compelling or off-putting? I want people to play because the games are fun, with the AI angle as a bonus
  • The bandit currently tests 5 card orderings. Should it also test different descriptions, CTAs, or visual treatments?
  • Anyone have experience with behavioral signals (mouse trajectory, scroll velocity) as features for conversion prediction? I'm considering a small in-browser MLP for this but unsure if the signal-to-noise ratio justifies it at low traffic

Stack: Vanilla HTML/CSS/JS, no build system. Nginx + Python ingest hub on a Jetson Orin Nano. Thompson Sampling implemented in ~50 lines of client-side JS.


Timing Notes

  • Best submission times for HN: Tuesday–Thursday, 8-10am ET (when US + Europe overlap). Avoid weekends and Monday mornings.
  • Upvote strategy: Share the direct HN link (not the project URL) with anyone who might upvote. The first 30 minutes are critical — HN's ranking algorithm heavily weights early velocity.
  • Be present: Reply to every comment in the first 2 hours. HN rewards engaged founders. Comments boost the post's ranking.

Anticipated HN Questions & Prepared Responses

"Why not just use OpenAI Gym / Atari environments?"

Good question — those environments give the model direct access to game state (RAM values, reward signals). We're deliberately training from raw pixels only, the same visual input a human gets. The goal is a general-purpose screen-reading model, not a game-specific agent. Think of it as training the "eyes" rather than the "brain."

"How much data do you actually need?"

Still figuring this out honestly. We have [X] hours of gameplay so far across [Y] sessions. Early results show the model can distinguish game states (menu vs playing vs game-over) after ~2 hours of data per game. Meaningful play behavior requires significantly more. The dashboard at ssd.digitalsurface.com shows current progress.

"Isn't this just behavioral cloning? That doesn't generalize."

You're right that pure behavioral cloning has a compounding error problem. We're using it as a starting point — the human demonstrations bootstrap the model, and we plan to layer on self-play / RL fine-tuning once the model has a reasonable policy to start from. DAgger-style approaches are on the roadmap.

"Privacy concerns with recording screen + inputs?"

The recorder only captures the game canvas (not the full screen), and only keyboard events while the game tab is focused. No personal data, no cookies beyond a random session ID. All processing happens on a Jetson in my house, not in the cloud. The code is open — you can read recorder.js, it's ~100 lines.

"The 'online users' count looks fake."

Fair catch — the online count is simulated (random 8-26, fluctuates). I should either make it real (WebSocket presence) or remove it. Appreciate the honesty check. [Note to self: consider removing or replacing before launch if this feels dishonest.]

"Why a Jetson and not a cloud GPU?"

Cost. A Jetson Orin Nano is a one-time $250 purchase. It runs inference, serves the site, and handles the training pipeline 24/7 for ~15W of power. Cloud GPUs would cost more per month than the hardware cost to buy. For a side project this is the right tradeoff.