Research

Digital Surface Labs

"Personal AI Computer: Hardware, Compute, and Economics"

Personal AI Computer: Hardware, Compute, and Economics

A ~$400 device that sits in someone's home and runs AI tasks that recover money, save time, and handle administrative work. This document covers the technical feasibility, hardware trade-offs, compute budgets, and economic model for the product.


1. The Market Gap

There is a category of financial tasks worth $0-$500 each that currently go undone. They go undone because accessing the value requires professional help — lawyers at $200-500/hr, accountants at $150-400/hr, medical billing advocates at $100-200/hr — and the friction of engaging a professional exceeds the perceived reward. Nobody hires a lawyer to recover a $180 utility deposit. Nobody pays an accountant $300 to find a $500 escrow overpayment. The tasks rot.

The catalog from Dollar Hound's MONEY-SOURCES.md documents 15 distinct source categories. The highest-automation ones:

Source Typical Amount Coverage Automation Potential
State Unclaimed Property $20-$5,000 National (50 states) HIGH
Class Action Settlements $5-$500 National HIGH
IRS Unclaimed Refunds ~$893 median National MEDIUM
Medical Billing Errors $200-$500 per bill National MEDIUM
CFPB Settlement Distributions $50-$2,000 National MEDIUM
Utility Deposits $50-$500 National HIGH
Mortgage Escrow Overpayments $200-$2,000 National MEDIUM
State Tax Refunds $50-$2,000 State HIGH
Securities/Dividend Escheatment Varies National HIGH
DOL Wage & Hour Back Pay $100-$5,000 National MEDIUM

The total addressable value per household is significant — conservative estimates across all categories put it at $500-$3,000 for a typical family of four. But individually, each task is too small and too tedious to justify paying a professional. A $400 device that continuously scans for and acts on these opportunities changes the economics entirely. It converts professional-cost tasks into appliance-cost tasks. The device pays for itself, then keeps paying.

This is not a new market — it is an unserved one. The money exists. The databases are public. The claims processes are documented. What is missing is a persistent, automated agent with the patience to scan 50 state databases weekly, cross-reference names and addresses, and prepare claim forms. That is exactly what a local AI computer does well.


2. Hardware Platform Analysis

The Jetson Orin Nano Super is the leading candidate. Joe already operates three of them (Prometheus, Epimetheus, Atlas) running Ollama, CUDA, JetPack 6, and various services. The stack is proven.

Specifications: - 67 TOPS INT8 inference performance - 8GB LPDDR5 (shared CPU/GPU memory) - Ampere GPU: 1024 CUDA cores - 6-core Arm Cortex-A78AE CPU - Power: 7-25W configurable (15W typical for sustained inference) - JetPack 6 (Ubuntu 22.04 base, CUDA 12.x, cuDNN, TensorRT)

Bill of Materials at $400 target:

Component Cost Notes
Jetson Orin Nano Super Dev Kit $249 Includes carrier board, heatsink, power supply
512GB NVMe SSD (WD SN740 M.2 2230) $40 Database, models, OS. 2230 form factor fits the dev kit
WiFi 6 M.2 Module (Intel AX200) $15 2.4/5GHz, Bluetooth 5.0
USB SIM Adapter (Quectel EC25 or SIMCom 7600) $25 For phone call capability via cellular
Enclosure (injection-molded ABS) $15 At 500+ unit volumes
microSD for recovery $6 Backup boot/recovery image
Total $350 $50 margin to $400 retail

Critical constraint — the 8GB RAM wall: The Jetson shares its 8GB between CPU, GPU, and OS. A 7B Q4_K_M model consumes ~4.5GB of VRAM, leaving 3.5GB for Ubuntu, the application stack (Node.js, SQLite, Ollama), and working memory. This is tight but workable. A 14B model requires ~8.5GB and does not fit comfortably — it forces aggressive memory management, swapping, or model offloading to CPU, which tanks throughput.

Inference throughput: ~15-25 tokens/second for 7B Q4_K_M on the Orin Nano GPU. At an average of 150 tokens per inference (prompt + completion), that yields ~100-170 inferences per hour, or 2,400-4,000 inferences per day. This is more than sufficient for the workload (Section 4 shows ~2 GPU-hours/day baseline).

AMD Ryzen AI Mini PCs

The AMD Ryzen AI platform is the most interesting near-term alternative, primarily because it breaks through the 8GB wall.

Example unit: Beelink SER8 (Ryzen 8845HS) - 8-core Zen 4 CPU, Radeon 780M iGPU, XDNA NPU (16 TOPS) - 16GB DDR5 (upgradeable to 32GB), 500GB NVMe - WiFi 6E, Bluetooth 5.3, USB-C, HDMI - Price: ~$450 (16GB/500GB configuration) - Power: 45-65W TDP (3-4x the Jetson)

Why this matters: 16GB RAM fits a 14B Q4 model (~8.5GB) with 7.5GB left for OS and applications. The 14B models (Qwen2.5:14b, Llama 3.1:14b) are meaningfully better than 7B at structured extraction, reasoning, and JSON generation — exactly the tasks Dollar Hound needs.

CPU inference performance: 10-20 tok/s for 7B on Zen 4 (llama.cpp AVX-512). Competitive with the Jetson GPU path. The XDNA NPU is immature for LLM inference as of early 2026 — AMD's ROCm support on integrated GPUs is still catching up.

Recommendation: Watch this market through mid-2026. When 16GB AMD mini PCs hit $400, they become the default recommendation. The power draw (45-65W vs 15W) increases annual electricity from $20 to $60-85, but the 14B model capability may justify it.

Rockchip RK3588 (Orange Pi 5 Plus)

  • 6 TOPS NPU, 16-32GB RAM options
  • Board: ~$110-150, full BOM: $200-350
  • NPU support for LLMs is immature. CPU inference: 3-8 tok/s for 7B
  • Too slow for practical use. The RK3588 NPU excels at vision models (YOLO, etc.) but lacks the software ecosystem for LLM inference

Verdict: Not viable for this product. Revisit if RKNN-LLM matures.

Apple Mac Mini M2 (Refurbished)

  • M2 chip: 15.8 TOPS Neural Engine, 8-core GPU
  • MLX framework for optimized inference
  • 8GB/256GB refurbished: $380-420
  • Inference: 25-40 tok/s for 7B via MLX (fastest option at this price)

Weaknesses: - 8GB unified memory = same RAM wall as Jetson - Closed ecosystem — no GPIO, no SIM modem integration, no custom carrier board - Sleep management is problematic for an always-on appliance (macOS aggressively sleeps) - Software distribution via App Store would be required at scale - No cellular modem path without USB adapter + driver complexity

Verdict: Best raw inference performance at the price point, but the closed ecosystem and sleep management make it a poor appliance platform.

Custom PCB

  • NRE (non-recurring engineering): $50,000-$100,000 for board design, validation, and certification
  • Per-unit BOM at scale: $150-250 (Jetson module + custom carrier)
  • Only viable at 5,000-10,000+ units to amortize NRE
  • Timeline: 6-12 months from design to production

Verdict: Phase 3+ consideration. Use dev kits for MVP and pilot.

Platform Recommendation

Now: Jetson Orin Nano Super ($350 BOM). Proven stack, Joe's existing fleet, adequate for 7B workloads, 15W power budget, cellular modem integration via sip-audio already built.

6-12 months: AMD Ryzen AI mini PC when 16GB configurations hit $400. The 14B model unlock is worth the power trade-off.

At scale (5,000+ units): Custom carrier board with Jetson Orin Nano module ($199 module only) to reduce BOM and form factor.


3. Compute Budget Per Task Category

All estimates assume a Jetson Orin Nano Super running a 7B Q4_K_M model (e.g., Qwen2.5:7b, Llama 3.1:7b) via Ollama at ~20 tok/s. GPU time is measured as time the model is actively loaded and inferring. CPU-bound tasks (fetching, parsing, database ops) run concurrently and do not consume GPU time.

Claims Monitoring

What it does: Fetch and parse state unclaimed property databases (CPU-bound scraping), match names against household members (LLM), verify addresses against known addresses (LLM), assess claim validity and estimate amounts (LLM), pre-fill claim forms (LLM).

Metric Value
GPU time 0.7 hrs/day average
Model size needed 7B sufficient for name matching; 14B better for form understanding
Expected value $200-$1,500/year (1-3 claims found for household of 4)
Frequency Weekly scan of 50 states, daily scan of high-priority states
Internet required Yes (database access, form submission)
Human interaction Low — device identifies claims; human files manually or reviews auto-filled forms

Breakdown: 50 states x ~200 name variations (household members, maiden names, former addresses) = 10,000 comparisons per weekly sweep. Most are eliminated by exact-match filtering (CPU). ~500 require LLM disambiguation ("Is JOSEPH NEWBERRY the same as JOSEPH NEWBRY?"). At ~3 seconds per LLM call, that is 25 minutes of GPU time per weekly sweep, or ~3.6 minutes/day amortized. Deeper research per potential match (address verification, amount estimation, form analysis) adds ~30 minutes/day when active claims exist.

Tax Preparation

What it does: OCR W-2s, 1099s, and other tax documents. Categorize deductions across Schedule A/C/D/E. Optimize credit selection (child tax credit, EITC, education credits, energy credits). Fill Form 1040 and schedules. Compare against prior-year returns for anomaly detection. Model different filing strategies (standard vs itemized, MFJ vs MFS).

Metric Value
GPU time 7 hrs/day during tax season (14 days), amortized to 0.27 hrs/day
Model size needed 14B strongly preferred — tax logic requires multi-step reasoning
Expected value $200-$800/year (replaces TurboTax $90-170 + catches $100-600 in missed deductions)
Frequency Annual burst (January-April), with quarterly estimated tax check-ins
Internet required Yes (e-file submission, IRS API for refund status)
Human interaction High during preparation — user reviews every line item before filing

Note on 7B limitation: Tax preparation is the strongest argument for 14B models. Optimizing between standard and itemized deductions, evaluating education credit eligibility (American Opportunity vs Lifetime Learning), and correctly handling capital gains with wash sale rules all require multi-step reasoning that 7B models handle unreliably. The MVP may need to flag complex situations for manual review rather than attempting autonomous optimization.

Email Management

What it does: Triage and classify incoming emails (newsletter, actionable, settlement notice, bill, subscription confirmation). Draft responses for routine emails. Detect class action settlement notices and extract filing deadlines. Audit recurring subscriptions by scanning purchase confirmation emails. Flag time-sensitive items (deadlines, appointments, renewals).

Metric Value
GPU time 0.6 hrs/day
Model size needed 7B sufficient for classification; 14B better for drafting
Expected value $200-$600/year (settlements found + subscription cancellation savings)
Frequency Daily processing, real-time for high-priority items
Internet required Yes (IMAP/Gmail API access)
Human interaction Low — classifications and drafts are presented; user approves sends

Breakdown: Average household receives 50-150 emails/day. ~80% eliminated by rule-based filtering (known newsletters, marketing). ~20-30 emails/day require LLM classification. At ~2 seconds per classification, that is 1 minute/day. Settlement notice extraction and subscription auditing add ~30 minutes/day of deeper analysis.

Healthcare Records and Billing

What it does: OCR medical bills and Explanation of Benefits (EOB) documents. Compare billed amounts against insurance-approved amounts. Detect common billing errors (duplicate charges, unbundling, upcoding, balance billing for in-network providers). Track deductible and out-of-pocket maximum progress. Flag bills that exceed expected costs for the procedure (using publicly available price transparency data).

Metric Value
GPU time 0.1 hrs/day average (highly bursty — spikes during medical events)
Model size needed 7B for basic comparison; 14B for understanding EOB format variations
Expected value $200-$1,000/year (studies by NerdWallet and JAMA indicate 49-80% of hospital bills contain errors)
Frequency Per medical event — may be dormant for weeks, then process 10 documents in a day
Internet required No — all processing is local (this is a privacy-critical feature)
Human interaction Medium — device flags potential errors; user contacts provider or insurer

Privacy advantage: Medical billing is where local processing is not just a feature — it is a requirement. HIPAA concerns make cloud processing of medical bills a non-starter for privacy-conscious users. Every EOB contains diagnosis codes, provider names, and personal health information. Processing this data locally, with zero cloud transmission, is a genuine differentiator.

Form Filling and Benefits Access

What it does: Maintain a structured personal data store (names, addresses, SSNs, employment history, income). Pre-fill government forms (passport renewal, voter registration, FAFSA, ACA marketplace applications). Track deadlines for benefits enrollment (ACA open enrollment, Medicare, FSA/HSA elections). Identify unclaimed benefits eligibility (SNAP, WIC, LIHEAP, ACA subsidies, property tax exemptions).

Metric Value
GPU time 0.03 hrs/day average
Model size needed 7B sufficient for form filling; 14B for eligibility analysis
Expected value $125-$2,500/year (time savings + benefits many households don't realize they qualify for)
Frequency Event-driven — triggered by life changes (move, job change, new child, income change)
Internet required Yes (form submission, eligibility checkers)
Human interaction Medium — device pre-fills and flags eligibility; user reviews and submits

Note on benefits access: The high end of the value range ($2,500/year) reflects households that qualify for but do not claim ACA premium subsidies, SNAP benefits, or property tax exemptions for seniors/veterans/disabled. A device that monitors income changes and proactively checks eligibility could surface thousands of dollars in unclaimed benefits. The average value is much lower for middle-income households.

Phone Calls (via Cellular SIM)

What it does: Place outbound calls using text-to-speech and speech-to-text. Navigate IVR phone menus (press 1 for billing, etc.). Execute dispute scripts for billing errors, subscription cancellations, and fee reversals. Handle hold time autonomously (the device waits; the human doesn't). Report outcomes via transcript.

Metric Value
GPU time 0.3 hrs/day average
Model size needed 7B for IVR navigation; 14B for complex negotiation
Expected value $600-$1,500/year (hold-time savings valued at user's hourly rate + successful dispute/negotiation outcomes)
Frequency As needed — estimated 2-5 calls/week for active household
Internet required Yes (cellular via SIM modem)
Human interaction Medium initially — device handles routine calls; user monitors complex ones. Decreasing over time as call patterns are learned

Critical constraint — memory pressure: Real-time phone calls require simultaneous operation of STT (faster-whisper, ~500MB), TTS (Piper, ~200MB), and LLM (7B Q4, ~4.5GB). Total: ~5.2GB. On an 8GB device with OS overhead, this is feasible but leaves no room for anything else. Phone calls are a dedicated mode — the device suspends other tasks during a call. This is acceptable since calls are intermittent and short (average 5-15 minutes).

sip-audio integration: Joe's existing sip-audio project on the Jetson fleet already handles the SIM7600G-H modem driver, PCM audio I/O, VAD, barge-in detection, and latency tracing. This is proven code — 341 lines of AT command handling, real-time 8kHz PCM audio processing, sub-second latency tracing. The phone call capability is not theoretical; it is built.


4. The "Go Deeper" Utilization Model

Baseline Utilization

If the device runs each task category at minimum viable depth — just enough to get the job done — the GPU utilization is surprisingly low:

Task GPU hrs/day % of 24hr
Claims monitoring 0.70 2.9%
Tax prep (amortized) 0.27 1.1%
Email management 0.60 2.5%
Healthcare records 0.10 0.4%
Form filling 0.03 0.1%
Phone calls 0.30 1.3%
Total 2.00 8.3%

Two hours of GPU time per day out of 24 available. The device is idle 91.7% of the time. At 15W, it costs $20/year in electricity to run — this is not a problem. But it raises the question: what does the device do with the other 22 hours?

Depth Scaling — Using Idle Time Productively

The answer is to go deeper on every task during idle time. Instead of a quick 3-query scan of a state database, run 30+ queries with variant name spellings, maiden names, middle initials, and known former addresses. Instead of a basic email classification, build a dossier of every company the household has done business with and cross-reference against CFPB enforcement actions. Instead of checking one state's unclaimed property database, speculatively scan states the household has never lived in but where employers or financial institutions may hold assets.

The strategy is: spend more inference to find more money.

Mode What it does GPU hrs/day Utilization
Quick passes only Minimum viable scans, basic classification 2 8%
Standard depth Full name variant matching, multi-source cross-reference 5 21%
Deep + speculative Adjacent-state scanning, employer history research, pre-filled forms for all scenarios 10 42%
Deep + speculative + knowledge building Build household financial knowledge graph, track regulatory changes, pre-compute dispute templates, simulate tax strategies 14 58%

Realistic Ceiling

An honest assessment: 80-90% GPU utilization for a single household requires either serving multiple households (multi-tenant) or participating in federated compute (selling idle cycles). For a single household, 40-60% utilization with aggressive depth scaling is the realistic ceiling.

But this is fine. The device costs $20/year in electricity at 15W. The economic question is not "is the GPU busy enough?" but "does the device find enough money?" A device running at 8% utilization that recovers $730/year is a far better product than one running at 90% utilization that recovers the same amount. Utilization is an engineering metric, not a product metric. Value recovered is the product metric.

The depth scaling model does improve value recovery — more inference per task means fewer missed claims, better form filling, more thorough billing error detection. But the relationship is logarithmic, not linear. Going from 3 queries to 30 per state database might increase claim detection by 40%. Going from 30 to 300 might increase it by another 10%. The first few hours of depth scaling have the highest marginal return.


5. Economic Model

Annual Value Generation (Probability-Weighted)

Each category is estimated with a low and high bound, weighted by the probability of a typical household seeing any value in that category during a given year.

Category Low High Probability Expected Value
Unclaimed property claims $200 $1,500 60% $510
Tax prep savings $200 $800 90% $450
Email/settlement claims $100 $600 50% $175
Healthcare billing corrections $100 $500 40% $120
Form filling/benefits access $100 $2,000 30% $315
Phone call automation $200 $800 50% $250
Subscription audit savings $50 $300 70% $123
Total expected $1,943

Adjusted expected value: The categories are not independent — a household that has unclaimed property is more likely to also have billing errors and missed benefits. But the probabilities above already account for the "average household" base rate. A more conservative approach applies a 60% portfolio discount to account for correlation and optimism bias:

Conservative portfolio-adjusted expected value: $730/year

This is the number used in ROI calculations below.

Cost Structure

Cost Year 1 Year 2+ Notes
Device hardware $400 $0 One-time purchase
Electricity (15W x 24/7 x $0.15/kWh) $20 $20 National average residential rate
Internet (existing home connection) $0 $0 No incremental cost
SIM card (data-only plan) $120 $120 T-Mobile/Mint Mobile data-only, ~$10/mo
Cloud API fallback (optional) $0-$50 $0-$50 For tasks exceeding 7B capability
Total $540 $140

Note on SIM cost: The $120/year SIM card is only needed if the device makes phone calls. Without phone capability, annual operating cost drops to $20/year. Phone calls are the highest-value task category ($250/year expected) but also the highest-cost, so the net contribution is $130/year — still positive but the thinnest margin.

Return on Investment

Scenario Annual Value Year 1 Net Payback Period 5-Year Cumulative ROI
Conservative ($390/yr) $390 -$150 18 months $1,410
Expected ($730/yr) $730 +$190 8 months $3,110
Optimistic ($1,420/yr) $1,420 +$880 4 months $6,560

Year 1 breakeven requires $540 in recovered value. At the expected $730/year, the device pays for itself in the first year and generates ~$590/year net in subsequent years.

Comparison to alternatives: - TurboTax Deluxe: $90-170/year (just tax prep, no other features) - Identity theft monitoring (LifeLock): $120-350/year (monitoring only, no recovery) - Unclaimed property finder services: 10-35% of recovered amount (one-time, per-claim fee)

The device replaces or subsumes all three of these ongoing costs while adding capabilities none of them offer.


6. On-Device vs Cloud Inference Cost

Per-Inference Cost Comparison

Assuming an average inference of 150 tokens (50 input + 100 output):

Platform Cost/Inference 2,000 inf/day Annual Cost
On-device Jetson (amortized 3yr) $0.0001 $0.20/day $73
GPT-4o-mini (API) $0.0002 $0.40/day $146
Claude 3.5 Haiku (API) $0.0004 $0.80/day $292
Claude 3.5 Sonnet (API) $0.003 $6.00/day $2,190
GPT-4o (API) $0.004 $8.00/day $2,920

At household-scale volumes (1,000-4,000 inferences/day), on-device inference is 2x cheaper than the cheapest cloud option and 30-40x cheaper than frontier models. But the absolute numbers are small — the difference between $73/year and $146/year is not the reason to build a local device.

The Real Differentiator: Privacy

The cost advantage is real but modest. The privacy advantage is decisive. The device processes:

  • Social Security Numbers — for tax preparation, benefits applications, unclaimed property claims
  • Bank account and routing numbers — for direct deposit setup on claim forms
  • Medical records — diagnosis codes, provider names, treatment history, billing details
  • Tax returns — complete income, deduction, and credit information
  • Employment history — for retirement account searches, wage claim checks
  • Email contents — for settlement notice detection, subscription auditing

None of this data leaves the device. There is no cloud storage to breach, no terms of service granting the provider rights to training data, no API deprecation risk, no sudden pricing changes, no compliance questions about HIPAA or financial data handling. The data stays on a NVMe SSD in the user's home, encrypted at rest, accessible only on the local network.

This is not a marketing claim — it is an architectural guarantee. The models run locally via Ollama. The database is local SQLite. The only outbound network traffic is HTTP requests to public databases (state unclaimed property sites, IRS, etc.) and cellular calls via the SIM modem. No personal data is transmitted to any AI provider.

For the subset of users who care about data privacy — and that subset is growing — this is the entire pitch. "Your financial data never leaves your house."

Hybrid Strategy

The practical recommendation is a hybrid approach:

  • On-device (7B): Classification, name matching, form filling, email triage, phone call conversation — high-volume, privacy-sensitive tasks
  • Cloud API (Sonnet/Opus) as optional fallback: Tax optimization, complex dispute strategy, document understanding for unusual formats — low-volume, high-complexity tasks where 7B accuracy is insufficient

At 50-100 cloud API calls per month (the complex tail), annual cloud cost is $20-50. Combined with on-device inference, total compute cost is $93-123/year — well within the value generation envelope.


7. Competitive Landscape

Consumer AI Hardware (2024-2026)

Product Price Local Compute Financial Features Status
Rabbit R1 $199 None (cloud) None Commercial failure — glorified app launcher
Humane AI Pin $699 + $24/mo Minimal None Struggled commercially, pivoted
Limitless Pendant $99 None (cloud) None Meeting transcription only
Rewind/Limitless $300-500 + $20-30/mo macOS screen recording None Subscription model, cloud analysis
Personal AI Computer $400 67 TOPS, 7B model Core purpose Proposed

The Closest Analog: Home Assistant

Home Assistant is the best comparable product — not in function, but in structure. Home Assistant is:

  • Open-source software running on a $35-100 local device (Raspberry Pi, Home Assistant Yellow, Home Assistant Green)
  • Always on, always connected to the home network
  • Automates repetitive tasks (lights, thermostats, locks, sensors)
  • Justifies its hardware cost through utility savings ($50-200/year in energy optimization)
  • Has a passionate community of contributors and users
  • Sells hardware (Home Assistant Yellow: $125, Home Assistant Green: $99) that bundles the software

The personal AI computer is the financial equivalent of Home Assistant. Home Assistant saves money on electricity. The personal AI computer finds money in databases, corrects billing errors, and files claims. Both are local-first, always-on, privacy-preserving appliances that justify their cost through measurable savings.

Key Insight

Nothing in the current market combines on-device AI inference with financial recovery automation. The consumer AI hardware products (Rabbit, Humane, Limitless) are cloud-dependent general assistants with no financial features. The financial tools (TurboTax, LifeLock, unclaimed property finders) are cloud services with no local compute. The space between — a local AI appliance purpose-built for financial tasks — is empty.


8. The Three-Layer Architecture

The personal AI computer maps to a three-layer architecture that aligns with Joe's existing project portfolio:

┌─────────────────────────────────────────────────────────────┐
│  LAYER 1: ORCHESTRATION (OpenClaw)                          │
│                                                             │
│  Decides WHAT to do and WHEN:                               │
│  - Task scheduling via WSJF priority and cron               │
│  - Approval workflows (autonomous vs ask-first)             │
│  - Multi-channel messaging (WhatsApp, iMessage, SMS)        │
│  - Persistent memory — learns household preferences         │
│  - Skills: markdown playbooks for each task category        │
│  - Dashboard: Canvas for user visibility into device state  │
└──────────────────────────┬──────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────┐
│  LAYER 2: EXECUTION (Domain-Specific Engines)               │
│                                                             │
│  Does THE WORK:                                             │
│  - Dollar Hound: claims pipeline, state database scanning,  │
│    name resolution, scoring, outreach                       │
│  - sip-audio: phone call appliance (SIM7600 modem, TTS,    │
│    STT, real-time conversation)                             │
│  - Tax engine: OCR, deduction categorization, form filling  │
│  - Email engine: IMAP classification, settlement detection  │
│  - Healthcare engine: bill OCR, EOB comparison, error flags │
└──────────────────────────┬──────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────┐
│  LAYER 3: HARDWARE + INFERENCE (Jetson + Ollama)            │
│                                                             │
│  Provides THE COMPUTE:                                      │
│  - 7B Q4_K_M model via Ollama (CUDA-accelerated)           │
│  - 512GB NVMe: model storage, SQLite databases, documents  │
│  - WiFi 6: home network, internet access                   │
│  - SIM7600 cellular modem: phone calls, SMS, fallback data │
│  - 15W power envelope: always-on appliance                  │
└─────────────────────────────────────────────────────────────┘

What Ships on the Device

Component Role Size
OpenClaw (Node.js daemon) Orchestration, scheduling, memory, messaging ~50MB
Dollar Hound (Node.js) Claims pipeline, database, resolver, enrichment ~30MB
sip-audio (Python) Phone appliance: modem driver, TTS, STT, conversation agent ~20MB
Ollama runtime LLM inference server ~100MB
Qwen2.5:7b-instruct-q4_K_M Primary inference model ~4.5GB
faster-whisper (small.en) Speech-to-text for phone calls ~500MB
Piper TTS (en_US medium) Text-to-speech for phone calls ~100MB
SQLite databases Claims data, household data, memory, call logs Variable
Total ~5.5GB initial

The 512GB NVMe provides ample room for the software stack, multiple model versions, document storage (scanned bills, tax forms), and database growth over years of operation.


9. Minimum Viable Product

To justify $400, the device needs to generate $400+ in value during Year 1. The MVP should focus on the highest-probability, highest-value tasks — the ones that work for the broadest set of households with the least onboarding friction.

MVP Task Set

1. Unclaimed Property Monitoring (Already Built)

Dollar Hound's claims pipeline already scans state databases, resolves names, scores matches, and generates outreach. This is the most mature component. For the consumer product, it needs: - Simplified onboarding: enter household names and addresses (current + former) - Weekly automated scans of all 50 states - Push notifications when potential claims are found - Pre-filled claim forms where possible

Expected value: $200-$1,500/year at 60% probability = $510 expected.

2. Tax Preparation (Universal Need)

Every household that files taxes pays for preparation — either in software ($90-170 for TurboTax/H&R Block) or in professional fees ($200-500 for a CPA). A device that handles standard returns (W-2 income, standard deduction, child tax credit) replaces this cost entirely. For complex returns, it does the initial categorization and flags items for review.

Expected value: $200-$800/year at 90% probability = $450 expected.

3. Subscription Audit (Near-Universal Savings)

Scan email for recurring charge confirmations. Build a list of all active subscriptions. Flag subscriptions the user may have forgotten about. Calculate total monthly subscription spend. A Chase/Mint study found the average American underestimates their subscription spending by 2.5x. The average household has 12 paid subscriptions totaling $219/month — and has forgotten about 2-3 of them.

Expected value: $50-$300/year at 70% probability = $123 expected.

MVP Value Calculation

These three tasks yield $400-$900/year at 70%+ probability for the average household. This is sufficient to justify the $400 device cost in Year 1, even at the conservative end of estimates. Every additional task category (healthcare billing, phone calls, benefits access) is pure upside.

MVP Technical Requirements

  • Jetson Orin Nano Super with NVMe and WiFi (no SIM needed for MVP)
  • Ollama + 7B model
  • Dollar Hound pipeline (Node.js)
  • Simple web dashboard (accessible on home network)
  • Email access (IMAP credentials) for subscription audit
  • Document upload (photos of W-2s, 1099s) for tax prep
  • Push notifications via email or existing messaging platform

Phone call capability (SIM modem, sip-audio) is deferred to v2. It adds value but also adds hardware cost ($25 modem + $120/year SIM), onboarding complexity, and technical risk.


10. Key Risks

1. The 8GB RAM Wall

Risk: 7B models handle simple tasks (name matching, email classification, form filling) but struggle with complex reasoning (tax optimization, multi-step dispute strategies, nuanced billing error detection). 14B models need ~8.5GB of VRAM and do not fit on the 8GB Jetson alongside the OS and application stack.

Mitigation: Hybrid approach — use 7B for high-volume simple tasks, cloud API (Sonnet/Haiku) for the complex tail. Monitor AMD 16GB mini PC pricing for a hardware upgrade path. Design the software architecture to be hardware-agnostic (Ollama abstraction layer).

Severity: Medium. The MVP tasks (claims monitoring, basic tax prep, subscription audit) are within 7B capability. Complex tax optimization and dispute negotiation are v2 features.

2. Model Quality at 7B

Risk: Joe's current Dollar Hound system uses Qwen2.5:14b for JSON extraction and Qwen3-Coder (80B, on DGX Spark) for complex reasoning. Downgrading to 7B-only reduces accuracy on structured extraction tasks. Name matching false positive rates may increase. Form filling may produce more errors requiring human correction.

Mitigation: Fine-tune a 7B model on Dollar Hound's specific tasks (name matching, address parsing, form field extraction). Task-specific fine-tuning can recover much of the accuracy gap between 7B and 14B for narrow domains. Joe has access to the DGX Spark for training.

Severity: Medium-High. This is the core quality risk. User trust depends on accuracy — a device that produces too many false positives (claiming matches that aren't real) or false negatives (missing real claims) will be perceived as unreliable.

3. Real-Time Phone Call Memory Pressure

Risk: Simultaneous TTS + STT + LLM inference requires ~5.2GB on an 8GB device. This leaves minimal headroom for the OS and risks OOM kills or swapping-induced latency spikes during calls.

Mitigation: Phone calls are a dedicated mode — suspend all other tasks, unload non-essential processes, and reserve memory. The sip-audio architecture already handles this. Alternatively, defer phone calls to v2 when 16GB hardware is available.

Severity: Low (manageable with dedicated mode) to Medium (if users expect concurrent operation).

4. First-Run Onboarding Friction

Risk: The device needs household data to function: names (including maiden names, former names), current and former addresses, SSNs (for tax prep), email credentials (for subscription audit). This is the highest-friction moment in the user experience. Users who give up during onboarding never see value.

Mitigation: Progressive onboarding — start with just names and addresses (enough for claims monitoring), then request additional data as the user sees value. A companion mobile app with camera-based document scanning (photograph your W-2 instead of typing numbers) reduces friction significantly. Never require all data upfront.

Severity: High. Onboarding is the single biggest risk to consumer adoption. The device is useless without household data, but asking for SSNs and bank accounts on day one is a trust barrier.

Risk: Tax advice, claims filing, phone calls conducted on behalf of the user, and medical billing disputes all have legal implications. Unauthorized practice of law (filing legal claims), unauthorized tax advice (optimizing deductions), and impersonation (phone calls using TTS) are all potential liability vectors.

Mitigation: Position the device as a "tool that assists" rather than an "agent that acts." All actions require user approval before execution. Tax preparation outputs carry disclaimers. Phone calls disclose AI involvement where legally required. Terms of service explicitly state the device provides information, not professional advice. Consult with a consumer product attorney before launch.

Severity: Medium. The legal landscape for AI-assisted financial tools is evolving rapidly. Early movers face regulatory uncertainty but also have the opportunity to shape norms.

6. Software Update Mechanism

Risk: Models are 4-8GB each. Software updates require downloading new model weights, application code, and potentially OS patches. Over-the-air (OTA) updates over home WiFi are feasible but slow for large model updates. Failed updates can brick the device.

Mitigation: A/B partition scheme (like Android) — download the update to partition B while partition A continues running, then swap on next boot. If B fails, revert to A automatically. Model updates are background downloads with integrity verification (SHA-256 checksums) before swapping. microSD card serves as emergency recovery.

Severity: Low-Medium. Solved problem in embedded systems, but requires careful engineering.


11. What the Product Is

A $400 box that plugs into your home WiFi and starts finding money you did not know you had. It scans all 50 state unclaimed property databases for your family, monitors for class action settlements you are eligible for, catches medical billing errors, does your taxes, and handles phone calls to dispute charges — all without sharing your personal data with anyone. Think of it as a Home Assistant for your financial life: always on, always working, always private. It costs less than $2/month in electricity, requires no subscription, and pays for itself within the first year through the money it recovers and the costs it eliminates.


12. Next Steps: Research to Product

Phase 1: Validate on Existing Hardware (Weeks 1-2)

  1. Run Dollar Hound at depth=10 on Prometheus. Measure actual GPU time per weekly scan of 50 states with household of 4 names. Compare claim detection accuracy at 7B (Qwen2.5:7b) versus 14B (Qwen2.5:14b) on the same dataset. Quantify the accuracy gap.

  2. Benchmark 7B task accuracy. Run 100 known-positive and 100 known-negative name matches through the 7B model. Measure false positive rate and false negative rate. Determine the acceptable threshold for consumer use.

  3. Measure real power consumption. Plug Prometheus into a Kill-A-Watt meter. Measure actual wattage during idle, during inference, and during phone calls. Validate the $20/year electricity estimate.

Phase 2: Build Missing Components (Weeks 3-6)

  1. Build the tax prep module. Start with standard returns: W-2 income, standard deduction, child tax credit. Use the tax prep prototype Joe already started. Target: correctly prepare a 1040 for a household with W-2 income and 2 dependents, matching TurboTax output within $50.

  2. Build the subscription audit module. IMAP email scanning for recurring charge confirmations. Pattern matching on subject lines and sender addresses. Output: list of active subscriptions with monthly costs. Target: correctly identify 90%+ of paid subscriptions in a test mailbox.

  3. Build the onboarding flow. Web-based setup wizard served on the local network. Progressive data collection: names and addresses first, email credentials second, tax documents third. Secure local storage with at-rest encryption.

Phase 3: Package and Test (Weeks 7-10)

  1. Create the install image. Single NVMe image with Ubuntu 22.04 + JetPack 6 + Ollama + Dollar Hound + OpenClaw + models + setup wizard. Flash to SSD, insert into Jetson, power on, access setup wizard at http://device-ip:8080. Target: under 10 minutes from unboxing to first scan.

  2. User test with 5 households. Give 5 people (family, friends) a pre-built Jetson and track value recovered over 90 days. Success metric: 4 out of 5 recover $100+ within 90 days. Collect qualitative feedback on onboarding friction, notification quality, and trust.

Phase 4: Hardware and Business (Weeks 11-16)

  1. Lock in BOM for 100-unit pilot run. Negotiate volume pricing on Jetson dev kits, NVMe SSDs, WiFi modules, enclosures. Target BOM: $350 or less at 100-unit volumes.

  2. Define go-to-market. Direct-to-consumer via website. Pre-order model with refundable deposit. Target: 100 units in first batch. Price: $399 or $449 with SIM modem.

  3. Monitor AMD 16GB pricing. If 16GB Ryzen AI mini PCs hit $400 by Q3 2026, evaluate switching the hardware platform for batch 2. The 14B model unlock would significantly improve tax prep and dispute capabilities.


Appendix A: Power and Electricity Cost Sensitivity

Device Power Annual kWh Cost at $0.10/kWh Cost at $0.15/kWh Cost at $0.20/kWh
7W (Jetson idle) 61 $6 $9 $12
15W (Jetson active) 131 $13 $20 $26
25W (Jetson max) 219 $22 $33 $44
45W (AMD mini PC) 394 $39 $59 $79
65W (AMD mini PC max) 569 $57 $85 $114

The Jetson's 15W power budget is a genuine advantage for an always-on appliance. Even at the highest residential electricity rate in the US (~$0.30/kWh in Hawaii), annual cost is $39. The AMD alternative at 45W would cost $118/year in Hawaii — still modest, but 3x higher.

Appendix B: Model Size vs RAM Requirements

Model Parameters Quantization VRAM Required Fits 8GB? Fits 16GB?
Qwen2.5:3b 3B Q4_K_M ~2.0GB Yes Yes
Llama 3.2:3b 3B Q4_K_M ~2.0GB Yes Yes
Qwen2.5:7b 7B Q4_K_M ~4.5GB Yes (tight) Yes
Llama 3.1:8b 8B Q4_K_M ~5.0GB Yes (tight) Yes
Qwen2.5:14b 14B Q4_K_M ~8.5GB No Yes
Llama 3.1:14b 14B Q4_K_M ~8.5GB No Yes
Qwen2.5:32b 32B Q4_K_M ~19GB No No

The 8GB boundary is the key constraint. Everything above 7B-8B parameters requires 16GB+ RAM. The jump from 7B to 14B is where model quality improves most noticeably for structured reasoning tasks — and it is exactly the jump that the Jetson cannot make.

Appendix C: Comparable "Find Money" Services Pricing

Service Cost What It Does Limitation
TurboTax Deluxe $90-170/year Tax preparation Tax only, subscription model
H&R Block Premium $85-150/year Tax preparation Tax only, subscription model
LifeLock Standard $120/year Identity monitoring, alerts Monitoring only — no recovery
LifeLock Ultimate Plus $350/year Monitoring + stolen funds reimbursement Insurance, not prevention/recovery
Unclaimed.org finder services 10-35% of claim Find and file unclaimed property claims Per-claim fee, no ongoing monitoring
Medical billing advocate $100-200/hr Review and dispute medical bills Per-engagement, expensive
CPA (basic return) $200-500 Tax preparation and optimization Per-engagement, annual cost

The personal AI computer replaces or subsumes the functionality of multiple paid services while adding capabilities (phone calls, form filling, email monitoring) that none of them offer. At $400 one-time + $20-140/year operating cost, it is cheaper than any single subscription service above within 2 years.