HobbyBoard

Introduction

Hobbyboard turns folders of photos, sketches, and screenshots into a searchable archive with semantic search, enriched tags, and beautiful boards — all on your own machine.

Philosophy: Free as in beer and speech. Private by default. The AI/Vision model is used only once at image processing time to extract meaning; your data stays yours.

Quick Start

Get up and running in seconds. Select your platform below:

Option 1: Homebrew (Recommended)

Terminal
brew tap aravindhsampath/tap
brew install hobbyboard
hobbyboard
# Open http://localhost:9625

Option 2: Docker Compose

Terminal
curl -O https://raw.githubusercontent.com/aravindhsampath/hobbyboard/main/docker-compose.yml
docker compose up -d
# Open http://localhost:9625

Installation Options

Hobbyboard offers flexibility for every type of user. Other installation options are listed below.

Recommendation: Use your OS package manager when available. It is the easiest way to receive updates automatically.

1. Package Managers (Recommended)

Use your OS package manager (brew, yum, apt, pacman). Example with Homebrew:

Terminal
brew tap aravindhsampath/tap
brew install hobbyboard

2. Docker Compose

Easiest container setup via compose.

Terminal
curl -O https://raw.githubusercontent.com/aravindhsampath/hobbyboard/main/docker-compose.yml
docker compose up -d
# Open http://localhost:9625

3. Docker Image Direct

Run the official image without compose:

Terminal
docker run -d \
    --name hobbyboard \
    -p 9625:9625 \
    -v hb-config:/app/config \
    -v hb-data:/app/dist \
    -v ./raw_images:/app/raw_images \
    --add-host=host.docker.internal:host-gateway \
    ghcr.io/aravindhsampath/hobbyboard:latest
# Open http://localhost:9625

4. Pre-built Binaries

Download the latest release for your OS. Note: Requires ffmpeg and libheif to be installed on your system.

Platform Dependencies Command
macOS brew install ffmpeg libheif
Ubuntu/Debian sudo apt install ffmpeg libheif-dev

5. Build from Source (Cargo)

For developers who want the latest changes. Requires Rust toolchain.

Terminal
git clone https://github.com/aravindhsampath/hobbyboard.git
cd hobbyboard
cargo build --release
./target/release/hobbyboard serve

Configuration

Hobbyboard is configured via a hobbyboard.toml file. On first launch, a default file is created in your working directory.

Most knobs you want to turn live here. Customize away.

Configuration Reference

Section Setting What it does Default
[paths] raw_media_dir Your image hoard location. Read-only (we promise). "raw_images"
dist_dir Where we dump thumbnails, DBs, and metadata. "dist"
[ai] provider ollama (free/slow), openai, or gemini. "ollama"
vision_model The eyes. e.g., qwen3-vl:8b, gemini-3-flash-preview, gpt-5-mini. "qwen3-vl:8b"
embedding_model Text-to-vector model loaded in-memory by Hobbyboard. "mxbai-embed-large-v1"
[server] host 0.0.0.0 to expose to your LAN. "0.0.0.0"
port Why 9625? I mashed the numpad. 9625
[logging] level Minimum log level: trace, debug, info, warn, error. "info"
file_max_size_mb Max log file size in MB before rotation. 50
[search] vector_weight 0.0–1.0. Higher = more concept/vibe matching. 0.3
fts_weight 0.0–1.0. Higher = exact keyword matching. 0.7
[search.fts_weights] manual_tags Multiplier. If you typed it, it matters more. 2.0
user_notes Multiplier. Your notes matter too. 2.0
caption Multiplier. AI description weight. 1.0
auto_tags Multiplier. AI tags weight. 1.0
ocr Multiplier. Text found inside images. 0.8

Notes on AI Configuration

Vision models are configured via [ai].vision_model. You can customize [ai].prompt_analysis for domain-specific indexing.

Provider Recommended Model Use Case
Ollama qwen3-vl:8b Slow but local performance and privacy.
OpenAI gpt-5-mini High accuracy, low cost, requires API key.
Gemini gemini-3-flash-preview Excellent OCR and fast processing.

Embedding models run locally via fastembed-rs and are set via [ai].embedding_model.

Model Dimensions Size Latency (Apple Silicon M2)
nomic-embed-text-v1.5 768 ~275 MB 14 ms
mxbai-embed-large-v1 1024 ~670 MB 34 ms
Alibaba-NLP/gte-large-en-v1.5 1024 ~1.6 GB 42 ms

Technical Design

This section explains the end-to-end architecture for engineers who want to understand, self-host, or adapt Hobbyboard. Everything runs in a single binary — no external services required.

System Architecture

System Architecture diagramSystem Architecture diagram
  • Single binary: Axum web server + SQLite + USearch vector index + fastembed-rs embeddings. No external services required.
  • Frontend: Vanilla JS/CSS/HTML embedded in the binary via rust-embed. Zero build tools needed.
  • AI: Vision analysis via Ollama (local), OpenAI, or Gemini. Text embeddings are always generated locally by fastembed-rs — no API calls for search.

Control Flow & Auto-Dispatch

Running hobbyboard with no subcommand triggers auto-dispatch based on system state:

Control Flow diagramControl Flow diagram

Explicit subcommands (setup, init, build, serve, export, auth reset) bypass auto-dispatch and run directly.

First Run & Onboarding

  1. Run hobbyboard → no config found → onboarding web UI starts on port 9625.
  2. Wizard checks dependencies (ffmpeg, libheif codecs).
  3. User configures: AI provider, media source path, password, embedding model.
  4. Wizard writes hobbyboard.toml + .env (API keys, password hash, signing key).
  5. Triggers init (creates SQLite DB + USearch index) then build (processes media).
  6. Process restarts via exec() → auto-dispatch detects config + DB → normal serve mode.
Artifacts created: hobbyboard.toml, .env, dist/data/hobbyboard.db, dist/data/vectors.usearch, dist/images/ (thumbnails).

Build Pipeline (Image Lifecycle)

The build command runs a 3-phase pipeline orchestrated by src/ops/library.rs:

Build Pipeline diagramBuild Pipeline diagram
  • Item ID: First 16 chars of the SHA-256 hash of the file content. Same content = same ID (deduplication).
  • Supported formats: jpg, jpeg, png, webp, heic, avif (images); mp4, mov, webm, m4v (videos).
  • Thumbnails: 4 WebP size variants stored under dist/images/{thumb,sm,md,lg}/.
  • Video: Transcoded to MP4 via ffmpeg; frame thumbnails extracted for the grid view.
  • AI output: Structured JSON — title, caption (dense description), tags[] (10–12 keywords), ocr (visible text or null).

Search & Retrieval

Search is hybrid: semantic vectors + full-text, fused with configurable weights.

Search and Retrieval diagramSearch and Retrieval diagram
  • Vector search: USearch HNSW index with cosine similarity. Embeddings are generated locally by fastembed-rs — no API calls at query time.
  • Text search: SQLite FTS5 with BM25 ranking across title, caption, auto_tags, ocr_text, manual_tags, and user_notes.
  • Score fusion: combined = (vec_score × vector_weight) + (fts_score × fts_weight). Default: 30% vector, 70% FTS. Tune via [search] in the config.
  • "More like this": The /api/similar/{id} endpoint returns cosine-nearest neighbors from the USearch index.

Data Model

All structured data lives in a single SQLite database (dist/data/hobbyboard.db, WAL mode). The USearch vector index (dist/data/vectors.usearch) is an on-disk cache — SQLite is the source of truth for embeddings.

Data Model diagramData Model diagram
Refresh safety: --refresh wipes AI-generated data (items, generated_metadata, vector_keys, FTS) but always preserves user_tags, user_notes, and boards.

Directory Layout

./ ├── hobbyboard.toml # Main config ├── .env # Secrets (API keys, password hash, signing key) ├── raw_media/ # Your original photos & videos (read-only) └── dist/ ├── data/ │ ├── hobbyboard.db # SQLite (WAL mode) — all metadata │ └── vectors.usearch # HNSW vector index (cache, rebuilt on build) ├── images/ │ ├── thumb/ # 480px WebP thumbnails │ ├── sm/ # 960px WebP │ ├── md/ # 1920px WebP │ └── lg/ # 3840px WebP ├── originals/ # Converted originals (HEIC→JPEG, etc.) ├── video/ # Transcoded MP4 videos └── logs/ └── hobbyboard.log # Rotating log (max 50 MB)

In Docker, these map to named volumes: hb-config (hobbyboard.toml + .env) and hb-data (dist/). The HB_CONFIG_DIR, HB_DIST_DIR, and HB_RAW_MEDIA_DIR env vars override default paths.

Operational Invariants

  • Embedding model lock: On serve startup, the app checks that system_info.embedding_model matches the config. Mismatches are fatal — you must build --refresh to re-embed with a new model.
  • Content-addressed identity: Item IDs are derived from file content (SHA-256), so the same file always gets the same ID regardless of filename or location.
  • SQLite is source of truth: The USearch .usearch file is a cache rebuilt from the vector_keys table on each build. Delete it and the next build recreates it.

Failure Modes

  • Missing ffmpeg: Video transcoding is skipped; build continues with image-only processing.
  • Missing libheif: HEIC/AVIF decoding fails for those formats; other image formats still process.
  • AI provider unreachable: Vision analysis fails for that item; item metadata is not created. Retryable on next build.
  • Embedding model mismatch: Server refuses to start. Fix by running build --refresh with the correct model configured.
  • Refresh build: AI metadata is reset; manual tags and notes are preserved but may reference items not yet re-processed.

Key Dependencies

Category Crate / Tool Role
Webaxum + tokio + towerAsync web server, middleware, routing
DatabaserusqliteSQLite bindings (WAL mode, FTS5)
VectorsusearchHNSW approximate nearest neighbor index
EmbeddingsfastembedLocal text embedding inference (ONNX)
Imagesimage + libheif-rsImage decoding, resizing, HEIC/AVIF support
Videoffmpeg (sidecar)Transcode to MP4, extract frame thumbnails
AIreqwestHTTP client for Ollama, OpenAI, Gemini APIs
Authargon2 + jsonwebtokenPassword hashing, JWT session tokens
CLIclapCommand-line argument parsing
Frontendrust-embedEmbeds HTML/JS/CSS into the binary
Serializationserde + serde_jsonConfig parsing, API request/response
LoggingtracingStructured logging with daily rotation
Hashingsha2Content-addressed file identity

Usage & Workflow

Hobbyboard operates in two modes: the CLI (for management and builds) and the Web UI (for browsing, searching, and organizing).

CLI Reference

The binary is a single-command orchestrator. All subcommands share the same config and storage.

Core Workflow

setup (config wizard) → init (create DB + index) → build (ingest media) → serve (run server)

Or just run hobbyboard with no arguments — auto-dispatch handles the right action based on system state.


1. Setup Wizard

Terminal
hobbyboard setup
  • What it does: Launches an interactive web UI (port 9625) that walks you through AI provider selection, media path, password, and embedding model.
  • Dependency check: Verifies ffmpeg and libheif (heif-info/heif-enc) are available.
  • Output: Writes hobbyboard.toml and .env with your settings.

2. Library Initialization

Terminal
hobbyboard init
  • SQLite: Creates dist/data/hobbyboard.db with all tables and the FTS5 index.
  • USearch: Initializes the HNSW vector index file at dist/data/vectors.usearch with the correct dimensions for your embedding model.
  • Idempotent: Safe to run multiple times; won't overwrite existing data.

3. Build (Ingestion & Indexing)

Terminal
hobbyboard build [--refresh] [--deep]
  • --refresh: Wipes AI-generated metadata (descriptions, tags, embeddings) but preserves user data (manual tags, boards, notes). Use when changing AI provider or prompt.
  • --deep: Full SHA-256 re-hash of every file to detect content changes that didn't update filesystem timestamps.

Pipeline stages:

  1. Scan: Walk directory, hash files (SHA-256), diff against DB, skip unchanged.
  2. Process (parallel): Generate 4 WebP thumbnail variants (480/960/1920/3840px), transcode video to MP4, apply EXIF orientation.
  3. AI Analysis (serial): Send media to vision LLM, embed caption via fastembed-rs, store metadata + vector.

4. Serve

Terminal
hobbyboard serve [--port 9625] [--ui-path ./frontend]
  • Default port: 9625
  • Frontend: Served from the embedded binary (via rust-embed), or from a local directory if --ui-path is set (useful for UI development).
  • Auth: Password-based login with Argon2 hashing. Sessions use JWT cookies (hb_session) or Bearer tokens for API access.
  • Search: Hybrid vector + FTS5 queries with configurable weights.

5. Other Commands

CommandDescription
hobbyboard exportExport all data (Google Takeout style archive).
hobbyboard auth resetReset the admin password interactively.

Environment Variables (.env)

Secrets and overrides are stored in .env (loaded automatically from HB_CONFIG_DIR or CWD):

VariablePurpose
OPENAI_API_KEYRequired when [ai].provider = "openai"
GEMINI_API_KEYRequired when [ai].provider = "gemini"
HB_PASSWORD_HASHArgon2 password hash (set by setup wizard)
HB_SIGNING_KEYJWT session signing secret (set by setup wizard)
HB_CONFIG_DIROverride config directory (default: CWD). Used in Docker.
HB_DIST_DIROverride dist directory (default: dist/).
HB_RAW_MEDIA_DIROverride raw media directory.

Logs

Logs are written to both stdout and dist/logs/hobbyboard.log (rotating, max 50 MB). In Docker:

Terminal
docker compose logs -f hobbyboard

Troubleshooting

SymptomCauseFix
Server refuses to start with model mismatch errorEmbedding model in config differs from what's stored in DBRun hobbyboard build --refresh
HEIC/AVIF images show as brokenlibheif not installed or missing codecsInstall libheif + libaom + libdav1d
Videos not processedffmpeg not in PATHInstall ffmpeg
Search returns no resultsBuild hasn't run or completedRun hobbyboard build and check logs
Docker: loops back to onboarding after restartConfig check wasn't using HB_CONFIG_DIRUpdate to latest image

Code Map

For engineers adapting Hobbyboard or contributing. The codebase is ~12,700 lines of Rust + ~1,400 lines of frontend JavaScript.

PathWhat it does
src/main.rsCLI entry point, clap dispatch, auto-start logic
src/config.rsConfig loading, defaults, TOML serialization, env var handling
src/logging.rsTracing setup with daily file rotation
src/ops/library.rsBuild pipeline orchestrator (scan → process → AI)
src/media/scanner.rsDirectory walker, file hashing, change detection
src/media/processor.rsImage resizing (WebP variants), video transcoding (ffmpeg)
src/ai/client.rsMulti-provider AI client (Ollama, OpenAI, Gemini)
src/ai/embedding.rsfastembed-rs wrapper for local text embeddings
src/ai/schema.rsVision analysis JSON schema (title, caption, tags, OCR)
src/db/mod.rsSchema creation, migrations, connection setup
src/db/items.rsItem CRUD, catalog queries, FTS operations
src/db/vectors.rsUSearch index management, vector upsert/search
src/server/mod.rsAxum router, JWT auth middleware, CORS
src/server/handlers.rsAll REST API endpoints (~1,200 lines)
frontend/app.jsSingle-file SPA: grid, search, lightbox, boards, tags