HobbyBoard

Introduction

Hobbyboard turns folders of photos, sketches, and screenshots into a searchable archive with semantic search, enriched tags, and beautiful boards — all on your own machine.

Philosophy: Free as in beer and speech. Private by default. The AI/Vision model is used only once at image processing time to extract meaning; your data stays yours.

Quick Start

Get up and running in seconds. Select your platform below:

Option 1: Homebrew (Recommended)

Terminal
brew tap aravindhsampath/tap
brew install hobbyboard
hobbyboard
# Open http://localhost:9625

Option 2: Docker Compose

Terminal
curl -O https://raw.githubusercontent.com/aravindhsampath/hobbyboard/main/docker-compose.yml
docker compose up -d
# Open http://localhost:9625

Installation Options

Hobbyboard offers flexibility for every type of user. Other installation options are listed below.

Recommendation: Use your OS package manager when available. It is the easiest way to receive updates automatically.

1. Package Managers (Recommended)

Use your OS package manager (brew, yum, apt, pacman). Example with Homebrew:

Terminal
brew tap aravindhsampath/tap
brew install hobbyboard

2. Docker Compose (Dependencies Pre-installed)

Easiest container setup with Qdrant bundled via compose.

Terminal
curl -O https://raw.githubusercontent.com/aravindhsampath/hobbyboard/main/docker-compose.yml
docker compose up qdrant -d
docker compose up -d
# Open http://localhost:9625

3. Docker Image Direct

Run the official image without compose. Start Qdrant first, then Hobbyboard:

Terminal
docker run -d --name hobbyboard-qdrant \
  --restart unless-stopped \
  -p 6333:6333 -p 6334:6334 \
  -v $(pwd)/qdrant_storage:/qdrant/storage:z \
  qdrant/qdrant:latest

docker run -d --name hobbyboard-app \
  --restart unless-stopped \
  --add-host=host.docker.internal:host-gateway \
  -p 9625:9625 \
  -v $(pwd):/app \
  -v /app/frontend \
  -v $(pwd)/raw_images:/app/raw_images \
  -v $(pwd)/dist:/app/dist \
  ghcr.io/aravindhsampath/hobbyboard:latest
# Open http://localhost:9625

4. Pre-built Binaries

Download the latest release for your OS. Note: Requires ffmpeg and libheif to be installed on your system.

Platform Dependencies Command
macOS brew install ffmpeg libheif
Ubuntu/Debian sudo apt install ffmpeg libheif-dev
Windows Install via Winget or Chocolatey.

5. Build from Source (Cargo)

For developers who want the latest changes. Requires Rust toolchain.

Terminal
git clone https://github.com/aravindhsampath/hobbyboard.git
cd hobbyboard
cargo build --release
./target/release/hobbyboard serve

Configuration

Hobbyboard is configured via a hobbyboard.toml file. On first launch, a default file is created in your working directory.

Most knobs you want to turn live here. Customize away.

Configuration Reference

Section Setting What it does Default
[paths] raw_media_dir Your image hoard location. Read-only (we promise). "raw_images"
dist_dir Where we dump thumbnails, DBs, and metadata. "dist"
[ai] provider ollama (free/slow), openai, or gemini. "ollama"
vision_model The eyes. e.g., qwen3-vl:8b, gemini-3-flash-preview, gpt-5-mini. "qwen3-vl:8b"
embedding_model Text-to-vector model loaded in-memory by Hobbyboard. "mxbai-embed-large-v1"
[server] host 0.0.0.0 to expose to your LAN. "0.0.0.0"
port Why 9625? I mashed the numpad. 9625
[search] vector_weight 0.0–1.0. Higher = more concept/vibe matching. 0.7
fts_weight 0.0–1.0. Higher = exact keyword matching. 0.3
[search.fts_weights] manual_tags Multiplier. If you typed it, it matters more. 2.0
user_notes Multiplier. Your notes matter too. 2.0
caption Multiplier. AI description weight. 1.0
auto_tags Multiplier. AI tags weight. 1.0
ocr Multiplier. Text found inside images. 0.8

Notes on AI Configuration

Vision models are configured via [ai].vision_model. You can customize [ai].prompt_analysis for domain-specific indexing.

Provider Recommended Model Use Case
Ollama qwen3-vl:8b Slow but local performance and privacy.
OpenAI gpt-5-mini High accuracy, low cost, requires API key.
Gemini gemini-3-flash-preview Excellent OCR and fast processing.

Embedding models run locally via fastembed-rs and are set via [ai].embedding_model.

Model Dimensions Size Latency (Apple Silicon M2)
nomic-embed-text-v1.5 768 ~275 MB 14 ms
mxbai-embed-large-v1 1024 ~670 MB 34 ms
Alibaba-NLP/gte-large-en-v1.5 1024 ~1.6 GB 42 ms

Technical Design

This section explains the end-to-end architecture and the image lifecycle. It is written for engineers who care about self-hosting, privacy, and how the pipeline is actually wired.

System Orientation

  • Binary entrypoint: src/main.rs (CLI commands: setup, init, build, serve, export).
  • Auto-dispatch: On start, HobbyBoard decides whether to serve, run onboarding, or initialize based on config and DB presence.
  • Core modules: config/logging, media ingestion, AI, DB, ops pipeline, server/API, and the embedded UI.

User Action Flow (first run)

  1. Run hobbyboard → onboarding server starts.
  2. Diagnostics check dependencies (ffmpeg, libheif, Qdrant).
  3. Configure AI provider, media path, and embedding model.
  4. Write hobbyboard.toml + .env, then init + build.
  5. Server restarts into normal serve mode.
Artifacts created: hobbyboard.toml, .env, dist/, SQLite DB, Qdrant collection.

Image Lifecycle

  • Scan: Walk the media directory, hash files, and upsert into items.
  • Process: Create variants, transcode video, generate thumbnails, and write to dist/.
  • AI + Embedding: Run vision analysis, extract tags/OCR, embed text, and upsert vectors in Qdrant.
  • Catalog: Regenerate dist/data/catalog.json used by the UI for fast listing.

Search & Retrieval

  • Semantic search: Query text is embedded locally and searched in Qdrant.
  • Text search: SQLite FTS5 (bm25) searches OCR, captions, and notes.
  • Fusion: Vector and FTS scores are normalized and merged using config weights.

Data Model (SQLite)

  • items: file registry, paths, variants, dimensions.
  • generated_metadata: AI output (title, caption, tags, OCR).
  • user_tags, user_notes: manual inputs (preserved on refresh).
  • boards, board_items: collections.
  • media_search_fts: FTS index.
  • system_info: embedding model and dimension integrity.

Operational Invariants

  • Embedding model integrity: serve refuses to start if model differs from DB.
  • Content hash identity: same file content = same ID (dedupe by hash).
  • Catalog is authoritative: UI list is driven by catalog.json.

Failure Modes (and what happens)

  • Missing ffmpeg/libheif: video transcode or HEIC/AVIF decode is skipped; build continues with warnings.
  • AI provider errors: metadata extraction fails and item is rolled back.
  • Qdrant down: semantic search degrades or fails; SQLite search still works.
  • Refresh build: generated metadata is reset; manual tags remain (may be orphaned without FK).

Code Entry Points

  • src/main.rs — CLI dispatch + auto-start.
  • src/ops/library.rs — scan/process/AI pipeline.
  • src/media/* — image/video processing.
  • src/ai/client.rs — AI API + schema handling.
  • src/server/handlers.rs — HTTP endpoints.
  • frontend/app.js — UI interactions.

Usage & Workflow

Hobbyboard operates in two main modes: the CLI (for management) and the Web UI (for browsing).

The Workflow

1. Scan2. Process3. Serve

Behind the Curtains

  • Processing: When you start Hobbyboard, it scans raw_media_dir. New files are hashed to detect duplicates.
  • OCR & Vision: Images are passed through OCR (to read text) and a Vision LLM (to describe the scene). This metadata is stored in SQLite.
  • Vector Search: The descriptions and OCR text are converted into vectors using a local embedding model and stored in Qdrant (embedded). This allows for "vibe-based" semantic search.

HobbyBoard CLI Reference

HobbyBoard is designed as a single-binary orchestrator. The CLI handles everything from system diagnostics to AI-heavy ingestion and high-performance serving.

Core Workflow

The standard lifecycle for a HobbyBoard library follows this sequence:

setup (Diagnostics) → init (Infrastructure) → build (Ingestion) → serve (Operation)

1. System Diagnostics (setup)

Before initializing a library, use the setup command to verify native dependencies. HobbyBoard relies on ffmpeg for video processing and libheif for Apple/AVIF image support.

Terminal
hobbyboard setup
  • What it checks: Binary presence of ffmpeg, heif-info/heif-enc, and connectivity to the Qdrant gRPC/HTTP endpoints.
  • Setup Wizard: If run without a configuration file, HobbyBoard defaults to an interactive Web UI on port 9625 to help generate the initial hobbyboard.toml.

2. Library Initialization (init)

Prepares the storage layer for a new library.

Terminal
hobbyboard init
  • SQLite: Creates hobbyboard_data.db and applies migrations (FTS5 tables, metadata schema).
  • Qdrant: Creates the collection and defines the vector distance metric (Cosine) and dimension size based on the selected embedding model.
  • Idempotency: Safe to run multiple times; it will not overwrite existing data unless the database file is manually removed.

3. Ingestion & Indexing (build)

The build phase is a multi-stage pipeline that is CPU and network bound.

Terminal
hobbyboard build [FLAGS]

Key Flags

  • --refresh: Wipes all AI-generated metadata (descriptions, tags, embeddings) but preserves user data (manual tags, boards, notes). Useful when changing the AI system prompt or provider.
  • --deep: Forces a full SHA-256 re-hash of every file in the raw_images directory to detect bit-rot or content changes that haven't triggered a filesystem timestamp update.

Pipeline Stages

  1. Scanner: Walks the directory, identifies supported MIME types, and computes 64-bit unique IDs.
  2. Processor: Generates optimized AVIF/JPEG thumbnails and 4-second video previews. Applies EXIF orientation.
  3. Vision AI: Sends media to the configured provider (Ollama, OpenAI, Gemini) for structural analysis.
  4. Embedder: Generates 1024D (default) vectors from descriptions using fastembed-rs.

4. Operational Mode (serve)

Starts the Axum-based web server and REST API.

Terminal
hobbyboard serve
  • Default Port: 9625
  • Routes: Serves the Vanilla JS frontend from the binary (via rust-embed) or a local directory if ui_path is set.
  • Search: Handles hybrid queries (Semantic + FTS5) with configurable weights.
  • Auth: In the current version, access control is handled at the network/proxy layer.

5. Maintenance & Reset (cleanup.sh)

For developers or users needing a hard reset, a cleanup.sh utility is provided in the root of the repository.

Terminal
./cleanup.sh
  • Destructive: Removes the SQLite database, wipes the local Qdrant storage directory, and clears all generated thumbnails in dist/.
  • Usage: Best used when switching between local development and Docker environments to ensure no stale volume data persists.

Configuration (hobbyboard.toml)

While most settings are in the TOML, the CLI respects the following environment variables (usually stored in .env):

  • OPENAI_API_KEY: Required for OpenAI Vision.
  • GEMINI_API_KEY: Required for Google Gemini.
  • RUST_LOG: Controls verbosity (error, warn, info, debug, trace).

Logs

Logs are printed to stdout by default. In Docker, view them with:

Terminal
docker compose logs -f hobbyboard

Architecture

Hobbyboard is built with Rust for performance and safety. It uses a clean architecture separating the scanner, the AI pipeline, and the HTTP server.

Data persistence relies on:

  • SQLite: Relational data (file paths, timestamps, tags).
  • Qdrant: High-performance vector similarity search.
  • FileSystem: Thumbnail cache in dist/images.

Mermaid diagram — TBD.