diff --git a/README.md b/README.md new file mode 100644 index 0000000..0f97d84 --- /dev/null +++ b/README.md @@ -0,0 +1,245 @@ +# publish + +Turn a recorded church service into show-notes and a social hook clip in one +pass. Local transcription via [whisper.cpp](https://github.com/ggerganov/whisper.cpp), +LLM summary via Claude, ffmpeg-cut portrait clip — wired together in a single Go +CLI. + +``` +publish [--summerize] [--clip] [--post] [flags] +``` + +## What it does + +Given an audio or video recording (mp4, m4a, mp3, wav, ...), `publish` will: + +1. **Transcribe** the audio locally with whisper.cpp (CUDA / ROCm / Vulkan / + Metal / CPU — picked automatically per machine). +2. **Summarize** (`--summerize`) the sermon into a Markdown document with + speaker, scripture references (KJV), key points, and a memorable quote. + Optionally also emit Spotify-for-Podcasters-friendly HTML. +3. **Clip** (`--clip`) a 60–90 second hook from the preaching, re-encoded to + 1080×1920 portrait (9:16) with a center-crop, capped at 1 GiB — ready to + upload to Reels / Shorts / TikTok / X. +4. **Post** (`--post`) — Spotify upload integration, not implemented yet. + +The transcript is cached at `.segments.json`, so running multiple modes +or re-tuning prompt parameters costs one whisper run. + +## Quick start + +```bash +git clone ~/Git\ Repos/summerize +cd ~/Git\ Repos/summerize +make install +``` + +`make install` is interactive: it detects your OS and GPU, walks you through +installing system dependencies, builds whisper.cpp with the right backend, +downloads a ggml model, and links `publish` + `whisper-cli-` into +`~/.local/bin`. Re-runnable; each step is idempotent. + +Then: + +```bash +publish --summerize sermon.mp4 +publish --clip sermon.mp4 +publish --summerize --clip sermon.mp4 # both, one transcribe pass +``` + +Make sure `~/.local/bin` is on your `PATH`. + +### Other Make targets + +| target | what it does | +|---|---| +| `make` / `make build` | build `./publish` in the repo | +| `make link` | rebuild + link `./publish` into `~/.local/bin` | +| `make install` | interactive end-to-end setup | +| `make doctor` | print detected OS / GPU / dependencies and exit | +| `make uninstall` | remove the `publish` symlink | +| `make clean` | remove the local `publish` binary | +| `make test` | `go test ./...` | + +## Modes + +Modes are boolean flags; combine freely. Defaults to `--summerize` if none set. + +### `--summerize` + +Markdown summary of the message. + +```bash +publish --summerize sermon.mp4 +publish --summerize --spotify sermon.html sermon.mp4 +publish --summerize --copy sermon.mp4 # Spotify HTML -> clipboard +publish --summerize --prompt "$(cat notes.md)" sermon.mp4 +``` + +Key flags: + +| flag | purpose | +|---|---| +| `--md PATH` | Markdown output path; `-` = stdout, `""` = disable. Default `.summary.md` | +| `--spotify PATH` | Also write Spotify-show-notes HTML (subset of HTML their editor accepts) | +| `--copy` | Copy the Spotify HTML to the clipboard (`wl-copy` / `xclip` / `pbcopy`) | +| `--prompt TEXT` | Producer's notes — pre-written framing the LLM treats as authoritative for title, speaker name, key points. The transcript expands and enriches it | + +### `--clip` + +Pick the best 60–90 second sermon clip and cut it to a portrait social video. + +```bash +publish --clip sermon.mp4 +publish --clip --min 75 --max 90 sermon.mp4 +publish --clip --dry-run sermon.mp4 # show the picked window only +publish --clip --copy-codec sermon.mp4 # fast stream copy (skips 9:16 crop) +``` + +Key flags: + +| flag | purpose | +|---|---| +| `--min` / `--max` | clip length bounds in seconds (default 60 / 90) | +| `--out PATH` | clip output path (default `.clip`) | +| `--copy-codec` | use `ffmpeg -c copy` — fast, but **skips the 9:16 portrait crop** (stream copy can't apply video filters) | +| `--dry-run` | print the picked window but don't run ffmpeg | + +Video clips are re-encoded to **1080×1920 portrait** with a safe center-crop +(`crop=min(iw,ih*9/16):min(ih,iw*16/9)`) that handles any source aspect ratio +without distortion, and capped at 1 GiB via ffmpeg's `-fs`. + +### `--post` + +Stub. Will eventually push the markdown summary to a Spotify-for-Podcasters +episode description. + +## Shared flags + +| flag | purpose | default | +|---|---|---| +| `--summarizer` | `claude-cli` (shells out to `claude -p`) or `claude-api` (direct Messages API) | `claude-cli` | +| `--model` | model name (Anthropic API path defaults to `claude-sonnet-4-6`) | empty | +| `--prompt-summary` | override the bundled summary system prompt | bundled | +| `--prompt-clip` | override the bundled clip-selector system prompt | bundled | +| `--whisper-bin` | whisper.cpp binary; auto-detects best backend if empty | auto | +| `--whisper-model` | path to a ggml whisper model (.bin) | `~/.cache/whisper.cpp/ggml-base.en.bin` | +| `--whisper-lang` | force whisper language code | auto-detect | +| `--whisper-threads` | thread count | library default | +| `--segments` | segments JSON cache path | `.segments.json` | +| `--keep-transcript` | also write `.transcript.txt` | off | +| `--keep-wav` | keep the normalized 16kHz WAV instead of using a tempdir | off | +| `-v` | verbose progress to stderr | off | + +> **Note on `--prompt` vs `--prompt-summary`:** +> `--prompt` is **content** (producer's notes that anchor the summary). +> `--prompt-summary` is a **path** to override the system prompt template. +> Different things; both are intentional. + +## Backends + +When `--whisper-bin` is not set, `publish` picks a whisper.cpp backend at +runtime by walking this order: + +1. **Metal** (macOS) — uses `~/.local/bin/whisper-cli-metal` if present +2. **CUDA** — `whisper-cli-cuda` if `nvidia-smi -L` succeeds +3. **ROCm** — `whisper-cli-rocm` if `rocminfo` succeeds +4. **Vulkan** — `whisper-cli-vulkan` if `vulkaninfo --summary` succeeds +5. **CPU** — `whisper-cli` / `whisper-cpp` / `main` on PATH + +Each step requires both the binary and a working runtime probe; failures fall +through to the next backend. The chosen backend is logged on stderr; `-v` +adds diagnostics about which probes were skipped or failed. + +`make install` builds the right backend for your machine. Per-platform build +recipes (CUDA, ROCm, Vulkan, Metal) live in [CLAUDE.md](./CLAUDE.md#backend-auto-detect). + +## Platform support + +| OS | tested | notes | +|---|---|---| +| Arch Linux | yes | `pacman` for system deps; CUDA / ROCm / Vulkan all supported | +| macOS (Apple Silicon) | install path | `brew` + Xcode CLT; Metal acceleration | +| Debian / Ubuntu | install path | `apt`; GPU runtime install is distro-specific, prints the package list | +| Fedora | install path | `dnf`; same caveat as Debian | +| Windows | no | not supported | + +## Output + +A typical `publish --summerize sermon.mp4` produces `sermon.summary.md` that +looks roughly like: + +```markdown +# Fatal Sleep + +**Speaker:** Rev. Hayford +**Scripture:** Romans 13:11–14, 1 Thessalonians 5:6–8, Ephesians 5:14 + +## Overview +A 2-4 sentence plain-English summary of the central message. + +## Key Points +- ... + +## Scripture & References +- Romans 13:11 — "And that, knowing the time, that now it is high time to awake out of sleep..." +- ... + +## Application / Call to Action +- ... + +## Memorable Quote +> "..." +``` + +`--spotify path.html` writes the same content as the small HTML subset that +Spotify-for-Podcasters' show-notes editor accepts (``, ``, ``, +`
    `, `
      `, `
    1. `, `

      `). + +`--clip` writes `.clip.mp4` (or `.m4a` for audio inputs). Video clips +are 1080×1920 portrait, ≤1 GiB. + +## Requirements + +| tool | required for | install on Arch | install on macOS | +|---|---|---|---| +| `ffmpeg` | always | `pacman -S ffmpeg` | `brew install ffmpeg` | +| `whisper-cli` | transcription | `pacman -S whisper.cpp` (CPU) or build from source for GPU | `brew install whisper-cpp` (Metal) or build | +| ggml model | transcription | downloaded by `make install` | downloaded by `make install` | +| `claude` CLI | `--summarizer claude-cli` (default) | comes with [Claude Code](https://claude.com/claude-code) | same | +| `ANTHROPIC_API_KEY` | `--summarizer claude-api` | env var | env var | +| `wl-copy` / `xclip` / `pbcopy` | `--copy` flag | wayland default; `pacman -S xclip` for X11 | `pbcopy` ships with macOS | + +`make install` walks you through these. + +## Building from source + +```bash +go build -o publish . +``` + +Zero external Go dependencies — stdlib only. `go.sum` is empty. + +``` +. +├── main.go flat flagset, mode dispatch, orchestration +├── prompts/ +│ ├── church-service.md summary system prompt +│ └── clip-selector.md clip-selector system prompt +├── internal/ +│ ├── audio/ ffmpeg → 16kHz mono WAV +│ ├── transcribe/ whisper.cpp wrapper, segments, mm:ss helper +│ ├── summarize/ pluggable LLM backends (CLI + Anthropic API) +│ ├── clip/ LLM clip selection + ffmpeg cut +│ └── output/ markdown→Spotify HTML, clipboard +├── scripts/install.sh interactive setup +├── Makefile +└── CLAUDE.md deep architecture / pipeline docs +``` + +## Spelling note + +Yes, `--summerize` is intentionally spelled with an "e" — sermon + summarize, +the original name of the project. The internal Go package uses standard +`summarize`; only the user-facing flag and binary keep the pun. +