246 lines
9.2 KiB
Markdown
246 lines
9.2 KiB
Markdown
# publish
|
||
|
||
Turn a recorded church service into show-notes and a social hook clip in one
|
||
pass. Local transcription via [whisper.cpp](https://github.com/ggerganov/whisper.cpp),
|
||
LLM summary via Claude, ffmpeg-cut portrait clip — wired together in a single Go
|
||
CLI.
|
||
|
||
```
|
||
publish [--summerize] [--clip] [--post] [flags] <audio-or-video>
|
||
```
|
||
|
||
## What it does
|
||
|
||
Given an audio or video recording (mp4, m4a, mp3, wav, ...), `publish` will:
|
||
|
||
1. **Transcribe** the audio locally with whisper.cpp (CUDA / ROCm / Vulkan /
|
||
Metal / CPU — picked automatically per machine).
|
||
2. **Summarize** (`--summerize`) the sermon into a Markdown document with
|
||
speaker, scripture references (KJV), key points, and a memorable quote.
|
||
Optionally also emit Spotify-for-Podcasters-friendly HTML.
|
||
3. **Clip** (`--clip`) a 60–90 second hook from the preaching, re-encoded to
|
||
1080×1920 portrait (9:16) with a center-crop, capped at 1 GiB — ready to
|
||
upload to Reels / Shorts / TikTok / X.
|
||
4. **Post** (`--post`) — Spotify upload integration, not implemented yet.
|
||
|
||
The transcript is cached at `<input>.segments.json`, so running multiple modes
|
||
or re-tuning prompt parameters costs one whisper run.
|
||
|
||
## Quick start
|
||
|
||
```bash
|
||
git clone <repo-url> ~/Git\ Repos/summerize
|
||
cd ~/Git\ Repos/summerize
|
||
make install
|
||
```
|
||
|
||
`make install` is interactive: it detects your OS and GPU, walks you through
|
||
installing system dependencies, builds whisper.cpp with the right backend,
|
||
downloads a ggml model, and links `publish` + `whisper-cli-<backend>` into
|
||
`~/.local/bin`. Re-runnable; each step is idempotent.
|
||
|
||
Then:
|
||
|
||
```bash
|
||
publish --summerize sermon.mp4
|
||
publish --clip sermon.mp4
|
||
publish --summerize --clip sermon.mp4 # both, one transcribe pass
|
||
```
|
||
|
||
Make sure `~/.local/bin` is on your `PATH`.
|
||
|
||
### Other Make targets
|
||
|
||
| target | what it does |
|
||
|---|---|
|
||
| `make` / `make build` | build `./publish` in the repo |
|
||
| `make link` | rebuild + link `./publish` into `~/.local/bin` |
|
||
| `make install` | interactive end-to-end setup |
|
||
| `make doctor` | print detected OS / GPU / dependencies and exit |
|
||
| `make uninstall` | remove the `publish` symlink |
|
||
| `make clean` | remove the local `publish` binary |
|
||
| `make test` | `go test ./...` |
|
||
|
||
## Modes
|
||
|
||
Modes are boolean flags; combine freely. Defaults to `--summerize` if none set.
|
||
|
||
### `--summerize`
|
||
|
||
Markdown summary of the message.
|
||
|
||
```bash
|
||
publish --summerize sermon.mp4
|
||
publish --summerize --spotify sermon.html sermon.mp4
|
||
publish --summerize --copy sermon.mp4 # Spotify HTML -> clipboard
|
||
publish --summerize --prompt "$(cat notes.md)" sermon.mp4
|
||
```
|
||
|
||
Key flags:
|
||
|
||
| flag | purpose |
|
||
|---|---|
|
||
| `--md PATH` | Markdown output path; `-` = stdout, `""` = disable. Default `<input>.summary.md` |
|
||
| `--spotify PATH` | Also write Spotify-show-notes HTML (subset of HTML their editor accepts) |
|
||
| `--copy` | Copy the Spotify HTML to the clipboard (`wl-copy` / `xclip` / `pbcopy`) |
|
||
| `--prompt TEXT` | Producer's notes — pre-written framing the LLM treats as authoritative for title, speaker name, key points. The transcript expands and enriches it |
|
||
|
||
### `--clip`
|
||
|
||
Pick the best 60–90 second sermon clip and cut it to a portrait social video.
|
||
|
||
```bash
|
||
publish --clip sermon.mp4
|
||
publish --clip --min 75 --max 90 sermon.mp4
|
||
publish --clip --dry-run sermon.mp4 # show the picked window only
|
||
publish --clip --copy-codec sermon.mp4 # fast stream copy (skips 9:16 crop)
|
||
```
|
||
|
||
Key flags:
|
||
|
||
| flag | purpose |
|
||
|---|---|
|
||
| `--min` / `--max` | clip length bounds in seconds (default 60 / 90) |
|
||
| `--out PATH` | clip output path (default `<input>.clip<ext>`) |
|
||
| `--copy-codec` | use `ffmpeg -c copy` — fast, but **skips the 9:16 portrait crop** (stream copy can't apply video filters) |
|
||
| `--dry-run` | print the picked window but don't run ffmpeg |
|
||
|
||
Video clips are re-encoded to **1080×1920 portrait** with a safe center-crop
|
||
(`crop=min(iw,ih*9/16):min(ih,iw*16/9)`) that handles any source aspect ratio
|
||
without distortion, and capped at 1 GiB via ffmpeg's `-fs`.
|
||
|
||
### `--post`
|
||
|
||
Stub. Will eventually push the markdown summary to a Spotify-for-Podcasters
|
||
episode description.
|
||
|
||
## Shared flags
|
||
|
||
| flag | purpose | default |
|
||
|---|---|---|
|
||
| `--summarizer` | `claude-cli` (shells out to `claude -p`) or `claude-api` (direct Messages API) | `claude-cli` |
|
||
| `--model` | model name (Anthropic API path defaults to `claude-sonnet-4-6`) | empty |
|
||
| `--prompt-summary` | override the bundled summary system prompt | bundled |
|
||
| `--prompt-clip` | override the bundled clip-selector system prompt | bundled |
|
||
| `--whisper-bin` | whisper.cpp binary; auto-detects best backend if empty | auto |
|
||
| `--whisper-model` | path to a ggml whisper model (.bin) | `~/.cache/whisper.cpp/ggml-base.en.bin` |
|
||
| `--whisper-lang` | force whisper language code | auto-detect |
|
||
| `--whisper-threads` | thread count | library default |
|
||
| `--segments` | segments JSON cache path | `<input>.segments.json` |
|
||
| `--keep-transcript` | also write `<input>.transcript.txt` | off |
|
||
| `--keep-wav` | keep the normalized 16kHz WAV instead of using a tempdir | off |
|
||
| `-v` | verbose progress to stderr | off |
|
||
|
||
> **Note on `--prompt` vs `--prompt-summary`:**
|
||
> `--prompt` is **content** (producer's notes that anchor the summary).
|
||
> `--prompt-summary` is a **path** to override the system prompt template.
|
||
> Different things; both are intentional.
|
||
|
||
## Backends
|
||
|
||
When `--whisper-bin` is not set, `publish` picks a whisper.cpp backend at
|
||
runtime by walking this order:
|
||
|
||
1. **Metal** (macOS) — uses `~/.local/bin/whisper-cli-metal` if present
|
||
2. **CUDA** — `whisper-cli-cuda` if `nvidia-smi -L` succeeds
|
||
3. **ROCm** — `whisper-cli-rocm` if `rocminfo` succeeds
|
||
4. **Vulkan** — `whisper-cli-vulkan` if `vulkaninfo --summary` succeeds
|
||
5. **CPU** — `whisper-cli` / `whisper-cpp` / `main` on PATH
|
||
|
||
Each step requires both the binary and a working runtime probe; failures fall
|
||
through to the next backend. The chosen backend is logged on stderr; `-v`
|
||
adds diagnostics about which probes were skipped or failed.
|
||
|
||
`make install` builds the right backend for your machine. Per-platform build
|
||
recipes (CUDA, ROCm, Vulkan, Metal) live in [CLAUDE.md](./CLAUDE.md#backend-auto-detect).
|
||
|
||
## Platform support
|
||
|
||
| OS | tested | notes |
|
||
|---|---|---|
|
||
| Arch Linux | yes | `pacman` for system deps; CUDA / ROCm / Vulkan all supported |
|
||
| macOS (Apple Silicon) | install path | `brew` + Xcode CLT; Metal acceleration |
|
||
| Debian / Ubuntu | install path | `apt`; GPU runtime install is distro-specific, prints the package list |
|
||
| Fedora | install path | `dnf`; same caveat as Debian |
|
||
| Windows | no | not supported |
|
||
|
||
## Output
|
||
|
||
A typical `publish --summerize sermon.mp4` produces `sermon.summary.md` that
|
||
looks roughly like:
|
||
|
||
```markdown
|
||
# Fatal Sleep
|
||
|
||
**Speaker:** Rev. Hayford
|
||
**Scripture:** Romans 13:11–14, 1 Thessalonians 5:6–8, Ephesians 5:14
|
||
|
||
## Overview
|
||
A 2-4 sentence plain-English summary of the central message.
|
||
|
||
## Key Points
|
||
- ...
|
||
|
||
## Scripture & References
|
||
- Romans 13:11 — "And that, knowing the time, that now it is high time to awake out of sleep..."
|
||
- ...
|
||
|
||
## Application / Call to Action
|
||
- ...
|
||
|
||
## Memorable Quote
|
||
> "..."
|
||
```
|
||
|
||
`--spotify path.html` writes the same content as the small HTML subset that
|
||
Spotify-for-Podcasters' show-notes editor accepts (`<b>`, `<i>`, `<a>`,
|
||
`<ul>`, `<ol>`, `<li>`, `<p>`).
|
||
|
||
`--clip` writes `<input>.clip.mp4` (or `.m4a` for audio inputs). Video clips
|
||
are 1080×1920 portrait, ≤1 GiB.
|
||
|
||
## Requirements
|
||
|
||
| tool | required for | install on Arch | install on macOS |
|
||
|---|---|---|---|
|
||
| `ffmpeg` | always | `pacman -S ffmpeg` | `brew install ffmpeg` |
|
||
| `whisper-cli` | transcription | `pacman -S whisper.cpp` (CPU) or build from source for GPU | `brew install whisper-cpp` (Metal) or build |
|
||
| ggml model | transcription | downloaded by `make install` | downloaded by `make install` |
|
||
| `claude` CLI | `--summarizer claude-cli` (default) | comes with [Claude Code](https://claude.com/claude-code) | same |
|
||
| `ANTHROPIC_API_KEY` | `--summarizer claude-api` | env var | env var |
|
||
| `wl-copy` / `xclip` / `pbcopy` | `--copy` flag | wayland default; `pacman -S xclip` for X11 | `pbcopy` ships with macOS |
|
||
|
||
`make install` walks you through these.
|
||
|
||
## Building from source
|
||
|
||
```bash
|
||
go build -o publish .
|
||
```
|
||
|
||
Zero external Go dependencies — stdlib only. `go.sum` is empty.
|
||
|
||
```
|
||
.
|
||
├── main.go flat flagset, mode dispatch, orchestration
|
||
├── prompts/
|
||
│ ├── church-service.md summary system prompt
|
||
│ └── clip-selector.md clip-selector system prompt
|
||
├── internal/
|
||
│ ├── audio/ ffmpeg → 16kHz mono WAV
|
||
│ ├── transcribe/ whisper.cpp wrapper, segments, mm:ss helper
|
||
│ ├── summarize/ pluggable LLM backends (CLI + Anthropic API)
|
||
│ ├── clip/ LLM clip selection + ffmpeg cut
|
||
│ └── output/ markdown→Spotify HTML, clipboard
|
||
├── scripts/install.sh interactive setup
|
||
├── Makefile
|
||
└── CLAUDE.md deep architecture / pipeline docs
|
||
```
|
||
|
||
## Spelling note
|
||
|
||
Yes, `--summerize` is intentionally spelled with an "e" — sermon + summarize,
|
||
the original name of the project. The internal Go package uses standard
|
||
`summarize`; only the user-facing flag and binary keep the pun.
|
||
|