adding readme

This commit is contained in:
2026-05-10 13:43:17 -06:00
parent 54629aecad
commit 70c0eea31f

245
README.md Normal file
View File

@@ -0,0 +1,245 @@
# publish
Turn a recorded church service into show-notes and a social hook clip in one
pass. Local transcription via [whisper.cpp](https://github.com/ggerganov/whisper.cpp),
LLM summary via Claude, ffmpeg-cut portrait clip — wired together in a single Go
CLI.
```
publish [--summerize] [--clip] [--post] [flags] <audio-or-video>
```
## What it does
Given an audio or video recording (mp4, m4a, mp3, wav, ...), `publish` will:
1. **Transcribe** the audio locally with whisper.cpp (CUDA / ROCm / Vulkan /
Metal / CPU — picked automatically per machine).
2. **Summarize** (`--summerize`) the sermon into a Markdown document with
speaker, scripture references (KJV), key points, and a memorable quote.
Optionally also emit Spotify-for-Podcasters-friendly HTML.
3. **Clip** (`--clip`) a 6090 second hook from the preaching, re-encoded to
1080×1920 portrait (9:16) with a center-crop, capped at 1 GiB — ready to
upload to Reels / Shorts / TikTok / X.
4. **Post** (`--post`) — Spotify upload integration, not implemented yet.
The transcript is cached at `<input>.segments.json`, so running multiple modes
or re-tuning prompt parameters costs one whisper run.
## Quick start
```bash
git clone <repo-url> ~/Git\ Repos/summerize
cd ~/Git\ Repos/summerize
make install
```
`make install` is interactive: it detects your OS and GPU, walks you through
installing system dependencies, builds whisper.cpp with the right backend,
downloads a ggml model, and links `publish` + `whisper-cli-<backend>` into
`~/.local/bin`. Re-runnable; each step is idempotent.
Then:
```bash
publish --summerize sermon.mp4
publish --clip sermon.mp4
publish --summerize --clip sermon.mp4 # both, one transcribe pass
```
Make sure `~/.local/bin` is on your `PATH`.
### Other Make targets
| target | what it does |
|---|---|
| `make` / `make build` | build `./publish` in the repo |
| `make link` | rebuild + link `./publish` into `~/.local/bin` |
| `make install` | interactive end-to-end setup |
| `make doctor` | print detected OS / GPU / dependencies and exit |
| `make uninstall` | remove the `publish` symlink |
| `make clean` | remove the local `publish` binary |
| `make test` | `go test ./...` |
## Modes
Modes are boolean flags; combine freely. Defaults to `--summerize` if none set.
### `--summerize`
Markdown summary of the message.
```bash
publish --summerize sermon.mp4
publish --summerize --spotify sermon.html sermon.mp4
publish --summerize --copy sermon.mp4 # Spotify HTML -> clipboard
publish --summerize --prompt "$(cat notes.md)" sermon.mp4
```
Key flags:
| flag | purpose |
|---|---|
| `--md PATH` | Markdown output path; `-` = stdout, `""` = disable. Default `<input>.summary.md` |
| `--spotify PATH` | Also write Spotify-show-notes HTML (subset of HTML their editor accepts) |
| `--copy` | Copy the Spotify HTML to the clipboard (`wl-copy` / `xclip` / `pbcopy`) |
| `--prompt TEXT` | Producer's notes — pre-written framing the LLM treats as authoritative for title, speaker name, key points. The transcript expands and enriches it |
### `--clip`
Pick the best 6090 second sermon clip and cut it to a portrait social video.
```bash
publish --clip sermon.mp4
publish --clip --min 75 --max 90 sermon.mp4
publish --clip --dry-run sermon.mp4 # show the picked window only
publish --clip --copy-codec sermon.mp4 # fast stream copy (skips 9:16 crop)
```
Key flags:
| flag | purpose |
|---|---|
| `--min` / `--max` | clip length bounds in seconds (default 60 / 90) |
| `--out PATH` | clip output path (default `<input>.clip<ext>`) |
| `--copy-codec` | use `ffmpeg -c copy` — fast, but **skips the 9:16 portrait crop** (stream copy can't apply video filters) |
| `--dry-run` | print the picked window but don't run ffmpeg |
Video clips are re-encoded to **1080×1920 portrait** with a safe center-crop
(`crop=min(iw,ih*9/16):min(ih,iw*16/9)`) that handles any source aspect ratio
without distortion, and capped at 1 GiB via ffmpeg's `-fs`.
### `--post`
Stub. Will eventually push the markdown summary to a Spotify-for-Podcasters
episode description.
## Shared flags
| flag | purpose | default |
|---|---|---|
| `--summarizer` | `claude-cli` (shells out to `claude -p`) or `claude-api` (direct Messages API) | `claude-cli` |
| `--model` | model name (Anthropic API path defaults to `claude-sonnet-4-6`) | empty |
| `--prompt-summary` | override the bundled summary system prompt | bundled |
| `--prompt-clip` | override the bundled clip-selector system prompt | bundled |
| `--whisper-bin` | whisper.cpp binary; auto-detects best backend if empty | auto |
| `--whisper-model` | path to a ggml whisper model (.bin) | `~/.cache/whisper.cpp/ggml-base.en.bin` |
| `--whisper-lang` | force whisper language code | auto-detect |
| `--whisper-threads` | thread count | library default |
| `--segments` | segments JSON cache path | `<input>.segments.json` |
| `--keep-transcript` | also write `<input>.transcript.txt` | off |
| `--keep-wav` | keep the normalized 16kHz WAV instead of using a tempdir | off |
| `-v` | verbose progress to stderr | off |
> **Note on `--prompt` vs `--prompt-summary`:**
> `--prompt` is **content** (producer's notes that anchor the summary).
> `--prompt-summary` is a **path** to override the system prompt template.
> Different things; both are intentional.
## Backends
When `--whisper-bin` is not set, `publish` picks a whisper.cpp backend at
runtime by walking this order:
1. **Metal** (macOS) — uses `~/.local/bin/whisper-cli-metal` if present
2. **CUDA**`whisper-cli-cuda` if `nvidia-smi -L` succeeds
3. **ROCm**`whisper-cli-rocm` if `rocminfo` succeeds
4. **Vulkan**`whisper-cli-vulkan` if `vulkaninfo --summary` succeeds
5. **CPU**`whisper-cli` / `whisper-cpp` / `main` on PATH
Each step requires both the binary and a working runtime probe; failures fall
through to the next backend. The chosen backend is logged on stderr; `-v`
adds diagnostics about which probes were skipped or failed.
`make install` builds the right backend for your machine. Per-platform build
recipes (CUDA, ROCm, Vulkan, Metal) live in [CLAUDE.md](./CLAUDE.md#backend-auto-detect).
## Platform support
| OS | tested | notes |
|---|---|---|
| Arch Linux | yes | `pacman` for system deps; CUDA / ROCm / Vulkan all supported |
| macOS (Apple Silicon) | install path | `brew` + Xcode CLT; Metal acceleration |
| Debian / Ubuntu | install path | `apt`; GPU runtime install is distro-specific, prints the package list |
| Fedora | install path | `dnf`; same caveat as Debian |
| Windows | no | not supported |
## Output
A typical `publish --summerize sermon.mp4` produces `sermon.summary.md` that
looks roughly like:
```markdown
# Fatal Sleep
**Speaker:** Rev. Hayford
**Scripture:** Romans 13:1114, 1 Thessalonians 5:68, Ephesians 5:14
## Overview
A 2-4 sentence plain-English summary of the central message.
## Key Points
- ...
## Scripture & References
- Romans 13:11 — "And that, knowing the time, that now it is high time to awake out of sleep..."
- ...
## Application / Call to Action
- ...
## Memorable Quote
> "..."
```
`--spotify path.html` writes the same content as the small HTML subset that
Spotify-for-Podcasters' show-notes editor accepts (`<b>`, `<i>`, `<a>`,
`<ul>`, `<ol>`, `<li>`, `<p>`).
`--clip` writes `<input>.clip.mp4` (or `.m4a` for audio inputs). Video clips
are 1080×1920 portrait, ≤1 GiB.
## Requirements
| tool | required for | install on Arch | install on macOS |
|---|---|---|---|
| `ffmpeg` | always | `pacman -S ffmpeg` | `brew install ffmpeg` |
| `whisper-cli` | transcription | `pacman -S whisper.cpp` (CPU) or build from source for GPU | `brew install whisper-cpp` (Metal) or build |
| ggml model | transcription | downloaded by `make install` | downloaded by `make install` |
| `claude` CLI | `--summarizer claude-cli` (default) | comes with [Claude Code](https://claude.com/claude-code) | same |
| `ANTHROPIC_API_KEY` | `--summarizer claude-api` | env var | env var |
| `wl-copy` / `xclip` / `pbcopy` | `--copy` flag | wayland default; `pacman -S xclip` for X11 | `pbcopy` ships with macOS |
`make install` walks you through these.
## Building from source
```bash
go build -o publish .
```
Zero external Go dependencies — stdlib only. `go.sum` is empty.
```
.
├── main.go flat flagset, mode dispatch, orchestration
├── prompts/
│ ├── church-service.md summary system prompt
│ └── clip-selector.md clip-selector system prompt
├── internal/
│ ├── audio/ ffmpeg → 16kHz mono WAV
│ ├── transcribe/ whisper.cpp wrapper, segments, mm:ss helper
│ ├── summarize/ pluggable LLM backends (CLI + Anthropic API)
│ ├── clip/ LLM clip selection + ffmpeg cut
│ └── output/ markdown→Spotify HTML, clipboard
├── scripts/install.sh interactive setup
├── Makefile
└── CLAUDE.md deep architecture / pipeline docs
```
## Spelling note
Yes, `--summerize` is intentionally spelled with an "e" — sermon + summarize,
the original name of the project. The internal Go package uses standard
`summarize`; only the user-facing flag and binary keep the pun.