Install Taters¶

Use a fresh virtual environment. No, really. It keeps your global Python env from turning into hash browns. Which are delicious, but messy.

A note about the instructions below¶

Okay so... I'm more of a psychologist that a computer scientist. I'm not a hardware guru, and when it comes to CUDA, I'm on the "consumer, not developer" side of things. The instructions below should all work just fine and dandy but, if not, I suspect that an LLM can help you out.

With all of that said, this is the process that I've been using on my server. If you're a desktop/laptop user — if you don't have a CUDA-supported GPU and CUDA already installed, then much of this will not be especially relevant to you. Most functions should still work, but anything that requires CUDA (e.g., Whisper, Diarization, sentence-transfomers) will run considerably slower.

0) Prerequisites¶

Python: 3.9+ (CPython recommended).
FFmpeg on your PATH (for audio/video I/O).
macOS: brew install ffmpeg
Ubuntu/Debian: sudo apt-get install ffmpeg
(Optional) NVIDIA GPU for faster diarization/embedding workloads.

Tip: For GPU setups, make sure your CUDA/cuDNN versions match the wheels you install. Mismatches are the #1 source of "it compiled but won't load" errors.

1) Create and activate a virtual environment¶

python -m venv venv-taters
# macOS/Linux
source venv-taters/bin/activate
# Windows PowerShell
# .\venv-taters\Scripts\Activate.ps1

2) Install Taters¶

Quick path (when available)¶

If you see a published build on PyPI (or your internal index):

pip install "taters[all]"

This pulls the core package plus all of the optional extras. You can also do a lighter install that will only install optional dependencies (i.e., "extras") for things that you plan to use. For example:

pip install "taters[readability,cuda]"

Install diarization extras¶

If you're using the diarization wrapper, install these repos:

pip install git+https://github.com/MahmoudAshraf97/demucs.git
pip install git+https://github.com/oliverguhr/deepmultilingualpunctuation.git
pip install git+https://github.com/MahmoudAshraf97/ctc-forced-aligner.git

Choose your PyTorch build¶

CUDA 12.4 (recommended for GPU users):

pip install --force-reinstall --no-cache-dir \
  torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 \
  --index-url https://download.pytorch.org/whl/cu124

Make sure your system CUDA runtime matches cu124 wheels (CUDA 12.4; cuDNN 9).

CPU-only (works everywhere, slower):

pip install --force-reinstall --no-cache-dir \
  torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 \
  --index-url https://download.pytorch.org/whl/cpu

(If you don't plan on GPU acceleration, you can skip CUDA entirely.)

3) Verify your setup¶

Sanity check: import & version¶

python - <<'PY'
import taters, sys
print("taters version:", getattr(taters, "__version__", "unknown"))
PY

Check FFmpeg¶

ffmpeg -version
# Should print a version; if not, ensure FFmpeg is installed and on PATH

Optional: run a tiny pipeline (CLI)¶

python -m taters.helpers.text_gather --help
python -m taters.text.extract_sentence_embeddings --help

(Every module has a --help with the full set of options.)

4) Picking a device¶

Most commands accept a --device (or device= in Python) parameter:

cuda — fastest with a compatible GPU
cpu — safest for portability
auto — wrappers that support it will pick sensibly based on availability

5) Common installation pitfalls¶

CUDA/cuDNN loader errors: your runtime and wheel builds probably don't match. Keep CUDA 12.4, cu124 wheels, and cuDNN 9 aligned.
No ffmpeg found: install FFmpeg and ensure it's on PATH.
Mixed environments: if you see strange import errors, double-check that your shell is using the venv you created.