Skip to content

Backends (execution engines)

AbstractVision executes tasks via a VisionBackend adapter (../../src/abstractvision/backends/base_backend.py). VisionManager is intentionally thin and delegates to the configured backend (../../src/abstractvision/vision_manager.py).

See also: - Getting started (REPL examples): docs/getting-started.md - Configuration (env vars / CLI flags): docs/reference/configuration.md

Support matrix (built-in backends)

Backend Implementation Tasks implemented Notes
OpenAI-compatible HTTP openai_compatible.py text_to_image, image_to_image (+ optional text_to_video, image_to_video) Stdlib-only (urllib). Video is opt-in via configured paths.
Diffusers (local) huggingface_diffusers.py text_to_image, image_to_image Requires abstractvision[diffusers]. Supports cache-only/offline mode.
stable-diffusion.cpp (local GGUF/checkpoints) stable_diffusion_cpp.py text_to_image, image_to_image Uses external sd-cli if present, else abstractvision[sdcpp] python bindings. Start with single-file Stable Diffusion models; Qwen/FLUX GGUF may need VAE + LLM components.

Notes: - multi_view_image (VisionManager.generate_angles) is part of the public API, but no built-in backend implements it yet (all raise CapabilityNotSupportedError today). - Backends may also expose best-effort get_capabilities(), preload(), unload(), generate_image_with_progress(...), and edit_image_with_progress(...) hooks via the shared VisionBackend contract.

OpenAI-compatible HTTP backend

When to use - You already run a service that exposes OpenAI-shaped endpoints (local or remote). - You want to keep inference out-of-process.

Core config - base_url (required): points to a /v1-style root, e.g. http://localhost:1234/v1 - api_key (optional): sent as Authorization: Bearer ... - model_id (optional): forwarded as model in requests - models_path (default /models): provider catalog path for explicit model listing

Request shape: - Unknown/local endpoints receive local extension fields when provided, including steps, seed, guidance_scale, negative_prompt, width, and height. - Real OpenAI-looking endpoints and known OpenAI image models use the narrower OpenAI request shape; GPT image models do not receive unsupported local-only fields such as steps, seed, or guidance_scale.

Provider model catalogs: - OpenAICompatibleVisionBackend.list_provider_models(...) queries GET /models by default. - VisionManager.list_provider_models(...) delegates to the configured backend. - The AbstractCore plugin exposes the same catalog through llm.vision.list_provider_models(...). - CLI examples: abstractvision provider-models --openai --task text_to_image and abstractvision provider-models --base-url http://localhost:1234/v1 --task text_to_image. - Listing is explicit; AbstractVision does not use provider catalogs to silently select a model.

Code pointers: - Config: OpenAICompatibleBackendConfig (../../src/abstractvision/backends/openai_compatible.py) - Backend: OpenAICompatibleVisionBackend (../../src/abstractvision/backends/openai_compatible.py)

Video endpoints (optional) OpenAICompatibleVisionBackend only enables: - text_to_video if text_to_video_path is set - image_to_video if image_to_video_path is set

Diffusers backend (local)

When to use - You want local inference for Diffusers pipelines. - Start with runwayml/stable-diffusion-v1-5 for the lowest-risk local test. - Move to black-forest-labs/FLUX.2-klein-4B after that if you want a newer non-gated model and can install Diffusers main.

Install: - pip install "abstractvision[diffusers]" - For newer/unreleased pipeline classes: pip install "abstractvision[diffusers-dev]" plus Diffusers from source.

Code pointers: - Config: HuggingFaceDiffusersBackendConfig (../../src/abstractvision/backends/huggingface_diffusers.py) - Backend: HuggingFaceDiffusersVisionBackend (../../src/abstractvision/backends/huggingface_diffusers.py)

Offline / cache-only mode The Python backend and REPL are cache-only by default (allow_download=False). Pre-download model weights separately, or set allow_download=True / ABSTRACTVISION_DIFFUSERS_ALLOW_DOWNLOAD=1 when runtime downloads are desired (see config/env in docs/reference/configuration.md).

Config fields: - model_id, device, torch_dtype - allow_download, auto_retry_fp32 - cache_dir, revision, variant - use_safetensors, low_cpu_mem_usage

stable-diffusion.cpp backend (local GGUF/checkpoints)

When to use - You want to run single-file Stable Diffusion checkpoints/GGUF or component-based GGUF diffusion models locally.

Runtime modes (auto-selected): - CLI mode via sd-cli (stable-diffusion.cpp executable) when available in PATH - Python mode via stable-diffusion-cpp-python when sd-cli is not available

Notes: - If you care about GPU acceleration (macOS Metal, NVIDIA CUDA, etc.), prefer CLI mode via sd-cli. - Python bindings run whatever backend the installed wheel was built with. On macOS, that often means CPU-only, so FLUX/Qwen-class models can be extremely slow. - The optional python binding is constrained below 0.4.6 because that sdist currently misses vendored CMake files needed by native Linux builds. - REPL selection supports both /backend sdcpp <model.gguf|model.safetensors> [sd_cli_path] and /backend sdcpp <diffusion_model.gguf> <vae.safetensors> <llm.gguf> [sd_cli_path]. - Python code and AbstractCore plugin configuration can also pass component paths such as clip_l, clip_g, t5xxl, llm_vision, plus extra_args, timeout_s, and cwd.

Code pointers: - Config: StableDiffusionCppBackendConfig (../../src/abstractvision/backends/stable_diffusion_cpp.py) - Backend: StableDiffusionCppVisionBackend (../../src/abstractvision/backends/stable_diffusion_cpp.py)