basicmachines-co · phernandez · Jun 8, 2026 · Jun 7, 2026
diff --git a/docs/litellm-provider.md b/docs/litellm-provider.md
@@ -8,6 +8,14 @@ Cohere, Bedrock, NVIDIA NIM, and other LiteLLM-supported embedding providers.
 Use this page when you want to try a non-default embedding model, validate a provider,
 or tune LiteLLM-specific settings.
 
+> **Experimental — advanced users only.** The LiteLLM provider is experimental and
+> intended for users who are comfortable operating remote embedding backends. It makes
+> paid, networked API calls, requires per-model dimension and input-role configuration,
+> and reindexing a real corpus can be slow and spend provider quota (see
+> [Reindexing with a remote provider](#reindexing-with-a-remote-provider)). For most
+> users, the default local **FastEmbed** provider is the recommended choice. Use LiteLLM
+> only if you know what you're doing.
+
 ## Quick Start
 
 The default LiteLLM model is OpenAI `text-embedding-3-small` through the LiteLLM
@@ -101,6 +109,44 @@ those changes:
 bm reindex --embeddings
 ```
 
+## Reindexing with a remote provider
+
+Embedding a real corpus through a network API is far slower than local FastEmbed, and
+the defaults are tuned for the local case. Two things to know before you run a full
+reindex.
+
+**Raise the sync batch size.** `semantic_embedding_sync_batch_size` defaults to `2`, and
+it — not `semantic_embedding_batch_size` — governs throughput on the sync pipeline. With
+the default, a full reindex can take tens of seconds *per note* against a remote provider.
+Raising both to a larger value turns a multi-minute (or longer) reindex into well under a
+minute for the same corpus:
+
+```bash
+export BASIC_MEMORY_SEMANTIC_EMBEDDING_SYNC_BATCH_SIZE=32
+export BASIC_MEMORY_SEMANTIC_EMBEDDING_BATCH_SIZE=64
+```
+
+Stay within the provider's per-request size and rate limits — Cohere v3, for example,
+accepts up to 96 inputs per embedding request.
+
+**Changing dimensions requires recreating the vector table.** Basic Memory dimensions the
+vector table on first index and refuses to mix sizes. Switching to a model with a
+different dimension (for example FastEmbed 384 → OpenAI 1536 → Cohere 1024) makes a plain
+`bm reindex` raise an `Embedding dimension mismatch` error. Recreate the table with a full
+rebuild — files are the source of truth, so this re-indexes from disk and re-embeds
+everything:
+
+```bash
+bm reset --reindex
+```
+
+To trial a provider without disturbing your existing index, point Basic Memory at a
+throwaway config + database instead:
+
+```bash
+export BASIC_MEMORY_CONFIG_DIR=/tmp/bm-litellm-trial
+```
+
 ## Provider Setup Examples
 
 LiteLLM reads provider credentials from the environment. These are the examples

diff --git a/docs/semantic-search.md b/docs/semantic-search.md
@@ -99,7 +99,7 @@ All settings are fields on `BasicMemoryConfig` and can be set via environment va
 | Config Field | Env Var | Default | Description |
 |---|---|---|---|
 | `semantic_search_enabled` | `BASIC_MEMORY_SEMANTIC_SEARCH_ENABLED` | Auto (`true` when semantic deps are available) | Enable semantic search. Required before vector/hybrid modes work. |
-| `semantic_embedding_provider` | `BASIC_MEMORY_SEMANTIC_EMBEDDING_PROVIDER` | `"fastembed"` | Embedding provider: `"fastembed"` (local), `"openai"` (API), or `"litellm"` (multi-provider API). |
+| `semantic_embedding_provider` | `BASIC_MEMORY_SEMANTIC_EMBEDDING_PROVIDER` | `"fastembed"` | Embedding provider: `"fastembed"` (local), `"openai"` (API), or `"litellm"` (multi-provider API, **experimental** — advanced users only). |
 | `semantic_embedding_model` | `BASIC_MEMORY_SEMANTIC_EMBEDDING_MODEL` | `"bge-small-en-v1.5"` | Model identifier. Auto-adjusted per provider if left at default. |
 | `semantic_embedding_dimensions` | `BASIC_MEMORY_SEMANTIC_EMBEDDING_DIMENSIONS` | Provider default | Vector dimensions. 384 for FastEmbed, 1536 for OpenAI/LiteLLM OpenAI. Required when using a non-default LiteLLM model. |
 | `semantic_embedding_forward_dimensions` | `BASIC_MEMORY_SEMANTIC_EMBEDDING_FORWARD_DIMENSIONS` | Auto | LiteLLM-only override for whether configured dimensions are sent as a provider-side output-size request. |
@@ -140,6 +140,8 @@ export OPENAI_API_KEY=sk-...
 
 ### LiteLLM
 
+> **Experimental — advanced users only.** The LiteLLM provider is experimental and aimed at users comfortable operating remote embedding backends: paid API calls, per-model dimension and input-role configuration, and slower reindexing of large corpora. For most users, FastEmbed (local, default) is recommended. See [LiteLLM Provider](litellm-provider.md) for the caveats and tuning.
+
 Uses the LiteLLM SDK to call embedding models from providers such as OpenAI, Cohere, Azure, Bedrock, NVIDIA NIM, and other LiteLLM-supported backends. Requires the provider's API credentials.
 For the full option reference, provider setup examples, and live validation harness, see [LiteLLM Provider](litellm-provider.md).