LLMs and Embeddings

A practical guide to configuring and using Large Language Models (LLM) and Embedders in MemOS.

Overview

MemOS decouples model logic from runtime config via two Pydantic factories:

FactoryProducesTypical backends
LLMFactoryChat‑completion modelollama, openai, qwen, deepseek, huggingface
EmbedderFactoryText‑to‑vector encoderollama, sentence_transformer, universal_api

Both factories accept a *_ConfigFactory(model_validate(...)) blob, so you can switch provider with a single backend= swap.

LLM Module

Supported LLM Backends

BackendNotesExample Model Id
ollamaLocal llama‑cpp runnerqwen3:0.6b etc.
openaiOfficial or proxygpt-4o-mini, gpt-3.5-turbo etc.
qwenDashScope‑compatibleqwen-plus, qwen-max-2025-01-25 etc.
deepseekDeepSeek REST APIdeepseek-chat, deepseek-reasoner etc.
huggingfaceTransformers pipelineQwen/Qwen3-1.7B etc.

LLM Config Schema

Common fields:

FieldTypeDefaultDescription
model_name_or_pathstrModel id or local tag
temperaturefloat0.8
max_tokensint1024
top_p / top_kfloat / int0.9 / 50
API‑specifice.g. api_key, api_baseOpenAI‑compatible creds
remove_think_prefixboolTrueStrip /think role content

Factory Usage

from memos.configs.llm import LLMConfigFactory
from memos.llms.factory import LLMFactory

cfg = LLMConfigFactory.model_validate({
    "backend": "ollama",
    "config": {"model_name_or_path": "qwen3:0.6b"}
})
llm = LLMFactory.from_config(cfg)

LLM Core APIs

MethodPurpose
generate(messages: list)Return full string response
generate_stream(messages)Yield streaming chunks

Streaming & CoT

messages = [{"role": "user", "content": "Let’s think step by step: …"}]
for chunk in llm.generate_stream(messages):
    print(chunk, end="")
Full code
Find all scenarios in examples/basic_modules/llm.py.

Performance Tips

  • Use qwen3:0.6b for <2 GB footprint when prototyping locally.
  • Combine with KV Cache (see KVCacheMemory doc) to cut TTFT .

Embedding Module

Supported Embedder Backends

BackendExample ModelVector Dim
ollamanomic-embed-text:latest768
sentence_transformernomic-ai/nomic-embed-text-v1.5768
universal_apitext-embedding-3-large3072

Embedder Config Schema

Shared keys: model_name_or_path, optional API creds (api_key, base_url), etc.

Factory Usage

cfg = EmbedderConfigFactory.model_validate({
    "backend": "ollama",
    "config": {"model_name_or_path": "nomic-embed-text:latest"}
})
embedder = EmbedderFactory.from_config(cfg)