REST API Server

MemOS provides a REST API server (written using FastAPI). Users can perform all operations through REST endpoints.

MemOS Architecture

APIs supported by MemOS REST API Server

Features

Register a new user: Register a new user with configuration and default cube.
Get suggestion queries: Get suggestion queries for a specific user.
Get all memories for user: Get all memories for a specific user.
Add a new memory: Create a new memory for a specific user.
Search memories: Search memories for a specific user.
Chat with MemOS: Chat with MemOS for a specific user. Returns SSE stream.

Running Locally

With Docker Compose up

The Development Docker Compose up comes pre-configured with qdrant, neo4j. The required environment variable to run the server is OPENAI_API_KEY.

Create a `.env` file in the root directory and set your environment variables. For example:

# User key, used to initialize or default request users
OPENAI_API_KEY=your-openai-api-key  

# OpenAI interface address, default https://api.openai.com/v1. If using proxy or self built compatible services, change here.
OPENAI_API_BASE=your-openai-ip

# Http-bge (HTTP Service BGE rearrangement) or cosine local (local cosine).
MOS_RERANKER_BACKEND=cosine_local

# universal_api: Using OpenAI chat and embedding, 
# Ollama: using local Ollama embedding
MOS_EMBEDDER_BACKEND=universal_api

# Embedded model
MOS_EMBEDDER_MODEL=bge-m3

# Interface address (OpenAI is) https://api.openai.com/v1 Azure for your endpoint
MOS_EMBEDDER_API_BASE=your-openai-ip

# Corresponding provider's key
MOS_EMBEDDER_API_KEY=EMPTY

# Vector dimension
EMBEDDING_DIMENSION=1024

# expand
# MOS_SESSION_ID: Session ID (used for start_mapi-py route)
# MOS_TOP_K: The candidate upper limit for retrieval/recall (such as 30, 50)
# MOS_MAX_TOKENS: LLM generates maximum tokens
# MOS_TOP_P / MOS_TOP_K (Generate): Generate sampling parameters (note that they have different meanings from the top_k retrieved)
# MOS_CHAT_TEMPERATURE: Generate temperature
# MOS_MAX_TURNS_WINDOW: The dialogue window retains the number of rounds
# MOS_EMBEDDER_PROVIDER: openai or azure

Enter Docker Folder

# Enter Docker in the current folder
cd docker

Install Dependencies


pip install --upgrade pip && pip install --no-cache-dir -r requirements.txt

# Use aliyun source to install dependencies
pip install --upgrade pip && pip install --no-cache-dir -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/

Start the container using Docker Compose Up in the Docker directory (ensuring proper VPN connection):


# On the first run, you need to build the image
docker compose up --build
# On subsequent runs, you do not need to build the image
docker compose up

Access the API at http://localhost:8000/docs.

Test case (registered user ->add user memory ->query user memory)

Registered http://localhost:8000/product/users/register (POST)

# response
{
    "code": 200,
    "message": "User registered successfully",
    "data": {
        "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc",
        "mem_cube_id": "b32d0977-435d-4828-a86f-4f47f8b55bca"
    }
}

Add User Memory http://localhost:8000/product/add (POST)

# request params
{
  "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc",
  "mem_cube_id": "b32d0977-435d-4828-a86f-4f47f8b55bca",
  "messages": [
    {
      "role": "user",
      "content": "I like strawberry"
    }
  ],
  "memory_content": "",
  "doc_path": "",
  "source": "",
  "user_profile": false
}
# response
{
    "code": 200,
    "message": "Memory created successfully",
    "data": null
}

Query User Memory http://localhost:8000/product/search (POST)

# request params
{
  "query": "What do I like",
  "user_id": "8736b16e-1d20-4163-980b-a5063c3facdc",
  "mem_cube_id": "b32d0977-435d-4828-a86f-4f47f8b55bca"
}
# response
{
    "code": 200,
    "message": "Search completed successfully",
    "data": {
        "text_mem": [
          {
            "cube_id": "7231eda8-6c57-4f6e-97ce-98b699eebb98",
            "memories": [
              {
                  "id": "2f40be8f-736c-4a5f-aada-9489037769e0",
                  "memory": "[User viewpoint] Users like strawberries.",
                  "metadata": {
                      "user_id": "de8215e3-3beb-4afc-9b64-ae594d62f1ea",
                      "session_id": "root_session",
                      "status": "activated",
                      "type": "fact",
                      "key": "User preference for strawberries",
                      "confidence": 0.99,
                      "source": null,
                      "tags": [
                          "preference",
                          "strawberry"
                      ],
                      "visibility": null,
                      "updated_at": "2025-09-18T08:23:44.625479000+00:00",
                      "memory_type": "UserMemory",
                      "sources": [],
                      "embedding": [],
                      "created_at": "2025-09-18T08:23:44.625511000+00:00",
                      "usage": [
                          "{\"time\": \"2025-09-18T08:24:17.759748\", \"info\": {\"user_id\": \"de8215e3-3beb-4afc-9b64-ae594d62f1ea\", \"session_id\": \"root_session\"}}"
                      ],
                      "background": "The user expressed a preference for strawberries, indicating their inclination towards dietary preferences.",
                      "relativity": 0.6349761312470591,
                      "vector_sync": "success",
                      "ref_id": "[2f40be8f]",
                      "id": "2f40be8f-736c-4a5f-aada-9489037769e0",
                      "memory": "[User viewpoint] Users like strawberries."
                  },
                  "ref_id": "[2f40be8f]"
              },
              ...
            }
          }
        ],
        "act_mem": [],
        "para_mem": []
    }
}

Making changes to the server code or the library code will automatically reload the server.

With Docker

Create a `.env` file in the root directory and set your environment variables. For example:

OPENAI_API_KEY=your-openai-api-key




# Starting in Docker neo4j and qdrant
QDRANT_HOST=host.docker.internal

NEO4J_URI=bolt://host.docker.internal:7687

Build the docker image locally.

docker build -t memos-api-server .

Start in Docker first neo4j and qdrant

Run the Docker container:

docker run --env-file .env -p 8000:8000 memos-api-server

Access the API at http://localhost:8000/docs.

Test cases (register user ->add user memory ->query user memory) refer to Docker Compose up test cases

Without Docker

Create a `.env` file in the root directory and set your environment variables. For example:

OPENAI_API_KEY=your-openai-api-key

OPENAI_API_BASE=your-openai-ip

MOS_RERANKER_BACKEND=cosine_local

MOS_EMBEDDER_BACKEND=universal_api

MOS_EMBEDDER_MODEL=bge-m3

MOS_EMBEDDER_API_BASE=your-openai-ip

MOS_EMBEDDER_API_KEY=EMPTY

EMBEDDING_DIMENSION=1024


# Starting in Docker neo4j and qdrant
QDRANT_HOST=host.docker.internal

NEO4J_URI=bolt://host.docker.internal:7687

Install Poetry for dependency management:

curl -sSL https://install.python-poetry.org | python3 -

Configure Poetry environment variables:


# To start using Poetry, ensure its bin directory (/Users/jinyunyuan/.local/bin) is in your PATH.
# The default shell for modern macOS systems is zsh. You can confirm with the command below.
1. Check which shell you are using

echo $SHELL
# If the output is /bin/zsh or /usr/bin/env zsh, then you are using zsh.
# (If your system version is older and you may still be using bash, the output will be /bin/bash)
2. Open the corresponding shell configuration file
# If using zsh (in the vast majority of cases):
# Using the Nano editor (recommended for beginners)
nano ~/.zshrc

# Or use the vim editor
# vim ~/.zshrc
# If you are using bash:
nano ~/.bash_profile
# Or
nano ~/.bashrc

3. Add the PATH environment variable

# At the end of the opened file, add a new line and paste the command suggested by the installer:
export PATH="/your-path/.local/bin:$PATH"

4. Save and exit the editor

# If you are using nano:
# Press Ctrl + O to write (save), then press Enter to confirm the filename.
# Then press Ctrl + X to exit the editor.

# If you are using vim:
# Press i to enter insert mode, paste the code, then press ESC to exit insert mode.
# Type :wq and press Enter to save and exit.

5. Apply the configuration immediately
# The modified configuration file will not take effect in the current open terminal window automatically; run one of the following commands to reload it:

# For zsh:
source ~/.zshrc

# For bash:
source ~/.bash_profile

6. Verify the installation
# Now, run the test command suggested above to verify everything is ready:
poetry --version
# It should show the version number, e.g., Poetry (version 2.2.0)

Install all project dependencies and development tools:

make install

Start in Docker first neo4j and qdrant

Start the FastAPI server:

uvicorn memos.api.product_api:app --host 0.0.0.0 --port 8000 --reload

After the server is running, you can use OpenAPI documentation to test the API http://localhost:8000/docs or http://127.0.0.1:8000/docs

Test cases (register user ->add user memory ->query user memory) refer to Docker Compose up test cases

Start using pyCharm

run start_api

1、 Enter the MemOS/dock/Dockerfile file and modify the running configuration
# Start the docker
CMD ["uvicorn", "memos.api.start_api:app", "--host", "0.0.0.0", "--port", "8000", "--reload"]

2、Go to the MemOS/src/emos/API directory and run start_api.py directly

run product_api

1、 Enter the MemOS/dock/Dockerfile file and modify the running configuration
# Start the docker
CMD ["uvicorn", "memos.api.product_api:app", "--host", "0.0.0.0", "--port", "8000", "--reload"]

2、 Go to the MemOS/src/emos/API directory and run product_api.py directly

Architecture

MemOS is made up of core modules that work together to turn your LLM into a truly memory-augmented system — from orchestration to storage to retrieval.

Examples

Congratulations - you've mastered the Quick Start and built your first working memory! Now it's time to see how far you can take MemOS by combining different memory types and features. Use these curated examples to inspire your own agents, chatbots, or knowledge systems.

REST API Server

Features

Running Locally

With Docker Compose up

Create a `.env` file in the root directory and set your environment variables. For example:

Start Docker Client

Enter Docker Folder

Install Dependencies

Start the container using Docker Compose Up in the Docker directory (ensuring proper VPN connection):

Access the API at http://localhost:8000/docs.

Example process

Test case (registered user ->add user memory ->query user memory)

Registered http://localhost:8000/product/users/register (POST)

Add User Memory http://localhost:8000/product/add (POST)

Query User Memory http://localhost:8000/product/search (POST)

Making changes to the server code or the library code will automatically reload the server.

With Docker

Create a `.env` file in the root directory and set your environment variables. For example:

Build the docker image locally.

Start in Docker first neo4j and qdrant

Run the Docker container:

Access the API at http://localhost:8000/docs.

Test cases (register user ->add user memory ->query user memory) refer to Docker Compose up test cases

Without Docker

Create a `.env` file in the root directory and set your environment variables. For example:

Install Poetry for dependency management:

Configure Poetry environment variables:

Install all project dependencies and development tools:

Start in Docker first neo4j and qdrant

Start the FastAPI server:

After the server is running, you can use OpenAPI documentation to test the API http://localhost:8000/docs or http://127.0.0.1:8000/docs

Test cases (register user ->add user memory ->query user memory) refer to Docker Compose up test cases

Start using pyCharm

run start_api

run product_api