| Category | Method | Function | Key components |
| Context Management | check_context | 📊 Check context size | ContextChecker — checks whether context exceeds thresholds and splits messages |
compact_memory | 📦 Compact history into summary | Compactor — ReActAgent that generates structured context summaries |
compact_tool_result | ✂️ Compact long tool outputs | ToolResultCompactor — truncates long tool outputs and stores them in tool_result/ while keeping file references in messages |
pre_reasoning_hook | 🔄 Pre-reasoning hook | compact_tool_result + check_context + compact_memory + summary_memory (async) |
| Long-term Memory | summary_memory | 📝 Persist important memory to files | Summarizer — ReActAgent + file tools (read / write / edit) |
memory_search | 🔍 Semantic memory search | MemorySearch — hybrid retrieval with vectors + BM25 |
| Session Memory | get_in_memory_memory | 💾 Create in-session memory instance | Returns ReMeInMemoryMemory with dialog_path configured for persistence |
await_summary_tasks | ⏳ Wait for async summary tasks | Block until all background summary tasks complete |
| - | start | 🚀 Start memory system | Initialize file storage, file watcher, and embedding cache; clean up expired tool result files |
| - | close | 📕 Shutdown and cleanup | Clean up tool result files, stop file watcher, and persist embedding cache |
---
### 🚀 Quick start
#### Installation
**Install from source:**
```bash
git clone https://github.com/agentscope-ai/ReMe.git
cd ReMe
pip install -e ".[light]"
```
**Update to the latest version:**
```bash
git pull
pip install -e ".[light]"
```
#### Environment variables
`ReMeLight` uses environment variables to configure the embedding model and storage backends:
| Variable | Description | Example |
|----------------------|-------------------------------|-----------------------------------------------------|
| `LLM_API_KEY` | LLM API key | `sk-xxx` |
| `LLM_BASE_URL` | LLM base URL | `https://dashscope.aliyuncs.com/compatible-mode/v1` |
| `EMBEDDING_API_KEY` | Embedding API key (optional) | `sk-xxx` |
| `EMBEDDING_BASE_URL` | Embedding base URL (optional) | `https://dashscope.aliyuncs.com/compatible-mode/v1` |
#### Python usage
```python
import asyncio
from reme.reme_light import ReMeLight
async def main():
# Initialize ReMeLight
reme = ReMeLight(
default_as_llm_config={"model_name": "qwen3.5-35b-a3b"},
# default_embedding_model_config={"model_name": "text-embedding-v4"},
default_file_store_config={"fts_enabled": True, "vector_enabled": False},
enable_load_env=True,
)
await reme.start()
messages = [...] # List of conversation messages
# 1. Check context size (token counting, determine if compaction is needed)
messages_to_compact, messages_to_keep, is_valid = await reme.check_context(
messages=messages,
memory_compact_threshold=90000, # Threshold to trigger compaction (tokens)
memory_compact_reserve=10000, # Token count to reserve for recent messages
)
# 2. Compact conversation history into a structured summary
summary = await reme.compact_memory(
messages=messages,
previous_summary="",
max_input_length=128000, # Model context window (tokens)
compact_ratio=0.7, # Trigger compaction when exceeding max_input_length * 0.7
language="zh", # Summary language (e.g., "zh" / "")
)
# 3. Compact long tool outputs (prevent tool results from blowing up context)
messages = await reme.compact_tool_result(messages)
# 4. Pre-reasoning hook (auto compact tool results + check context + generate summaries)
processed_messages, compressed_summary = await reme.pre_reasoning_hook(
messages=messages,
system_prompt="You are a helpful AI assistant.",
compressed_summary="",
max_input_length=128000,
compact_ratio=0.7,
memory_compact_reserve=10000,
enable_tool_result_compact=True,
tool_result_compact_keep_n=3,
)
# 5. Persist important memory to files (writes to memory/YYYY-MM-DD.md)
summary_result = await reme.summary_memory(
messages=messages,
language="zh",
)
# 6. Semantic memory search (vector + BM25 hybrid retrieval)
result = await reme.memory_search(query="Python version preference", max_results=5)
# 7. Create in-session memory instance (manages context for one conversation)
memory = reme.get_in_memory_memory() # Auto-configures dialog_path
for msg in messages:
await memory.add(msg)
token_stats = await memory.estimate_tokens(max_input_length=128000)
print(f"Current context usage: {token_stats['context_usage_ratio']:.1f}%")
print(f"Message token count: {token_stats['messages_tokens']}")
print(f"Estimated total tokens: {token_stats['estimated_tokens']}")
# 8. Mark messages as compressed (auto-persists to dialog/YYYY-MM-DD.jsonl)
# await memory.mark_messages_compressed(messages_to_compact)
# Shutdown ReMeLight
await reme.close()
if __name__ == "__main__":
asyncio.run(main())
```
> 📂 Full example: [test_reme_light.py](tests/light/test_reme_light.py)
> 📋 Sample run log: [test_reme_light_log.txt](tests/light/test_reme_light_log.txt) (223,838 tokens → 1,105 tokens, 99.5%
> compression)
### Architecture of the file-based ReMeLight memory system
#### Context data structure
```mermaid
flowchart TD
A[Context] --> B[compact_summary]
B --> C[dialog path guide + Goal/Constraints/Progress/KeyDecisions/NextSteps]
A --> E[messages: full dialogue history]
A --> F[File System Cache]
F --> G[dialog/YYYY-MM-DD.jsonl]
F --> H[tool_result/uuid.txt N-day TTL]
```
---
[CoPaw MemoryManager](https://github.com/agentscope-ai/CoPaw/blob/main/src/copaw/agents/memory/reme_light_memory_manager.py)
inherits `ReMeLight` and integrates its memory capabilities into the agent reasoning loop:
```mermaid
graph LR
Agent[Agent] -->|Before each reasoning step| Hook[pre_reasoning_hook]
Hook --> TC[compact_tool_result