# vcli **Repository Path**: lvhaodeyeye/vcli ## Basic Information - **Project Name**: vcli - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-03-30 - **Last Updated**: 2026-03-31 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # vcli — Multimodal AI CLI A command-line tool that wraps the **Qwen3.5-Omni-Plus** multimodal model (Alibaba Bailian / DashScope) via its OpenAI-compatible API. Built with an extensible provider design to support Gemini and other models in the future. ## Model: Qwen3.5-Omni-Plus | 能力 | 规格 | |------|------| | 视频 | ≤400s 720p (1fps),含音视频理解 | | 音频 | >10小时长音频,60+语言输入,30+语言语音输出 | | 图片 | 多图理解 | | 文本 | 结构化理解,创作,问答 | ## Features - Ask questions with optional video, image, or audio input - **Audio auto-conversion**: DashScope 不支持 `input_audio`,自动将音频转为黑屏视频走 `video_url` 通道 - **Auto-compression**: 超过 4MB 的文件自动压缩(降分辨率/码率),适配 API 6MB body 限制 - Streaming responses (required by the Qwen API) - Local file upload via base64 encoding or remote URL support - Rotating file log at `~/.video_cli/logs/vcli.log` - Config stored at `~/.video_cli/config.yaml` (auto-created on first run) - Gemini provider stub (ready for future implementation) ## Tech Stack - **Python 3.11+** with [uv](https://github.com/astral-sh/uv) for environment management - **typer** — CLI framework - **openai** SDK — OpenAI-compatible HTTP client - **pyyaml** — config file serialization - **httpx** — HTTP transport - **ffmpeg** — audio-to-video conversion and video compression ## Setup ```bash # 1. Create virtual environment uv venv # 2. Install in editable mode uv pip install -e . # 3. Set your DashScope API key .venv/bin/vcli config set api_key ``` Or activate the venv first: ```bash source .venv/bin/activate vcli config set api_key ``` ### Dependencies - **ffmpeg** — required for audio processing and video compression ## Usage ### Ask a text question ```bash vcli ask "What is the capital of France?" ``` ### Ask about a video (local file) ```bash vcli ask "Describe what happens in this video" --video /path/to/video.mp4 ``` ### Ask about a video (URL) ```bash vcli ask "Summarize this video" --video https://example.com/video.mp4 ``` ### Ask about an image ```bash vcli ask "What objects are in this image?" --image /path/to/photo.jpg ``` ### Transcribe or analyze audio ```bash vcli ask "Transcribe this audio" --audio /path/to/recording.mp3 vcli ask "Summarize this lecture" --audio /path/to/lecture.mp3 ``` > Audio files are automatically converted to black-screen video internally > and sent via the video channel, as DashScope's OpenAI-compatible API > does not support the `input_audio` format. ### Use a specific provider ```bash vcli ask "hello" --provider qwen ``` ### Configuration ```bash # Show current config (API keys are masked) vcli config show # Set API key for a provider vcli config set api_key --provider qwen # List providers and their status vcli providers ``` ## Project Structure ``` video_reader/ ├── pyproject.toml # Project metadata and dependencies ├── README.md └── src/ └── vcli/ ├── __init__.py ├── main.py # CLI entry point (typer app) ├── config.py # Config load/save from ~/.video_cli/config.yaml ├── logger.py # Rotating file logger to ~/.video_cli/logs/ ├── utils.py # Base64 encode, MIME detection, URL check └── providers/ ├── __init__.py ├── base.py # Abstract BaseProvider, MediaInput, ProviderResponse ├── qwen.py # QwenProvider — qwen3.5-omni-plus (full implementation) └── gemini.py # GeminiProvider stub (NotImplementedError) ``` ## Config File Located at `~/.video_cli/config.yaml`, auto-created on first run: ```yaml default_provider: qwen providers: qwen: api_key: "" base_url: "https://dashscope.aliyuncs.com/compatible-mode/v1" model: "qwen3.5-omni-plus" gemini: api_key: "" base_url: "" model: "gemini-2.5-pro" ``` ## Logs Logs are written to `~/.video_cli/logs/vcli.log`: - **DEBUG** level and above in the file (rotating, max 10 MB, 5 backups) - **ERROR** level only on the console ## Adding a New Provider 1. Create `src/vcli/providers/.py` implementing `BaseProvider` 2. Register it in `PROVIDERS` dict in `main.py` 3. Add its config block in `config.py`'s `DEFAULT_CONFIG`