Version 0.5.0: Media Actions for Audio Flashcards
We’re excited to announce Anki MCP 0.5.0, which introduces media file management capabilities. Now you can upload audio files, images, and other media directly to Anki - perfect for creating immersive language learning flashcards with AI-generated audio!
What’s New
🎵 Media Actions Tool
A unified mediaActions
tool that handles all media file operations in Anki:
Upload Media
storeMediaFile
- Upload audio files from base64 data, file paths, or URLs- Supports all formats: MP3, WAV, OGG for audio; JPG, PNG, GIF for images
- Automatic filename handling with optional underscore prefix to prevent cleanup
Manage Media
retrieveMediaFile
- Download media files as base64getMediaFilesNames
- List all media files with optional pattern filteringdeleteMediaFile
- Remove unwanted media files
🎤 Perfect for ElevenLabs Integration
The star use case? AI-generated audio flashcards for language learning.
With ElevenLabs MCP server, you can now create a complete workflow:
- Generate native pronunciation audio with ElevenLabs
- Upload it to Anki with
mediaActions
- Create flashcards with embedded audio references
- Study with authentic pronunciation
Example Workflow:
You: "Generate Spanish pronunciation for 'Buenos días, ¿cómo está usted?'"
AI with ElevenLabs: *Generates audio file with native speaker voice*
You: "Upload this to Anki as spanish_greeting.mp3"
AI with Anki MCP: *Stores audio in Anki's media collection*
You: "Create a flashcard with this audio"
AI: *Creates note with [sound:spanish_greeting.mp3] reference*
When you review the card in Anki, you hear perfect native pronunciation!
🌍 Multi-Language Support
ElevenLabs supports 29+ languages with their eleven_multilingual_v2
model:
- Spanish, French, German, Italian
- Portuguese, Dutch, Polish, Russian
- Japanese, Korean, Chinese
- Arabic, Turkish, Hindi
- And many more!
Each with authentic native speaker voices.
Getting Started with Audio Flashcards
Prerequisites
- Install Anki MCP (Desktop or Web Mode)
- Add ElevenLabs MCP Server
- Repository: github.com/elevenlabs/elevenlabs-mcp
- Get your API key from ElevenLabs
- Follow their setup guide
Basic Audio Card Workflow
Step 1: Generate Audio with AI Ask your AI assistant (Claude, etc.) to generate pronunciation:
"Generate Spanish audio for 'Hola, ¿cómo estás?' using ElevenLabs"
The AI uses ElevenLabs to create native-quality audio and saves it locally.
Step 2: Upload to Anki
"Upload this audio file to Anki as spanish_hello.mp3"
The AI uses mediaActions
to store the file in your Anki media collection.
Step 3: Create Flashcard
"Create a flashcard with:
- Front: [sound:spanish_hello.mp3] (click to hear)
- Back: 'Hello, how are you?' (English translation)"
The AI creates the card with the audio reference.
Step 4: Study! Open Anki and study your deck. Click the audio icon to hear native pronunciation.
Advanced: Custom Audio Models
For even better control, create a custom Anki note type:
Fields:
- Audio - Filename only (e.g.,
spanish_audio.mp3
) - Text - Original phrase in target language
- Translation - Translation in your native language
Front Template:
<div style="text-align: center; padding: 20px;">
<button onclick="var audio = document.createElement('audio');
audio.src='{{Audio}}'; audio.play(); return false;"
style="font-size: 36px; padding: 15px 30px;
background: #4CAF50; color: white; border: none;
border-radius: 8px; cursor: pointer;">
▶️ Listen
</button>
</div>
Back Template:
{{FrontSide}}
<hr id="answer">
<div class="target-text">{{Text}}</div>
<div class="translation"><b>Translation:</b> {{Translation}}</div>
This gives you:
- ✅ Audio-first learning - Hear before you see
- ✅ No autoplay - Active engagement with click-to-listen
- ✅ Clean interface - Big, friendly play button
- ✅ Full control - See text and translation after listening
Use Cases Beyond Language Learning
Pronunciation Practice
- Medical terms with correct pronunciation
- Scientific vocabulary
- Historical names and places
Music Education
- Chord progressions
- Rhythm examples
- Musical phrases
Accessibility
- Audio descriptions for visual learners
- Text-to-speech for any content
- Multi-modal learning experiences
Technical Details
Architecture
The mediaActions
tool uses a dispatcher pattern:
- Single unified tool with
action
parameter - Four specialized actions with dedicated implementations
- Runtime validation for required fields
- Full TypeScript type safety
Experimental: Unified Tool Pattern
We’re testing a new approach with mediaActions
to address a common pain point: tool approval fatigue.
Claude Desktop requires users to manually approve each tool on first use for security. With 27+ tools in our server, this means clicking “Allow” 27 times. By consolidating 4 media operations into a single mediaActions
tool, users only need to approve once.
The Trade-off:
- ✅ Fewer approvals - One tool instead of four
- ✅ Simpler tool list - Less overwhelming for new users
- ❓ New pattern - Different from traditional one-tool-per-action approach
We’re actively gathering feedback on this pattern. If it works well, we may apply it to other tool groups (model management, note operations, etc.). If users prefer individual tools for better discoverability, we can split them.
Your feedback matters! Let us know which approach you prefer in our Community & Support channels.
Path Aliases
We’ve also modernized the codebase with TypeScript path aliases:
@/mcp/*
replaces relative imports like../../../
- Cleaner imports across 40+ files
- Better IDE support and autocomplete
Testing
All media operations are thoroughly tested:
- 14 new tests for media actions
- 217 total tests passing
- 100% coverage of media workflows
Important Notes
Cost Consideration
ElevenLabs API calls incur costs based on your subscription:
- Check your plan limits before bulk generation
- Consider batching multiple cards at once
- Most efficient: Generate audio for an entire lesson in one session
Media Storage
- Anki stores media in
collection.media
folder - Files prefixed with
_
won’t be auto-deleted during cleanup - Use descriptive filenames for easy management
Get Started
Desktop Users (MCPB Bundle)
Download the latest bundle from GitHub:
Then drag & drop the .mcpb
file into Claude Desktop.
Web Mode Users
Update with one command:
npm install -g anki-mcp-http@latest
Or use without installing:
npx anki-mcp-http@latest
What’s Next
We’re still thinking about new features - keep in touch!
We’re actively seeking user feedback to shape the future of Anki MCP. Share your ideas and suggestions with us in our Community & Support channels.
Resources
Feedback Welcome
Try creating audio flashcards and let us know how it works for you!
Happy learning! 📚🎵
Full Changelog: v0.5.0 on GitHub