🎉 v0.6.0 Released! Check out the latest version with new features and improvements. Read more → | Download →
Version 0.5.0: Media Actions for Audio Flashcards

Version 0.5.0: Media Actions for Audio Flashcards

October 6, 2025

We’re excited to announce Anki MCP 0.5.0, which introduces media file management capabilities. Now you can upload audio files, images, and other media directly to Anki - perfect for creating immersive language learning flashcards with AI-generated audio!

What’s New

🎵 Media Actions Tool

A unified mediaActions tool that handles all media file operations in Anki:

Upload Media

  • storeMediaFile - Upload audio files from base64 data, file paths, or URLs
  • Supports all formats: MP3, WAV, OGG for audio; JPG, PNG, GIF for images
  • Automatic filename handling with optional underscore prefix to prevent cleanup

Manage Media

  • retrieveMediaFile - Download media files as base64
  • getMediaFilesNames - List all media files with optional pattern filtering
  • deleteMediaFile - Remove unwanted media files

🎤 Perfect for ElevenLabs Integration

The star use case? AI-generated audio flashcards for language learning.

With ElevenLabs MCP server, you can now create a complete workflow:

  1. Generate native pronunciation audio with ElevenLabs
  2. Upload it to Anki with mediaActions
  3. Create flashcards with embedded audio references
  4. Study with authentic pronunciation

Example Workflow:

You: "Generate Spanish pronunciation for 'Buenos días, ¿cómo está usted?'"
AI with ElevenLabs: *Generates audio file with native speaker voice*

You: "Upload this to Anki as spanish_greeting.mp3"
AI with Anki MCP: *Stores audio in Anki's media collection*

You: "Create a flashcard with this audio"
AI: *Creates note with [sound:spanish_greeting.mp3] reference*

When you review the card in Anki, you hear perfect native pronunciation!

🌍 Multi-Language Support

ElevenLabs supports 29+ languages with their eleven_multilingual_v2 model:

  • Spanish, French, German, Italian
  • Portuguese, Dutch, Polish, Russian
  • Japanese, Korean, Chinese
  • Arabic, Turkish, Hindi
  • And many more!

Each with authentic native speaker voices.

Getting Started with Audio Flashcards

Prerequisites

  1. Install Anki MCP (Desktop or Web Mode)
  2. Add ElevenLabs MCP Server

Basic Audio Card Workflow

Step 1: Generate Audio with AI Ask your AI assistant (Claude, etc.) to generate pronunciation:

"Generate Spanish audio for 'Hola, ¿cómo estás?' using ElevenLabs"

The AI uses ElevenLabs to create native-quality audio and saves it locally.

Step 2: Upload to Anki

"Upload this audio file to Anki as spanish_hello.mp3"

The AI uses mediaActions to store the file in your Anki media collection.

Step 3: Create Flashcard

"Create a flashcard with:
- Front: [sound:spanish_hello.mp3] (click to hear)
- Back: 'Hello, how are you?' (English translation)"

The AI creates the card with the audio reference.

Step 4: Study! Open Anki and study your deck. Click the audio icon to hear native pronunciation.

Advanced: Custom Audio Models

For even better control, create a custom Anki note type:

Fields:

  • Audio - Filename only (e.g., spanish_audio.mp3)
  • Text - Original phrase in target language
  • Translation - Translation in your native language

Front Template:

<div style="text-align: center; padding: 20px;">
  <button onclick="var audio = document.createElement('audio');
    audio.src='{{Audio}}'; audio.play(); return false;"
    style="font-size: 36px; padding: 15px 30px;
    background: #4CAF50; color: white; border: none;
    border-radius: 8px; cursor: pointer;">
    ▶️ Listen
  </button>
</div>

Back Template:

{{FrontSide}}
<hr id="answer">
<div class="target-text">{{Text}}</div>
<div class="translation"><b>Translation:</b> {{Translation}}</div>

This gives you:

  • Audio-first learning - Hear before you see
  • No autoplay - Active engagement with click-to-listen
  • Clean interface - Big, friendly play button
  • Full control - See text and translation after listening

Use Cases Beyond Language Learning

Pronunciation Practice

  • Medical terms with correct pronunciation
  • Scientific vocabulary
  • Historical names and places

Music Education

  • Chord progressions
  • Rhythm examples
  • Musical phrases

Accessibility

  • Audio descriptions for visual learners
  • Text-to-speech for any content
  • Multi-modal learning experiences

Technical Details

Architecture

The mediaActions tool uses a dispatcher pattern:

  • Single unified tool with action parameter
  • Four specialized actions with dedicated implementations
  • Runtime validation for required fields
  • Full TypeScript type safety

Experimental: Unified Tool Pattern

We’re testing a new approach with mediaActions to address a common pain point: tool approval fatigue.

Claude Desktop requires users to manually approve each tool on first use for security. With 27+ tools in our server, this means clicking “Allow” 27 times. By consolidating 4 media operations into a single mediaActions tool, users only need to approve once.

The Trade-off:

  • Fewer approvals - One tool instead of four
  • Simpler tool list - Less overwhelming for new users
  • New pattern - Different from traditional one-tool-per-action approach

We’re actively gathering feedback on this pattern. If it works well, we may apply it to other tool groups (model management, note operations, etc.). If users prefer individual tools for better discoverability, we can split them.

Your feedback matters! Let us know which approach you prefer in our Community & Support channels.

Path Aliases

We’ve also modernized the codebase with TypeScript path aliases:

  • @/mcp/* replaces relative imports like ../../../
  • Cleaner imports across 40+ files
  • Better IDE support and autocomplete

Testing

All media operations are thoroughly tested:

  • 14 new tests for media actions
  • 217 total tests passing
  • 100% coverage of media workflows

Important Notes

Cost Consideration

ElevenLabs API calls incur costs based on your subscription:

  • Check your plan limits before bulk generation
  • Consider batching multiple cards at once
  • Most efficient: Generate audio for an entire lesson in one session

Media Storage

  • Anki stores media in collection.media folder
  • Files prefixed with _ won’t be auto-deleted during cleanup
  • Use descriptive filenames for easy management

Get Started

Desktop Users (MCPB Bundle)

Download the latest bundle from GitHub:

Download v0.5.0

Then drag & drop the .mcpb file into Claude Desktop.

Web Mode Users

Update with one command:

npm install -g anki-mcp-http@latest

Or use without installing:

npx anki-mcp-http@latest

What’s Next

We’re still thinking about new features - keep in touch!

We’re actively seeking user feedback to shape the future of Anki MCP. Share your ideas and suggestions with us in our Community & Support channels.

Resources

Feedback Welcome

Try creating audio flashcards and let us know how it works for you!

Happy learning! 📚🎵


Full Changelog: v0.5.0 on GitHub