Category landing page

Best media skills for OpenClaw

Audio, voice, TTS, video, and media-processing skills for AI agent workflows.

View leaderboard Search all skills Compare skills

Matched skills

30

Real skills currently mapped into this landing page cluster.

Average security

60.9

A quick trust signal across the skills currently surfaced here.

Combined installs

10.6K

Adoption signal across the skills shown on this page.

Why this page exists

Media-focused skills often rank well on long-tail search because users search for concrete jobs like text-to-speech, audio processing, screenshots, or video extraction.

This first slice landing page groups those jobs into a scalable media cluster that can later expand into dedicated voice, audio, and video pages.

Top skills in this cluster

Ranked with live SkillsReview data and linked into detail pages that can convert search traffic into actual usage.

See full leaderboard →

Advanced filters

Combine security, update window, activity, and reputation filters. The current filter state stays in the query string so this category view is shareable.

SecurityUpdatedActivityReputation

#1

camsnap

by openclaw

Security 85

Capture frames or clips from RTSP/ONVIF cameras.

Category: media
Installs: 300
Stars: 100
Reviews: 5

Read review →Compare

#2

sherpa-onnx-tts

by openclaw

Security 85

Local text-to-speech via sherpa-onnx (offline, no cloud)

Category: ai
Installs: 300
Stars: 100
Reviews: 5

Read review →Compare

#3

songsee

by openclaw

Security 85

Generate spectrograms and feature-panel visualizations from audio with the songsee CLI.

Category: media
Installs: 300
Stars: 100
Reviews: 5

Read review →Compare

#4

summarize

by openclaw

Security 85

Summarize or extract text/transcripts from URLs, podcasts, and local files (great fallback for “transcribe this YouTube/video”).

Category: utility
Installs: 300
Stars: 100
Reviews: 5

Read review →Compare

#5

Nano Pdf

by community

Security 85

OpenClaw skill indexed by SkillsReview.

Category: media
Installs: 0
Stars: 0
Reviews: 5

Read review →Compare

#6

Video Frames

by community

Security 85

OpenClaw skill indexed by SkillsReview.

Category: media
Installs: 0
Stars: 0
Reviews: 5

Read review →Compare

#7

Voice Call

by community

Security 85

OpenClaw skill indexed by SkillsReview.

Category: other
Installs: 0
Stars: 0
Reviews: 5

Read review →Compare

#8

Agent Profile Images

by community

Security 71

OpenClaw skill indexed by SkillsReview.

Category: media
Installs: 0
Stars: 0
Reviews: 96

Read review →Compare

#9

OpenClaw 11-in-1 Visual Automation Suite (Windows Only) Complete visual automation toolkit with 11 integrated modules. ### 💰 Price One-time purchase: $2.99 (Lifetime access to all modules + future updates) ### 🚀 How to Purchase 1. Pay via PayPal Invoice: 🔗 [Click to pay $2.99](https://www.paypal.com/invoice/p/#V2RC9S8LVKJ434R9) 2. After payment, send your email to: 1215066513@qq.com 3. I will send the full download link within 12 hours. ### 🖥️ Compatibility - Windows 10 / 11 only - Not compatible with macOS / Linux ## 1. Product Basic Description ### 1.1 Core Functions Provides professional universal computer vision automation capabilities covering the full-process visual automation scenarios such as environment initialization, full-screen automatic screenshot, OCR text recognition, template matching target localization, mouse click simulation, keyboard input simulation, and complete environment initialization & cleanup mechanisms. It supports custom task combination and cyclic execution. ### 1.2 Version & Directory Description - Core Capability: Flexible invocation based on minimum executable units, supporting parameter customization, result variable inheritance, and custom skill saving. All functions can be used directly with the `call` command right after extracting the package. - Directory Structure: - `claw.json` - Skill package configuration file - `skills/all_skills.claw` - All skill unit definitions - `templates/` - Directory for template images (place your template images here for matching) - Temporary file directory `temp/` (for storing screenshots like temp/screen.png) is automatically created after executing `init_env`; temporary screenshot files can be cleaned up via `clean_temp`. - Version Info: Current version: 1.0.0; Compatible with OpenClaw >= 1.0.0 ### 1.3 Paid Attribute This automation skill system (vision-auto-tool-pro) is a paid professional toolkit. The document does not explicitly authorize commercial use of the toolkit. The paid permission only covers basic usage (non-commercial by default), and commercial use requires separate confirmation of authorization with the provider (e.g., purchasing a commercial license, signing a commercial agreement). ## 2. Complete Skill Invocation Manual ### Important Notes Ensure sufficient time is reserved for the computer to respond to each click or operation. For example, add a 2-second wait after `mouse_click` to avoid operation failure due to slow system response. ### 2.1 List of All Minimum Executable Units | Unit Name | Fixed Call Name | Function Description | Individual Call Method | |-------------------------|--------------------------|--------------------------------------------------------------------------------------|-------------------------------------------------| | Initialize Environment | `init_env` | Create directory structure, clear temporary files, check template directory | `call init_env` | | Full Screen Screenshot | `screenshot_full` | Capture entire screen and save as temp/screen.png | `call screenshot_full` | | Check Screenshot Validity | `check_screenshot_valid` | Check for black screen/freeze, wake up the interface if invalid | `call check_screenshot_valid` | | Wake Interface | `wake_window` | Solve the problems of background non-rendering and black screenshot | `call wake_window` | | OCR Recognition | `ocr_recognize` | Recognize all text on the screen and their corresponding coordinates | `call ocr_recognize` | | Template Matching | `template_match` | Use template image to match and locate icons/buttons | `call template_match category template_name` | | Unified Localization | `locate_target` | Prioritize OCR positioning; use template matching if not found, return coordinates | `call locate_target target_text OR category+template_name` | | Mouse Click | `mouse_click` | Move to the specified coordinates and perform click operation | `call mouse_click X Y [click_type, default=single_click]` | | Keyboard Input | `keyboard_input` | Input text after locating the input box | `call keyboard_input target_coords/description input_content` | | Clean Temporary Files | `clean_temp` | Delete temporary screenshots and free up storage space | `call clean_temp` | | Loop Restart | `loop_restart` | Wait 2 seconds then go back to the screenshot step and restart the process | `call loop_restart` | ### 2.2 Method for Invoking Individual Units #### Invocation Format ``` call [unit_call_name] [parameter...] ``` #### Invocation Examples - Initialize environment: `call init_env` - Template match browser icon on desktop: `call template_match desktop web` - Perform double-click at coordinates (100,200): `call mouse_click 100 200 double` ### 2.3 Combine into Custom New Tasks By writing one call instruction per line in execution order, you can combine them into a custom new task, which supports variable inheritance, looping, and permanent saving. #### Format Example (Open Browser) ``` # Task Name: Open Browser call init_env call screenshot_full call check_screenshot_valid call locate_target browser desktop Browser call mouse_click {{resultX}} {{resultY}} double call clean_temp ``` #### Combination Steps 1. Write task name and description first (for easier identification later) 2. In execution order, write one `call unit_name parameters` instruction per line 3. Coordinates can use variables `{{resultX}}`/`{{resultY}}` to inherit the output result of the previous unit 4. If cyclic execution is required, add `call loop_restart` at the end 5. Save custom skill: Use `save_skill skill_name instruction_list` to save the task permanently, then call it directly with `call skill_name` ### 2.4 Complete Main Flow Invocation Example ``` # General Main Flow: vision_auto_main call init_env call screenshot_full call check_screenshot_valid call ocr_recognize # If template matching is needed, add this line: call template_match category name call locate_target target_text call mouse_click {{X}} {{Y}} # If text input is needed, replace the above line with: call keyboard_input {{X}} {{Y}} input_content call clean_temp # Add this line if you need to loop: call loop_restart ``` ### Important Notes Ensure sufficient time is reserved for the computer to respond to each click or operation. > For example, add a 2-second wait after `mouse_click` to avoid operation failure due to slow system response.

by community

Security 71

OpenClaw skill indexed by SkillsReview.

Category: web
Installs: 0
Stars: 0
Reviews: 99

Read review →Compare

#10

MiniMax Coding Plan Tool Patched

by community

Security 70

Use MiniMax Coding Plan API for real-time web search and image understanding (VLM). Based on yorch233/minimax-coding-plan-tool, patched to use api.minimax.io...

Category: development
Installs: 4.7K
Stars: 47
Reviews: 94

Read review →Compare

#11

Coding Tutorial Video

by community

Security 70

Coding Tutorial Video Maker — Create Programming Walkthroughs and Dev Tutorials. Works by connecting to the NemoVideo AI backend. Supports MP4, MOV, AVI, Web...

Category: development
Installs: 4.7K
Stars: 47
Reviews: 94

Read review →Compare

#12

(Google) Veo 3 Video Gen

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: media
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#13

04 Text To Image

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: media
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#14

05 Image To Video

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: media
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#15

06 Tts Voice

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: other
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#16

08 Video Merge

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: media
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#17

1. The input is a Word template document; 2. Analyze the structure of this template, including: 1) Font styles, sizes, etc., for headings, body text, etc.; 2) Content: the general structure of each paragraph; 3) Multimedia elements: images, text, tables (including metric data, etc.) 3. Generate documents that conform to the template structure based on the given data 1) A knowledge base, or multiple documents; 2) Relationship tables (or CSV files), SQL query results, etc.

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: data
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#18

1031 Exchange Company Video

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: media
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#19

1031 Exchange Intermediary Company Video

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: media
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#20

1031 Exchange Service Video

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: media
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#21

401k Advisor Company Video

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: media
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#22

5g Network Company Video

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: media
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#23

A B Testing Company Video

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: coding
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#24

A B Testing Platform Video

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: coding
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#25

A Mathematics Problem-Solving Coach Skill Based on Socratic Dialogue Guidance

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: media
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#26

A skill that automates repurposing Chinese social videos (Douyin/Bilibili/Xiaohongshu) to international platforms (TikTok/YouTube/Instagram) via the Lumi API — handling translation, AI dubbing, and publishing in one workflow.

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: ai
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#27

A4 To A3 Pdf

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: media
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#28

ADP Global Invoice Extraction · Free API

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: other
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#29

AI Content Brief, Script & Outline Generator — Research Assistant for Video & Image generation

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: web
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

#30

AI Image & Video Toolkit — Free Upscale, Face Enhance, BG Remove & Generation

by community

Security 50

OpenClaw skill indexed by SkillsReview.

Category: ai
Installs: 0
Stars: 0
Reviews: 0

Read review →Compare

Popular comparisons from this cluster

Internal comparison links help users evaluate adjacent options and give search engines deeper crawlable structure.

camsnap vs sherpa-onnx-tts camsnap vs songsee sherpa-onnx-tts vs songsee

Related landing pages

This internal-link graph is the scalable part of the rollout: each new template adds more crawlable surfaces without hand-copying content.

Explore more hubs →

Best Voice and Media Skills

For TTS, audio processing, video workflows, and media automation.

Best Browser Automation Skills

For browsing, scraping, QA, site control, and web task execution.

Best AI Skills for OpenClaw

Skills focused on models, LLM workflows, prompts, transcription, and AI-native automation.

FAQ

Is transcription included on the media page?

Often yes, especially when skills overlap between media processing and AI model usage.

Will there be separate TTS or audio pages later?

Yes. The taxonomy and templates were designed so narrower voice, TTS, and audio pages can be added without rewriting the system.