Category landing page

Best media skills for OpenClaw

Audio, voice, TTS, video, and media-processing skills for AI agent workflows.

Matched skills

693

Real skills currently mapped into this landing page cluster.

Average security

75.9

A quick trust signal across the skills currently surfaced here.

Combined installs

12.8K

Adoption signal across the skills shown on this page.

Why this page exists

Media-focused skills often rank well on long-tail search because users search for concrete jobs like text-to-speech, audio processing, screenshots, or video extraction.

This first slice landing page groups those jobs into a scalable media cluster that can later expand into dedicated voice, audio, and video pages.

Top skills in this cluster

Ranked with live SkillsReview data and linked into detail pages that can convert search traffic into actual usage.

See full leaderboard β†’
#1

sag

by openclaw

Security 85

ElevenLabs text-to-speech with mac-style say UX.

Category
media
Installs
930
Stars
310
Reviews
12
#2

blucli

by openclaw

Security 85

BluOS CLI (blu) for discovery, playback, grouping, and volume.

Category
media
Installs
300
Stars
100
Reviews
5
#5

songsee

by openclaw

Security 85

Generate spectrograms and feature-panel visualizations from audio with the songsee CLI.

Category
media
Installs
300
Stars
100
Reviews
5
#7

summarize

by openclaw

Security 85

Summarize or extract text/transcripts from URLs, podcasts, and local files (great fallback for β€œtranscribe this YouTube/video”).

Category
utility
Installs
300
Stars
100
Reviews
5
#9

xurl

by openclaw

Security 85

A CLI tool for making authenticated requests to the X (Twitter) API. Use this skill when you need to post tweets, reply, quote, search, read posts, manage followers, send DMs, upload media, or interac

Category
utility
Installs
300
Stars
100
Reviews
5
#13

OpenClaw 11-in-1 Visual Automation Suite (Windows Only) Complete visual automation toolkit with 11 integrated modules. ### πŸ’° Price One-time purchase: **$2.99** (Lifetime access to all modules + future updates) ### πŸš€ How to Purchase 1. Pay via PayPal Invoice: πŸ”— [Click to pay $2.99](https://www.paypal.com/invoice/p/#V2RC9S8LVKJ434R9) 2. After payment, send your email to: **1215066513@qq.com** 3. I will send the full download link within 12 hours. ### πŸ–₯️ Compatibility - Windows 10 / 11 only - Not compatible with macOS / Linux ## 1. Product Basic Description ### 1.1 Core Functions Provides professional universal computer vision automation capabilities covering the full-process visual automation scenarios such as environment initialization, full-screen automatic screenshot, OCR text recognition, template matching target localization, mouse click simulation, keyboard input simulation, and complete environment initialization & cleanup mechanisms. It supports custom task combination and cyclic execution. ### 1.2 Version & Directory Description - Core Capability: Flexible invocation based on minimum executable units, supporting parameter customization, result variable inheritance, and custom skill saving. All functions can be used directly with the `call` command right after extracting the package. - Directory Structure: - `claw.json` - Skill package configuration file - `skills/all_skills.claw` - All skill unit definitions - `templates/` - Directory for template images (place your template images here for matching) - Temporary file directory `temp/` (for storing screenshots like temp/screen.png) is automatically created after executing `init_env`; temporary screenshot files can be cleaned up via `clean_temp`. - Version Info: Current version: 1.0.0; Compatible with OpenClaw >= 1.0.0 ### 1.3 Paid Attribute This automation skill system (vision-auto-tool-pro) is a paid professional toolkit. The document does not explicitly authorize commercial use of the toolkit. The paid permission only covers basic usage (non-commercial by default), and commercial use requires separate confirmation of authorization with the provider (e.g., purchasing a commercial license, signing a commercial agreement). ## 2. Complete Skill Invocation Manual ### Important Notes Ensure sufficient time is reserved for the computer to respond to each click or operation. For example, add a 2-second wait after `mouse_click` to avoid operation failure due to slow system response. ### 2.1 List of All Minimum Executable Units | Unit Name | Fixed Call Name | Function Description | Individual Call Method | |-------------------------|--------------------------|--------------------------------------------------------------------------------------|-------------------------------------------------| | Initialize Environment | `init_env` | Create directory structure, clear temporary files, check template directory | `call init_env` | | Full Screen Screenshot | `screenshot_full` | Capture entire screen and save as temp/screen.png | `call screenshot_full` | | Check Screenshot Validity | `check_screenshot_valid` | Check for black screen/freeze, wake up the interface if invalid | `call check_screenshot_valid` | | Wake Interface | `wake_window` | Solve the problems of background non-rendering and black screenshot | `call wake_window` | | OCR Recognition | `ocr_recognize` | Recognize all text on the screen and their corresponding coordinates | `call ocr_recognize` | | Template Matching | `template_match` | Use template image to match and locate icons/buttons | `call template_match category template_name` | | Unified Localization | `locate_target` | Prioritize OCR positioning; use template matching if not found, return coordinates | `call locate_target target_text OR category+template_name` | | Mouse Click | `mouse_click` | Move to the specified coordinates and perform click operation | `call mouse_click X Y [click_type, default=single_click]` | | Keyboard Input | `keyboard_input` | Input text after locating the input box | `call keyboard_input target_coords/description input_content` | | Clean Temporary Files | `clean_temp` | Delete temporary screenshots and free up storage space | `call clean_temp` | | Loop Restart | `loop_restart` | Wait 2 seconds then go back to the screenshot step and restart the process | `call loop_restart` | ### 2.2 Method for Invoking Individual Units #### Invocation Format ``` call [unit_call_name] [parameter...] ``` #### Invocation Examples - Initialize environment: `call init_env` - Template match browser icon on desktop: `call template_match desktop web` - Perform double-click at coordinates (100,200): `call mouse_click 100 200 double` ### 2.3 Combine into Custom New Tasks By writing one call instruction per line in execution order, you can combine them into a custom new task, which supports variable inheritance, looping, and permanent saving. #### Format Example (Open Browser) ``` # Task Name: Open Browser call init_env call screenshot_full call check_screenshot_valid call locate_target browser desktop Browser call mouse_click {{resultX}} {{resultY}} double call clean_temp ``` #### Combination Steps 1. **Write task name and description first** (for easier identification later) 2. **In execution order**, write one `call unit_name parameters` instruction per line 3. Coordinates can use variables `{{resultX}}`/`{{resultY}}` to inherit the output result of the previous unit 4. If cyclic execution is required, add `call loop_restart` at the end 5. **Save custom skill**: Use `save_skill skill_name instruction_list` to save the task permanently, then call it directly with `call skill_name` ### 2.4 Complete Main Flow Invocation Example ``` # General Main Flow: vision_auto_main call init_env call screenshot_full call check_screenshot_valid call ocr_recognize # If template matching is needed, add this line: call template_match category name call locate_target target_text call mouse_click {{X}} {{Y}} # If text input is needed, replace the above line with: call keyboard_input {{X}} {{Y}} input_content call clean_temp # Add this line if you need to loop: call loop_restart ``` ### Important Notes Ensure sufficient time is reserved for the computer to respond to each click or operation. > For example, add a 2-second wait after `mouse_click` to avoid operation failure due to slow system response.

by community

Security 71

OpenClaw skill indexed by SkillsReview.

Category
web
Installs
0
Stars
0
Reviews
99
Security 70

Use MiniMax Coding Plan API for real-time web search and image understanding (VLM). Based on yorch233/minimax-coding-plan-tool, patched to use api.minimax.io...

Category
development
Installs
4.7K
Stars
47
Reviews
94
#15

Coding Tutorial Video

by community

Security 70

Coding Tutorial Video Maker β€” Create Programming Walkthroughs and Dev Tutorials. Works by connecting to the NemoVideo AI backend. Supports MP4, MOV, AVI, Web...

Category
development
Installs
4.7K
Stars
47
Reviews
94

OpenClaw skill indexed by SkillsReview.

Category
data
Installs
0
Stars
0
Reviews
0

Popular comparisons from this cluster

Internal comparison links help users evaluate adjacent options and give search engines deeper crawlable structure.

Related landing pages

This internal-link graph is the scalable part of the rollout: each new template adds more crawlable surfaces without hand-copying content.

Explore more hubs β†’

FAQ

Is transcription included on the media page?

Often yes, especially when skills overlap between media processing and AI model usage.

Will there be separate TTS or audio pages later?

Yes. The taxonomy and templates were designed so narrower voice, TTS, and audio pages can be added without rewriting the system.