Agent Skills: Why Code Enhances LLM Efficiency Over Markdown for Scraping
Clip title: Agent Skills: Code Beats Markdown (Here’s Why) Author / channel: Sam Witteveen URL: https://www.youtube.com/watch?v=IjiaCOt7bP8
Summary
This video provides an in-depth look at Agent Skills (formerly Claude Skills) and best practices for developing efficient scraping skills for large language models (LLMs). The presenter highlights that Agent Skills have become a “killer tool” for helping both models and agent harnesses achieve tasks effectively. These skills have evolved into an open standard, adopted by major AI companies like Anthropic (Claude Code), OpenAI (Codex), and Google DeepMind (Antigravity, Gemini CLI), leading to a proliferation of available AI capabilities and marketplaces like skills.sh and skillsmp.com. The core mechanism behind Agent Skills’ effectiveness is “progressive disclosure” or “context engineering,” where a small “Skill Index” is always loaded, and more detailed instructions or scripts are only loaded into the model’s context window when triggered, optimizing token usage.
The structure of an Agent Skill typically includes a required SKILL.md
file for instructions and metadata, along with optional folders for scripts
(executable code), references (documentation/examples), and assets
(templates/resources). The video emphasizes the power of incorporating
scripts, which allow models to execute code in sandboxed environments,
rewrite scripts, retrieve more context, and interact with APIs, making them
significantly more efficient than solely relying on markdown instructions.
A significant portion of the video is dedicated to common mistakes and best practices when building scraping skills for LLMs, focusing on optimizing token consumption and ensuring reliability. Key recommendations include:
- Stripping HTML: Instead of fetching and processing entire raw HTML pages, which can waste thousands of tokens, developers should pre-process and strip unnecessary elements (like scripts, styles, navigation, footers, ads) to retain only meaningful content.
- Prefilling Known Structures: Rather than making the LLM figure out a page’s structure in every run, hardcode known structures (e.g., CSS selectors for article titles, users, scores on a news site) into the script. The LLM can extract this information once, and the script can then apply these patterns for precise data extraction.
- Using Scripts for Heavy Lifting: Delegate complex parsing and data conversion (e.g., to JSON) to the scripts, returning clean, structured data to the LLM. This reduces the LLM’s cognitive load and token usage.
- Defining Output Schemas: Establish a strict output schema for scraped data, ideally defined within the script (e.g., a JSON format for title, URL, date, summary, source). This ensures consistent output, simplifies downstream processing, and allows for easier comparison and validation.
- Batch Searches for Efficiency: For tasks involving multiple searches or fetches, execute them in parallel within the script to reduce round-trips and save time and cost, rather than performing sequential fetches.
- Setting Hard Limits and Stop Conditions: Implement maximum call
limits (e.g., max pages to fetch, max searches) and detect infinite loop
conditions in the
SKILL.mdto prevent excessive token consumption and resource usage, particularly when dealing with pagination. DataImpulse, is introduced here as a solution for reliable proxies, which are crucial for preventing IP blocking during large-scale scraping operations. - Designing for Incremental Runs: Incorporate logic to check for previous reports, skip already processed URLs, and only fetch genuinely new content. This minimizes redundant work and further optimizes token usage over time.
- Hardcoding Knowns: Store unchanging configuration details like base URLs, timeouts, categories, and selectors as hardcoded values or environment variables, preventing the LLM from repeatedly processing or deducing them.
In conclusion, while Agent Skills offer powerful new capabilities for AI models, the video strongly emphasizes that effective development hinges on adhering to best practices focused on efficiency and token management. By being intentional about what goes into and comes out of the model’s context window, developers can create more cost-effective, reliable, and performant AI applications and agents.
Related Concepts
- Agent Skills — Wikipedia
- LLM Efficiency — Wikipedia
- Web Scraping — Wikipedia
- Code-based scraping — Wikipedia
- Markdown-based scraping — Wikipedia
- Large Language Models — Wikipedia
- Agent harnesses — Wikipedia
- Context Engineering — Wikipedia
- Progressive Disclosure — Wikipedia
- Context Window Optimization — Wikipedia
- Skill Index — Wikipedia
- Data Extraction — Wikipedia
- Token Consumption Optimization — Wikipedia
- Structured Data — Wikipedia
- HTML Stripping — Wikipedia
- CSS Selectors — Wikipedia
- Output Schema Definition — Wikipedia
- Sandboxed Execution — Wikipedia
Related Entities
- Sam Witteveen — Wikipedia
- Anthropic — Wikipedia
- OpenAI — Wikipedia
- Google DeepMind — Wikipedia
- Claude Code — Wikipedia
- Codex — Wikipedia
- Gemini CLI — Wikipedia
- skills.sh — Wikipedia
- skillsmp.com — Wikipedia