URL to Markdown API — Convert Webpages at Scale to Clean Markdown
Send any URL and get back clean, structured markdown. JavaScript rendering, content extraction, and noise removal built in. Designed for LLM pipelines, RAG systems, and AI agents.
67%
Fewer tokens than HTML
JS
Rendering included
REST
Simple API
Credits
Pay per conversion
SIMPLE PROCESS
How the URL to Markdown API works
Send a URL
Make a POST request with any public webpage URL. The API handles JavaScript rendering and content extraction automatically.
Content extraction
The page is rendered in a real browser, navigation and ads are stripped, and the main content is identified and extracted.
Get clean markdown
Receive structured markdown with proper headings, lists, tables, code blocks, and links. Ready for LLMs, documentation, or any markdown-compatible system.
Features
Built for developers and AI teams
Everything you need to convert webpages to markdown at scale.
Use Cases
Who uses URL to Markdown?
From AI engineers building RAG systems to content teams archiving documentation — here's how teams use LeadMagic's URL to Markdown API.
AI Agents & LLM Pipelines
Give your AI agents the ability to read and understand any webpage. Clean markdown input produces better LLM outputs with fewer tokens.
RAG Systems
Feed web content into retrieval-augmented generation pipelines. Markdown preserves document structure for better chunking and retrieval.
Content & Documentation
Convert web-based documentation, knowledge bases, and articles into portable markdown files for static site generators and wikis.
Developer Tools
Build internal tools, browser extensions, and automations that convert web content to markdown for processing, archival, or analysis.
Website to Markdown API for Any Webpage
Convert any website or webpage to markdown with a single API call. Our endpoint handles the full pipeline: fetching the page, rendering JavaScript, extracting the main content, and returning clean, structured markdown.
Works with any webpage — blog posts, documentation, landing pages, product pages, news articles. The API normalizes all output to consistent markdown format regardless of the source site's HTML structure.
Web Scraping for LLM Applications
Purpose-built for teams doing web scraping for LLM pipelines, RAG systems, and AI agents. Markdown output is token-efficient, preserves semantic structure, and chunks cleanly for vector embeddings.
Unlike general web scraping tools, our API focuses on content quality over crawl breadth. One URL in, clean markdown out — ready for your LLM data pipeline.
Developer Integration
API and MCP Server for AI Tools
Integrate with a simple REST call or connect directly as an MCP (Model Context Protocol) server — giving AI assistants like Claude, Cursor, and custom agents the ability to read any webpage.
REST API
curl -X POST https://api.web2md.app/api/scrape \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_API_KEY" \
-d '{"url":"https://example.com"}'One endpoint. Send a URL, get markdown back. JSON response with metadata.
Works with Python, Node.js, Go, Ruby — any language that can make HTTP requests.
MCP Server
{
"mcpServers": {
"web2md": {
"url": "https://mcp.web2md.app/sse"
}
}
}Add as an MCP server in Claude Desktop, Cursor, Windsurf, or any MCP-compatible client. Your AI assistant can then read any webpage as part of its workflow.
Model Context Protocol support means your AI tools get web reading capability without custom integration code.
Works with your existing AI stack
LangChain / LlamaIndex
Use as a document loader. Fetch URLs and feed clean markdown directly into your RAG pipeline or agent chain.
CrewAI / AutoGen
Give agents web browsing capability. Each agent can read and understand any public webpage as part of its task execution.
Claude / ChatGPT / Cursor
Via MCP integration, AI assistants can fetch and process web content in real time during conversations and coding sessions.
Custom Agents
Simple REST API works with any framework. Build web-aware agents without managing headless browsers or scraping infrastructure.
Why markdown instead of raw HTML?
Raw HTML is noisy. A typical webpage contains navigation, tracking scripts, ad containers, style declarations, and thousands of characters of markup that add zero value to the actual content. When you feed raw HTML to a language model, you're burning tokens on noise.
Markdown preserves document structure — headings, lists, tables, code blocks, links — while stripping everything else. The result is 67% fewer tokens for the same content, which means lower LLM costs and better output quality.
For a deeper dive into the format differences, read our guide on Markdown vs HTML for AI applications.
Token comparison
Raw HTML
- ~15,000 tokens per page
- Navigation and ads included
- Script tags and styles
- Noisy for LLMs
Clean Markdown
- ~5,000 tokens per page
- Main content only
- Proper heading structure
- LLM-optimized
Free tools to try
Not ready for the API? Try our free browser-based converters.
FAQ
Frequently asked questions
Ready to convert webpages to markdown?
Sign up for LeadMagic and start converting URLs to clean markdown. Credit-based pricing, no contracts, credits roll over.