What is the easiest way to convert HTML to markdown?

The easiest way is to use a free online converter like LeadMagic's HTML to Markdown tool. Paste your HTML, click convert, and copy the output. No signup or installation needed.

Can I convert HTML to markdown with Python?

Yes. Use the markdownify or html2text Python libraries. Install with pip install markdownify, then call markdownify.markdownify(html_string) to convert HTML to markdown programmatically.

How do I convert HTML to markdown in JavaScript?

Use the Turndown library. Install with npm install turndown, create a TurndownService instance, and call turndownService.turndown(htmlString). It works in both Node.js and the browser.

What's the best HTML to markdown API?

LeadMagic's URL to Markdown API converts any live webpage to clean markdown with a single API call. It handles JavaScript rendering, content extraction, and noise removal. Credit-based pricing with no separate subscription.

Does converting HTML to markdown lose formatting?

Standard formatting like headings, bold, italic, links, images, lists, and code blocks are preserved. Complex CSS styling, custom fonts, and layout-specific HTML are not — markdown is a content format, not a presentation format.

Back to blog

Developer12 min read

How to Convert HTML to Markdown: Complete Guide (2026)

Learn how to convert HTML to Markdown with Python, JavaScript, Pandoc, and online tools. Code examples and comparison table included.

Patrick Spielmann

February 17, 2026

HTML is everywhere. It powers every webpage you've ever visited. But when you need to work with that content — drop it into a CMS, feed it to an LLM, store it in a knowledge base, or write documentation — HTML is the wrong format. Too much noise. Too many tags. Too little signal.

That's where converting HTML to markdown comes in. Markdown gives you the content without the cruft: headings, links, bold, code blocks, tables — all in a format that's human-readable and machine-friendly.

This guide covers every practical method to convert HTML to markdown, from pasting into a free online tool to running Python scripts to calling an API at scale. Pick the approach that fits your workflow.

How to Convert HTML to Markdown Online

The fastest way to convert HTML to markdown is with a browser-based tool. No installs, no dependencies, no accounts.

LeadMagic's free HTML to Markdown converter does exactly this. Paste your HTML into the left panel, get clean markdown in the right panel. Copy it, download it, done.

This works well for one-off conversions: grabbing content from a webpage, cleaning up an email template, converting a blog post draft, or stripping HTML from a CMS export. If you're converting a handful of pages, an online tool is all you need.

For live webpages, the URL to Markdown tool takes it a step further — paste a URL and it fetches, renders, and converts the page content automatically. No need to view-source and copy the HTML yourself.

When to use an online converter:

One-off or occasional conversions
Quick cleanup of HTML snippets
Non-technical users who don't want to install anything
Previewing what the markdown output will look like

Convert HTML to Markdown with Python

If you need to convert HTML to markdown programmatically — inside a script, a data pipeline, or a backend service — Python has two solid libraries.

markdownify

markdownify is the most popular Python library for HTML-to-markdown conversion. It wraps BeautifulSoup and handles most standard HTML elements out of the box.

pip install markdownify

from markdownify import markdownify

html = """
<h1>Project Update</h1>
<p>We shipped <strong>three features</strong> this week:</p>
<ul>
  <li>Email verification API</li>
  <li>Bulk CSV enrichment</li>
  <li>Webhook notifications</li>
</ul>
<p>Read the <a href="https://example.com/changelog">full changelog</a>.</p>
"""

markdown = markdownify(html, heading_style="ATX")
print(markdown)

Output:

# Project Update

We shipped **three features** this week:

* Email verification API
* Bulk CSV enrichment
* Webhook notifications

Read the [full changelog](https://example.com/changelog).

markdownify handles headings, lists, links, images, bold, italic, code, and blockquotes. You can customize the output with options like heading_style (ATX vs Setext), strip (remove specific tags), and convert (only process specific tags).

For batch processing, wrap it in a loop:

import os
from markdownify import markdownify

html_dir = "exported_pages"
output_dir = "markdown_output"

os.makedirs(output_dir, exist_ok=True)

for filename in os.listdir(html_dir):
    if filename.endswith(".html"):
        with open(os.path.join(html_dir, filename)) as f:
            html = f.read()
        md = markdownify(html, heading_style="ATX", strip=["script", "style"])
        md_filename = filename.replace(".html", ".md")
        with open(os.path.join(output_dir, md_filename), "w") as f:
            f.write(md)

html2text

html2text is another Python option, originally written by Aaron Swartz. It focuses on producing readable plain text with markdown formatting.

pip install html2text

import html2text

converter = html2text.HTML2Text()
converter.ignore_links = False
converter.ignore_images = False
converter.body_width = 0  # Don't wrap lines

html = "<h2>API Docs</h2><p>Send a <code>POST</code> request to <a href='/api/enrich'>/api/enrich</a>.</p>"
print(converter.handle(html))

markdownify vs html2text: markdownify gives you more control over output formatting and handles edge cases better (nested lists, complex tables). html2text is lighter and faster for simple conversions where you mostly want readable text. For most use cases, start with markdownify.

HTML to Markdown in JavaScript

For JavaScript and TypeScript projects, Turndown is the standard library. It runs in both Node.js and the browser.

Node.js

npm install turndown

const TurndownService = require("turndown");
const turndownService = new TurndownService({
  headingStyle: "atx",
  codeBlockStyle: "fenced",
});

const html = `
<article>
  <h2>Getting Started</h2>
  <p>Install the package with <code>npm install turndown</code>.</p>
  <pre><code class="language-js">const td = new TurndownService();
console.log(td.turndown("<b>hello</b>"));</code></pre>
  <p>That's it. No config required.</p>
</article>
`;

const markdown = turndownService.turndown(html);
console.log(markdown);

Output:

## Getting Started

Install the package with `npm install turndown`.

```js
const td = new TurndownService();
console.log(td.turndown("<b>hello</b>"));

That's it. No config required.


### Browser

Turndown also works client-side. This is useful for building in-browser converters or converting DOM elements directly:

```javascript
import TurndownService from "turndown";

const turndownService = new TurndownService();

// Convert a DOM element directly
const article = document.querySelector("article");
const markdown = turndownService.turndown(article);

Custom Rules

Turndown lets you add custom rules for elements that need special handling:

turndownService.addRule("strikethrough", {
  filter: ["del", "s"],
  replacement: (content) => `~~${content}~~`,
});

turndownService.addRule("highlight", {
  filter: (node) => node.nodeName === "MARK",
  replacement: (content) => `==${content}==`,
});

This is particularly useful for converting custom HTML components or non-standard markup into markdown extensions.

HTML to Markdown with Pandoc

Pandoc is the Swiss Army knife of document conversion. If you already have it installed (or don't mind installing it), converting HTML to markdown is a one-liner.

pandoc input.html -f html -t markdown -o output.md

Pandoc supports multiple markdown flavors. Use the flavor flag to match your target:

# GitHub-Flavored Markdown
pandoc input.html -f html -t gfm -o output.md

# CommonMark
pandoc input.html -f html -t commonmark -o output.md

# Pandoc's extended markdown (default)
pandoc input.html -f html -t markdown -o output.md

You can also pipe HTML directly:

curl -s https://example.com | pandoc -f html -t gfm

Pandoc is powerful but heavy. It's a Haskell binary that needs to be installed system-wide. Great for local document conversion, less practical for embedding in a web service or lightweight script. If you're already using Pandoc for other document workflows (LaTeX, DOCX, EPUB), adding HTML-to-markdown is trivial. If you're starting from scratch, a Python or JavaScript library is usually easier.

Convert HTML to Markdown via API

When you need to convert live webpages to markdown at scale — for LLM ingestion, content monitoring, competitive research, or automated documentation — you want an API.

The LeadMagic URL to Markdown API takes a URL and returns clean markdown. It handles the parts that trip up local tools: JavaScript rendering, dynamic content loading, navigation/footer removal, and content extraction.

curl -X POST https://api.web2md.app/api/scrape \
  -H "X-API-Key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/blog/some-article"}'

Response:

{
  "success": true,
  "data": {
    "markdown": "# Some Article\n\nThe article content in clean markdown...",
    "title": "Some Article",
    "url": "https://example.com/blog/some-article"
  }
}

Why use an API instead of a local library?

Local libraries like markdownify and Turndown work on raw HTML strings. If the page loads content with JavaScript (React, Vue, dynamic widgets), the raw HTML won't contain that content. An API renders the page in a browser first, then converts the fully-loaded DOM.

LeadMagic's API also strips boilerplate — navbars, footers, sidebars, cookie banners — and extracts only the main content. This is critical for LLM use cases where you want article text, not navigation links.

When to use an API:

Converting live webpages (not static HTML files)
Batch processing hundreds or thousands of URLs
Pages with JavaScript-rendered content
When you need clean, noise-free markdown for AI/LLM pipelines
Production workflows where reliability and uptime matter

HTML to Markdown Converter Comparison

Method	Best For	Handles JS?	Tables?	Setup	Cost
Online tool	One-off conversions	No	Yes	None	Free
Python (markdownify)	Scripts & pipelines	No	Basic	`pip install`	Free
Python (html2text)	Simple text extraction	No	No	`pip install`	Free
JavaScript (Turndown)	Node.js / browser apps	No*	Plugin	`npm install`	Free
Pandoc	Document workflows	No	Yes	System install	Free
LeadMagic API	Live pages at scale	Yes	Yes	API key	Per-credit

*Turndown runs in the browser and can access the rendered DOM, but as a library it doesn't render pages on its own.

Bottom line: For static HTML you already have, use markdownify (Python) or Turndown (JavaScript). For converting live webpages, especially JavaScript-heavy ones, use an API.

HTML Table to Markdown

Tables are the trickiest part of HTML-to-markdown conversion. Markdown tables are limited — no colspan, no rowspan, no merged cells, no nested tables. Here's what works and what doesn't.

Simple tables convert cleanly:

<table>
  <thead>
    <tr><th>Name</th><th>Email</th><th>Role</th></tr>
  </thead>
  <tbody>
    <tr><td>Jane</td><td>jane@acme.com</td><td>CTO</td></tr>
    <tr><td>Alex</td><td>alex@acme.com</td><td>VP Eng</td></tr>
  </tbody>
</table>

Converts to:

| Name | Email          | Role   |
|------|----------------|--------|
| Jane | jane@acme.com  | CTO    |
| Alex | alex@acme.com  | VP Eng |

What breaks:

colspan and rowspan — markdown has no equivalent, so converters either flatten or skip them
Nested tables — inner tables get collapsed or lost
Complex formatting inside cells — images, lists, and multi-line content inside table cells rarely survive conversion

Tips for better table conversion:

Simplify your HTML tables before converting — remove merged cells if possible
Use GFM (GitHub-Flavored Markdown) output — it has the best table support
For complex data tables, consider converting to a code block or CSV instead
markdownify and Pandoc handle standard tables well; html2text drops them entirely

If your HTML has complex tables and you need accurate markdown output, the LeadMagic URL to Markdown API applies specialized table handling that preserves structure better than most local libraries.

When to Use HTML vs Markdown

This isn't really an either/or decision — it's about picking the right format for the job.

Use HTML when:

Building interactive web interfaces
You need precise layout control (CSS Grid, Flexbox)
The content includes forms, media embeds, or custom components
You're rendering directly in a browser

Use markdown when:

Writing documentation, READMEs, or knowledge bases
Storing content for static site generators (Next.js, Hugo, Astro)
Feeding text to LLMs or AI pipelines
Collaborating on text-heavy content with non-technical contributors
You want version-controlled, diff-friendly content

The two formats complement each other. Most modern publishing workflows convert markdown to HTML for display. The reverse — converting HTML to markdown — is what you do when you want to extract content from the web and work with it in a more portable format.

For a deeper comparison of the two formats, read Markdown vs HTML.

Wrapping Up

Converting HTML to markdown boils down to three scenarios: paste-and-convert for one-off jobs (use the HTML to Markdown converter), run a script for batch processing (use markdownify or Turndown), or call an API for live webpages at scale (use the URL to Markdown API).

Pick the tool that matches how many pages you're converting and whether they require JavaScript rendering. For most developers building content pipelines, AI workflows, or documentation systems, an API that handles rendering and cleanup saves hours of edge-case debugging. For a broader look at extracting content from live pages, see our guide on how to extract text from any website.

Try the URL to Markdown API →

Questions or feature requests? Our team reads every message.

Developer12 min read

Email Finder API Guide: Code + Integration Patterns

Integrate an email finder API with curl, Python, and Node.js. Includes auth, rate limits, error handling, and batch patterns.

Developer10 min read

How to Extract Text from Any Website (2026 Guide)

How to extract text from any website — browser tools, Python scripts, and APIs. Covers JS-rendered pages and AI-ready output.

Developer8 min read

Markdown vs HTML: When to Use Each (2026 Guide)

Markdown vs HTML — syntax differences, when to use each, and conversion methods. Why markdown wins for LLMs and AI pipelines.