Extract readable main content from Xiaohongshu and other web pages.
Extracting clean, readable content from web pages — especially Xiaohongshu, WeChat, or JavaScript-heavy sites — is painful. This connector returns the main content of any URL as structured Markdown, with title, canonical URL, and content length.
| Parameter | Format | Example | Required |
|---|---|---|---|
url | string | https://www.xiaohongshu.com/explore/abcd1234 | Yes |
| Field | Type | Description |
|---|---|---|
title | string | The page title extracted from the document |
url | string | The canonical URL of the page |
contentLength | number | Character count of the extracted main content |
url: "https://www.xiaohongshu.com/explore/abcd1234"
# URL Extract - Title: 10 Hidden Cafes in Shanghai You Must Visit - URL: https://www.xiaohongshu.com/explore/abcd1234 - Content Length: 2,340 characters
The Xiaohongshu note was successfully extracted. The main content is 2,340 characters and covers hidden cafe recommendations in Shanghai.
Last tested: 2026-03-10
| OAuth required | No |
| Managed auth | Included, no API key needed |
| Bring your own key (BYOK) | Not supported in v1 |
| Audit logged | Every request is logged with input, result, cost, and timestamps |
| Training acknowledgment | Not required |
Admins can restrict this connector via blacklist or whitelist in the admin panel.
Paste a note URL and get the title, content, and character count.
Feed a URL to this connector, then chain the output to a summarizer.
Any publicly accessible web page. Xiaohongshu and WeChat articles are specifically tested and optimized.
No. It extracts the main readable content (similar to reader mode) and returns it as Markdown.
Extraction will fail for pages that require authentication. Only publicly accessible content is supported.