Extractionsync4 creditsURL extractweb scraperXiaohongshu

URL Extract

Extract readable main content from Xiaohongshu and other web pages.

What it does

Extracting clean, readable content from web pages — especially Xiaohongshu, WeChat, or JavaScript-heavy sites — is painful. This connector returns the main content of any URL as structured Markdown, with title, canonical URL, and content length.

Inputs

ParameterFormatExampleRequired
urlstringhttps://www.xiaohongshu.com/explore/abcd1234Yes

Outputs

FieldTypeDescription
titlestringThe page title extracted from the document
urlstringThe canonical URL of the page
contentLengthnumberCharacter count of the extracted main content

Example output

Input

url: "https://www.xiaohongshu.com/explore/abcd1234"

Output

# URL Extract
- Title: 10 Hidden Cafes in Shanghai You Must Visit
- URL: https://www.xiaohongshu.com/explore/abcd1234
- Content Length: 2,340 characters

What it means

The Xiaohongshu note was successfully extracted. The main content is 2,340 characters and covers hidden cafe recommendations in Shanghai.

Last tested: 2026-03-10

How to install

Via CLI

$ vernclaw-cli invoke extract.url --url <value>

Via web

  1. Visit the connector catalog and find URL Extract.
  2. Click Install connector and acknowledge any training requirements.
  3. Complete authorization if prompted.
  4. Use the connector from Chat or the CLI.

Auth and permissions

OAuth requiredNo
Managed authIncluded, no API key needed
Bring your own key (BYOK)Not supported in v1
Audit loggedEvery request is logged with input, result, cost, and timestamps
Training acknowledgmentNot required

Admins can restrict this connector via blacklist or whitelist in the admin panel.

Limits and edge cases

  • Pages behind login walls or CAPTCHAs may fail to extract.
  • Very large pages (over 100KB of content) may be truncated.
  • Dynamic SPA pages that require extensive JavaScript rendering may return incomplete content.
  • Xiaohongshu and WeChat article URLs are specifically supported and tested.

Common use cases

Read a Xiaohongshu note and extract key points

Paste a note URL and get the title, content, and character count.

Extract article content for summarization

Feed a URL to this connector, then chain the output to a summarizer.

Frequently asked questions

Which sites are supported?

Any publicly accessible web page. Xiaohongshu and WeChat articles are specifically tested and optimized.

Does this return the full page HTML?

No. It extracts the main readable content (similar to reader mode) and returns it as Markdown.

What happens with login-required pages?

Extraction will fail for pages that require authentication. Only publicly accessible content is supported.

Related connectors

Agent-friendly links