Use LLMs to robustly extract structured data from HTML and markdown
-
Updated
May 19, 2025 - TypeScript
Use LLMs to robustly extract structured data from HTML and markdown
GNewsScraper is a TypeScript package that scrapes article data from Google News based on a keyword or phrase. It returns the results as an array of JSON objects, making it convenient to access and use the scraped information
RealShotPDF is a Chrome extension designed to simplify the process of creating PDF documents from web content. The extension allows users to navigate through selected webpages, parse and display links in a tree view, and generate PDFs for the chosen pages. It operates locally without sending any data to external servers.
Metadata extractor for the sprawling web ⚙️
Add a description, image, and links to the web-data-extraction topic page so that developers can more easily learn about it.
To associate your repository with the web-data-extraction topic, visit your repo's landing page and select "manage topics."