Clean, filter and sample URLs to optimize data collection – Python & command-line – Deduplication, spam, content and language filters
-
Updated
Dec 30, 2024 - Python
Clean, filter and sample URLs to optimize data collection – Python & command-line – Deduplication, spam, content and language filters
A Minimal Yet Powerful Crawler for Extracting all The Internal/External/Fuzz-able Links from a website
A simple GIT URL parser.
A type to represent, query, and manipulate a Uniform Resource Identifier.
Web scraping | Website cloner | Path Traversal Scanner
This is a website url scraper built using python.
Check if the urls contained in a markdown file are down or not.
Extract information from URLs inside shell scripts
WebBriefs is an intelligent webpage summarizer API that extracts and condenses content into concise, readable markdown format. Perfect for quickly getting the gist of any website
Crawl websites and extract meaningful information from HTML and site content
A command line url parser, written in Python
Simple URL builder
Collection of helper functions designed to facilitate efficient web scraping in python
Bot to generate useful links to increase the ranking of products sold on Amazon
ImageSpace is a Python application that downloads images from web pages, filters out certain types of images, and stores the valid images in a SQLite database. It utilizes the FastAPI framework for providing an API endpoint to process web pages and extract images.
A python library which could parse URL to ip and country.
UrlShortner map's the larger url's into smaller one. This app is fully designed in python and used postgresql database for mapping url's.
Add a description, image, and links to the url-parser topic page so that developers can more easily learn about it.
To associate your repository with the url-parser topic, visit your repo's landing page and select "manage topics."