Open
Description
After debriefing this consult with @tomvannuenen:
#1816 Trouble bypassing captcha verification while doing data scraping
Let's consider how we help people consider the big picture and goals of what they're trying to accomplish and whether or not some more effective strategies than scraping, including:
- check if there is another simpler technical approach (use API instead of scraping)
- check if there is another already-scraped database of the data available (e.g. reddit data on bigquery)
- check if there is a low-tech (contact by email or sneakernet) solution for acquiring the data easily/quickly in bulk
- consider the legal/ethical implications of scraping (see Library Text Mining pages/workshops and CLTC Scraping for Research Purposes)
- mention the purpose of captchas and pitfalls of attempting to circumvent them