Skip to content

Add overview slide to consider the big picture/alternate strategies #6

Open
@aculich

Description

@aculich

After debriefing this consult with @tomvannuenen:
#1816 Trouble bypassing captcha verification while doing data scraping

Let's consider how we help people consider the big picture and goals of what they're trying to accomplish and whether or not some more effective strategies than scraping, including:

  • check if there is another simpler technical approach (use API instead of scraping)
  • check if there is another already-scraped database of the data available (e.g. reddit data on bigquery)
  • check if there is a low-tech (contact by email or sneakernet) solution for acquiring the data easily/quickly in bulk
  • consider the legal/ethical implications of scraping (see Library Text Mining pages/workshops and CLTC Scraping for Research Purposes)
  • mention the purpose of captchas and pitfalls of attempting to circumvent them

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions