pending anonymous user

  • 38 Posts
  • 651 Comments
Joined 1 year ago
cake
Cake day: August 7th, 2023

help-circle
  • I don’t a single guide for you but I can layout a road map.

    1. A programming language. I prefer Python.
    2. Basic HTML syntax and CSS selectors
    3. HTTP, specifically methods, status code (no need to memorize all cuz you can go look it up), and cookies

    After you got those foundation ready, you can go on and try to build a webscraper. I advice aginst using Scrapy. Not because it is bad but too overwhelming and abstracted for any beginner. I will instead advice you use requests for HTTP, and BeautifulSoup4 for HTML parsing. You will build a more solid foundation and transition to scrapy later when you need those advanced function.

    When you get stuck, don’t afraid to pause on your attempt and read tutorials again. Head to the Python Community on Discord to get interactive help. We welcome noobs as we once were noobs too. Just don’t ever mention scraping there as they can’t help if they suspect you’re trying to do something inappropriate, malicious, or illegal. They are notoriously aginst yt-dlp which frustrates me a bit. Phrase it nicely and in an generic way. I will be there occasionally offering help.