Tristan Daniël Maat 8cb72464b4 | ||
---|---|---|
guangdong | ||
qinghai | ||
Readme.md | ||
flake.lock | ||
flake.nix |
Readme.md
Province article scraping
A couple of scripts to scrape article text from various provinces for a text analysis university course.
We need:
The websites all have subtle differences, so there's simply a folder + scripts for each (the scripts are simple enough that there's no need for deduplication or anything complex). Written in python/js where necessary for educational purposes.