Compare commits

..

No commits in common. "9030da9a0c34f4547070b8dae7160966fea341a9" and "4c73ace62d0c65b094ec002c4a0575012a635628" have entirely different histories.

5 changed files with 10 additions and 35 deletions

View file

@ -1,23 +0,0 @@
# Province article scraping
A couple of scripts to scrape article text from various provinces for
a text analysis university course.
We need:
Qinghai
: page 14-75
Ningxia
: page 11-42
Shanxi
: page 2-18
Xinjiang
: page 10-20
The websites all have subtle differences, so there's simply a folder +
scripts for each (the scripts are simple enough that there's no need
for deduplication or anything complex). Written in python/js where
necessary for educational purposes.

View file

@ -16,8 +16,6 @@
in {
devShell = pkgs.mkShell {
nativeBuildInputs = with pkgs; [
nodePackages.typescript-language-server
(python39.withPackages (pypkgs:
with pypkgs; [
beautifulsoup4