diff --git a/guangdong/README.md b/guangdong/README.md new file mode 100644 index 0000000..a35ed9a --- /dev/null +++ b/guangdong/README.md @@ -0,0 +1,20 @@ +## Guangdong scraping + +Zip of full article dump: [articles-guangdong.zip](./articles-guangdong.zip) + +Sorry, format for this one is a bit different. I got a bit more +practiced after the first. + +Files that are likely just links to PDFs: + +```console +.rw-r--r-- 119 tlater 9 Apr 04:07 ./2017-05-27_949.txt +.rw-r--r-- 104 tlater 9 Apr 04:07 ./2017-05-27_950.txt +.rw-r--r-- 85 tlater 9 Apr 04:07 ./2017-05-27_951.txt +.rw-r--r-- 157 tlater 9 Apr 04:07 ./2017-05-27_952.txt +.rw-r--r-- 164 tlater 9 Apr 04:07 ./2017-05-27_953.txt +.rw-r--r-- 149 tlater 9 Apr 04:07 ./2017-05-27_954.txt +.rw-r--r-- 85 tlater 9 Apr 04:07 ./2017-05-27_955.txt +.rw-r--r-- 387 tlater 9 Apr 04:07 ./2017-08-14_888.txt +.rw-r--r-- 355 tlater 9 Apr 04:07 ./2017-08-15_876.txt +``` diff --git a/guangdong/articles-guangdong.zip b/guangdong/articles-guangdong.zip new file mode 100644 index 0000000..0febee8 Binary files /dev/null and b/guangdong/articles-guangdong.zip differ