1
0
Fork 0

Having fun with ChatGPT4 to build an archive of IRS PDF documents. Curious about XetHub default deduplication over PDFs. 15+% feels pretty good!

initial commit

main
Rajat Arya 1 year ago
parent 0c4ca8efd0
commit 87a66fd759
4 changed files (0 B → 5.0 KiB)
  1. 28
      .gitignore
  2. 105
      README.md
  3. 7
      code/requirements.txt
  4. 35
      code/scraper.py
Loading…
Cancel
Save