forked from rajatarya/irs-pdf
Having fun with ChatGPT4 to build an archive of IRS PDF documents. Curious about XetHub default deduplication over PDFs. 15+% feels pretty good!
parent
0c4ca8efd0
commit
87a66fd759
4 changed files (0 B → 5.0 KiB)
.gitignore
(0 B → 287 B)
README.md
(0 B → 3.6 KiB)
code/requirements.txt
(0 B → 126 B)
code/scraper.py
(0 B → 1.0 KiB)
Loading…
Reference in new issue