Assembled from URLs hosted at https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T
Add first batch of data from URLs
#2
Merged
zach
merged 2 commits from first_batch
into main
1 year ago
-
3arxiv/arxiv_023827cd-7ee8-42e6-aa7b-661731f4c70f.jsonl
-
3arxiv/arxiv_024de5df-1b7f-447c-8c3a-51407d8d6732.jsonl
-
3arxiv/arxiv_03232e26-be3f-4a28-a5d2-ee1d8c0e9831.jsonl
-
3arxiv/arxiv_034e819a-cfcb-43c6-ad25-0232ad48823c.jsonl
-
3arxiv/arxiv_077ae8de-a68e-47e7-95a6-6d82f8f4eeb9.jsonl
-
3arxiv/arxiv_0af50072-df4c-4084-a833-cebbd046e70e.jsonl
-
3arxiv/arxiv_0de84cfc-c080-471f-b139-1bf061db4feb.jsonl
-
3arxiv/arxiv_0fbdd8ad-32d8-4228-9a40-e09dde689760.jsonl
-
3arxiv/arxiv_11c659c1-ffbf-4455-abfd-058f6bbf4bb2.jsonl
-
3arxiv/arxiv_1958455d-6543-4307-a081-d86ce0637f9a.jsonl
-
3arxiv/arxiv_1982fb29-c4ed-4dd3-855c-666e63bc62d9.jsonl
-
3arxiv/arxiv_1caed86f-5625-4941-bdc1-cc57e4fec1cd.jsonl
-
3arxiv/arxiv_1d3a0cd6-f0e6-4106-a080-524a4bd50016.jsonl
-
3arxiv/arxiv_29d54f5a-1dd0-4e9a-b783-fb2eec9db072.jsonl
-
3arxiv/arxiv_29fd3d99-53fb-43e2-a4a5-2fd01bf77258.jsonl
-
3arxiv/arxiv_2b224cd9-286e-46ac-8c4e-c1e3befc8760.jsonl
-
3arxiv/arxiv_2c131fca-2a05-4d5f-a805-59d2af3477e2.jsonl
-
3arxiv/arxiv_2f28f1a7-6972-48ad-8997-65a5d52e4f1c.jsonl
-
3arxiv/arxiv_30440198-cd90-48c6-82c1-ea871b8c21c5.jsonl
-
3arxiv/arxiv_39367d6c-d7d4-45fc-a929-8a17184d1744.jsonl
-
3arxiv/arxiv_393d19f2-1cd1-421f-be8a-78d955fdf602.jsonl
-
3arxiv/arxiv_3a5d4f93-97ec-483a-88ef-324df9651b3f.jsonl
-
3arxiv/arxiv_3c89ea11-69ff-4049-b775-f0c785997909.jsonl
-
3arxiv/arxiv_3d5a011a-4bbe-4585-a2bd-ff3e943c8671.jsonl
-
3arxiv/arxiv_3f805f4b-6f7f-42a8-a006-47c1e0401bd7.jsonl
-
3arxiv/arxiv_3f9eb7ad-f266-4154-8d4d-54deeffde075.jsonl
-
3arxiv/arxiv_400748d3-0076-4a04-8a1c-6055ba0b5a2d.jsonl
-
3arxiv/arxiv_44e19375-3995-4dff-a3b6-8a25247a165c.jsonl
-
3arxiv/arxiv_4a8cf52f-81d0-4875-9528-466b1cbc71e1.jsonl
-
3arxiv/arxiv_4cc7015c-c39a-4bf6-9686-c00b3343edd9.jsonl
-
3arxiv/arxiv_50757a42-079b-41ec-bcca-73759faffd62.jsonl
-
3arxiv/arxiv_575ae832-e770-4a89-bfa7-c56f16dbca69.jsonl
-
3arxiv/arxiv_580be642-bb73-4d0d-8b5e-f494722934cd.jsonl
-
3arxiv/arxiv_5a02d9ee-12a0-437d-808f-d26f0eb2012b.jsonl
-
3arxiv/arxiv_5d8d402b-8277-480a-b5fa-71169726864f.jsonl
-
3arxiv/arxiv_5ee33ef7-455e-4fd5-9512-c4771dd802c1.jsonl
-
3arxiv/arxiv_610c82ed-b9ee-449c-83b0-601205f3a74a.jsonl
-
3arxiv/arxiv_629fe3ca-075f-4663-9b81-b807f3b42bf2.jsonl
-
3arxiv/arxiv_64e5075e-e87e-4b2a-9e38-e5c102f6f2b1.jsonl
-
3arxiv/arxiv_65dd2ff6-dae3-4a60-90d3-c3d7349fc92f.jsonl
-
3arxiv/arxiv_6719ecd2-fe34-4078-a584-320d921cbf6f.jsonl
-
3arxiv/arxiv_6938ee72-43ee-4ade-8840-151a402383b0.jsonl
-
3arxiv/arxiv_73241940-66c1-481c-b53a-f5e8b9afe9fa.jsonl
-
3arxiv/arxiv_751370b5-c7cb-44d8-a039-1468ee6747ab.jsonl
-
3arxiv/arxiv_75af5d17-5ebb-4460-9f2a-dc9fe880a936.jsonl
-
3arxiv/arxiv_79d50803-f7d9-4aa8-bf1a-d807980a40c6.jsonl
-
3arxiv/arxiv_7b26046f-7c8d-405b-911b-df51e1a069fa.jsonl
-
3arxiv/arxiv_7d1d69dc-bc8e-4817-9cab-afdc002ab7c4.jsonl
-
3arxiv/arxiv_7ea7a996-b1bb-4773-a36a-461dce2de861.jsonl
-
3arxiv/arxiv_8232f276-9e3f-463a-9350-362de1b501d1.jsonl
- Some files were not shown because too many files have changed in this diff Show More
Write
Preview
Loading…
Cancel
Save
Reference in new issue