XetHub

Updated 1 year ago

Performance Evaluation for working with large binary files with history

Updated 1 year ago

Repo for running benchmarks against Git LFS, DVC, and LakeFS.

Updated 3 months ago

Blog Authorship Corpus Over 600,000 posts from more than 19 thousand bloggers. Obtained from Kaggle.

Updated 9 months ago

Updated 1 year ago

Updated 3 months ago

Updated 8 months ago

Try Meta's Code Llama models on your laptop or cloud VM in seconds.

Updated 8 months ago

Add custom views to your repository by following the instructions in this template.

Updated 2 months ago

An app to visually summarize any CSV data files stored in the data folder.

Updated 1 month ago

Falcon RefinedWeb is a massive English web dataset built by TII and released under an ODC-By 1.0 license.

Updated 8 months ago

19k+ players and 110 attributes extracted from the latest edition of FIFA. Obtained from Kaggle.

Updated 6 months ago

Simplify the LLM finetuning workflow in Google Colab with XetHub!

Updated 3 months ago

Stream the Flickr30k image dataset on XetHub in seconds. Flickr30k is the benchmark for sentence-based image description, containing 31,000 images collected from Flickr alongside annotatations. Obtained from Kaggle.

Updated 2 weeks ago

URL and caption metadata for the LAION-400M dataset - 400M English (image, text) pairs built for research purposes to enable testing model training on larger scale for broad researcher and other interested communities.

Updated 9 months ago

People