Fine tune an LLM with everything tracked together.
Updated 4 months ago
Updated 2 weeks ago
Updated 7 months ago
Updated 1 week ago
A clone of the Visual Behavior Neuropixels dataset, collected over 153 sessions with 81 mice.
Updated 8 months ago
A small langchain demo project of a QA on movies
Updated 9 months ago
Falcon RefinedWeb is a massive English web dataset built by TII and released under an ODC-By 1.0 license.
Updated 8 months ago
Try Meta's Code Llama models on your laptop or cloud VM in seconds.
Updated 9 months ago
Updated 9 months ago
URL and caption metadata for the LAION-400M dataset - 400M English (image, text) pairs built for research purposes to enable testing model training on larger scale for broad researcher and other interested communities.
Updated 10 months ago
Preserve generated Stable Diffusion images with comments and metadata. Duplicate this repository to store your generated images in your own account, and customize to use your own code, tokens, and endpoints.
Updated 19 hours ago
Updated 9 months ago
Stream the Flickr30k image dataset on XetHub in seconds. Flickr30k is the benchmark for sentence-based image description, containing 31,000 images collected from Flickr alongside annotatations. Obtained from Kaggle.
Updated 4 weeks ago
Assembled from URLs hosted at https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T
Updated 8 months ago
An app to visually summarize any CSV data files stored in the data folder.
Updated 2 months ago