Skip to main content

Introduction

XetHub is a collaborative ML development platform built for working with data, models, and artifacts. Use XetHub to track and share the evolution of large files and repositories, or as a unified versioning layer across your ML stack for end-to-end observability at scale.

It's perfect for teams who want Git-backed reliability and reproducibility for every piece of their ML workflow, without the hassle of running additional commands or managing remote servers. With a per-repository limit of over 100TB and no limits on file types or formats, XetHub combines performance with flexible access to naturally augment your existing workflows.

Why XetHub?

Existing software development tools were optimized to work with small code files and perform poorly when anything over a few megabytes shows up. ML tooling has worked around the problem by combining software and storage tools, but the result hasn't been pretty. If you've ever tried DVC and accidentally forgotten to dvc add a file, or used Git LFS only to wait hours for each command to complete, you know that the workarounds can be painful. If your S3 buckets are full of fragile naming conventions and accidental overwrites, you know that object stores weren't made for versioning.

XetHub bridges the gap between software versioning and object store scale to enable modern ML development. Use XetHub to reinforce or replace pieces of your existing stack, adding guaranteed reproducibility and collaborative context to your workflow.

ML teams have much better things to do than to learn yet another set of commands. Use XetHub with your choice of Git, S3, Python, or CLI syntax to access any file, any time, anywhere.

How does it work?

DVC and Git LFS replace large files with pointers to files stored on remotes, but do nothing to optimize the storage of the files themselves. XetHub uses pointers as well, but also invisibly chunks the files into blocks for more efficient storage and transfer with no hit to developer experience. Our compute backend also enables rich views and context on top of large files that no other versioning system can support.

XetHub is free for academics and non-profits, with flexible plans for other use cases. Install Git-Xet now to get started with your first Xet repository.