Skip to main content

Common Questions

What's your quick XetHub pitch?

Imagine that S3 and Git had a lovable baby. That's us! While our founding team built the ML and data infrastructure at Apple, we met with many ML teams that had challenges with reproducibility, recovery, compliance, collaboration, and scale. We realized that Git has solved all but the last and built XetHub to scale Git to S3 scale.

Storage and versioning aren't enough by themselves with large scales of data. Understandability of how things change over time is an unsolved problem. XetHub is invested in helping ML teams visualize files and differences in the context that they were made, with features like app deployment and visualization.

Ok, that sounds cool. But my code is already on Git* and my data is on S3. Do I have to move them in to use XetHub?

If your code is on GitHub, try our GitHub application, which adds Xet support (lite) to GitHub repositories. If you're on S3, try setting up a S3 sync to test out XetHub functionality without having to manually transfer data.

For a fully featured XetHub experience, however, we do recommend moving your files into our system.

What are your per repository, file size, and number of file limits?

We have tested XetHub with repositories of over 100TB, and there are no limits on per file size or number of files. Operations on larger ends of the spectrum may take longer as we continue to optimize our platform for scale. We recommend using our large repo access patterns to make development easier.

The data I work with is huge and I only need to work with parts of it. Can XetHub help?

Yes! Check out our large repo access patterns; lazy clone in particular may make your life easier.

What can I use XetHub for? Is it just for ML?

XetHub is for anything that you want to collaborate on, where reliable history and metadata are important. It is especially useful for iterative workflows where you want to quickly access different versions without fully downloading them, such as large asset development (Unreal and Unity models) or ML model training iterations (model checkpoints), or to replace workflows where you may currently be appending "_1", "_2", etc. to your file names to manually track versions.

What operating systems are supported right now?

MacOS and Linux are fully supported. Windows is currently in preview.

Is it ready for production use?

Yes!

Ok, so how much does this cost?

XetHub is free for academics and non-profits, with flexible plans for other use cases. See our pricing page for more details.

You mentioned efficient storage. How does that work?

Magic! Read how Xet deduplication works for a high-level overview and some commands you can run to explore its internals.