So you have an ML project.

Your code is versioned, your data lives in a warehouse, and your compute generates artifacts that are saved to a bucket. No problem!

Then your team starts growing…

Then your team starts growing…

Multiple teams and projects start referencing and modifying data. More teams leads to more files, more tools, and more security considerations. Understanding provenance is no longer a given, and tracking changes across data sources and tools is a nightmare.

Take back control of your ML development.

Take back control of your ML development.

XetHub brings software development best practices to ML by creating consolidated, versioned project views across all your tools — no workflow changes needed. Connect your sources and let XetHub provide fast access to assets across your stack while guaranteeing full reproducibility and lineage.

ML Project

Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo

Does this sound familiar?

Does this sound familiar?

Your team's assets are spread across Git, object stores, and data lakes, while dependencies tracked in yet another tool.

You review models, datasets, and notebooks with collaborators over Slack and email because there’s no easier way.

Your team’s datasets and models are constantly growing in size, leading to increasingly longer transfer and training times.

XetHub has you covered.

XetHub has you covered.

ML development shouldn't be fractured

ML development shouldn't be fractured

Let XetHub bridge the contextual gap between your tools so you can focus on development.

Let XetHub bridge the contextual gap between your tools so you can focus on development.

• Write or sync outputs to XetHub

• Version across tools and formats

• Time travel to any point in history

• CI/CD your full ML stack

ML development shouldn't be fractured

Let XetHub bridge the contextual gap between your tools so you can focus on development.

• Write or sync outputs to XetHub

• Version across tools and formats

• Time travel to any point in history

• CI/CD your full ML stack

Your one-stop shop for ML collaboration.

Your one-stop shop for ML collaboration.

Keep everyone on the same page as projects evolve, with easy file and project level sharing.

Keep everyone on the same page as projects evolve, with easy file and project level sharing.

• Branch for safe experimentation

• Add custom views for context

• Review changes in one place

• Granular access controls

Your one-stop shop for ML collaboration.

Keep everyone on the same page as projects evolve, with easy file and project level sharing.

• Branch for safe experimentation

• Add custom views for context

• Review changes in one place

• Granular access controls

Built-in performance and efficiency

Built-in performance and efficiency

Optimized terabyte-scale storage that seamlessly scales to fit your team’s needs.

Optimized terabyte-scale storage that seamlessly scales to fit your team’s needs.

• Stream files without downloading

• Mount repositories for fast exploration

• Leverage zero-cost clones and branches

• Deploy caches to speed transfers

Average upload time for 10 iterative column additions
Seconds to upload
10000
7500
5000
2500
0

8642s

7743s

2671s

1500s

Git LFS
DVC
LakeFS
XetHub

Built-in performance and efficiency

Optimized terabyte-scale storage that seamlessly scales to fit your team’s needs.

• Stream files without downloading

• Mount repositories for fast exploration

• Leverage zero-cost clones and branches

• Deploy caches to speed transfers

Average upload time for 10 iterative column additions
Seconds to upload
10000
7500
5000
2500
0

8642s

7743s

2671s

1500s

Git LFS
DVC
LakeFS
XetHub

GB Scale

GB Scale

TB Scale

TB Scale

Clean UX

Clean UX

Managed

Managed

Deduped

Deduped

Upgrade your legacy versioning tools

Upgrade your legacy versioning tools

Stop working around software-era size limitations with seamless versioning at scale. No extra commands or servers needed.

Stop working around software-era size limitations with seamless versioning at scale. No extra commands or servers needed.

Git LFS

Git LFS

DVC

DVC

GB Scale

TB Scale

Clean UX

Managed

Deduped

Upgrade your legacy versioning tools

Stop working around software-era size limitations with seamless versioning at scale. No extra commands or servers needed.

Git LFS

DVC

Enhance your existing ML solutions

Enhance your existing ML solutions

Bring your own stack. XetHub adds reproducibility and lineage across the tools you already use.

Bring your own stack. XetHub adds reproducibility and lineage across the tools you already use.

Object stores

Object stores

Data lakes and warehouses

Data lakes and warehouses

Workflow orchestrators

Workflow orchestrators

Analytics and reporting tools

Analytics and reporting tools

Logo
Logo
Logo
Logo
Logo
Logo

Enhance your existing ML solutions

Bring your own stack. XetHub adds reproducibility and lineage across the tools you already use.

Object stores

Data lakes and warehouses

Workflow orchestrators

Analytics and reporting tools

Logo
Logo
Logo
Logo
Logo
Logo

Flexible features to fit your team’s needs

Instant access

Stream files and mount repos without waiting for downloads.

Diff tracking

Easily see how your work has evolved over time.

APIs

Programmatically interact with files for easy workflow access.

Git-integrated

Use the Git commands you know to manage files of any size.

Apps

Deploy Streamlit and Gradio apps for interactive exploration.

Actions

Automate your workflows with triggers and schedules.

Deduplication

Save on storage and transfer with automatic block-level dedupe.

Issues

Track concerns and review changes with issue and pull requests.

"We rely on computer vision and ML to deliver on Gather AI's mission. XetHub has enabled our ML team to be more productive, collaborate efficiently, and iterate quickly."

"We rely on computer vision and ML to deliver on Gather AI's mission. XetHub has enabled our ML team to be more productive, collaborate efficiently, and iterate quickly."

"We rely on computer vision and ML to deliver on Gather AI's mission. XetHub has enabled our ML team to be more productive, collaborate efficiently, and iterate quickly."

Daniel Maturana

Co-founder and Chief ML Scientist

40%

reduction in repository size and transfer time

4

data silos eliminated by switching to XetHub

51%

cost savings over using EBS, Git LFS, and DVC

Connect to your favorite tools

  • Amazon S3
    Azure
    GCS
    Logo
    Snowflake
    Logo
    Databricks
  • Spark
    Logo
    DuckDB
    Logo
    Tableau
    Label Studio
    Argilla
  • Jupyter
    PyTorch
    TensorFlow
    HuggingFace
    XGBoost
  • MLflow
    SageMaker
    Flyte
    ZenML
    Metaflow
  • Aim
    Logo
    Comet
    Logo
    Weights & Biases
    Giskard
    Logo
    Plotly
  • Shiny
    Kubernetes
    Docker
    Logo
    Streamlit
    Gradio
  • Amazon S3
    Azure
    GCS
    Logo
    Snowflake
    Logo
    Databricks
  • Spark
    Logo
    DuckDB
    Logo
    Tableau
    Label Studio
    Argilla
  • Jupyter
    PyTorch
    TensorFlow
    HuggingFace
    XGBoost
  • MLflow
    SageMaker
    Flyte
    ZenML
    Metaflow
  • Aim
    Logo
    Comet
    Logo
    Weights & Biases
    Giskard
    Logo
    Plotly
  • Shiny
    Kubernetes
    Docker
    Logo
    Streamlit
    Gradio
  • Amazon S3
    Azure
    GCS
    Logo
    Snowflake
    Logo
    Databricks
  • Spark
    Logo
    DuckDB
    Logo
    Tableau
    Label Studio
    Argilla
  • Jupyter
    PyTorch
    TensorFlow
    HuggingFace
    XGBoost
  • MLflow
    SageMaker
    Flyte
    ZenML
    Metaflow
  • Aim
    Logo
    Comet
    Logo
    Weights & Biases
    Giskard
    Logo
    Plotly
  • Shiny
    Kubernetes
    Docker
    Logo
    Streamlit
    Gradio

Give your ML stack a boost

XetHub consolidates your workflows for accelerated development and delivery.

Give your ML stack a boost

XetHub consolidates your workflows for accelerated development and delivery.