Fork 12

MyGPT Workshop: Build a ChatGPT For Your Own Data in One Hour



This repository provides a simple example of using Retrieval Augmented Generation (RAG) to provide question answering on your personal documents.

Make your own private fork of this repository, and clone it. Then all you need to do is to put any text files you want into the data/ directory, and run:

python src/train.py
gradio src/app.py

See below for more instructions to export from Notion or Slack.

With XetHub, you can easily check in all your data and store everything in one place (code, data, embeddings).

To store everything simply

git add .
git commit -a -m "adding all my data"
git push

And now you will be able to fetch and run your own personal question answering service from anywhere simply by cloning this repository.


We use langchain - index and openai.

pip install -r requirements.txt



# Retrain from scratch
python src/train.py

Run the app

gradio src/app.py

Getting Data


  1. Download text files, any directory structure
  2. Put them into the data directory of this repository
  3. Train app!


  1. Follow the steps here: https://www.notion.so/help/export-your-content#export-as-markdown-&-csv
  2. Unzip the downloaded archive
  3. Move the unzipped folder/directory into the data directory of this repo and then train!


  1. Follow steps here: https://slack.com/help/articles/201658943-Export-your-workspace-data

Sample Data

A sample dataset has been provided in the sample-data directory, just copy the gen-ai folder into the data directory and use that for a very simple corpus of documents.

File List Total items: 9
Name Last Commit Size Last Modified
assets Simplify design 4 months ago
data Basic app skeleton 8 months ago
model Basic app skeleton 8 months ago
sample-data converted pdf to markdown for sample data 7 months ago
src Simplify design 4 months ago
.gitattributes Initial commit 79 B 8 months ago
.gitignore Better error display & validation 287 B 7 months ago
README.md more instructions 1.8 KiB 3 months ago
requirements.txt Updated deps, verified working Windows 204 B 5 months ago


MyGPT Workshop: Build a ChatGPT For Your Own Data in One Hour

Repository Size

Loading repo size...

Activity 28 commits

File Types