README.md
MyGPT
This repository provides a simple example of using Retrieval Augmented Generation (RAG) to provide question answering on your personal documents.
Make your own private fork of this repository, and clone it. Then all you need to do is to put any text files you want into the data/ directory, and run:
export OPENAI_API_KEY=<YOUR_OPENAI_API_KEY>
python src/train.py
gradio src/app.py
See below for more instructions to export from Notion or Slack.
With XetHub, you can easily check in all your data and store everything in one place (code, data, embeddings).
To store everything simply
git add .
git commit -a -m "adding all my data"
git push
And now you will be able to fetch and run your own personal question answering service from anywhere simply by cloning this repository.
Requirements
We use langchain - index and openai.
pip install -r requirements.txt
export OPENAI_API_KEY=YOUR_OPENAI_API_KEY
Usage
Train
# Retrain from scratch
python src/train.py
Run the app
gradio src/app.py
Getting Data
All
- Download text files, any directory structure
- Put them into the data directory of this repository
- Train app!
Notion
- Follow the steps here: https://www.notion.so/help/export-your-content#export-as-markdown-&-csv
- Unzip the downloaded archive
- Move the unzipped folder/directory into the data directory of this repo and then train!
Slack
- Follow steps here: https://slack.com/help/articles/201658943-Export-your-workspace-data
Sample Data
A sample dataset has been provided in the sample-data directory, just copy the gen-ai folder into the data directory and use that for a very simple corpus of documents.
File List | Total items: 9 | ||
---|---|---|---|
Name | Last Commit | Size | Last Modified |
assets | |||
data | |||
model | |||
sample-data | |||
src | |||
.gitattributes | |||
.gitignore | |||
README.md | |||
requirements.txt |
About
MyGPT Workshop: Build a ChatGPT For Your Own Data in One Hour
Repository Size
Activity 28 commits
-
committed bc4a72ea67 3mo ago
-
committed cc379783c3 3mo ago
-
committed 1ca96a0395 4mo ago
-
committed 4092a33bd7 4mo ago
-
committed 69f29b3338 5mo ago