XetHub

langchain_talk

forked from xdssio/langchain_demo

A langchain demo project for pygrunn talk.

xdssio

5860a3fa34 71 commits

make the assistant more fun

readme.md

A langchain demo in xethub

We use langchain - index and openai to answer questions about fairy tales by indexing text files.

Requirements

$ python -m venv .venv
$ . .venv/bin/activate
$ pip install -r requirements.txt
$ export OPENAI_API_KEY=YOUR_OPENAI_API_KEY

If you want to try searching the internet:
Sign here

export SERP_API_KEY=YOUT_SERP_API_KEY
pip install google-search-results

Checkout index and sql datqbase

git xet checkout -- model
git xet checkout -- data/imdb.db

Instructions

Search docs

To search within a group of files, we must first split them to small chunks, embed them as vectors, and save to an index.
On query time, we find the most relevant chunks, retrieve them and construct a prompt with the query and the context.

Usage

With langchain, building a doc search engine is as simple as:

from langchain.document_loaders import TextLoader
from langchain.indexes import VectorstoreIndexCreator
import glob
import itertools

loaders = list(itertools.chain(*[TextLoader(file_path) for file_path in glob.glob(f'data/*.txt')]))
index = VectorstoreIndexCreator().from_loaders(loaders)
index.query_with_sources("Who was Pinocchio's father?")

In this repo we made an index helper which wrap it for training and querying workflows.

from src.index import Index

index = Index(model_path='model').fit("data", reset=True)
index = Index.load('model_path')
index.query("Who was Pinocchio's father?")

python src/train.py would do the same
If you clone the repo as is, the index is already populated.
Chunking-strategies

SQL queries

We saved a sqlite db with imdb dataset in the same data folder. With langchain we can query it with natural language.

from langchain import OpenAI, SQLDatabase
from langchain.chains import SQLDatabaseSequentialChain

db_chain = SQLDatabaseSequentialChain.from_llm(llm=OpenAI(temperature=0),
                                               database=SQLDatabase.from_uri("sqlite:///data/imdb.db"),
                                               verbose=True)

db_chain.run("How many movies are there?")

Python

We can also run python code with langchain using PythonREPL (Read-Eval-Print Loop).

from langchain.agents import Tool
from langchain.utilities import PythonREPL

repl_tool = Tool(
    name="python_repl",
    description="A Python shell. Use this to execute python commands. Input should be a valid python command. If you want to see the output of a value, you should print it out with `print(...)`.",
    func=PythonREPL().run
)
repl_tool.run("print('Hello World')")

Search the internet

You'll need a serp api key
export SERP_API_KEY=YOUT_SERP_API_KEY
pip install google-search-results

from langchain.utilities import SerpAPIWrapper

search = SerpAPIWrapper()
print(search.run("Obama's first name?"))

If you don't want, we can use wikipedia instead.

pip install wikipedia

from langchain.utilities import WikipediaAPIWrapper

print(WikipediaAPIWrapper().run("Who was Pinocchio's father?"))

Chatbots

A Chatbot is created by holding on to the last few messages plus a system basic message.

A quick example:

from langchain.llms import OpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

llm = OpenAI(temperature=0)
conversation = ConversationChain(
    llm=llm,
    verbose=True,
    memory=ConversationBufferMemory()
)

conversation.predict(input="how are you doing?")

A more ChatGPT-like can be by adjusting the system prompt

from langchain import OpenAI, LLMChain, PromptTemplate
from langchain.memory import ConversationBufferWindowMemory
import pathlib

prompt = PromptTemplate(
    input_variables=["history", "human_input"],
    template=pathlib.Path("prompts/assistant.txt").read_text()  # I made her a bit sassier
)

chatgpt_chain = LLMChain(
    llm=OpenAI(temperature=0),
    prompt=prompt,
    verbose=True,
    memory=ConversationBufferWindowMemory(k=4),
)
print(chatgpt_chain.predict(human_input="Do you believe in the moon landing?"))
print(chatgpt_chain.predict(human_input="What is in area 51?"))

Combining everything

We can use all of these capabilities as tool and provide them to our agent.

from langchain import OpenAI, SerpAPIWrapper
from langchain.agents import Tool, initialize_agent
from langchain.agents import AgentType
from langchain.tools.python.tool import PythonREPLTool

llm = OpenAI(temperature=0)

tools = [
    Tool(
        name="Search",
        func=SerpAPIWrapper().run,
        description="useful for when you need to answer questions about current events. You should ask targeted questions"
    ),
    Tool(
        name="Python",
        func=PythonREPLTool().run,
        description="useful for when you need to calculate somthing using programing"
    ),
]
mrkl = initialize_agent(tools, llm, agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
mrkl.run("What is the capital of France? and use python to get a hash of it")

Run the app

gradio app.py

File List			Total items: 13
Name	Last Commit	Size	Last Modified
data	remove fresh prince		1 year ago
docs	v1		12 months ago
model	remove fresh prince		1 year ago
notebooks	v1		12 months ago
prompts	make the assistant more fun		11 months ago
src	make the assistant more fun		11 months ago
tests	make the assistant more fun		11 months ago
.gitattributes	Initial commit	79 B	1 year ago
.gitignore	move scripts to src	8.7 KiB	1 year ago
app.py	make the assistant more fun	1006 B	11 months ago
config.py	first steps	190 B	12 months ago
readme.md	make the assistant more fun	5.6 KiB	11 months ago
requirements.txt	working example	1.7 KiB	1 year ago