15
0
Fork 0

A database and collection of LLM results across models and questions.

README.md

LLM Evolution Tracking

To run, simply set up a new python environment using

python -m venv .venv/
. .venv/bin/activate
pip install -r requirements.txt

Then run

./runner.py

The models used may be changed in llm_evolution/models.py.

This will generate a json output file containing all the responses.

To process the new results, run

./dump_output.py <output_json>

Run ./dump_output.py --help to see other options.

File List Total items: 12
Name Last Commit Size Last Modified
.xet Compare 4 different GPT models across MMLU prompts, with prompt scorings 3 weeks ago
cache Compare 4 different GPT models across MMLU prompts, with prompt scorings 3 weeks ago
data/mmlu Compare 4 different GPT models across MMLU prompts, with prompt scorings 3 weeks ago
llm_evolution Compare 4 different GPT models across MMLU prompts, with prompt scorings 3 weeks ago
output_data Compare 4 different GPT models across MMLU prompts, with prompt scorings 3 weeks ago
.gitattributes Compare 4 different GPT models across MMLU prompts, with prompt scorings 90 B 3 weeks ago
.gitignore Compare 4 different GPT models across MMLU prompts, with prompt scorings 3.0 KiB 3 weeks ago
LICENSE Compare 4 different GPT models across MMLU prompts, with prompt scorings 1.0 KiB 3 weeks ago
README.md Updated readme. 462 B 3 weeks ago
dump_output.py Compare 4 different GPT models across MMLU prompts, with prompt scorings 2.6 KiB 3 weeks ago
requirements.txt Updated readme. 1.4 KiB 3 weeks ago
runner.py Compare 4 different GPT models across MMLU prompts, with prompt scorings 1.2 KiB 3 weeks ago

About

A database and collection of LLM results across models and questions.

Repository Size

Loading repo size...

Commits 3 commits

File Types