Skip to content

A compilation of referenced benchmark metrics to evaluate different aspects of knowledge for Large Language Models.

License

Notifications You must be signed in to change notification settings

IteraLabs/knowledge-benchmarks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

llm-performance

A compilation of performance metrics for Large Language Models.

LLama3 model's card

Model's card, and specifically, some evaluation details about its performance has been released for the llama 3 family models.

Extracted from the website:

This document contains additional context on the settings and parameters for how we evaluated the Llama 3 pre-trained and instruct-aligned models.

mlcommons

It is very hard to argue agains this being the reference for all references in terms of science-based benchmarking for AI/ML in general.

Extracted from the website:

Building trusted, safe, and efficient AI requires better systems for measurement and accountability. MLCommons’ collective engineering with industry and academia continually measures and improves the accuracy, safety, speed, and efficiency of AI technologies.

About

A compilation of referenced benchmark metrics to evaluate different aspects of knowledge for Large Language Models.

Topics

Resources

License

Stars

Watchers

Forks