lechmazur

Follow

lechmazur

Follow

CEO, Advameg, Inc.

2 followers · 0 following

Achievements

Achievements

Popular repositories Loading

confabulations confabulations Public

Hallucinations (Confabulations) Document-Based Benchmark for RAG

HTML 51 2
deception deception Public

Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claude, GPT-4, Gemini, Llama, etc.) with standardized evaluation …

14 1
ChessCounter ChessCounter Public

Estimate the number of legal chess positions

C++ 11
nyt-connections nyt-connections Public

Benchmark that evaluates LLMs using 436 NYT Connections puzzles

Python 8
lechmazur.github.io lechmazur.github.io Public

HTML