Skip to content
Change the repository type filter

All

    Repositories list

    • Code to reproduce the paper "Questioning the Survey Responses of Large Language Models"
      Jupyter Notebook
      MIT License
      1800Updated Dec 8, 2024Dec 8, 2024
    • folktexts

      Public
      Get classification risk scores on tabular tasks using LLMs
      Jupyter Notebook
      MIT License
      01500Updated Dec 5, 2024Dec 5, 2024
    • Code to reproduce the experiments in the paper Training on the Test Task Confounds Evaluation and Emergence.
      Jupyter Notebook
      0700Updated Dec 3, 2024Dec 3, 2024
    • Jupyter Notebook
      MIT License
      0000Updated Dec 2, 2024Dec 2, 2024
    • Code to reproduce the paper "Do causal predictors generalize better to new domains?"
      Python
      Other
      7700Updated Oct 23, 2024Oct 23, 2024
    • A framework for few-shot evaluation of language models.
      Python
      MIT License
      1.9k100Updated Sep 20, 2024Sep 20, 2024
    • lawma

      Public
      Lawma: A lightly fine-tuned Llama model for legal classification tasks.
      Jupyter Notebook
      01500Updated Sep 14, 2024Sep 14, 2024
    • BenchBench is a Python package to evaluate multi-task benchmarks.
      Python
      MIT License
      11300Updated Jul 18, 2024Jul 18, 2024
    • Datasets derived from US census data
      Python
      MIT License
      1824253Updated May 15, 2024May 15, 2024
    • Achieve error-rate fairness between societal groups for any score-based classifier.
      Python
      MIT License
      41601Updated Apr 26, 2024Apr 26, 2024
    • tttlm

      Public
      Test-time-training on nearest neighbors for large language models
      Python
      MIT License
      43100Updated Apr 18, 2024Apr 18, 2024
    • Code for "Is your model predicting the past?"
      Jupyter Notebook
      MIT License
      0100Updated Mar 10, 2024Mar 10, 2024
    • whynot

      Public
      A Python sandbox for decision making in dynamics
      Python
      MIT License
      4341882Updated Aug 21, 2023Aug 21, 2023