-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
It's AI detector leaderboard submission #8
Conversation
Add It's AI submission
Eval run succeeded! Link to run: link Here are the results of the submission(s): It's AIRelease date: 2024-09-04 I've committed detailed results of this detector's performance on the test set to this PR. On the RAID dataset as a whole (aggregated across all generation models, domains, decoding strategies, repetition penalties, and adversarial attacks), it achieved an accuracy of 87.25%. |
Congrats on the new SOTA! Merged. |
Thanks! By the way, I've seen that in the paper you've also scored commercial ai-detectors, but I haven't found them in the leaderboard - is there a way to compare our solution with theirs? |
Yes. The reason why they aren't included on the leaderboard is because we were only able to test a small portion of our test set on the commercial detectors due to budget constraints. We're currently working with these companies to get more credits so that we can add them to the leaderboard as well. We hope to add them soon. |
Glad to hear that you are going to add them too. Anyway great work with RAID dataset - really impressed with the amount and diversity of data that you've collected and first ai-detection leaderboard that you've made. |
Thank you! And thanks for submitting :) |
Thank you @liamdugan for the prompt update of the leaderboard. It is great to have a new SOTA method. Congrats @sergak0 I notice on your github page (https://github.com/It-s-AI/llm-detection) that there're two baseline models, the perplexity-logistic regression model, and the finetuned Deberta model. Is the Deberta model the SOTA submission here? Or is it the result from the subnet 32? Btw, I love your idea of using Bittensor subset for the AI detection task. |
@AIApprentice101 yeah, it's the result from subnet 32.
Thanks, hope to make our detector even better in the future. |
No description provided.