RFC: Grades are coming for everything. Here's the draft rubric. #34
Replies: 9 comments 28 replies
-
Awesome, I think this mechanism is sound and good for C4. The only doubt I have is about the working code requirements. "Below 50 may be increasingly penalized in the future" - |
Beta Was this translation helpful? Give feedback.
-
Interesting idea. I like the idea of scoring the submissions. Poor quality reports in my opinion don't merit a payout. Finding vulnerabilities is useless if you can't explain or justify them. I have a big issue with the payouts being distributed on the curve. All this does is bring more subjectivity into what should be an objective contest. Disqualify reports under a threshold but distribution should be even for everyone that qualifies. |
Beta Was this translation helpful? Give feedback.
-
Will this also be implemented for QA? I believe it should. |
Beta Was this translation helpful? Give feedback.
-
Would be nice to have a clear statement of what the intended effect of this proposed change is. As discussed in another comment thread already, how good / bad this change is strongly depends on the grading curve. If the grading is too granular and/or the curve is too steep it will lead to a reduction in audit quality as wardens may start shifting more of their effort into making their submissions more polished to ensure that they receive a higher spot along the curve rather than searching for more issues. A balance needs to be met between requiring a minimum level of quality but capping the maximum useful quality. I believe that any submission that meets a certain set of criteria should be rewarded the same. Submissions that obviously lack in quality, are malicious and/or spam should be penalized or sanctioned accordingly, but there should nevertheless be a set, objective threshold beyond which an auditor can feel certain that investing more effort will not lead to a larger reward for that issue. This will allow wardens to more strongly focus on finding more issues while still providing an incentive to meet a certain quality standard. If the total required effort is simply shifted from judges to wardens without a useful reduction in overall effort it simply makes C4 less efficient. The more we can centralize and streamline different steps of the process the more efficient and therefore competitive C4 can be. I'd argue that a warden's core focus should mainly be finding high, medium and low/non-critical issues and creating submissions such that they have a PoC, impact and justification of the severity for each issue. Judging, creating recommendations for mitigation and final report compilation can all be done in a more centralized manner downstream. Specifically requiring wardens to already create mitigation recommendations and make their submissions report ready unnecessarily duplicates work for issues. Instead, these steps of the process should be centralized and separated to allow for specialization and higher efficiency. As an added note I didn't mention gas issues yet in my comment as I feel that could better be separated into its own offering. This is because:
TL;DR If this proposal leads to judges spending much less time in exchange for wardens spending a little more on submissions I'm all for it. Oppositely if it leads to wardens being required to spend a lot more time on submissions, while saving little for judges and not really contributing to overall audit quality then it should be revised and/or better specified. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
I think it's also important to put some emphasis on having the sponsor to include some basic test/deploy scripts (as they mostly do), and make it as understandable, extendable and usable for the wardens (they mostly are, but there's always room for improvement). This can save a lot of time for the wardens. |
Beta Was this translation helpful? Give feedback.
-
I would suggest that first time offenders get a warning before permanent penalties are imposed on them. If the penalty is only for the contest in hand, then it would be fine. As a relatively new warden, i have made mistakes myself. And even repeated those mistakes as I had no feedback about my reports. I simply didn't know them. One could argue i should have taken the time and gone through the rules. But I was caught up between learning solidity, coding and submitting. Those times, how much I had wished for some constructive feedback! It would be unfair to expect from a first time auditor to know all the rules and ways of posting. So for the first offense show the stick, and for the second one onwards give a wack. :) I appreciate how the C4 team is working towards making C4 even more perfect. Grade system is great. Though it might not benefit a beginner like me at this point, from a wholesome perspective its the right approach. Thank you. _/_ |
Beta Was this translation helpful? Give feedback.
-
Some judges do not currently upgrade QA to Med/High, and given that wardens won't know which judge will be judging ahead of time, this will lead to wardens submitting inflated severities. With the new scoring system, will judges be required to upgrade, or will upgrades be done away with, or will it still be up to each judge to decide what they want to do? |
Beta Was this translation helpful? Give feedback.
-
I am completely, directionally on board with this proposal: grades are going to be a big improvement and I trust that we'll work out the details. (In fact, I am on board with adopting this as is and trusting judges to figure it out.) But I've noticed a couple specific edge cases over the past few weeks I want to call out.
If there's a theme in both of these, I think it is that it may be more complicated to combine "grading" and "scoring" than it appears. (But that should not stop us from designing a way to do it!) |
Beta Was this translation helpful? Give feedback.
-
Grades for everything
Soon™ we will also be asking judges to grade everything, including medium and high severity issues on the
0
to100
scale—just as QA and gas reports are now. Within a given set of duplicates, awards will be distributed on a curve based on the judges' grading.This approach gives judges flexibility to have their own style. (Some prefer buckets, others prefer granularity; the 100-point scale and curve allows both.)
Assume three aspects to grading criteria for quality submissions (60+):
Only 'passing' grades will be eligible to be included in awards. Minimum threshold for passing is 60.
Medium and high severity finding criteria
Medium and high severity issues will require some level of clear evidence and a justification for why they merit the severity within the criteria guidelines. The more complex the claimed vulnerability, the more they require working code to demonstrate the conclusion. (In order to support this, we will be requesting sponsors to include their full code repo.)
Submitting a high severity issue and failing to include working code which demonstrates the impact is a risk wardens may take, but this may lead to a high severity issue being downgraded and/or deemed ineligible for awards.
To ensure folks are aware of this requirement, we will be adding the severity descriptions to the finding form when selecting the severity level and clearly state that the issue will not be awarded if it does not include the appropriate evidence.
Draft rubric
Passing grades:
Borderline passing:
Borderline failing:
Below 50 may be increasingly penalized in the future:
Beta Was this translation helpful? Give feedback.
All reactions