You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm implementing the use of code quality in Gitlab for the company I work at.
Code quality relies on json reports generated with a specific format, with each error having some metadata and a fingerprint.
My understanding is that in ruff, the fingerprint is a hash generated from:
In the past, the line number at which the error is picked up used to be included in the hash, but thankfully was removed. If someone added code before an issue the file had, it would generate a new finger print and the code quality widget in Gitlab would show all the issues as fixed and also new, duplicating everything and making the tool unusable.
Reelvant issues/discussions on this subject: #3996 : initial discussion #7203 : issue solving the above problem #7159
Current problem
This is all positive and a clear welcomed improvement. However I see a shortcoming with this approach, that is actually mentioned in the issue referenced above.
Let's assume we have a bunch of errors in a file that all have the same error message, I'll blatantly reuse the example by @MichaReiser in the above linked thread:
Let's say we have two unused variable diagnostics in a file:
x = 1
y = 2
There's a hash collision for x and y, so y performs a second hashing round and we now fix the first violation
x = 1
y = 2
print(x)
There's no longer a hash collision for y, meaning that the diagnostic for y now gets the hash (fingerprint) of the violation that used to be for x.
If I do this in gitlab's code quality, assuming I had a report generated on my target merge branch generated with these two errors, I might end up with a case where:
y's unused error is now marked fixed even though it isn't
x's unused error is still present and not marked as fixed even though it is.
You can also imagine plenty of other scenarios, where I fix a bunch of errors like the one above, but I introduce other identical errors later in the code, following the same logic, the fixed ones will not be marked as fixed and the new ones won't even be picked up and displayed as a new error.
Specific scenario
To bring this back to my specific scenario, I noticed problems while doing some various tests of the feature where I would add docstrings to functions missing them but then implement new functions without a docstring and would end up with the same error as above. Why is that? Because the missing docstring rule for a public function (D103) doesn't mention the name of the function for example and therefore all missing docstring errors are the same.
I understand the current implementation is still functional and the above examples are maybe fringe for most people but I still wanted to log it as it represents a limitation in our project.
I also wanted to know if anyone has any idea of how this could be improved without re-adding the line number which creates much more problems than the above.
The text was updated successfully, but these errors were encountered:
We're interested in anyone's ideas on approaching this that doesn't rely on line numbers and isn't prone to the above problem. I tried to find some guidance or best practices from git lab but couldn't find any.
Do you use any other linters in your project that doesn't have the limitation you outlined above?
Context
I'm implementing the use of code quality in Gitlab for the company I work at.
Code quality relies on json reports generated with a specific format, with each error having some metadata and a fingerprint.
My understanding is that in ruff, the fingerprint is a hash generated from:
Relevant code is here.
The finger print is:
Past changes
In the past, the line number at which the error is picked up used to be included in the hash, but thankfully was removed. If someone added code before an issue the file had, it would generate a new finger print and the code quality widget in Gitlab would show all the issues as fixed and also new, duplicating everything and making the tool unusable.
Reelvant issues/discussions on this subject:
#3996 : initial discussion
#7203 : issue solving the above problem
#7159
Current problem
This is all positive and a clear welcomed improvement. However I see a shortcoming with this approach, that is actually mentioned in the issue referenced above.
Let's assume we have a bunch of errors in a file that all have the same error message, I'll blatantly reuse the example by @MichaReiser in the above linked thread:
Let's say we have two unused variable diagnostics in a file:
There's a hash collision for x and y, so y performs a second hashing round and we now fix the first violation
There's no longer a hash collision for y, meaning that the diagnostic for y now gets the hash (fingerprint) of the violation that used to be for x.
If I do this in gitlab's code quality, assuming I had a report generated on my target merge branch generated with these two errors, I might end up with a case where:
You can also imagine plenty of other scenarios, where I fix a bunch of errors like the one above, but I introduce other identical errors later in the code, following the same logic, the fixed ones will not be marked as fixed and the new ones won't even be picked up and displayed as a new error.
Specific scenario
To bring this back to my specific scenario, I noticed problems while doing some various tests of the feature where I would add docstrings to functions missing them but then implement new functions without a docstring and would end up with the same error as above. Why is that? Because the missing docstring rule for a public function (D103) doesn't mention the name of the function for example and therefore all missing docstring errors are the same.
I understand the current implementation is still functional and the above examples are maybe fringe for most people but I still wanted to log it as it represents a limitation in our project.
I also wanted to know if anyone has any idea of how this could be improved without re-adding the line number which creates much more problems than the above.
The text was updated successfully, but these errors were encountered: