-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate quantum kernel matrix elements for identical samples #432
Changes from 10 commits
309334c
707c938
48c2c40
e77e116
1af37fe
85a413e
48f0ac9
3b69199
d99a194
68d9ed8
dded7f7
82c09aa
63b6029
93274f4
9d38229
bcc4ab2
5b6857c
c83fcd7
97ca0a1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
--- | ||
features: | ||
- | | ||
Introduced a new parameter `evaluate_duplicates` in | ||
:class:`~qiskit_machine_learning.kernels.QuantumKernel`. This parameter defines a strategy how | ||
kernel matrix elements are evaluated if identical samples are found. | ||
adekusar-drl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Possible values are: | ||
|
||
- ``all`` means that all kernel matrix elements are evaluated, even the diagonal ones when | ||
training. This may introduce additional noise in the matrix. | ||
- ``non_diagonal`` when training the matrix diagonal is set to `1`, the rest elements are | ||
fully evaluated, e.g., for two identical samples in the dataset. When inferring, all | ||
elements are evaluated. This is the default value. | ||
- ``none`` when training the diagonal is set to `1` and if two identical samples are found | ||
in the dataset the corresponding matrix element is set to `1`. When inferring, matrix | ||
elements for identical samples are set to `1`. | ||
fixes: | ||
- | | ||
Fixed quantum kernel evaluation when duplicate samples are found in the dataset. Originally, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's right. I used both terms in the same meaning. I agree, I was not consistent. In terms of language and docstrings/parameters what is better? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I didn't find it confusing to have 2 terms, I think it's pretty clear that they refer to the same concept. I believe that maybe balancing them out and using duplicate a bit more should be enough (see suggestions below). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I would prefer to see duplicate used more in the descriptions when referring to elements managed by the evaluate_duplicates. I can see a duplicate as being another sample with an identical value as I mentioned above, but that begs another question I guess.... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Replaced with |
||
kernel matrix elements were not evaluated for identical samples in the dataset and such elements | ||
were set wrongly to zero. Now we introduced a new parameter `evaluate_duplicates` that ensures | ||
that elements of the kernel matrix are evaluated correctly. See the feature section for more | ||
details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this just be of type
str
? If I pass inNone
it looks like it will work, since it does astr(None)
which ends up as"None"
and then lowercase makes it the same as passing"none"
in. I guess having both I find confusing - I mean the None/Optional aspect here and the "none" documented string value. (In the unit tests you always pass in "none" as a string.)On a side note, would we want to mix/match in the same function both the Union the new | way of doing things. I think it would be better to be consistent at least at this level, though I realize we are moving towards the newer way of doing things where some modules may now be in new style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None
is a valid value as well. But I fine to keep juststr
as the type hint as well. Should I keep onlystr
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Edit] I think that the issue here is that, as Steve pointed out, on line 101 you are doing
str(evaluate_duplicates)
.If this line did not exist, then the issue would be clear, the type hint should be
str
, because we have a non-None
default, and therefore the argument is not really optional. In other words, without line 101, doing something likeevaluate_duplicates = None
would override the default, and later on raise an error.Now, because line 101 exists, then if you do explicitly
evaluate_duplicates = None
, it later on gets converted intonone
, and because this is one of the keyword names we accept, then the code goes on without raising an error. But I don't think this is a behaviour we want.So, I see 2 possible solutions:
None
as an input (not an empty argument, but an actualNone
), then we handle this case in the code.I would go for not accepting the explicit
None
, keeping onlystr
as a type hint, and replacing line 101 with:|eval_duplicates = evaluate_duplicates.lower()
I will make suggestions with this alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let me summarize:
QuantumKernel() -> evaluate_duplicates = off_diagonal
QuantumKernel(evaluate_duplicates=None) -> evaluate_duplicates = none
QuantumKernel(evaluate_duplicates="none") -> evaluate_duplicates = none
This is how it works now and this is exactly what I wanted to achieve. If you think this is a wrong way, let me know why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alright, I thought it was not intentional to have the second case... I still find it a bit strange but I don't think it would happen a lot anyways, so if you believe it makes sense I would be ok with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference would be to get rid of None and have it just as
str
type taking "all", "off_diagonal" or "none". None would feel to me it ought to be something different, but it ends up as just a synonym for "none" if you will so to me dropping it seems the thing to do.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, removed
None
.