Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate quantum kernel matrix elements for identical samples #432

Merged
merged 19 commits into from
Jul 31, 2022

Conversation

adekusar-drl
Copy link
Collaborator

@adekusar-drl adekusar-drl commented Jul 6, 2022

Summary

Introduced a new parameter evaluate_duplicates in QuantumKernel. This parameter defines a strategy how kernel matrix elements are evaluated if identical samples are found. Possible values are:

  • all means that all kernel matrix elements are evaluated, even the diagonal ones when training. This may introduce additional noise in the matrix.
  • non_diagonal when training the matrix diagonal is set to 1, the rest elements are fully evaluated, e.g., for two identical samples in the dataset. When inferring, all elements are evaluated. This is the default value.
  • none when training the diagonal is set to 1 and if two identical samples are found in the dataset the corresponding matrix element is set to 1. When inferring, matrix elements for identical samples are set to 1.

This PR fixes #425, #336.

@coveralls
Copy link

coveralls commented Jul 7, 2022

Pull Request Test Coverage Report for Build 2717867829

  • 20 of 22 (90.91%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.06%) to 87.241%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit_machine_learning/kernels/quantum_kernel.py 20 22 90.91%
Totals Coverage Status
Change from base Build 2717863385: -0.06%
Covered Lines: 2995
Relevant Lines: 3433

💛 - Coveralls

@@ -60,6 +61,7 @@ def __init__(
batch_size: int = 900,
quantum_instance: Optional[Union[QuantumInstance, Backend]] = None,
training_parameters: Optional[Union[ParameterVector, Sequence[Parameter]]] = None,
evaluate_duplicates: str | None = "non_diagonal",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this just be of type str? If I pass in None it looks like it will work, since it does a str(None) which ends up as "None" and then lowercase makes it the same as passing "none" in. I guess having both I find confusing - I mean the None/Optional aspect here and the "none" documented string value. (In the unit tests you always pass in "none" as a string.)

On a side note, would we want to mix/match in the same function both the Union the new | way of doing things. I think it would be better to be consistent at least at this level, though I realize we are moving towards the newer way of doing things where some modules may now be in new style.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I wanted to show that None is a valid value as well. But I fine to keep just str as the type hint as well. Should I keep only str?
  2. I changed the type hints across the class to the new way.

Copy link
Collaborator

@ElePT ElePT Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Edit] I think that the issue here is that, as Steve pointed out, on line 101 you are doing str(evaluate_duplicates).

If this line did not exist, then the issue would be clear, the type hint should be str, because we have a non-None default, and therefore the argument is not really optional. In other words, without line 101, doing something like evaluate_duplicates = None would override the default, and later on raise an error.

Now, because line 101 exists, then if you do explicitly evaluate_duplicates = None, it later on gets converted into none, and because this is one of the keyword names we accept, then the code goes on without raising an error. But I don't think this is a behaviour we want.

So, I see 2 possible solutions:

  1. We don't use "none" as a keyword, so this edge case never takes place (however, I can't think of a good alternative)
  2. We don't convert to string on line 101, we enforce this condition to the user, and if we really want to accept an explicit None as an input (not an empty argument, but an actual None), then we handle this case in the code.

I would go for not accepting the explicit None, keeping only str as a type hint, and replacing line 101 with:|

eval_duplicates = evaluate_duplicates.lower()

I will make suggestions with this alternative.

Copy link
Collaborator Author

@adekusar-drl adekusar-drl Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let me summarize:

QuantumKernel() -> evaluate_duplicates = off_diagonal
QuantumKernel(evaluate_duplicates=None) -> evaluate_duplicates = none
QuantumKernel(evaluate_duplicates="none") -> evaluate_duplicates = none

This is how it works now and this is exactly what I wanted to achieve. If you think this is a wrong way, let me know why.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright, I thought it was not intentional to have the second case... I still find it a bit strange but I don't think it would happen a lot anyways, so if you believe it makes sense I would be ok with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to show that None is a valid value as well. But I fine to keep just str as the type hint as well. Should I keep only str?

My preference would be to get rid of None and have it just as str type taking "all", "off_diagonal" or "none". None would feel to me it ought to be something different, but it ends up as just a synonym for "none" if you will so to me dropping it seems the thing to do.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, removed None.

qiskit_machine_learning/kernels/quantum_kernel.py Outdated Show resolved Hide resolved
qiskit_machine_learning/kernels/quantum_kernel.py Outdated Show resolved Hide resolved
elements for identical samples are set to `1`.
fixes:
- |
Fixed quantum kernel evaluation when duplicate samples are found in the dataset. Originally,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when duplicate samples above, apart from the parameter evaluate_duplicates it uses identical to describe samples, Here it uses duplicate like the parameter does, but this is the only place. I guess I should read duplicate and identical as meaning the same thing - i.e. two or more samples having an identical value and thus being in essence duplicates of one another.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. I used both terms in the same meaning. I agree, I was not consistent. In terms of language and docstrings/parameters what is better?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find it confusing to have 2 terms, I think it's pretty clear that they refer to the same concept. I believe that maybe balancing them out and using duplicate a bit more should be enough (see suggestions below).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would prefer to see duplicate used more in the descriptions when referring to elements managed by the evaluate_duplicates.

I can see a duplicate as being another sample with an identical value as I mentioned above, but that begs another question I guess....
....when something is a duplicate is this an exact value match? - we have no tolerance here that can be set for when things are very very close that we consider them the same (a duplicate)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with duplicated, thanks to Elena

@adekusar-drl adekusar-drl requested a review from ElePT July 11, 2022 14:06
@adekusar-drl
Copy link
Collaborator Author

@ElePT while the PR is almost done, you review might be helpful.

Copy link
Collaborator

@ElePT ElePT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me, except for minor changes explained in comments above (see suggestions below, sorry for the strange order).

adekusar-drl and others added 7 commits July 13, 2022 10:45
Co-authored-by: ElePT <57907331+ElePT@users.noreply.github.com>
…f60.yaml

Co-authored-by: ElePT <57907331+ElePT@users.noreply.github.com>
Co-authored-by: ElePT <57907331+ElePT@users.noreply.github.com>
Co-authored-by: ElePT <57907331+ElePT@users.noreply.github.com>
@adekusar-drl adekusar-drl merged commit 1261ef1 into qiskit-community:main Jul 31, 2022
@adekusar-drl adekusar-drl deleted the qk_bug branch July 31, 2022 17:09
oscar-wallis pushed a commit that referenced this pull request Feb 16, 2024
* bug fix

* add reno

* format reno

* format reno

* format reno

* format reno

* format reno

* fix tests

* fix style

* fix mypy

* code review

* Update qiskit_machine_learning/kernels/quantum_kernel.py

Co-authored-by: ElePT <57907331+ElePT@users.noreply.github.com>

* Update releasenotes/notes/fix-quantum-kernel-duplicates-b75d6ed8a0f37f60.yaml

Co-authored-by: ElePT <57907331+ElePT@users.noreply.github.com>

* Update qiskit_machine_learning/kernels/quantum_kernel.py

Co-authored-by: ElePT <57907331+ElePT@users.noreply.github.com>

* Update qiskit_machine_learning/kernels/quantum_kernel.py

Co-authored-by: ElePT <57907331+ElePT@users.noreply.github.com>

Co-authored-by: ElePT <57907331+ElePT@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Quantum kernel - missing indices in regression tasks
4 participants