-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
schema disagreement error attempting to insert data after the Scylla upgrade #1150
Comments
node-1 (the one being upgraded)
node-2, update the schema:
node-1, notice the other nodes 2min after, and get the new schema from them:
@vponomaryov, I think this might be an k8s related issue, and we'll need @scylladb/team-operator to take a closer look here. |
@fruch |
@fruch why is it marked as master/triage? |
It was a suspected core issue, seems like it's not the case. |
Why do you think it's k8s related? What's the condition you wait for before you issue an insert? |
we are waiting like that: def wait_till_scylla_is_upgraded_on_all_nodes(self, target_version: str) -> None:
def _is_cluster_upgraded() -> bool:
for node in self.db_cluster.nodes:
node.forget_scylla_version()
if node.scylla_version != target_version or not node.db_up:
return False
return True
wait.wait_for(
func=_is_cluster_upgraded,
step=30,
text="Waiting until all nodes in the cluster are upgraded",
timeout=900,
throw_exc=True,
) that the version is what we except, and the the CQL port is open. what else should we need to wait for before using the cluster ? |
In my view, you should look at ScyllaCluster.Status.Conditions - Not keeping quorum throught rollouts it's a known issue on k8s - #1077 |
We will look at checking this status as well
@mykaul is it's agreed it's a operator issue, can you help us move it there ? @zimnx seems like there's some strong arguments with the suggest solution for #1077, is there still moving forward ? |
The Scylla Operator project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
/lifecycle stale |
1 similar comment
The Scylla Operator project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
/lifecycle stale |
The Scylla Operator project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
/lifecycle rotten |
The Scylla Operator project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
/close not-planned |
@scylla-operator-bot[bot]: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Issue description
Describe your issue in detail and steps it took to produce it.
Impact
User cannot perform some queries.
How frequently does it reproduce?
It was reproduced 2 times from 2.
Installation details
Kernel Version: 5.15.0-1020-gke
Scylla version (or git commit hash):
5.0.5-20221009.5a97a1060
with build-id5009658b834aaf68970135bfc84f964b66ea4dee
Relocatable Package: http://downloads.scylladb.com/downloads/scylla/relocatable/scylladb-5.1/scylla-x86_64-package-5.1.2.0.20221225.4c0f7ea09893.tar.gz
Operator Image: scylladb/scylla-operator:1.8.0-rc.0
Operator Helm Version: 1.8.0-rc.0
Operator Helm Repository: https://storage.googleapis.com/scylla-operator-charts/latest
Cluster size: 3 nodes (n1-standard-8)
Scylla Nodes used in this run:
No resources left at the end of the run
OS / Image:
N/A
(k8s-gke: us-east1-b)Test:
upgrade-major-scylla-k8s-gke
Test id:
207bdbdc-673c-4c52-ac37-44faddabe464
Test name:
scylla-operator/operator-1.8/upgrade/upgrade-major-scylla-k8s-gke
Test config file(s):
<details>
<summary>
Running Scylla upgrade from
5.0.5-0.20221009.5a97a1060 with build-id 5009658b834aaf68970135bfc84f964b66ea4dee
to5.1.2-0.20221225.4c0f7ea09893 with build-id 4817fe236d57eca203f35b1dbb4bfe43cab72590
on K8S backend (GKE) we faced following problem:Logs with error:
We run lots of commands, but the same one failed in the same place in 2 different test runs.
And second test run was using enterprise Scylla upgrading from the
2021.1.17-0.20221221.5318a7fec with build-id d4378bd13d179b4bbcde7bdc82b92d8cc71c52d8
to the2022.1.3-0.20220922.539a55e35 with build-id d1fb2faafd95058a04aad30b675ff7d2b930278d
version.</summary>
$ hydra investigate show-monitor 207bdbdc-673c-4c52-ac37-44faddabe464
$ hydra investigate show-logs 207bdbdc-673c-4c52-ac37-44faddabe464
Logs:
Jenkins job URL
</details>
The text was updated successfully, but these errors were encountered: