Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minio buckets are missing #1853

Closed
jefflill opened this issue Aug 19, 2023 · 3 comments
Closed

minio buckets are missing #1853

jefflill opened this issue Aug 19, 2023 · 3 comments
Assignees
Labels
bug Identifies a bug or other failure neon-kube Related to our Kubernetes distribution

Comments

@jefflill
Copy link
Collaborator

It looks like minio isn't working anymore because the verson we're on is 1 year old. I've been trying to track down the cluster operator performance issue and found that minio-ss is pegging the CPU and loki-ingestor is also struggling (probably trying to do S3 operations) and loki-compactor and a few other mimir are crashing.

Note that the cluster seemed fine with low CPU utilization at first and for maybe an hour or two while I was working and then went bad overnight.

cluster info: tiny Hyper-V but with 16 GiB RAM,

minio-ss log:

λ neon logs -f -n=neon-system minio-ss-0-0                                                                                            
Formatting 1st pool, 1 set(s), 4 drives per set.                                                                                      
WARNING: Host local has more than 0 drives of set. A host failure will result in data becoming unavailable.                           
                                                                                                                                      
API: SYSTEM()                                                                                                                         
Time: 00:43:34 UTC 08/19/2023                                                                                                         
DeploymentID: 89588c77-6635-4097-8909-ab917b96a508                                                                                    
Error: Unable to initialize storage class config: Standard storage class parity 3 should be less than or equal to 2 (*fmt.wrapError)  
      10: internal/logger/logger.go:278:logger.LogIf()                                                                                
       9: cmd/config-current.go:725:cmd.applyDynamicConfigForSubSys()                                                                 
       8: cmd/config-current.go:756:cmd.applyDynamicConfig()                                                                          
       7: cmd/config-current.go:618:cmd.lookupConfigs()                                                                               
       6: cmd/config-current.go:871:cmd.loadConfig()                                                                                  
       5: cmd/config.go:235:cmd.initConfig()                                                                                          
       4: cmd/config.go:195:cmd.(*ConfigSys).Init()                                                                                   
       3: cmd/server-main.go:404:cmd.initConfigSubsystem()                                                                            
       2: cmd/server-main.go:371:cmd.initServer()                                                                                     
       1: cmd/server-main.go:552:cmd.serverMain()                                                                                     
                                                                                                                                      
 You are running an older version of MinIO released 1 year ago                                                                        
 Update: Run `mc admin update`                                                                                                        
                                                                                                                                      
                                                                                                                                      
MinIO Object Storage Server                                                                                                           
Copyright: 2015-2022 MinIO, Inc.                                                                                                      
License: GNU AGPLv3 <https://www.gnu.org/licenses/agpl-3.0.html>                                                                      
Version: RELEASE.2022-07-04T21-02-54Z (go1.18.3 linux/amd64)                                                                          
                                                                                                                                      
Status:         4 Online, 0 Offline.                                                                                                  
API: http://minio.neon-system.svc.cluster.local                                                                                       
Console: http://10.254.120.231:9090 http://127.0.0.1:9090                                                                             
                                                                                                                                      
Documentation: https://docs.min.io                                                                                                    

This is a bit confusing though:

  • ERROR: Unable to initialize storage class config: Standard storage class parity 3 should be less than or equal to 2 (*fmt.wrapError)

    This could be bad, but it doesn't look like this was fatal:

    https://blog.min.io/configurable-data-and-parity-drives-on-minio-server/

  • You are running an older version of MinIO released 1 year ago

    I think this is just informational. At first, I thought this might be fatal, but that would be crazy.

  • Status: 4 Online, 0 Offline.

    This seems to imply that things are fine.

loki-distributor log

level=warn ts=2023-08-19T12:22:39.740308871Z caller=logging.go:86 traceID=0a4dafcf332e1d4c orgID=tiny-hyperv msg="POST /loki/api/v1/push (500) 7.50222ms Response: \"at least 1 live replicas required, could only find 0 - unhealthy instances: 10.254.120.228:9095\\n\" ws: false; Content-Length: 199194; Content-Type: application/x-protobuf; User-Agent: GrafanaAgent/; X-Scope-Orgid: tiny-hyperv; "
level=warn ts=2023-08-19T12:22:39.940772339Z caller=logging.go:86 traceID=566db4ebbe001dfa orgID=tiny-hyperv msg="POST /loki/api/v1/push (500) 14.208149ms Response: \"at least 1 live replicas required, could only find 0 - unhealthy instances: 10.254.120.228:9095\\n\" ws: false; Content-Length: 198665; Content-Type: application/x-protobuf; User-Agent: GrafanaAgent/; X-Scope-Orgid: tiny-hyperv; "
level=warn ts=2023-08-19T12:22:40.695451414Z caller=logging.go:86 traceID=0750ae8ba4a8c8ce orgID=tiny-hyperv msg="POST /loki/api/v1/push (500) 9.165102ms Response: \"at least 1 live replicas required, could only find 0 - unhealthy instances: 10.254.120.228:9095\\n\" ws: false; Content-Length: 198665; Content-Type: application/x-protobuf; User-Agent: GrafanaAgent/; X-Scope-Orgid: tiny-hyperv; "
level=warn ts=2023-08-19T12:22:42.123810025Z caller=logging.go:86 traceID=17de6854879d1fa7 orgID=tiny-hyperv msg="POST /loki/api/v1/push (500) 5.910137ms Response: \"at least 1 live replicas required, could only find 0 - unhealthy instances: 10.254.120.228:9095\\n\" ws: false; Content-Length: 198665; Content-Type: application/x-protobuf; User-Agent: GrafanaAgent/; X-Scope-Orgid: tiny-hyperv; "
level=warn ts=2023-08-19T12:22:42.295139603Z caller=logging.go:86 traceID=247821e2a7647a45 orgID=tiny-hyperv msg="POST /loki/api/v1/push (500) 5.037447ms Response: \"at least 1 live replicas required, could only find 0 - unhealthy instances: 10.254.120.228:9095\\n\" ws: false; Content-Length: 199145; Content-Type: application/x-protobuf; User-Agent: GrafanaAgent/; X-Scope-Orgid: tiny-hyperv; "
@jefflill jefflill added bug Identifies a bug or other failure neon-kube Related to our Kubernetes distribution labels Aug 19, 2023
@jefflill
Copy link
Collaborator Author

jefflill commented Aug 19, 2023

info: https://blog.min.io/configurable-data-and-parity-drives-on-minio-server/

I see that we're setting the environment variable MINIO_STORAGE_CLASS_STANDARD=EC:3 in the Minio Helm values file:

    ## Add environment variables to be set in MinIO container (https://github.com/minio/minio/tree/master/docs/config)
    env:
      - name: GOGC
        value: "25"
      - name: CONSOLE_PROMETHEUS_URL
        value: http://mimir.neon-monitor.svc.cluster.local/prometheus
      - name: CONSOLE_PROMETHEUS_AUTH_TYPE
        value: public
      - name: CONSOLE_PROMETHEUS_JOB_ID
        value: minio
      - name: MINIO_PROMETHEUS_URL
        value: http://mimir.neon-monitor.svc.cluster.local/prometheus
      - name: MINIO_PROMETHEUS_AUTH_TYPE
        value: public
      - name: MINIO_PROMETHEUS_JOB_ID
        value: minio
      - name: MINIO_STORAGE_CLASS_STANDARD      <---
        value: EC:3

 
This minio readme implies that minio defaults to the STANDARD storage class and that a reasonable EC:? value is selected automatically.

  • I'm going to try removing the MINIO_STORAGE_CLASS_STANDARD environment variable from the Helm values file and see if the error does away and if I still see crashes in a new cluster. I suspect minio is just ignoring the EC:3 value and using its defaults instead.

    JEFF UPDATE: Removing the environment variable fixed the logged error.

@jefflill
Copy link
Collaborator Author

jefflill commented Aug 19, 2023

This is interesting. The mini dashboard indicates that minio is healthy, but there are no buckets?

Image

Image

So, the monitoring buckets mustn't have been created or was deleted? I must have broke something.

@jefflill jefflill changed the title minio is failing due to being 1 year old minio buckets are missing Aug 19, 2023
@jefflill
Copy link
Collaborator Author

my bad: the logic for the new [Controller(IgnoreWhenNotInPod = true)] attribute property was incorrect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Identifies a bug or other failure neon-kube Related to our Kubernetes distribution
Projects
None yet
Development

No branches or pull requests

2 participants