Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store nodes are not available after upgrading golang to 1.20 #19581

Closed
siddarthkay opened this issue Apr 10, 2024 · 10 comments
Closed

store nodes are not available after upgrading golang to 1.20 #19581

siddarthkay opened this issue Apr 10, 2024 · 10 comments
Assignees

Comments

@siddarthkay
Copy link
Contributor

Problem

After upgrading go to 1.20 on status-go and pointing to that branch on status-mobile we observe the following issue :

Trying to open community via a link fails.
Community link I tried : https://status.app/c/iw6AChwKCE5ldyBjb21tEgVkYXNzZBgBIgcjNDM2MERGAw==#zQ3shibjPWsBFi527PL5hyBWVq9tWjuXdSXwBscTKgEEx3YwM

Screenshots

What we see immediately:
Screenshot 2024-04-10 at 11 13 48 AM

What we see after a few seconds:
Screenshot 2024-04-10 at 11 14 56 AM

Metro logs

INFO  2024-04-10T05:42:56.633Z INFO [status-im.contexts.wallet.events:438] 
 - [wallet] Test network enabled:  true
 INFO  2024-04-10T05:42:56.634Z INFO [status-im.contexts.wallet.events:439] 
 - [wallet] Goerli network enabled:  false
 INFO  2024-04-10T05:43:34.595Z INFO [status-im.common.router:275] 
 - [router] uri  status-app://c/iw6AChwKCE5ldyBjb21tEgVkYXNzZBgBIgcjNDM2MERGAw==
 #zQ3shibjPWsBFi527PL5hyBWVq9tWjuXdSXwBscTKgEEx3YwM  matched  :community
  with  {:community-data "iw6AChwKCE5ldyBjb21tEgVkYXNzZBgBIgcjNDM2MERGAw==",
 :community-id "zQ3shibjPWsBFi527PL5hyBWVq9tWjuXdSXwBscTKgEEx3YwM"}
 INFO  2024-04-10T05:43:34.618Z INFO [native-module.core:319] 
 - [native-module] Deserializing and then compressing public key {:fn :deserialize-and-compress-key,
 :key "zQ3shibjPWsBFi527PL5hyBWVq9tWjuXdSXwBscTKgEEx3YwM"}
 INFO  2024-04-10T05:43:34.625Z INFO [legacy.status-im.mailserver.core:142] - fetched historical messages
 ERROR  2024-04-10T05:44:36.680Z ERROR [status-im.contexts.communities.events:230] 
 - {:message "Failed to request community info from mailserver",
 :error {:code -32000, :message "store node is not available"}}
@siddarthkay
Copy link
Contributor Author

The error is triggered from here -> https://github.com/status-im/status-go/blob/759e5e5c7b7697168305eab6d69383a6914fd657/protocol/messenger_store_node_request_manager.go#L520

seems that there is a timeout of 30 seconds

Investigating further

@siddarthkay
Copy link
Contributor Author

siddarthkay commented Apr 11, 2024

I added few logs here -> status-im/status-go@ec02f33

I see some interesting logs in logcat on Android Simulator API 34

INFO [04-11|06:58:27.305|github.com/status-im/status-go/protocol/
messenger_store_node_request_manager.go:80]                                
requesting community from store node     
community="{CommunityID:0x033ad301aab4414a8dc3d65094230e767a119d84249e86b5c499fd54498e89bc4c 
Shard:<nil>}" 
config="{WaitForResponse:true StopWhenDataFound:true InitialPageSize:4 FurtherPageSize:20}"
INFO [04-11|06:58:27.305|github.com/status-im/status-go/protocol/messenger_store_node_request_manager.go:497]                               
starting store node request              
requestID="{RequestType:2 
DataID:0x033ad301aab4414a8dc3d65094230e767a119d84249e86b5c499fd54498e89bc4c-shard-info}" 
pubsubTopic=/waku/2/rs/16/64 
contentTopic="[8 221 191 202]"
INFO [04-11|06:58:27.305|github.com/status-im/status-go/protocol/messenger_mailserver_cycle.go:766]
Waiting for available store node with timeout: 31s 
INFO [04-11|06:58:27.305|github.com/status-im/status-go/protocol/messenger_mailserver_cycle.go:779]
Mailserver is not available, waiting... 
ERROR[04-11|06:58:27.813|github.com/status-im/status-go/vendor/go.uber.org/zap/sugar.go:222]
failed to resolve local interface addresses error="route ip+net: netlinkrib: permission denied"
INFO [04-11|06:58:28.007|github.com/status-im/status-go/protocol/messenger_mailserver_cycle.go:145]
Automatically switching mailserver 
INFO [04-11|06:58:28.008|github.com/status-im/status-go/protocol/messenger_mailserver_cycle.go:235]
Finding a new mailserver... 
INFO [04-11|06:58:28.012|github.com/status-im/status-go/protocol/messenger_mailserver_cycle.go:264]
connecting error                         
err="lookup store-02.do-ams3.shards.test.statusim.net on [::1]:53: 
read udp [::1]:35038->[::1]:53: read: connection refused"

@siddarthkay
Copy link
Contributor Author

Some of these errors seem specific to Android OS I wonder what would be the difference in logs on iOS since QA team earlier reported that this functionality fails on iOS as well.

Investigating further.

@siddarthkay
Copy link
Contributor Author

I tried to recreate this on iOS and I was able to fetch community properly over there.
It seems like failed to resolve local interface addresses error="route ip+net: netlinkrib: permission denied" is the major culprit.

Potential fix in golang repo is here -> golang/go#61089

@siddarthkay
Copy link
Contributor Author

so error="route ip+net: netlinkrib: permission denied" is not the core issue, this log exists before the go upgrade and is not responsible for the storenode issue.

Next step should be to add more logs to the waitForAvailableStoreNode implementation to figure out why it timesout after 30 seconds.

@jakubgs
Copy link
Member

jakubgs commented Apr 15, 2024

Some relevant links about DNS resolution issues when CGO is not used in Go projects on Android:

@siddarthkay
Copy link
Contributor Author

The problematic logs are actually these :

INFO [04-15|13:08:09.725|github.com/status-im/status-go/protocol/messenger_mailserver_cycle.go:265]
connecting error
err="lookup store-02.ac-cn-hongkong-c.shards.test.statusim.net on [::1]:53: 
read udp [::1]:54547->[::1]:53: read: connection refused"

INFO [04-15|13:08:09.725|github.com/status-im/status-go/protocol/messenger_mailserver_cycle.go:265]
connecting error                         
err="lookup store-01.do-ams3.shards.test.statusim.net on [::1]:53: 
read udp [::1]:35352->[::1]:53: read: connection refused"

INFO [04-15|13:08:09.725|github.com/status-im/status-go/protocol/messenger_mailserver_cycle.go:265]
connecting error
err="lookup store-02.do-ams3.shards.test.statusim.net on [::1]:53: 
read udp [::1]:47643->[::1]:53: read: connection refused"
.
.
.

These logs indicate a dns resolution failure which probably fails at net.ResolveTCPAddr function call here -> https://github.com/status-im/status-go/blob/4c313c70322d0729fb4bfd6dab2bbad46a441d2e/vendor/github.com/status-im/tcp-shaker/socket.go#L14

These hints indicate that go 1.19 -> 1.20 changed something in the net package which only fails on Android ( probably linux as well ).

The go 1.20 changelog also suggest they do not enable cgo by default which could lead to this issue as well -> https://tip.golang.org/doc/go1.20#go-command

@jakubgs
Copy link
Member

jakubgs commented Apr 16, 2024

@richard-ramos says he could probably quite easily upgrade go-waku to Go 1.22 since the main problematic dependency already has support for that version, though no release will be made for that fix:

According to Richard the tricky part might be upgrading status-go to use new go-waku.

@siddarthkay
Copy link
Contributor Author

The dns resolution issue was fixed by overriding the DefaultResolver in net package and modifying the dialer.DialContext to resolve using Google DNS like this :

const bootstrapDNS = "8.8.8.8:53"
var dialer net.Dialer
net.DefaultResolver = &net.Resolver{
PreferGo: false,
Dial: func(context context.Context, _, _ string) (net.Conn, error) {
	conn, err := dialer.DialContext(context, "udp", bootstrapDNS)
	if err != nil {
		return nil, err
	}
	return conn, nil
},
}

note: we only do this for Android devices, Since DNS resolution works fine for iOS.

@jakubgs
Copy link
Member

jakubgs commented May 6, 2024

We have to keep in mind that hard-coding google DNS to be 8.8.8.8 is a bad practice, both from reliability perspective, as well as privacy perspective.

But good work figuring out the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants