Skip to content
This repository has been archived by the owner on Nov 25, 2022. It is now read-only.

'No route to host?' with active Internet connection #2746

Closed
Filazapovich opened this issue Sep 19, 2019 · 17 comments
Closed

'No route to host?' with active Internet connection #2746

Filazapovich opened this issue Sep 19, 2019 · 17 comments
Labels
engine An issue related to the Defold engine

Comments

@Filazapovich
Copy link

Filazapovich commented Sep 19, 2019

Part of people (looks like ~10%) whose download our game is affected by strange error with active and stable Internet connection:

ERROR:SCRIPT: Unable to create HTTP connection to [Some server]. No route to host?

Native gethostbyname works fine and resolves host IP. Logcat from device:

D/libc-netbsd(17157): [getaddrinfo]: mtk hostname=[Some server]; servname=(null); cache_mode=(null), netid=0; mark=0
D/libc-netbsd(17157): getaddrinfo( app_uid:10242
D/libc-netbsd(17157): getaddrinfo() uid prop:
D/libc-netbsd(17157): getaddrinfo() getuid():10242
D/libc-netbsd(17157): [getaddrinfo]: mtk ai_addrlen=0; ai_canonname=(null); ai_flags=0; ai_family=0
D/libc-netbsd(  356): [getaddrinfo]: mtk hostname=[Some server]; servname=(null); cache_mode=local, netid=122; mark=917626
D/libc-netbsd(  356): getaddrinfo( app_uid:0
D/libc-netbsd(  356): [getaddrinfo]: mtk ai_addrlen=0; ai_canonname=(null); ai_flags=0; ai_family=0
D/libc-netbsd(  356): res_queryN name = [Some server], class = 1, type = 28
D/libc-netbsd(  356): res_queryN name = [Some server], class = 1, type = 1
D/libc-netbsd(  356): res_queryN name = [Some server] succeed
D/libc-netbsd(17157): getaddrinfo: [Some server] get result from proxy >>

Thinks about than:

Our QA have 2 devices with that error: Sony Xperia M5 5603 with Android 5.1 and Lenovo Vibe K5 Plus A6020a46 with Android 5.1.1. That is not case with Android 5.1.

Reproducible with defold 161, 159 and earlier.


Build time2019-09-02T09:40:18.383211
Defold channelstable
Defold editor sha45635ad26f85009c52905724e242cc92dd252146
Defold engine sha45635ad26f85009c52905724e242cc92dd252146
Defold version1.2.161
GPUIntel(R) HD Graphics 630
GPU Driver4.5.0 - Build 23.20.16.4973
Java version11.0.1+13
OS archamd64
OS nameWindows 10
OS version10.0
@britzl
Copy link
Contributor

britzl commented Sep 19, 2019

Thank you for reporting this. We haven't heard of this being an issue for anyone else but nevertheless we are currently investigating it.

@jhonnyking
Copy link

jhonnyking commented Sep 19, 2019

"Lenovo Vibe K5 Plus A6020a46 with Android 5.1.1. That is not case with Android 5.1."
Does this mean that you can't reproduce it on Android 5.1 with that device?
Does the issue only happen on Android, or do you have similar issues on iOS?

I took a look into the link you provided, and i I understand correctly the addition that was made was that you can provide c_ares with a list of search locations that it should look through for the resolv.conf file. This has since been adapted into c_ares, but it seems strange that the default implementation cannot find the resolve file. Could you look into the file system on those devices and see if the /etc/resolv.conf file exists? and possibly if you have read access to it and what its content is.

Edit: The resolv.conf file isn't used on Android. From the c-ares code:

/* Use the Android connectivity manager to get a list

  • of DNS servers. As of Android 8 (Oreo) net.dns#
  • system properties are no longer available. Google claims this
  • improves privacy. Apps now need the ACCESS_NETWORK_STATE
  • permission and must use the ConnectivityManager which
  • is Java only. */

@Filazapovich
Copy link
Author

Filazapovich commented Sep 19, 2019

"Lenovo Vibe K5 Plus A6020a46 with Android 5.1.1. That is not case with Android 5.1."
Does this mean that you can't reproduce it on Android 5.1 with that device?
Does the issue only happen on Android, or do you have similar issues on iOS?

This case is 100% reproducible on Sony Xperia M5 5603 and on Lenovo Vibe K5 Plus A6020a46 from our QA devices bank. Both with Android 5.1 but on other devices with Android 5.1 this error is not reproducible. I have no information about similar issue on iOS.

@vlaaad vlaaad added the engine An issue related to the Defold engine label Sep 20, 2019
@jhonnyking
Copy link

I was able to run into a few issues with tpacketcapture app. It does fail on android 8+ devices 100% of the time, but only occasionally on the 5.1 devices:

  • When it fails for the 5.1 devices, we get ssl handshake failure after 300-1500 sequential requests. If I switch the request to regular http instead, it doesn't seem to fail at all. Yesterday I could do 60k+ sequential requests without any of them failing.
  • For the 8+ devices, we don't get any DNS servers at all when running in capture mode, so it's not surprising that it fails. In this case we discussed adding a default DNS server such as 8.8.8.8 and see if it helps.

So for now, until we can patch the DNS case, could you switch your request from https to http and see if it works better? Obviously not ideal, but consider it a workaround for now. I'll see if I can send you a WIP engine that you guys can try out this week.

@Filazapovich
Copy link
Author

We made fallback to IP as temporary solution.

@Filazapovich
Copy link
Author

I think the better solution of no DNS servers problem is property in game.project [network] section with comma separated list of DNS servers used as default servers if no DNS servers found on system. And default value for this property should be 8.8.8.8. This solution make no problems with China distribution.

@jhonnyking
Copy link

jhonnyking commented Sep 26, 2019

I have a PR that sets DNS servers to 8.8.8.8 and 8.8.4.4 (and similar IPV6 variants) if ares didn't find anything from the device configuration, so if you want I could send you that version so you can test it and see if it resolves your issue. I have tested it on a device with android 8 and the tpacketcapture app running and there are no more DNS issues.

If the solution works, then maybe we could add something similar to the game.project, but it needs a round of discussions internally first.

@Filazapovich
Copy link
Author

Ok. I'm ready to test.

@Filazapovich
Copy link
Author

Filazapovich commented Sep 27, 2019

Test failed but without "No route to host?" error.
Code:

local function check_http(url, method)
	http.request(url, method, function(self, id, response)
		print(method .. " " .. url .. " : " .. response.status)
	end, nil, "")
end


local function run_batch_http(method)
	check_http("https://example.com", method)
	check_http("https://graph.facebook.com/me", method)
	check_http("[Some server]", method)
end


local function run_test()
	print("------------------------   run test   -------------------------")
	run_batch_http("GET")
	run_batch_http("POST")
end

Output from Windows and from Android):

DEBUG:SCRIPT: GET [Some server] : 400
DEBUG:SCRIPT: GET https://graph.facebook.com/me : 400
WARNING:DLIB: Unhandled ssl status code: -78 (-FFFFFFB2)
ERROR:SCRIPT: HTTP request to '[Some server]' failed (http result: -1  socket result: -1000)
DEBUG:SCRIPT: POST [Some server] : 0
ERROR:SCRIPT: HTTP request to 'https://example.com' failed (http result: -1  socket result: -5)
DEBUG:SCRIPT: POST https://example.com : 0
DEBUG:SCRIPT: GET https://example.com : 304
ERROR:SCRIPT: HTTP request to 'https://graph.facebook.com/me' failed (http result: -1  socket result: -5)
DEBUG:SCRIPT: POST https://graph.facebook.com/me : 0

@britzl
Copy link
Contributor

britzl commented Nov 19, 2019

Will investigate.

@britzl
Copy link
Contributor

britzl commented Nov 19, 2019

Surprisingly the https request to graph.facebook.com/me fails with a socket error. The -78 is MBEDTLS_ERR_NET_SEND_FAILED. Doing the same with curl gives a response:

{"error":{"message":"An active access token must be used to query information about the current user.","type":"OAuthException","code":2500,"fbtrace_id":"Ai5KpDDPg2Pdeu-APCy2kPr"}}

We must be missing something in our config or in the way we set up timeouts

@britzl
Copy link
Contributor

britzl commented Nov 19, 2019

Correction and some additional observations:

The socket error doesn't show up 100% of the time (more like 50%).

The test provided in the example makes both GET and POST requests at the same time. If I only do GET requests the problem never happens.

If I do only a POST request the problem happens.

@britzl
Copy link
Contributor

britzl commented Nov 19, 2019

Isolating the problem further:

http.request("https://graph.facebook.com/me", "POST", function(self, id, response)
	print("POST : " .. response.status)
end, nil, "")

Sometime results in:

WARNING:DLIB: Unhandled ssl status code: -78 (-FFFFFFB2)

Aaaand it never happens if the request actually contains POST data:

http.request("https://graph.facebook.com/me", "POST", function(self, id, response)
	print("POST : " .. response.status)
end, nil, "foo=bar")

This makes sense from one point of view, but I guess a POST request should contain data.

@britzl
Copy link
Contributor

britzl commented Nov 20, 2019

We will include a fix for 0 content length POSTs on SSL connections in 1.2.165.

@britzl britzl closed this as completed Nov 20, 2019
@britzl
Copy link
Contributor

britzl commented Nov 28, 2019

@Filazapovich The fix was included in 1.2.164. Can you please confirm that you no longer have any problems with HTTP(S) requests?

@Filazapovich
Copy link
Author

We will update Defold to 1.2.164+ only at January.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
engine An issue related to the Defold engine
Projects
None yet
Development

No branches or pull requests

4 participants