Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[leo_gateway] Crash when multiple clients requesting the same large object without disk cache #433

Closed
windkit opened this issue Nov 10, 2015 · 6 comments

Comments

@windkit
Copy link
Contributor

windkit commented Nov 10, 2015

Description

Fix for Issue #325 assume disk cache is active.
When disk cache is not configured, 'read' mode client would crash as no cache worker is found.
Moreover, as disk cache is not present, 'read' mode client would not be able to read the object.

Error Log

[E] gateway_0@127.0.0.1 2015-11-10 09:46:33.617538 +0900    1447116393  null:null   0   gen_server <0.1874.0> terminated with reason: bad argument in call     to erlang:phash2(<<"test/testfile">>, 0) in leo_cache_api:put_begin_tran/2 line 254^M
[E] gateway_0@127.0.0.1 2015-11-10 09:46:33.617842 +0900    1447116393  null:null   0   ["CRASH REPORT ",[80,114,111,99,101,115,115,32,"<0.1874.0>",32,119    ,105,116,104,32,"1",32,110,101,105,103,104,98,111,117,114,115,32,"exited",32,119,105,116,104,32,114,101,97,115,111,110,58,32,[["bad argument in call to ",    ["erlang",58,"phash2",40,["<<","\"test/testfile\"",">>"],44,32,"0",41]," in ",[["leo_cache_api",58,"put_begin_tran",47,"2"],[32,108,105,110,101,32,"254"]]    ]," in ",[["gen_server",58,"terminate",47,"7"],[32,108,105,110,101,32,"804"]]]]]^M
[E] gateway_0@127.0.0.1 2015-11-10 09:46:33.618125 +0900    1447116393  null:null   0   Ranch listener leo_gateway_s3_api had connection process started w    ith cowboy_protocol:start_link/4 at <0.1872.0> exit with reason: {badarg,[{erlang,phash2,[<<"test/testfile">>,0],[]},{leo_cache_api,put_begin_tran,2,[{fil    e,"src/leo_cache_api.erl"},{line,254}]},{leo_large_object_get_handler,put_begin_tran_with_retry,1,[{file,"src/leo_large_object_get_handler.erl"},{line,288    }]},{leo_large_object_get_handler,handle_call,3,[{file,"src/leo_large_object_get_handler.erl"},{line,123}]},{gen_server,try_handle_call,4,[{file,"gen_serv    er.erl"},{line,607}]},{gen_server,handle_msg,5,[{file,"gen_server.erl"},{line,639}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,237}]}]}
@windkit
Copy link
Contributor Author

windkit commented Dec 16, 2015

Note that this problem still exists with 1.4.0-pre3

Test Case
$  ./s3cmd put testfile s3://test/test
WARNING: Module python-magic is not available. Guessing MIME types based on file extensions.
testfile -> s3://test/test  [part 1 of 7, 15MB]
 15728640 of 15728640   100% in    0s    62.58 MB/s  done
testfile -> s3://test/test  [part 2 of 7, 15MB]
 15728640 of 15728640   100% in    0s    59.67 MB/s  done
testfile -> s3://test/test  [part 3 of 7, 15MB]
 15728640 of 15728640   100% in    0s    49.73 MB/s  done
testfile -> s3://test/test  [part 4 of 7, 15MB]
 15728640 of 15728640   100% in    0s    60.85 MB/s  done
testfile -> s3://test/test  [part 5 of 7, 15MB]
 15728640 of 15728640   100% in    0s    47.69 MB/s  done
testfile -> s3://test/test  [part 6 of 7, 15MB]
 15728640 of 15728640   100% in    0s    59.48 MB/s  done
testfile -> s3://test/test  [part 7 of 7, 10MB]
 10485760 of 10485760   100% in    0s    51.72 MB/s  done

With two consoles,

$ ./s3cmd get s3://test/test dl1
s3://test/test -> dl1  [1 of 1]
 104857600 of 104857600   100% in    0s   135.81 MB/s  done
$ ./s3cmd get s3://test/test dl2
s3://test/test -> dl2  [1 of 1]
         0 of 104857600     0% in    0s     0.00 B/s  failed
WARNING: Retrying failed request: /test (EOF from S3!)
WARNING: Waiting 3 sec...
s3://test/test -> dl2  [1 of 1]
 104857600 of 104857600   100% in    0s   137.45 MB/s  done
Log
[E] gateway_0@127.0.0.1 2015-12-16 18:44:41.356829 +0900    1450259081  null:null   0   gen_server <0.1703.0> terminated with reason: no match of right hand value {error,"Invalid operation"} in leo_large_object_get_handler:handle_call/3 line 123
[E] gateway_0@127.0.0.1 2015-12-16 18:44:41.357258 +0900    1450259081  null:null   0   ["CRASH REPORT ",[80,114,111,99,101,115,115,32,"<0.1703.0>",32,119,105,116,104,32,"1",32,110,101,105,103,104,98,111,117,114,115,32,"exited",32,119,105,116,104,32,114,101,97,115,111,110,58,32,[["no match of right hand value ",[123,["error",44,"\"Invalid operation\""],125]," in ",[["leo_large_object_get_handler",58,"handle_call",47,"3"],[32,108,105,110,101,32,"123"]]]," in ",[["gen_server",58,"terminate",47,"7"],[32,108,105,110,101,32,"804"]]]]]
[E] gateway_0@127.0.0.1 2015-12-16 18:44:41.357724 +0900    1450259081  null:null   0   Ranch listener leo_gateway_s3_api had connection process started with cowboy_protocol:start_link/4 at <0.1701.0> exit with reason: {{badmatch,{error,"Invalid operation"}},[{leo_large_object_get_handler,handle_call,3,[{file,"src/leo_large_object_get_handler.erl"},{line,123}]},{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,607}]},{gen_server,handle_msg,5,[{file,"gen_server.erl"},{line,639}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,237}]}]}

@yosukehara yosukehara reopened this Dec 16, 2015
@yosukehara
Copy link
Member

@windkit What did you check the version of Leo's Gateway?
It seems not the latest v1.4 branch:

reason: no match of right hand value {error,"Invalid operation"} in leo_large_object_get_handler:handle_call/3 line 123

@windkit
Copy link
Contributor Author

windkit commented Dec 17, 2015

The problem also exists when no memory cache is configured cache.cache_ram_capacity = 0

[E] gateway_0@127.0.0.1 2015-12-17 09:28:53.45679 +0900 1450312133  null:null   0   gen_server <0.2057.0> terminated with reason: bad argument in call to erlang:phash2(<<"test/test">>, 0) in leo_cache_api:put_begin_tran/2 line 251
[E] gateway_0@127.0.0.1 2015-12-17 09:28:53.46074 +0900 1450312133  null:null   0   ["CRASH REPORT ",[80,114,111,99,101,115,115,32,"<0.2057.0>",32,119,105,116,104,32,"1",32,110,101,105,103,104,98,111,117,114,115,32,"exited",32,119,105,116,104,32,114,101,97,115,111,110,58,32,[["bad argument in call to ",["erlang",58,"phash2",40,["<<","\"test/test\"",">>"],44,32,"0",41]," in ",[["leo_cache_api",58,"put_begin_tran",47,"2"],[32,108,105,110,101,32,"251"]]]," in ",[["gen_server",58,"terminate",47,"7"],[32,108,105,110,101,32,"804"]]]]]
[E] gateway_0@127.0.0.1 2015-12-17 09:28:53.48426 +0900 1450312133  null:null   0   Ranch listener leo_gateway_s3_api had connection process started with cowboy_protocol:start_link/4 at <0.2055.0> exit with reason: {badarg,[{erlang,phash2,[<<"test/test">>,0],[]},{leo_cache_api,put_begin_tran,2,[{file,"src/leo_cache_api.erl"},{line,251}]},{leo_large_object_get_handler,put_begin_tran_with_retry,1,[{file,"src/leo_large_object_get_handler.erl"},{line,289}]},{leo_large_object_get_handler,handle_call,3,[{file,"src/leo_large_object_get_handler.erl"},{line,120}]},{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,607}]},{gen_server,handle_msg,5,[{file,"gen_server.erl"},{line,639}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,237}]}]}

With memory cache, no disk cache

[E] gateway_0@127.0.0.1 2015-12-17 09:31:48.48244 +0900 1450312308  null:null   0   gen_server <0.1653.0> terminated with reason: no match of right hand value {error,"Invalid operation"} in leo_large_object_get_handler:handle_call/3 line 120
[E] gateway_0@127.0.0.1 2015-12-17 09:31:48.48570 +0900 1450312308  null:null   0   ["CRASH REPORT ",[80,114,111,99,101,115,115,32,"<0.1653.0>",32,119,105,116,104,32,"1",32,110,101,105,103,104,98,111,117,114,115,32,"exited",32,119,105,116,104,32,114,101,97,115,111,110,58,32,[["no match of right hand value ",[123,["error",44,"\"Invalid operation\""],125]," in ",[["leo_large_object_get_handler",58,"handle_call",47,"3"],[32,108,105,110,101,32,"120"]]]," in ",[["gen_server",58,"terminate",47,"7"],[32,108,105,110,101,32,"804"]]]]]
[E] gateway_0@127.0.0.1 2015-12-17 09:31:48.48811 +0900 1450312308  null:null   0   Ranch listener leo_gateway_s3_api had connection process started with cowboy_protocol:start_link/4 at <0.1651.0> exit with reason: {{badmatch,{error,"Invalid operation"}},[{leo_large_object_get_handler,handle_call,3,[{file,"src/leo_large_object_get_handler.erl"},{line,120}]},{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,607}]},{gen_server,handle_msg,5,[{file,"gen_server.erl"},{line,639}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,237}]}]}

@windkit
Copy link
Contributor Author

windkit commented Dec 17, 2015

@yosukehara Sorry that I added few lines in the source and I forgot to remove them before the test

yosukehara added a commit to leo-project/leo_gateway that referenced this issue Dec 17, 2015
@yosukehara
Copy link
Member

I've fixed this issue. It would be nice if you check this, again.

@mocchira
Copy link
Member

LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants