-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weird problem when getting files ~25MB #469
Comments
Thank you for your report. I've quickly confirmed this situation on my laptop but I could not face the same situation as below. In order to know this problem, It would be nice if you share error logs of the storage-node(s) and the gateway-node. Case-1: Put an 25MB object via s3cmd$ ./leofs-adm status
[System Confiuration]
-----------------------------------+----------
Item | Value
-----------------------------------+----------
Basic/Consistency level
-----------------------------------+----------
system version | 1.2.20
cluster Id | leofs_1
DC Id | dc_1
Total replicas | 2
number of successes of R | 1
number of successes of W | 1
number of successes of D | 1
number of rack-awareness replicas | 0
ring size | 2^128
-----------------------------------+----------
Multi DC replication settings
-----------------------------------+----------
max number of joinable DCs | 2
number of replicas a DC | 1
-----------------------------------+----------
Manager RING hash
-----------------------------------+----------
current ring-hash | 3923d007
previous ring-hash | 3923d007
-----------------------------------+----------
[State of Node(s)]
-------+--------------------------+--------------+----------------+----------------+----------------------------
type | node | state | current ring | prev ring | updated at
-------+--------------------------+--------------+----------------+----------------+----------------------------
S | storage_0@127.0.0.1 | running | 3923d007 | 3923d007 | 2016-03-27 22:07:10 +0900
S | storage_1@127.0.0.1 | running | 3923d007 | 3923d007 | 2016-03-27 22:07:10 +0900
S | storage_2@127.0.0.1 | running | 3923d007 | 3923d007 | 2016-03-27 22:07:09 +0900
S | storage_3@127.0.0.1 | running | 3923d007 | 3923d007 | 2016-03-27 22:07:10 +0900
G | gateway_0@127.0.0.1 | running | 3923d007 | 3923d007 | 2016-03-27 22:07:19 +0900
-------+--------------------------+--------------+----------------+----------------+----------------------------
$ dd if=/dev/zero of=25M.file bs=25600 count=1024
$ s3cmd mb s3:test/
$ s3cmd put ./25M.file s3://test/
$ leofs-adm whereis test/25M.file
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------------------
del? | node | ring address | size | checksum | # of chunks | clock | when
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------------------
| storage_0@127.0.0.1 | 784626dc30fc8147c4fa7edad8d8109 | 25600K | b7bbfe5965 | 2 | 52f0782b26577 | 2016-03-27 22:09:47 +0900
| storage_1@127.0.0.1 | 784626dc30fc8147c4fa7edad8d8109 | 25600K | b7bbfe5965 | 2 | 52f0782b26577 | 2016-03-27 22:09:47 +0900
$ leofs-adm update-acl test 05236 public-read-write
$ curl -v -X GET http://test.localhost:8080/25M.file > 25M.file.1
...
$ curl -v -X GET http://test.localhost:8080/25M.file > 25M.file.7
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 127.0.0.1...
* Connected to test.localhost (127.0.0.1) port 8080 (#0)
> GET /25M.file HTTP/1.1
> Host: test.localhost:8080
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 200 OK
< connection: keep-alive
< date: Sun, 27 Mar 2016 13:17:35 GMT
< Content-Length: 26214400
< server: LeoFS
< Content-Type: application/octet-stream
< ETag: "b7bbfe5965698d52826c529d34425a1d"
< Last-Modified: Sun, 27 Mar 2016 13:09:47 GMT
<
$ ls -la | grep 25M.file
-rw-r--r-- 1 yosukehara staff 26214400 3 27 22:09 25M.file
-rw-r--r-- 1 yosukehara staff 26214400 3 27 22:17 25M.file.1
-rw-r--r-- 1 yosukehara staff 26214400 3 27 22:17 25M.file.2
-rw-r--r-- 1 yosukehara staff 26214400 3 27 22:17 25M.file.3
-rw-r--r-- 1 yosukehara staff 26214400 3 27 22:17 25M.file.4
-rw-r--r-- 1 yosukehara staff 26214400 3 27 22:17 25M.file.5
-rw-r--r-- 1 yosukehara staff 26214400 3 27 22:17 25M.file.6
-rw-r--r-- 1 yosukehara staff 26214400 3 27 22:17 25M.file.7
-rw-r--r-- 1 yosukehara staff 26214400 3 27 22:17 25M.file.8
Case-2: Put an 25MB object via DragonDisk$ leofs-adm whereis test/25M-2.file
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------------------
del? | node | ring address | size | checksum | # of chunks | clock | when
-------+--------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------------------
| storage_1@127.0.0.1 | 9fd3c2a0917d698ed50d2edba32ab2e | 25600K | bed3c0a4a1 | 5 | 52f07b2451c85 | 2016-03-27 22:23:05 +0900
| storage_2@127.0.0.1 | 9fd3c2a0917d698ed50d2edba32ab2e | 25600K | bed3c0a4a1 | 5 | 52f07b2451c85 | 2016-03-27 22:23:05 +0900
$ curl -v -X GET http://test.localhost:8080/25M-2.file > 25M-2.file.5
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 127.0.0.1...
* Connected to test.localhost (127.0.0.1) port 8080 (#0)
> GET /25M-2.file HTTP/1.1
> Host: test.localhost:8080
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 200 OK
< connection: keep-alive
< date: Sun, 27 Mar 2016 13:24:41 GMT
< Content-Length: 26214400
< server: LeoFS
< Content-Type: application/octet-stream
< ETag: "bed3c0a4a1407f584989b4009e9ce33f"
< Last-Modified: Sun, 27 Mar 2016 13:23:05 GMT
<
{ [16384 bytes data]
100 25.0M 100 25.0M 0 0 89.6M 0 --:--:-- --:--:-- --:--:-- 89.9M
$ ls -l | grep 25M-2.file
-rw-r--r-- 1 yosukehara staff 26214400 3 27 22:22 25M-2.file
-rw-r--r-- 1 yosukehara staff 26214400 3 27 22:24 25M-2.file.1
-rw-r--r-- 1 yosukehara staff 26214400 3 27 22:24 25M-2.file.2
-rw-r--r-- 1 yosukehara staff 26214400 3 27 22:24 25M-2.file.3
-rw-r--r-- 1 yosukehara staff 26214400 3 27 22:24 25M-2.file.4
-rw-r--r-- 1 yosukehara staff 26214400 3 27 22:24 25M-2.file.5 |
Thanks for your prompt response! I can easily produce my case. Please take a look at the log files attached as well as the case below.
On the other hand I am happy to provide ssh access if needed.
Here are the log files. |
@gkyildirim Thank you for reporting the issue, I can now reproduce it on my Ubuntu 14.04 Machine, I will track it down now.
|
This is because the body function for reading from disk cache is incorrect ...
file:sendfile(CacheObj#cache.file_path, Socket, 0, 0, [{chunk_size, SendChunkLen}]),
...
|
@gkyildirim I've also reproduce this on my environment. If you don't want to face this situation, you can modify disk cache configuration@leo_gateway for now as below: # https://github.com/leo-project/leo_gateway/blob/develop/priv/leo_gateway.conf#L138
cache.cache_disc_capacity = 0 We're going to fix this issue soon. |
I have created a PR for this And I don't see the issue with the fix $ curl -v -X GET http://localhost:8080/test/25M.file > 25.file.1
* Hostname was NOT found in DNS cache
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying ::1...
* connect to ::1 port 8080 failed: Connection refused
* Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /test/25M.file HTTP/1.1
> User-Agent: curl/7.35.0
> Host: localhost:8080
> Accept: */*
>
< HTTP/1.1 200 OK
< connection: keep-alive
< date: Mon, 28 Mar 2016 06:35:32 GMT
< Content-Length: 26214400
* Server LeoFS is not blacklisted
< server: LeoFS
< Content-Type: application/octet-stream
< ETag: "f4e4750fbe4ad8dbfb0e7f4d6a83ef56"
< Last-Modified: Mon, 28 Mar 2016 06:35:29 GMT
<
{ [data not shown]
100 25.0M 100 25.0M 0 0 353M 0 --:--:-- --:--:-- --:--:-- 357M
* Connection #0 to host localhost left intact
$ curl -v -X GET http://localhost:8080/test/25M.file > 25.file.2
* Hostname was NOT found in DNS cache
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying ::1...
* connect to ::1 port 8080 failed: Connection refused
* Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /test/25M.file HTTP/1.1
> User-Agent: curl/7.35.0
> Host: localhost:8080
> Accept: */*
>
< HTTP/1.1 200 OK
< connection: keep-alive
< date: Mon, 28 Mar 2016 06:35:32 GMT
< Content-Length: 26214400
* Server LeoFS is not blacklisted
< server: LeoFS
< Content-Type: application/octet-stream
< ETag: "f4e4750fbe4ad8dbfb0e7f4d6a83ef56"
< Last-Modified: Mon, 28 Mar 2016 06:35:29 GMT
< x-from-cache: True/via disk
<
{ [data not shown]
100 25.0M 100 25.0M 0 0 626M 0 --:--:-- --:--:-- --:--:-- 641M
* Connection #0 to host localhost left intact |
Note: LeoFS v1.2.18 and v1.2.20 is adversely affected by this bug. |
@windkit Thanks, I'll check your request out now. |
In order to check this situation, we're going to add some test cases on leofs_client_test and leofs_test. |
We've checked this issue with some integration-tests and stress-tests, and confirmed this bug was fixed. |
@gkyildirim We fixed this issue, then LeoFS v1.2.21 was released yesterday. |
This is my first attempt to work with leofs. Please let me know if this is a well known issue.
I am running leofs-1.2.20 over a single ubuntu-14.0.4. Configurations are all default. I've created a new bucket and make it public-read-write.
I observe a weird problem. I put a 25MB file (with dragon disk). Then I can get it successfully at my first attempt. But after that I can not get the same file anymore. For example curl exits with "curl: (18) transfer closed with 26246026 bytes remaining to read". This is also true for dragon disk.
I run a short test and I've observed same issue with files between 3MB-40MB. Files below 3MB and files above 40MB has no problem.
The text was updated successfully, but these errors were encountered: