Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FoundationDB fails to run with a ZFS data dir #274

Closed
bobobo1618 opened this issue Apr 28, 2018 · 3 comments
Closed

FoundationDB fails to run with a ZFS data dir #274

bobobo1618 opened this issue Apr 28, 2018 · 3 comments

Comments

@bobobo1618
Copy link

When the data dir is on ZFS, my logs look like the following:

Time="1524944712.663276" Severity="10" LogGroup="default" Process="fdbmonitor": Started FoundationDB Process Monitor 5.1 (v5.1.5)
Time="1524944712.714362" Severity="10" LogGroup="default" Process="fdbmonitor": Watching conf file /etc/foundationdb/foundationdb.conf
Time="1524944712.714399" Severity="10" LogGroup="default" Process="fdbmonitor": Watching conf dir /etc/foundationdb (2)
Time="1524944712.714477" Severity="10" LogGroup="default" Process="fdbmonitor": Loading configuration /etc/foundationdb/foundationdb.conf
Time="1524944712.715834" Severity="10" LogGroup="default" Process="fdbmonitor": Starting backup_agent.1
Time="1524944712.716218" Severity="10" LogGroup="default" Process="fdbmonitor": Starting fdbserver.4500
Time="1524944712.716636" Severity="10" LogGroup="default" Process="backup_agent.1": Launching /usr/lib/foundationdb/backup_agent/backup_agent (18) for backup_agent.1
Time="1524944712.716769" Severity="10" LogGroup="default" Process="fdbserver.4500": Launching /usr/sbin/fdbserver (19) for fdbserver.4500
Time="1524944713.134326" Severity="40" LogGroup="default" Process="fdbserver.4500": ERROR: error creating or opening process id file `/var/lib/foundationdb/data/4500/processId'.
Time="1524944713.135767" Severity="40" LogGroup="default" Process="fdbserver.4500": Fatal Error: Disk i/o operation failed
Time="1524944713.187140" Severity="40" LogGroup="default" Process="fdbserver.4500": Process 19 exited 1, restarting in 0 seconds
Time="1524944713.187682" Severity="10" LogGroup="default" Process="fdbserver.4500": Launching /usr/sbin/fdbserver (23) for fdbserver.4500
Time="1524944713.250668" Severity="40" LogGroup="default" Process="fdbserver.4500": ERROR: error creating or opening process id file `/var/lib/foundationdb/data/4500/processId'.
Time="1524944713.251377" Severity="40" LogGroup="default" Process="fdbserver.4500": Fatal Error: Disk i/o operation failed
Time="1524944713.301838" Severity="40" LogGroup="default" Process="fdbserver.4500": Process 23 exited 1, restarting in 59 second

It's unclear what it's attempting to do or why it's failing but it's clearly not working.

The path it's attempting to access does not exist but there is a file called /var/lib/foundationdb/data/4500/processId.part.

@MikeMcMahon
Copy link

MikeMcMahon commented Apr 28, 2018

this is because of the O_DIRECT call to open files. ZFS does not support O_DIRECT.

openzfs/zfs#224

we, @wavefrontHQ, had looked at changing the call to O_[A]SYNC to bypass this and it does indeed work on ZFS after this change; however, we were not confident to pull the trigger as there are implications on data integrity at that point.

@davidscherer
Copy link

I think there are two reasons for O_DIRECT: we don't want kernel page caching (since we are doing our own), and we need kernel AIO to work (and on supported filesystems, only direct I/O is actually asynchronous). O_SYNC doesn't necessarily do either of these things and unnecessarily requires durability for every write (which is potentially a huge performance hit; we call fsync() where we need to already), so it doesn't really seem like the right thing, although of course ZFS may do something very unexpected with these flags.

If ZFS does kernel AIO reliably without O_DIRECT, and if you can live with horrible cache pollution, you might be able to just take out O_DIRECT.

But the testing burden of making FDB production ready on a new filesystem is pretty large. You want to check:

(a) durability, using real-world power failure tests or a low level simulator;
(b) performance, especially latency curves, which might get very very bad if AIO is silently failing to synchronous I/O that blocks the main fdbserver thread. The double page caching thing may be painful. Also ZFS's copy-on-write approach may not be a perfect fit for the ssd engine.

@hgray1
Copy link
Contributor

hgray1 commented Apr 30, 2018

This sounds like a feature request so could you please move this to the forums if you'd like to continue this discussion. We like to use GitHub Issues for code specific/targeted tasks.

@hgray1 hgray1 closed this as completed Apr 30, 2018
vishesh added a commit to vishesh/foundationdb that referenced this issue Mar 12, 2019
- Some Linux filesystems don't support O_DIRECT which is required by
Kernel AIO to function properly. Instead of using O_SYNC, EIO is
much better options in terms of performance penalty.
- Some systems may not support AIO at all. Eg. Windows Subsystem for
Linux.

FIXES apple#842
RELATED apple#274
vishesh added a commit to vishesh/foundationdb that referenced this issue Mar 12, 2019
- Some Linux filesystems don't support O_DIRECT which is required by
Kernel AIO to function properly. Instead of using O_SYNC, EIO is
much better options in terms of performance penalty.
- Some systems may not support AIO at all. Eg. Windows Subsystem for
Linux.

FIXES apple#842
RELATED apple#274
vishesh added a commit to vishesh/foundationdb that referenced this issue Mar 12, 2019
- Some Linux filesystems don't support O_DIRECT which is required by
Kernel AIO to function properly. Instead of using O_SYNC, EIO is
much better options in terms of performance penalty.
- Some systems may not support AIO at all. Eg. Windows Subsystem for
Linux.

FIXES apple#842
RELATED apple#274
vishesh added a commit to vishesh/foundationdb that referenced this issue Mar 13, 2019
- Some Linux filesystems don't support O_DIRECT which is required by
Kernel AIO to function properly. Instead of using O_SYNC, EIO is
much better options in terms of performance penalty.
- Some systems may not support AIO at all. Eg. Windows Subsystem for
Linux.

FIXES apple#842
RELATED apple#274
alexmiller-apple pushed a commit that referenced this issue Mar 13, 2019
- Some Linux filesystems don't support O_DIRECT which is required by
Kernel AIO to function properly. Instead of using O_SYNC, EIO is
much better options in terms of performance penalty.
- Some systems may not support AIO at all. Eg. Windows Subsystem for
Linux.

FIXES #842
RELATED #274
etschannen pushed a commit to etschannen/foundationdb that referenced this issue Mar 26, 2019
- Some Linux filesystems don't support O_DIRECT which is required by
Kernel AIO to function properly. Instead of using O_SYNC, EIO is
much better options in terms of performance penalty.
- Some systems may not support AIO at all. Eg. Windows Subsystem for
Linux.

FIXES apple#842
RELATED apple#274
etschannen pushed a commit to etschannen/foundationdb that referenced this issue Mar 26, 2019
- Some Linux filesystems don't support O_DIRECT which is required by
Kernel AIO to function properly. Instead of using O_SYNC, EIO is
much better options in terms of performance penalty.
- Some systems may not support AIO at all. Eg. Windows Subsystem for
Linux.

FIXES apple#842
RELATED apple#274
alexmiller-apple pushed a commit to etschannen/foundationdb that referenced this issue Mar 26, 2019
- Some Linux filesystems don't support O_DIRECT which is required by
Kernel AIO to function properly. Instead of using O_SYNC, EIO is
much better options in terms of performance penalty.
- Some systems may not support AIO at all. Eg. Windows Subsystem for
Linux.

FIXES apple#842
RELATED apple#274
sfc-gh-jfu pushed a commit to sfc-gh-jfu/foundationdb that referenced this issue Jul 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants