Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PoC] lib: os: fdtable: Add syscalls for POSIX read/write/close/ioctl operations #25443

Closed

Conversation

pfalcon
Copy link
Contributor

@pfalcon pfalcon commented May 19, 2020

Fdtable is a kernel-level structure, so functions accessing it, like
read(), write(), close(), should be syscalls. This conversion however
emphasizes a difference between "syscall" and "C standard library
function". C stdlib functions should be represented by just a
standard linkage symbol, and can be used even without prototype.
Syscall is however a special code sequence, which may require special
attributes and thus a prototype for usage. To deal with these
distinction, we define a convention that "sys_name" is a syscall,
while "name" is a normal C function (implemented as a wrapper around
syscall). (And we also need to define "_name" to interface with Newlib
(actually, just "_write".))

Fixes: #25407

Signed-off-by: Paul Sokolovsky paul.sokolovsky@linaro.org

@pfalcon pfalcon added area: POSIX POSIX API Library area: Sockets Networking sockets area: Userspace Userspace labels May 19, 2020
@zephyrbot zephyrbot added area: C Library C Standard Library area: API Changes to public APIs labels May 19, 2020
@pfalcon
Copy link
Contributor Author

pfalcon commented May 19, 2020

@andrewboie, long-standing technical debt.

Plan of work: conversion for read/write/close is pushed. Will see if CI passes. There're existing socket tests which could be easily (albeit in Do-Repeat-Yourself manner) converted to test read/write. But this test also calls more stuff, like fcntl. That in turn is based on ioctl. And ioctl is used for some kernel-level stuff. So, there's a bit of scope creep, and changes would be exactly "leaf in dependency graph" and "isolated", as there can be potential changes to socket family implementations (or maybe not, didn't look into that yet).

@cfriedt: FYI (and please review once finished, currently it's not, per above).

@pfalcon pfalcon force-pushed the read-write-syscalls branch 2 times, most recently from 5ad8073 to 3d50cab Compare May 19, 2020 16:14
lib/os/fdtable.c Outdated Show resolved Hide resolved
@pfalcon pfalcon marked this pull request as draft May 19, 2020 17:23
lib/os/fdtable.c Outdated
*/
ssize_t write(int fd, const void *buf, size_t sz)
{
return _write(fd, buf, sz);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting "undefined reference to _write" and "undefined reference to _read" in libc_nano.a . It doesn't jump out at me immediately why that might be the case.
https://app.shippable.com/github/zephyrproject-rtos/zephyr/runs/73259/5/console

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... But it should explain why this spent so much time in the backlog - dealing with all these naming (ab)uses across different targets is a bit boring (and I'm not particularly interested in kernel/userspace split ;-) ). Anyway, fixing it.

@pfalcon pfalcon force-pushed the read-write-syscalls branch 2 times, most recently from 6127e52 to 6c4206e Compare May 19, 2020 18:47
@pfalcon
Copy link
Contributor Author

pfalcon commented May 19, 2020

Ok, read/write/close syscall are now clean re: build. Will proceed with ioctl tomorrow, then test is the next step after that.

@pfalcon pfalcon force-pushed the read-write-syscalls branch from 6c4206e to c732f3d Compare May 20, 2020 08:59
@pfalcon
Copy link
Contributor Author

pfalcon commented May 20, 2020

Getting "undefined reference to _write" and "undefined reference to _read" in libc_nano.a . It doesn't jump out at me immediately why that might be the case.

For the record, to elaborate what happened there. Prebuilt Newlib, as shipped with Zephyr SDK is call into write to dump some error on stdout/stderr. The issue is that the actual function called is inconsistent among the builds for different architectures. For example, x86 calls write(), while ARM - _write(). My original idea was to make _write() a syscall, while write() a normal userspace func, that due to the above, I fell back to Linux-like convention of sys_write() being a syscall, and write()/_write() userspace funcs.

Thinking about it, ARM newlib build is right here, while x86 is wrong. The whole idea of underscore-prefixed functions is that they don't provide full POSIX semantics. So, if x86 called _write, we wouldn't need to define write() for !CONFIG_POSIX_API case, and @cfriedt would never get confused by calling write() on a socket in this case, and getting output dumped to console. Actually, that would be the right way to resolve the issue, so let me even try to submit zephyrproject-rtos/sdk-ng#221 (though I'm not sure who would be looking into that).

@pfalcon
Copy link
Contributor Author

pfalcon commented May 20, 2020

In the meantime, pushed a commit with ioctl() conversion to syscall.

@pfalcon pfalcon changed the title lib: os: fdtable: Add syscalls for POSIX read/write/close operations lib: os: fdtable: Add syscalls for POSIX read/write/close/ioctl operations May 20, 2020
@pfalcon pfalcon force-pushed the read-write-syscalls branch from c732f3d to 6dfe2eb Compare May 20, 2020 09:25
@pfalcon pfalcon force-pushed the read-write-syscalls branch from 6dfe2eb to ff160a3 Compare May 20, 2020 13:43
@zephyrbot zephyrbot added area: Networking area: Tests Issues related to a particular existing or missing test labels May 20, 2020
@pfalcon
Copy link
Contributor Author

pfalcon commented May 20, 2020

In the meantime, pushed a commit with ioctl() conversion to syscall.

And it's the usual suspect, @andrewboie, I'm sure there's nothing new in this #if 0, it was expected all along (and again, why it spent so much time in backlog).

@zephyrbot
Copy link
Collaborator

zephyrbot commented May 20, 2020

All checks passed.

checkpatch (informational only, not a failure)

-:11: WARNING:LONG_LINE: line over 80 characters
#11: FILE: include/posix/sys/ioctl.h:11:
+__syscall int sys_ioctl(int fd, unsigned long request, long n_args, uintptr_t *args);

-:211: WARNING:LONG_LINE: line over 80 characters
#211: FILE: lib/os/fdtable.c:250:
+	res = z_fdtable_call_ioctl(fdtable[fd].vtable, fdtable[fd].obj, ZFD_IOCTL_CLOSE, 0);

-:255: WARNING:LONG_LINE: line over 80 characters
#255: FILE: lib/os/fdtable.c:292:
+int z_impl_sys_ioctl(int fd, unsigned long request, long n_args, uintptr_t *args)

-:264: WARNING:LONG_LINE: line over 80 characters
#264: FILE: lib/os/fdtable.c:298:
+	return fdtable[fd].vtable->ioctl(fdtable[fd].obj, request, n_args, args);

-:268: WARNING:LONG_LINE: line over 80 characters
#268: FILE: lib/os/fdtable.c:302:
+ssize_t z_vrfy_sys_ioctl(int fd, unsigned long request, long n_args, uintptr_t *args)

-:539: WARNING:LONG_LINE: line over 80 characters
#539: FILE: subsys/net/lib/sockets/sockets.c:1679:
+static int sock_ioctl_vmeth(void *obj, unsigned long request, long n_args, uintptr_t *args)

-:652: WARNING:LONG_LINE: line over 80 characters
#652: FILE: subsys/net/lib/sockets/sockets_tls.c:1200:
+	ret = z_fdtable_call_ioctl(&sock_fd_op_vtable.fd_vtable, ctx, ZFD_IOCTL_CLOSE, 0);

-:661: WARNING:LONG_LINE: line over 80 characters
#661: FILE: subsys/net/lib/sockets/sockets_tls.c:1334:
+	err = z_fdtable_call_ioctl(&sock_fd_op_vtable.fd_vtable, child, ZFD_IOCTL_CLOSE, 0);

- total: 0 errors, 8 warnings, 764 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

Your patch has style problems, please review.

NOTE: Ignored message types: AVOID_EXTERNS BRACES CONFIG_EXPERIMENTAL CONST_STRUCT DATE_TIME FILE_PATH_CHANGES MINMAX NETWORKING_BLOCK_COMMENT_STYLE PREFER_KERNEL_TYPES PRINTK_WITHOUT_KERN_LEVEL SPDX_LICENSE_TAG SPLIT_STRING VOLATILE

NOTE: If any of the errors are false positives, please report
      them to the maintainers.

Tip: The bot edits this comment instead of posting a new one, so you can check the comment's history to see earlier messages.

@pfalcon
Copy link
Contributor Author

pfalcon commented May 20, 2020

Otherwise, 3rd commit in series patches an existing socket test to add read()/write() testcases.

@andrewboie
Copy link
Contributor

andrewboie commented May 28, 2020

I'm not sure though what additional validation is needed on the individual arguments in the array though?

@dleach02 That's the problem: the individual arguments needs to be validated. They could contain pointer values, and if so every single one of them needs to be bounded and checked for appropriate access. We can't just unmarshal them on the kernel side and call it a day handing them off to the implementation; if the implementation treats any of the args as a pointer and not an integral value we have a potential exploit because none of these pointers have been validated.

What we have here is a variadic interface to send up to N arbitrary uintptr_t arguments to the kernel, with no knowledge on the kernel side what these arguments actually represent, since the verification is being done in a generic verifier for any ioctl.

This generic interface will not work, the particular semantics of the arguments entirely depend on what the particular function is. Which is why I asked to just demux ioctl calls in userspace and send them each as different syscalls.

Now that I think of it, there's also the problem that there's no accounting for what threads can use what file descriptors, unless I am missing something a malicious thread could manipulate any file descriptor active in the system in the current design. That's something that should probably be looked at separately.

@pfalcon pfalcon force-pushed the read-write-syscalls branch from 1f96bf4 to ebad591 Compare May 28, 2020 06:28
@cfriedt
Copy link
Member

cfriedt commented May 28, 2020

Now that I think of it, there's also the problem that there's no accounting for what threads can use what file descriptors, unless I am missing something a malicious thread could manipulate any file descriptor active in the system in the current design. That's something that should probably be looked at separately.

Yep. That's a biggie.

@pfalcon
Copy link
Contributor Author

pfalcon commented May 28, 2020

Yep. That's a biggie.

Indeed, that's a "biggie":

File descriptors are per process, not per thread. So whoever will implement support for multiple processes in Zephyr, will also make sure that fdtable is per-process.

@pfalcon pfalcon force-pushed the read-write-syscalls branch from ebad591 to 60a7af8 Compare May 28, 2020 12:37
@dleach02
Copy link
Member

dleach02 commented May 28, 2020

@andrewboie, I understand what you are saying. We should clarify (if it isn't already done), what the rules are for validating the parameters in a system call. It seems like the rule you are articulating is that all parameters must be properly typed and known at the time of the call and they must be able to be properly verified. Hiding the type of the parameter is not allowed.

I'm assuming the baseline problem is that the IOCTL is directed at different underlying drivers/interfaces and as such, the interpretation of the data is different depending on the driver needs. The generic interface of the IOCTL will not be able to be supported via a syscall. If the "syscall" happens lower where the user space code takes in the IOCTL and then breaks out the parameters of the veriadic that could potentially be a syscall point of entry that could validate the parameters?

@cfriedt
Copy link
Member

cfriedt commented May 28, 2020

Yep. That's a biggie.

File descriptors are per process, not per thread. So whoever will implement support for multiple processes in Zephyr, will also make sure that fdtable is per-process.

Of course. I didn't want to go on a diatribe, but obviously processes, permissions, pre-opened file descriptors (stdin, stdout, stderr), virtual devfs, console on serial port, etc. However, currently Zephyr does attempt to separate thread permissions based on (physical / virtual) address space and metadata. So even though the entire "process" has a flat set of file descriptors, certain threads really shouldn't be able to write to some file descriptors and until processes are implemented, fd's may need to have permissions associated with a thread or thread group.

@pfalcon
Copy link
Contributor Author

pfalcon commented May 28, 2020

@cfriedt: Show me a reference to the general purpose POSIX standard which says that, and we'll discuss it (later).

@pfalcon pfalcon force-pushed the read-write-syscalls branch from 60a7af8 to af60bba Compare May 28, 2020 16:32
@andrewboie
Copy link
Contributor

Yep. That's a biggie.

Permission accounting is a must-have in order to introduce these syscalls, you wouldn't want Minecraft to be able read Firefox's file descriptors at-will. I have a solution I'm testing to fix this for sockets (we used to have it but it broke, it uses kernel object permissions) that should be generalize-able

About ioctl(): consider for a moment that although having syscalls with arbitrary variadic argument types is a nightmare, nobody I know of against having an extensible vectored driver interface mechanism...there's no requirement to use ioctl()'s interface specifically, ioctls aren't portable anyway and you can always write wrappers

have a look at how Windows does it: https://docs.microsoft.com/en-us/windows/win32/api/ioapiset/nf-ioapiset-deviceiocontrol?redirectedfrom=MSDN their definition with a device, handle, and in/out buffers would be trivial to validate in a syscall

@pfalcon pfalcon force-pushed the read-write-syscalls branch 3 times, most recently from 224e5b1 to 648c847 Compare June 1, 2020 15:41
@d3zd3z
Copy link
Collaborator

d3zd3z commented Jun 1, 2020

As far as ioctl calls go, maybe we can glean something from how this is implemented in various Unixy platforms. The 'request' value is not a general number, but various bits within the number have meaning. Two of the bits indicate the direction that the data parameter moves, and then a range of the value indicates the size of the data pointer. It effectively maps the command number directly to the size and direction of the payload. The only calls that have to do additional validation are those that have additional pointers in the block being checked.

pfalcon added 5 commits June 11, 2020 11:39
Fdtable is a kernel-level structure, so functions accessing it, like
read(), write(), close(), should be syscalls. This conversion however
emphasizes a difference between "syscall" and "C standard library
function". C stdlib functions should be represented by just a
standard linkage symbol, and can be used even without prototype.
Syscall is however a special code sequence, which may require special
attributes and thus a prototype for usage. To deal with these
distinction, we define a convention that "sys_name" is a syscall,
while "name" is a normal C function (implemented as a wrapper around
syscall). (And we also need to define "_name" to interface with Newlib
(actually, just "_write".))

Fixes: zephyrproject-rtos#25407

Signed-off-by: Paul Sokolovsky <paul.sokolovsky@linaro.org>
This is similar to previously added syscalls for read/write/close.
sys_ioctl accepts va_list. fcntl() is clearly rewritten as a userspace
operation, delegating to sys_ioctl().

Signed-off-by: Paul Sokolovsky <paul.sokolovsky@linaro.org>
Just a test for send()/recv() duplicated to call write()/read()
instead. Duplication is not ideal, but to deal with it, we would
need some no-nonsense metagramming capabilities. Sadly, C classically
have problems with that. Standard way to instantiate different
functions based on the same template is to use preprecessor, but that
requires suffixing each line with a backslash, and thus is pretty
unwieldy. A more modern approach would be to use always-inline
function as a template, but always-inline functions aren't part of
the C standard, and just inline functions don't guarantee inlinability.

(Besides metaprogramming, a mundane-programming solution is possible,
but somehow that was considered cumersome too.)

Signed-off-by: Paul Sokolovsky <paul.sokolovsky@linaro.org>
…a_list

As we cannot pass va_list across userspace/kernel boundary (because
we don't control its internel representation, and it may be quite,
complex), just get rid of it on the lowest level of ioctl machinery.
Pass just explicit array of machine words (uintptr_t) to the ioctl
vmethod implementation. On the API convenience layer, there's still
support for variadic arguments calls (both on the level of userspace
ioctl() function, and for internal in-kernel usage), with explicit
marshalling from va_list to array of uintptr_t's.

Signed-off-by: Paul Sokolovsky <paul.sokolovsky@linaro.org>
Following a refactor to allow pass ioctl arguments across
userspace/kernel boundary.

Signed-off-by: Paul Sokolovsky <paul.sokolovsky@linaro.org>
@pfalcon pfalcon force-pushed the read-write-syscalls branch from 648c847 to 58a19b8 Compare June 11, 2020 08:56
@pfalcon
Copy link
Contributor Author

pfalcon commented Jun 11, 2020

Ok, so I split off changes for just read()/write() and added a dedicated test for them: #26117

@pfalcon pfalcon changed the title lib: os: fdtable: Add syscalls for POSIX read/write/close/ioctl operations [PoC] lib: os: fdtable: Add syscalls for POSIX read/write/close/ioctl operations Jun 12, 2020
@github-actions
Copy link

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

@github-actions github-actions bot added the Stale label Aug 12, 2020
@pfalcon pfalcon removed the Stale label Aug 12, 2020
@github-actions
Copy link

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

@github-actions github-actions bot added the Stale label Oct 12, 2020
@github-actions github-actions bot closed this Oct 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: API Changes to public APIs area: C Library C Standard Library area: Networking area: POSIX POSIX API Library area: Sockets Networking sockets area: Tests Issues related to a particular existing or missing test area: Userspace Userspace Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

No tests/samples covering socket read()/write() calls
6 participants