Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osx support (fixes #128) #1348

Closed
wants to merge 26 commits into from
Closed

Conversation

zuercher
Copy link
Member

Some highlights:

source/common/filesystem: adds a kqueue-based implementation of WatcherImpl. The conditional compiliation is hacky.

source/common/network: handle odd-ball FreeBSD/osx errors on get/setsockopt. Also, SOCK_NONBLOCK doesn't exist on FreeBSD or OSX. OSX has an extra byte field in struct sockaddr and derivatives that needs to be accounted for in some places.

source/common/runtime: use libuuid to generate UUIDs on Linux and OSX. Freebsd gets a hand-rolled implementation.

source/exe: hot restart is completely disabled for osx and freebsd

test/common/http: on OSX, struct tm configured for a date before 1901-12-13 20:45:52 becomes a time_t that represents 1969-12-31 23:59:59.

test/common/network: OSX doesn't like some of IPv6 addresses used in testing. By default only a single IPv4 loopback address exists in OSX (127.0.0.1).

Various and sundry explicit #include statements to pick up definitions that are obtained transitively on Linux.

Martin Conte Mac Donell and others added 17 commits July 27, 2017 16:29
When a connect() is performed on a non-blocking socket, it'll return
-1/EINPROGRESS immediatly and when select()'ed for writability it'll
succeed or fail but when it fails the reason is hidden.
`getsockopt(,,SO_ERROR,,)` solves this issue on some sytems but on some
cases such as FreeBSD getsockopt returns 0 even on `EAGAIN`. The
following trick issues a `getpeername()` which will return -1/ENOTCONN
if the socket is not connected and read(,,1) will produce the right errno.
Note that this implementation doesn't follow RFC4122. It's a
fully random generated UUID instead of time-based.
osx: invoke fcntl to set O_NONBLOCK (cannot be set via socket(2))
@zuercher zuercher mentioned this pull request Jul 28, 2017
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick skim so far. This is great, thanks for the contribution.

@@ -0,0 +1,95 @@
--- ./lib/libc/sys/open.2.orig 2016-09-28 16:25:57.000000000 -0700
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this file intended for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was in @Reflejo's original branch. I gather it's a FreeBSD path required for the kqueue WatcherImpl to work. I'm fine to remove it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a patch for freebsd11's kernel to enable O_SYMLINK

Copy link
Contributor

@PiotrSikora PiotrSikora Jul 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we watch the directory (similarly to the inotify() implementation on Linux)? I don't think patching FreeBSD kernel and adding custom features to it is a viable prerequisite for Envoy on FreeBSD.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I didn't intended this to go to upstream. Honestly I don't think there is a way to do what envoy is doing for the symlink observation without this. But I didn't investigate this longer enough for a conclusive answer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To spin this differently... @mattklein123: now, that we have LDS and overall better xDS support, do you think that we still need the runtime configuration (and therfore Watcher)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need it. We will also support loading xDS APIs off of filesystem. Either way, I think we can ignore BSD for the purpose of this PR. Let's just get OSX working and move in. If someone like @Reflejo wants to add in BSD support after and can test on current master, that's fine.

#include "common/filesystem/watcher_impl.h"
// NOLINT(namespace-envoy)
#if defined(__linux__)
#include "common/filesystem/watcher_impl_linux.cc"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use select based Bazel target srcs/hdrs to avoid this kind of compile time conditioning?

@@ -78,7 +82,14 @@ InstanceConstSharedPtr peerAddressFromFd(int fd) {
}

int InstanceBase::flagsFromSocketType(SocketType type) const {
#if defined(__FreeBSD__)
int flags = O_NONBLOCK;
#elif defined(__APPLE__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In various places you condition on both FreeBSD and APPLE. Why is this? Are you adding FreeBSD and OS X support? Is the FreeBSD part of the OS X support (but then why special APPLE conditions?)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep it since I got freeBSD to pass all tests as well. I'd double check that it's still the case. The delta between FreeBSD support and OSX is actually pretty small

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

going to remove the partial FreeBSD support per gitter conversation

int fd = ::socket(AF_INET, flagsFromSocketType(type), 0);
#ifdef __APPLE__
if (fd != -1) {
RELEASE_ASSERT(fcntl(fd, F_SETFL, O_NONBLOCK) != -1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment explaining why this is a special case for APPLE?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

// reason is hidden. `getsockopt(,,SO_ERROR,,)` solves this issue on some systems but on some
// cases such as FreeBSD getsockopt returns 0 even on `EAGAIN`. The following trick issues a
// `getpeername()` which will return -1/ENOTCONN if the socket is not connected and read(,,1)
// will produce the right errno.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems sketchy. Is there some reference to where this is shown to be accurate for all errnos?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a good summary here: https://cr.yp.to/docs/connect.html

Copy link
Contributor

@PiotrSikora PiotrSikora Jul 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that connection error will be available in kq_errno on OSX & FreeBSD and it should be used instead of getsockopt(SO_ERROR) on those platforms, so this might not be necessary at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can only offer that it works, empirically. This case is triggered regularly in testing on OS X. I think it makes sense to keep it disabled under Linux since it'll only add overhead and doesn't seem to be a problem there.

@@ -438,6 +450,27 @@ void ConnectionImpl::onWriteReady() {
ASSERT(0 == rc);
UNREFERENCED_PARAMETER(rc);

#if !defined(__linux__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true for all non-Linux or only for OS X?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's true for FreeBSD and OS X. I'll just change it to #ifdef __APPLE__.

uuid_unparse(uuid, generated_uuid);
#else
sprintf(generated_uuid, "%08lx-%04x-%04x-%04x-%04x%08lx",
static_cast<unsigned long>(arc4random()), static_cast<unsigned>(arc4random() & 0xffff),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to generate the minimum number of random numbers required here before the sprintf. Also, would prefer snprintf if using C formatters for safety.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just remove this since it's not used for Linux or OS X.

#else // __linux__

/**
* No-op implementation of HotRestart for everybody else.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We (@ Google) would find this useful even on Linux. @mrice32. Maybe there is some way to condition this at build time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On this point, can we potentially just revert all the changes to hot_restart_impl.h/cc, and just add a new hot_restart_nop_impl.h/cc, and do select() compile? This will keep existing code cleaner and make it possible to compile out hot restart support. (I think MSFT people will want this too initially).

Copy link
Member

@lizan lizan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see build errors like this:
(on my merged HEAD from 1346 and 1348).

source/common/http/async_client_impl.cc:17:73: error: default initialization of an object of const type 'const AsyncStreamImpl::NullRetryPolicy' without a user-provided default constructor
const AsyncStreamImpl::NullRetryPolicy AsyncStreamImpl::RouteEntryImpl::retry_policy_;
                                                                        ^
                                                                                     {}

@@ -179,7 +180,7 @@ def envoy_cc_test(name,
copts = envoy_copts(repository, test = True),
# TODO(mattklein123): It's not great that we universally link against the following libs.
# In particular, -latomic is not needed on all platforms. Make this more granular.
linkopts = ["-pthread", "-latomic"],
linkopts = ["-pthread", "-latomic", "-luuid"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will need to install libuuid-dev for this in build container, and also add this to doc.

@@ -11,6 +11,10 @@
#include "spdlog/spdlog.h"
#include "tclap/CmdLine.h"

#if defined(__APPLE__)
#define uint64_t unsigned long
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this? uint64_t should just exists.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be in header <cstdint> included above. You might need to do using std::uint64_t; however.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint64_t exists but is typedef unsigned long long. Without this you get this error:

In file included from source/server/options_impl.cc:12:
In file included from external/envoy_deps/thirdparty/tclap/include/tclap/CmdLine.h:27:
In file included from external/envoy_deps/thirdparty/tclap/include/tclap/SwitchArg.h:30:
In file included from external/envoy_deps/thirdparty/tclap/include/tclap/Arg.h:54:
external/envoy_deps/thirdparty/tclap/include/tclap/ArgTraits.h:80:22: error: type 'unsigned long long' cannot be used prior to '::' because it has no members
    typedef typename T::ValueCategory ValueCategory;
                     ^
external/envoy_deps/thirdparty/tclap/include/tclap/ValueArg.h:403:37: note: in instantiation of template class 'TCLAP::ArgTraits<unsigned long long>' requested here
        ExtractValue(_value, val, typename ArgTraits<T>::ValueCategory());
                                           ^
external/envoy_deps/thirdparty/tclap/include/tclap/ValueArg.h:363:5: note: in instantiation of member function 'TCLAP::ValueArg<unsigned long long>::_extractValue' requested here
                                _extractValue( args[*i] );
                                ^
source/server/options_impl.cc:31:29: note: in instantiation of member function 'TCLAP::ValueArg<unsigned long long>::processArg' requested here
  TCLAP::ValueArg<uint64_t> base_id(
                            ^
1 error generated.

Swapping out the typedef I added for using std::uint64_t; produces an identical error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrrm, this is unfortunate. I don't understand why the define doesn't cause the same problem.

If we can't figure out a better solution please do add a detailed comment explaining why it's done this way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally figured this out. TCLAP provides implementations of ArgTraits<T> for various types, but the ones for unsigned long long and long long are gated by #ifdef HAVE_LONG_LONG, which isn't set. I think they were expecting users of libraries to use autoconf which would detect this and make the necessary #define. I'll post it with a few other things tomorrow (split out in a new PR).

Copy link
Contributor

@dnoe dnoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments across stuff I have specific familiarity with.

@@ -16,6 +16,8 @@ void SignalAction::sigHandler(int sig, siginfo_t* info, void* context) {
#ifdef REG_RIP
// x86_64
error_pc = reinterpret_cast<void*>(ucontext->uc_mcontext.gregs[REG_RIP]);
#elif defined(__APPLE__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to add a conditional for x86 32 bit here? No need to implement it now, but I'm guessing a 32 bit compile will fail because there's no __rip.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

int hidx = 0;
for (const auto& sig : FATAL_SIGS) {
RELEASE_ASSERT(sigaction(sig, &previous_handlers_[hidx++], nullptr) == 0);
}
}

#ifdef __APPLE__
#define MAP_STACK (0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to add #ifndef MAP_STACK around this in case Darwin adds real map_stack in the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@@ -57,13 +59,24 @@ void SignalAction::installSigHandlers() {
}

void SignalAction::removeSigHandlers() {
#if defined(__APPLE__)
// ss_flags contains SS_DISABLE, but Darwin still checks the size, contrary to the man page
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh, good catch!

@@ -47,7 +49,9 @@ namespace Envoy {
class SignalAction : NonCopyable {
public:
SignalAction()
: guard_size_(sysconf(_SC_PAGE_SIZE)), altstack_size_(guard_size_ * 4), altstack_(nullptr) {
: guard_size_(sysconf(_SC_PAGE_SIZE)),
altstack_size_(std::max(guard_size_ * 4, static_cast<size_t>(MINSIGSTKSZ))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if the MINSIGSTKSZ case is hit then protections for the immediately adjacent pages still works OK, there's just a portion at the end which is not protected. The second mprotect call above could have the size adjusted to cover this case.

Copy link
Member Author

@zuercher zuercher Jul 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's correct. We ultimately mmap altstack_size_ + 2 * guard_size_ bytes (see mapSizeWithGuards). altstack_size_ doesn't include the guard pages. The second mprotect calls protects guard_size_ bytes at an address that's guard_size_ + altstack_size_ bytes from the start of the mapped memory.

We call sigaltstack with altstack_ + guard_size_ (to skip the first guard page) and altstack_size_ (which doesn't count the guard pages, and so stops just before the second guard page).

Did I miss something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, all of this looks correct. Guard pages are always outside the altstack_size_, and MINSIGSTKSZ should always be a multiple of page size.

@@ -11,6 +11,10 @@
#include "spdlog/spdlog.h"
#include "tclap/CmdLine.h"

#if defined(__APPLE__)
#define uint64_t unsigned long
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be in header <cstdint> included above. You might need to do using std::uint64_t; however.

@@ -2,6 +2,12 @@

set -e

if [[ `uname` != "Linux" ]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's probably a way this test could be completely disabled with a Bazel rule instead.

@htuch

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I don't know how.

Normally, I could use select to choose either no test script or a different one. However, the output of select is only usable by bazel rules (because only at the point of rule execution is it possible to inspect the configuration to choose which subset of parameters to select).

envoy_sh_test is a macro and it attempts to access members of its srcs parameter to produce inputs for several rules. This doesn't work if srcs is assigned from select (whose output has no [] operator). To make it work, envoy_sh_test would have to become a rule and at that point I'm in over my head.

@@ -10,6 +10,15 @@

#include "spdlog/spdlog.h"

#ifdef __APPLE__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will end up getting used again elsewhere. Can we put this into common/common/byte_order.h or something?

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a ton for doing this. Some quick comments from a skim. Will do a more full review after the next pass of comment fixes once everyone else has looked.

// Since we are using SOCK_NONBLOCK, there are cases (on FreeBSD for example) where the immediate
// connect response is EINPROGRESS but the socket failed and the calling code might not know about
// it yet.
if (-1 == rc && errno == ECONNRESET) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we guard this with free bsd only then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed it since it doesn't seem to happen on OSX.

return ::socket(AF_INET, flagsFromSocketType(type), 0);
int fd = ::socket(AF_INET, flagsFromSocketType(type), 0);
#ifdef __APPLE__
if (fd != -1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any point in doing fd != -1 check here? (Same in other places). Will just crash anyway in RELEASE_ASSERT.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did the check because the original code didn't assert (for AF_INET and AF_UNIX). I'll go ahead and just assert on the fd and then assert on the fcntl call.

return Address::InstanceConstSharedPtr{
new Address::Ipv4Instance(reinterpret_cast<sockaddr_in*>(&orig_addr))};

case AF_INET6:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these new cases and the throw have code coverage on linux CI? If not will need to refactor somehow to get coverage.

#else // __linux__

/**
* No-op implementation of HotRestart for everybody else.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On this point, can we potentially just revert all the changes to hot_restart_impl.h/cc, and just add a new hot_restart_nop_impl.h/cc, and do select() compile? This will keep existing code cleaner and make it possible to compile out hot restart support. (I think MSFT people will want this too initially).

@@ -323,14 +323,14 @@ TEST(Ipv6CidrRange, InstanceConstSharedPtrAndLengthCtor) {

TEST(Ipv6CidrRange, StringAndLengthCtor) {
CidrRange rng;
rng = CidrRange::create("::ffff", 122); // Assignment operator.
rng = CidrRange::create("ff::ffff", 122); // Assignment operator.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we losing any linux test coverage w/ these changes? cc @hennna

Copy link
Contributor

@hennna hennna Jul 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not off the top of my head. I don't think these tests are dependent on the address value. Is the issue with OSX the leading ::?

Copy link
Member Author

@zuercher zuercher Jul 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of. Wikipedia tells me an old version of the IPv6 spec considered ::/96 addresses (that is addresses with 96 bits of leading zeros) to be IPv6-mapped IPv4 addresses. It's been deprecated for a decade, but when OSX converts, say ::ffff, to a string it comes out as "::0.0.255.255" and the tests fail because they do string comparisons. Linux only provides the "mapped IPv4" style addresses with trailing dotted quads for the ::ffff:0:0/96 range (which replaced ::/96). It seems to me that these test are concerned with how the low order bits of the address are manipulated and that using a different prefix doesn't change the meaning of the tests.

(I wrote a little program to round trip IP address strings to binary and back. https://gist.github.com/zuercher/0d0bc5a433fda423124c71afa2eb29e1)

@@ -102,6 +103,48 @@ TEST_F(RingHashLoadBalancerTest, Basic) {
TestLoadBalancerContext context(0);
EXPECT_EQ(cluster_.hosts_[5], lb_.chooseHost(&context));
}
#else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it's really worth doing this. This is already not great and related to my TODO above. Seems like it might be better to just not run these tests on Apple until someone fixes the TODO to compile murmur3 or city into the code (which is not difficult).

void free(Stats::RawStatData& data) override;

private:
SharedMemory& shmem_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the NOP hot restart impl should not use shared memory or the process shared mutex stuff. Basically, I would just do exactly what https://github.com/lyft/envoy/blob/master/test/integration/server.cc#L93 does which is to use process local locks, the heap allocator, etc. I think you can probably revert all the changes related to shared memory and just move the test impl into prod code somewhere and use that.

@mattklein123
Copy link
Member

@zuercher general comment. Can we start to split out some of this change into self contained PRs? Some of the test fixes can go in and get reviewed independently, as can some of the trivial header changes, etc. That way we can start merging the not controversial stuff and trim down the size of the main PR. Thank you.

@zuercher
Copy link
Member Author

zuercher commented Aug 1, 2017

Split out #1363 and #1365 for the watcher and hot restart changes. (I'll split more things out tomorrow.)

@mattklein123
Copy link
Member

Going to close this out in lieu of the other PRs that are ongoing.

@zuercher zuercher deleted the osx-support branch August 10, 2017 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants