Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broker check certificate load error and perform periodic reloading #781

Merged
merged 1 commit into from
Jan 11, 2021

Conversation

kleunen
Copy link
Contributor

@kleunen kleunen commented Jan 4, 2021

Broker crashes if non-existing certificate file was specified and error code was not checked on loading

@kleunen kleunen force-pushed the master branch 2 times, most recently from 996a306 to e9fa33c Compare January 4, 2021 20:50
@codecov
Copy link

codecov bot commented Jan 4, 2021

Codecov Report

Merging #781 (8999d08) into master (232651e) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #781   +/-   ##
=======================================
  Coverage   85.16%   85.16%           
=======================================
  Files          63       63           
  Lines        8747     8747           
=======================================
  Hits         7449     7449           
  Misses       1298     1298           

@redboltz
Copy link
Owner

redboltz commented Jan 5, 2021

I'm not sure what the problem is.

Broker crashes if non-existing certificate file was specified and error code was not checked on loading

If this is the problem, then check the error code and terminate the broker is good solution.
The PR seems to have many unrelated modification.

@kleunen
Copy link
Contributor Author

kleunen commented Jan 5, 2021

If you have long running server, you need to reload the certificate file. Because the certificate is only valid for certain period. For example you may get a new certificate every 60 days.
https://letsencrypt.org/docs/faq/#:~:text=Our%20certificates%20are%20valid%20for,your%20certificates%20every%2060%20days.

@redboltz
Copy link
Owner

redboltz commented Jan 5, 2021

Thank you for elaboration. I understand what is the problem you want to solve.
I will take a look the PR. Wait a moment please.

IF (MSVC AND MQTT_USE_STATIC_OPENSSL)
TARGET_LINK_LIBRARIES (${source_file_we} Crypt32)
ENDIF ()

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this update is required?

FILE(COPY ${CMAKE_CURRENT_SOURCE_DIR}/../test/certs/server.crt.pem DESTINATION ${CMAKE_CURRENT_BINARY_DIR})
FILE(COPY ${CMAKE_CURRENT_SOURCE_DIR}/../test/certs/server.key.pem DESTINATION ${CMAKE_CURRENT_BINARY_DIR})
FILE(COPY ${CMAKE_CURRENT_SOURCE_DIR}/../test/certs/cacert.pem DESTINATION ${CMAKE_CURRENT_BINARY_DIR})

IF ("${CMAKE_CXX_COMPILER_ID}" STREQUAL "MSVC")
FILE(COPY ${CMAKE_CURRENT_SOURCE_DIR}/broker.conf DESTINATION ${CMAKE_CURRENT_BINARY_DIR}/Release)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that this doesn't support debug build. The problem is originally exists.
But if we get the current build type (Debug or Release), then the code would become better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Also one problem i have with this. It is a separate target. If you build only broker, the conf files are not updated. And also, i wanted to name the key for broker broker.key.pem instead of server.key.pem

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean?
If you update the cert file, then you need to re-run cmake explicitly otherwise the cert file is not updated?
I think that it is natural for cmake.
Perhaps you can add your own dependency. But it is too difficult for me.

@@ -16,18 +16,44 @@
#include <fstream>

#if defined(MQTT_USE_TLS)
boost::asio::ssl::context init_ctx(boost::program_options::variables_map const& vm)
constexpr std::size_t certificate_reload_interval_seconds = 3600;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be a program option.
And if the value is 0 then no reload is happened and 0 should be default in the boost::program_options definition.

constexpr std::size_t certificate_reload_interval_seconds = 3600;

template<typename ServerPtr>
void reload_ctx(boost::asio::io_context &ioc, ServerPtr server, boost::program_options::variables_map const& vm, std::string const &name)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why name is needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the info log message

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WSS and TLS use a different cert files?

Copy link
Contributor Author

@kleunen kleunen Jan 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they have a different context. So they don't share a reference to the ssl context, but the context is moved into the Server class. So if a new file is loaded, it needs to be loaded into two different ssl context: WSS and TLS

https://github.com/redboltz/mqtt_cpp/blob/master/test/system/test_server_tls.hpp#L28-L33

https://github.com/redboltz/mqtt_cpp/blob/master/test/system/test_server_tls_ws.hpp#L28-L34

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is also why this is needed:

e9fa33c#diff-555c49abb1b633967427e5b3374722f21d0dc4898b352d55a715168a1e761b00R62-R77

Which i don't like that much

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why name is needed.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name is std::string const&, and you pass "WSS". Implicit temporary std::string is created.
It is quite misleading.
Use char const* of std::string.

@@ -51,16 +77,18 @@ void run_broker(boost::program_options::variables_map const& vm)
#endif // defined(MQTT_USE_WS)

#if defined(MQTT_USE_TLS)
std::unique_ptr<test_server_tls> s_tls;
std::shared_ptr<test_server_tls> s_tls;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why shared_ptr is needed? I think that unique_ptr is enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use shared ptr to bind to lambda for timer callback. But maybe use weak reference?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that you can use unique_ptr

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unique_ptr can move capture. I can't see where is the point that we can't use unique_ptr, so far.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but then the reference in the 'run_broker' is invalidated. Maybe at some point you want to add a shutdown of the broker (shutdown all servers), or maybe request the state of all servers (for example get diagnostics of number of connected clients). I don't think you want to move the pointer into the lambda of the reload timer. The reload timer is not the owner of the server, the broker is.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just keep unique_ptr and pass the reference of pointee to reload_ctx.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that 827ffea#diff-a980e7b60a024a0aa2c4bccacbeb0369bb88da50f872f356a1a30095c941cc03 is not good.
I didn't notice at that time.

827ffea#diff-a980e7b60a024a0aa2c4bccacbeb0369bb88da50f872f356a1a30095c941cc03R41
shouldn't be a unique_ptr.
Just locate on the stack.
If nullable is required, you should use optional, not unique_ptr.

I mean

        MQTT_NS::optional<test_server_no_tls> s;
        if (vm.count("tcp.port")) {
            s.emplace(ioc, b, vm["tcp.port"].as<uint16_t>());
        }

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IF (MSVC AND MQTT_USE_STATIC_OPENSSL)
TARGET_LINK_LIBRARIES (${source_file_we} Crypt32)
ENDIF ()

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this update is required?

IF (MSVC AND MQTT_USE_STATIC_OPENSSL)
TARGET_LINK_LIBRARIES (${source_file_we} Crypt32)
ENDIF ()

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this update is required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes. This is again maybe unrelated. But openssl static om Windows requires linking crypt32

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove this?
static link is out of scope here.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that I already merged static support. Hmm...
Maybe it is required. What does 32 mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I don't have much time to investigate it.
So if

    IF (MSVC AND MQTT_USE_STATIC_OPENSSL)
        TARGET_LINK_LIBRARIES (${source_file_we} Crypt32)
    ENDIF ()

is required for CI, then I apply it.

Otherwise, if this works well for both 64bit and 32bit, then I apply it.
If it works either 32bit or 64bit only, I don't apply it.

Which is the status ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works for both 64bit and 32bit

Copy link
Contributor Author

@kleunen kleunen Jan 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not needed for CI, because CI builds with dynamic openssl. I changed it because I build localy with static openssl on windows.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works for both 64bit and 32bit

Ok, I accept it.

@redboltz
Copy link
Owner

redboltz commented Jan 6, 2021

If you have long running server, you need to reload the certificate file. Because the certificate is only valid for certain period. For example you may get a new certificate every 60 days.
https://letsencrypt.org/docs/faq/#:~:text=Our%20certificates%20are%20valid%20for,your%20certificates%20every%2060%20days.

Just a question. Let's encrypt updates not only server certs but also key file ?

@redboltz
Copy link
Owner

redboltz commented Jan 6, 2021

Test is required.
How about this ?

Prepare valid_cert.crt and invalid_cert.crt(e.g. expired).

  1. Copy invalid_cert.crt to test_cert.crt
  2. Broker run with test_cert.crt and set certificate_reload_interval_seconds to 2.
  3. Client A connects to the broker, then failed due to expired cert.
  4. Copy valid_cert.crt to test_cert.crt
  5. Client A waits 3 seconds.
  6. When 2 seconds passed, the broker reload test_cert.crt.
  7. Client A connects again, and succcesfully connected.

@kleunen
Copy link
Contributor Author

kleunen commented Jan 6, 2021

If you have long running server, you need to reload the certificate file. Because the certificate is only valid for certain period. For example you may get a new certificate every 60 days.
https://letsencrypt.org/docs/faq/#:~:text=Our%20certificates%20are%20valid%20for,your%20certificates%20every%2060%20days.

Just a question. Let's encrypt updates not only server certs but also key file ?

Yes I believe so. It can be revoked in case it gets stolen

@kleunen
Copy link
Contributor Author

kleunen commented Jan 6, 2021

Test is required.
How about this ?

Prepare valid_cert.crt and invalid_cert.crt(e.g. expired).

  1. Copy invalid_cert.crt to test_cert.crt
  2. Broker run with test_cert.crt and set certificate_reload_interval_seconds to 2.
  3. Client A connects to the broker, then failed due to expired cert.
  4. Copy valid_cert.crt to test_cert.crt
  5. Client A waits 3 seconds.
  6. When 2 seconds passed, the broker reload test_cert.crt.
  7. Client A connects again, and succcesfully connected.

It will be difficult. Because you need to alter the system time to trick into thinking the certificate is expired. Or you need to generate a certificate every time you run the test. This is possible

@redboltz
Copy link
Owner

redboltz commented Jan 6, 2021

Test is required.
How about this ?
Prepare valid_cert.crt and invalid_cert.crt(e.g. expired).

  1. Copy invalid_cert.crt to test_cert.crt
  2. Broker run with test_cert.crt and set certificate_reload_interval_seconds to 2.
  3. Client A connects to the broker, then failed due to expired cert.
  4. Copy valid_cert.crt to test_cert.crt
  5. Client A waits 3 seconds.
  6. When 2 seconds passed, the broker reload test_cert.crt.
  7. Client A connects again, and succcesfully connected.

It will be difficult. Because you need to alter the system time to trick into thinking the certificate is expired. Or you need to generate a certificate every time you run the test. This is possible

But I need test. The point is not expiration. Invalid cert is important. I said invalid. Expired is just an example.

@kleunen
Copy link
Contributor Author

kleunen commented Jan 6, 2021

But I need test. The point is not expiration. Invalid cert is important. I said invalid. Expired is just an example.

Then you need to generate the certificate at runtime. It is possible your test approach, by generating a self signed certificate based on current timestamp

@kleunen
Copy link
Contributor Author

kleunen commented Jan 6, 2021

You can also update the name in the certificate. And see if it is updated. Then you can pregeneratr

@redboltz
Copy link
Owner

redboltz commented Jan 6, 2021

But I need test. The point is not expiration. Invalid cert is important. I said invalid. Expired is just an example.

Then you need to generate the certificate at runtime. It is possible your test approach, by generating a self signed certificate based on current timestamp

Why? Just commit invalid cert. I don't need generate it on the runtime

@kleunen
Copy link
Contributor Author

kleunen commented Jan 6, 2021

But ssl wont connect if the certificate is expired. But i think you can get certificate even if connect fails.

Your valid certificate will expire someday

@redboltz
Copy link
Owner

redboltz commented Jan 6, 2021

But ssl wont connect if the certificate is expired. But i think you can get certificate even if connect fails.

Your valid certificate will expire someday

I'm not sure how long duration is allowed. But 10 years or 100 years is good enough.
It is only for test.
And 10 years later, I will update the cert.
I don't want to introduce complicated system for test like a runtime cert generation.

@kleunen
Copy link
Contributor Author

kleunen commented Jan 6, 2021

But ssl wont connect if the certificate is expired. But i think you can get certificate even if connect fails.
Your valid certificate will expire someday

I'm not sure how long duration is allowed. But 10 years or 100 years is good enough.
It is only for test.
And 10 years later, I will update the cert.
I don't want to introduce complicated system for test like a runtime cert generation.

10 years should be possible, maybe 100 years also

@redboltz
Copy link
Owner

redboltz commented Jan 6, 2021

10 years should be possible, maybe 100 years also

Thanks. I forgot the current cert expiration date but it is pretty long.
Anyway, prepare somehow invalid (invalid IP address etc) cert for test.
Then we have valid and invalid certs now.
So you can write the test using #781 (comment) scenario.
Maybe test cert file copy/update can from the test code (not need to use cmake) using boost filesystem.

@redboltz
Copy link
Owner

redboltz commented Jan 6, 2021

Is cert file updating process is atomic ?

I personally use certbot to update Let's Encrypt cert files.
I'm not sure what kind of os command is run (cp mv, etc).

Is there any access failing timing during update? If it is, the current code throw exception. I know it happens very very rare timing.
If it is atomic, no problem.

}

auto reload_timer = std::make_shared<boost::asio::deadline_timer>(ioc, boost::posix_time::seconds(certificate_reload_interval_seconds));
reload_timer->async_wait([&, server = server, reload_timer = reload_timer, name = name](boost::system::error_code const &e) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reload timer shouldn't be allocated in the reload_ctx. It should be allocated at run_broker() as the shared_ptr. And passed by reference to reload_ctx. At the lambda capture, use weak_ptr pattern.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name shouldn't be copy capture. Use reference capture.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&e is coding rule violation.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name shouldn't be copy capture. Use reference capture.

If you use https://github.com/redboltz/mqtt_cpp/pull/781/files#r552543783 std::string, then name should be move capture.
If char const* then name should be copy capture. copy capture should be [name] not [name = name].

reload_timer->async_wait([&, server = server, reload_timer = reload_timer, name = name](boost::system::error_code const &e) {
BOOST_ASSERT(!e || e == boost::asio::error::operation_aborted);

if(!e) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coding rule violation.

@kleunen
Copy link
Contributor Author

kleunen commented Jan 6, 2021

Is cert file updating process is atomic ?

I personally use certbot to update Let's Encrypt cert files.
I'm not sure what kind of os command is run (cp mv, etc).

Is there any access failing timing during update? If it is, the current code throw exception. I know it happens very very rare timing.
If it is atomic, no problem.

There are symlinks to individual files, so the update of the key/certificate will be atomic by itself, but the key and certificate may have different version:

lrwxrwxrwx 1 root root  39 Jan  4 19:02 cert.pem -> ../../archive/mqtt.kleunen.nl/cert1.pem
lrwxrwxrwx 1 root root  40 Jan  4 19:02 chain.pem -> ../../archive/mqtt.kleunen.nl/chain1.pem
lrwxrwxrwx 1 root root  44 Jan  4 19:02 fullchain.pem -> ../../archive/mqtt.kleunen.nl/fullchain1.pem
lrwxrwxrwx 1 root root  42 Jan  4 19:02 privkey.pem -> ../../archive/mqtt.kleunen.nl/privkey1.pem
-rw-r--r-- 1 root root 692 Jan  4 19:02 README

I don't think it will be an issue the certificate is already updated, and the key not yet, or the other way around. There is an overlap period. Keys/certificates stay valid for 90 days, but should be refreshed every 60 days with letsencrypt

@kleunen
Copy link
Contributor Author

kleunen commented Jan 6, 2021

To make a test, the reload_ctx and run_broker need to be moved into a header file. Also the servers (test_server_tls, test_server_no_tls, etc ...), they are part of the system test. Should this be cleaned up first ?

@@ -84,7 +116,8 @@ int main(int argc, char **argv) {
#endif // defined(MQTT_USE_LOG)
("certificate", boost::program_options::value<std::string>(), "Certificate file for TLS connections")
("private_key", boost::program_options::value<std::string>(), "Private key file for TLS connections")
;
("certificate_reload_interval", boost::program_options::value<unsigned int>()->default_value(3600), "Reload interval for the certificate and private key files (seconds)")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
("certificate_reload_interval", boost::program_options::value<unsigned int>()->default_value(3600), "Reload interval for the certificate and private key files (seconds)")
("certificate_reload_interval", boost::program_options::value<unsigned int>()->default_value(0), "Reload interval for the certificate and private key files (seconds)")

Never reload by default is better. 0 means no reload.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all users expect automatic reload. It should be optional functionality.

boost::asio::ssl::context init_ctx(boost::program_options::variables_map const& vm)

template<typename Server>
void reload_ctx(boost::asio::steady_timer &reload_timer, Server &server, boost::program_options::variables_map const& vm, char const *name)
Copy link
Owner

@redboltz redboltz Jan 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void reload_ctx(boost::asio::steady_timer &reload_timer, Server &server, boost::program_options::variables_map const& vm, char const *name)
void reload_ctx(boost::asio::steady_timer& reload_timer, Server& server, boost::program_options::variables_map const& vm, char const* name) {

Please move { to the end of the line.

Copy link
Owner

@redboltz redboltz Jan 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing boost::program_options::variables_map const& vm is not good.
Analyze program options is not suitable for reload_ctx() responsibility. reload_ctx() should focus on reload ssl context.
So certificate, private_key, and certificate_reload_interval should be passed.
The arguments combination checking should be done at run_broker().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still needs to be updated ?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the function itself already updated as different parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I updated. Binding the variables_map to the timer is not a good idea.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the function itself already updated as different parameters.

It is wrong. The answer is yes. I will add comments.

Copy link
Owner

@redboltz redboltz Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got confused between load_ctx and reload_ctx. I need a time to understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I understand. No updated is required here. reload_ctx() 's parameter is OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only coding rules update ?

passing as const & does result in copy everytime the callback timer is called

if (ec) {
throw std::runtime_error("Failed to load private key file: " + ec.message());
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If certificate_reload_interval is 0 then don't set timer.

certificate=broker.crt.pem
private_key=broker.key.pem
certificate=server.crt.pem
private_key=server.key.pem
Copy link
Owner

@redboltz redboltz Jan 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following comment should be added. Please update to better English :)

# Reload interval for the certificate and private key files (seconds)
# It is useful for Let's Encrypt certs update.
# If not set or set to 0, then no reload.
#
# certificate_reload_interval=3600

@redboltz
Copy link
Owner

redboltz commented Jan 7, 2021

To make a test, the reload_ctx and run_broker need to be moved into a header file. Also the servers (test_server_tls, test_server_no_tls, etc ...), they are part of the system test. Should this be cleaned up first ?

Ah, I misunderstood it is a part of broker.hpp, but actually it is example/broker.cpp.
I think that the current responsibility mapping is good.
So you don't need to move the reload_ctx to the broker.hpp (or include/mqtt/broker/???.hpp`.
It is a part of example. So far, checking reload mechanism working well manually is good enough.
Do you already check it? (I mean invalid to valid update and connection fail/success expectedly.)

@kleunen
Copy link
Contributor Author

kleunen commented Jan 7, 2021

To make a test, the reload_ctx and run_broker need to be moved into a header file. Also the servers (test_server_tls, test_server_no_tls, etc ...), they are part of the system test. Should this be cleaned up first ?

Ah, I misunderstood it is a part of broker.hpp, but actually it is example/broker.cpp.
I think that the current responsibility mapping is good.
So you don't need to move the reload_ctx to the broker.hpp (or include/mqtt/broker/???.hpp`.
It is a part of example. So far, checking reload mechanism working well manually is good enough.
Do you already check it? (I mean invalid to valid update and connection fail/success expectedly.)

I still need to check manually if the reloading is actually done

@redboltz
Copy link
Owner

redboltz commented Jan 8, 2021

I still need to check manually if the reloading is actually done

Ok, after you finish the checking, please let me know.
ssl context sometimes behave unexpectedly. (Sometimes doesn't mean un-stable. Some of API works expectedly, and some of API doesn't).

Even if ssl::stream's parameter is a reference of ssl::context, some of OpenSSL API doesn't reflect the update to the current connection. So I needed to control ssl::context setup timing carefully.
I only experienced it on the client side.
Perhaps server side need to re-accept (or re-listen).
I hope the context update reflects dynamically the current listening and accepting ssl::stream.

@kleunen
Copy link
Contributor Author

kleunen commented Jan 8, 2021

It seems when I update only the key and not the cert, i get the following error:
08:32:28.389845 T:0x000005e4 S:error C:mqtt_broker broker.cpp:102 Failed to load private key file: key values mismatch

So you would have to update atomicly both the key and the cert:
https://www.tbs-certificates.co.uk/FAQ/en/520.html#:~:text=Openssl%3A%20key%20values%20mismatch&text=It%20means%20the%20private%20key,an%20error%20in%20the%20configuration.

So you would have to update a symlink to a directory containing both the key and the cert. I wonder why letsencrypt does not do this.

So maybe first load the certificate and private key in a temporary ssl context. And if succeed, update the context in the server. If fails: keep old context in the server ?

@kleunen
Copy link
Contributor Author

kleunen commented Jan 8, 2021

I bundled now both the key and the certificate in a bundle.pem:
cat server.key.pem server.crt.pem > server.bundle.pem

Now the updating works. But I do see that it sometimes takes a while before the updated certificate is used. I think the following happens:

  1. Stream is listening for connection with old certificate
  2. Certificate is updated
  3. Stream accept connection with old certificate
  4. Stream listens for connection with new certificate

So there will be 1 more connection with the old certificate, before the new certificate is used. Or you need to restart the listening when the certificate is updated.

Perhaps server side need to re-accept (or re-listen).

yes, it does

@kleunen kleunen force-pushed the master branch 3 times, most recently from ab91861 to 10cdb65 Compare January 8, 2021 10:57
include/mqtt/endpoint.hpp Outdated Show resolved Hide resolved
@kleunen kleunen force-pushed the master branch 2 times, most recently from 37a9082 to 9d87b9a Compare January 11, 2021 10:21

template<typename Server>
void reload_ctx(Server& server, boost::asio::steady_timer& reload_timer,
std::string const &certificate_filename,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::string const &certificate_filename,
std::string const& certificate_filename,

template<typename Server>
void reload_ctx(Server& server, boost::asio::steady_timer& reload_timer,
std::string const &certificate_filename,
std::string const &key_filename,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::string const &key_filename,
std::string const& key_filename,

Comment on lines 39 to 40
reload_timer.async_wait([&server, &reload_timer,
certificate_filename, key_filename, certificate_reload_interval, name](boost::system::error_code const &e) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
reload_timer.async_wait([&server, &reload_timer,
certificate_filename, key_filename, certificate_reload_interval, name](boost::system::error_code const &e) {
reload_timer.async_wait(
[&server, &reload_timer, certificate_filename, key_filename, certificate_reload_interval, name]
(boost::system::error_code const& e) {

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main point is const &e vs const& e. But I also suggested change line breaks. Please update it.

@redboltz
Copy link
Owner

Thank you for updating!
Merged.

@redboltz redboltz merged commit 3cc8622 into redboltz:master Jan 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants