-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broker check certificate load error and perform periodic reloading #781
Conversation
996a306
to
e9fa33c
Compare
Codecov Report
@@ Coverage Diff @@
## master #781 +/- ##
=======================================
Coverage 85.16% 85.16%
=======================================
Files 63 63
Lines 8747 8747
=======================================
Hits 7449 7449
Misses 1298 1298 |
I'm not sure what the problem is.
If this is the problem, then check the error code and terminate the broker is good solution. |
If you have long running server, you need to reload the certificate file. Because the certificate is only valid for certain period. For example you may get a new certificate every 60 days. |
Thank you for elaboration. I understand what is the problem you want to solve. |
IF (MSVC AND MQTT_USE_STATIC_OPENSSL) | ||
TARGET_LINK_LIBRARIES (${source_file_we} Crypt32) | ||
ENDIF () | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this update is required?
example/CMakeLists.txt
Outdated
FILE(COPY ${CMAKE_CURRENT_SOURCE_DIR}/../test/certs/server.crt.pem DESTINATION ${CMAKE_CURRENT_BINARY_DIR}) | ||
FILE(COPY ${CMAKE_CURRENT_SOURCE_DIR}/../test/certs/server.key.pem DESTINATION ${CMAKE_CURRENT_BINARY_DIR}) | ||
FILE(COPY ${CMAKE_CURRENT_SOURCE_DIR}/../test/certs/cacert.pem DESTINATION ${CMAKE_CURRENT_BINARY_DIR}) | ||
|
||
IF ("${CMAKE_CXX_COMPILER_ID}" STREQUAL "MSVC") | ||
FILE(COPY ${CMAKE_CURRENT_SOURCE_DIR}/broker.conf DESTINATION ${CMAKE_CURRENT_BINARY_DIR}/Release) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that this doesn't support debug build. The problem is originally exists.
But if we get the current build type (Debug or Release), then the code would become better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Also one problem i have with this. It is a separate target. If you build only broker, the conf files are not updated. And also, i wanted to name the key for broker broker.key.pem instead of server.key.pem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean?
If you update the cert file, then you need to re-run cmake explicitly otherwise the cert file is not updated?
I think that it is natural for cmake.
Perhaps you can add your own dependency. But it is too difficult for me.
example/broker.cpp
Outdated
@@ -16,18 +16,44 @@ | |||
#include <fstream> | |||
|
|||
#if defined(MQTT_USE_TLS) | |||
boost::asio::ssl::context init_ctx(boost::program_options::variables_map const& vm) | |||
constexpr std::size_t certificate_reload_interval_seconds = 3600; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be a program option.
And if the value is 0 then no reload is happened and 0 should be default in the boost::program_options definition.
example/broker.cpp
Outdated
constexpr std::size_t certificate_reload_interval_seconds = 3600; | ||
|
||
template<typename ServerPtr> | ||
void reload_ctx(boost::asio::io_context &ioc, ServerPtr server, boost::program_options::variables_map const& vm, std::string const &name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why name is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the info log message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WSS and TLS use a different cert files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, they have a different context. So they don't share a reference to the ssl context, but the context is moved into the Server class. So if a new file is loaded, it needs to be loaded into two different ssl context: WSS and TLS
https://github.com/redboltz/mqtt_cpp/blob/master/test/system/test_server_tls.hpp#L28-L33
https://github.com/redboltz/mqtt_cpp/blob/master/test/system/test_server_tls_ws.hpp#L28-L34
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is also why this is needed:
e9fa33c#diff-555c49abb1b633967427e5b3374722f21d0dc4898b352d55a715168a1e761b00R62-R77
Which i don't like that much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why name is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name is std::string const&
, and you pass "WSS". Implicit temporary std::string
is created.
It is quite misleading.
Use char const*
of std::string
.
example/broker.cpp
Outdated
@@ -51,16 +77,18 @@ void run_broker(boost::program_options::variables_map const& vm) | |||
#endif // defined(MQTT_USE_WS) | |||
|
|||
#if defined(MQTT_USE_TLS) | |||
std::unique_ptr<test_server_tls> s_tls; | |||
std::shared_ptr<test_server_tls> s_tls; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why shared_ptr is needed? I think that unique_ptr is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use shared ptr to bind to lambda for timer callback. But maybe use weak reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that you can use unique_ptr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unique_ptr can move capture. I can't see where is the point that we can't use unique_ptr, so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but then the reference in the 'run_broker' is invalidated. Maybe at some point you want to add a shutdown of the broker (shutdown all servers), or maybe request the state of all servers (for example get diagnostics of number of connected clients). I don't think you want to move the pointer into the lambda of the reload timer. The reload timer is not the owner of the server, the broker is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just keep unique_ptr and pass the reference of pointee to reload_ctx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that 827ffea#diff-a980e7b60a024a0aa2c4bccacbeb0369bb88da50f872f356a1a30095c941cc03 is not good.
I didn't notice at that time.
827ffea#diff-a980e7b60a024a0aa2c4bccacbeb0369bb88da50f872f356a1a30095c941cc03R41
shouldn't be a unique_ptr.
Just locate on the stack.
If nullable is required, you should use optional, not unique_ptr.
I mean
MQTT_NS::optional<test_server_no_tls> s;
if (vm.count("tcp.port")) {
s.emplace(ioc, b, vm["tcp.port"].as<uint16_t>());
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IF (MSVC AND MQTT_USE_STATIC_OPENSSL) | ||
TARGET_LINK_LIBRARIES (${source_file_we} Crypt32) | ||
ENDIF () | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this update is required?
IF (MSVC AND MQTT_USE_STATIC_OPENSSL) | ||
TARGET_LINK_LIBRARIES (${source_file_we} Crypt32) | ||
ENDIF () | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this update is required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes. This is again maybe unrelated. But openssl static om Windows requires linking crypt32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remove this?
static link is out of scope here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that I already merged static support. Hmm...
Maybe it is required. What does 32 mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I don't have much time to investigate it.
So if
IF (MSVC AND MQTT_USE_STATIC_OPENSSL)
TARGET_LINK_LIBRARIES (${source_file_we} Crypt32)
ENDIF ()
is required for CI, then I apply it.
Otherwise, if this works well for both 64bit and 32bit, then I apply it.
If it works either 32bit or 64bit only, I don't apply it.
Which is the status ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works for both 64bit and 32bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not needed for CI, because CI builds with dynamic openssl. I changed it because I build localy with static openssl on windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works for both 64bit and 32bit
Ok, I accept it.
Just a question. Let's encrypt updates not only server certs but also key file ? |
Test is required. Prepare valid_cert.crt and invalid_cert.crt(e.g. expired).
|
Yes I believe so. It can be revoked in case it gets stolen |
It will be difficult. Because you need to alter the system time to trick into thinking the certificate is expired. Or you need to generate a certificate every time you run the test. This is possible |
But I need test. The point is not expiration. Invalid cert is important. I said invalid. Expired is just an example. |
Then you need to generate the certificate at runtime. It is possible your test approach, by generating a self signed certificate based on current timestamp |
You can also update the name in the certificate. And see if it is updated. Then you can pregeneratr |
Why? Just commit invalid cert. I don't need generate it on the runtime |
But ssl wont connect if the certificate is expired. But i think you can get certificate even if connect fails. Your valid certificate will expire someday |
I'm not sure how long duration is allowed. But 10 years or 100 years is good enough. |
10 years should be possible, maybe 100 years also |
Thanks. I forgot the current cert expiration date but it is pretty long. |
Is cert file updating process is atomic ? I personally use certbot to update Let's Encrypt cert files. Is there any access failing timing during update? If it is, the current code throw exception. I know it happens very very rare timing. |
example/broker.cpp
Outdated
} | ||
|
||
auto reload_timer = std::make_shared<boost::asio::deadline_timer>(ioc, boost::posix_time::seconds(certificate_reload_interval_seconds)); | ||
reload_timer->async_wait([&, server = server, reload_timer = reload_timer, name = name](boost::system::error_code const &e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use single &
with async operation.
See https://github.com/redboltz/mqtt_cpp/wiki/Coding-Rules#details-of-lambda-expression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reload timer shouldn't be allocated in the reload_ctx. It should be allocated at run_broker() as the shared_ptr. And passed by reference to reload_ctx. At the lambda capture, use weak_ptr pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name
shouldn't be copy capture. Use reference capture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
&e
is coding rule violation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name
shouldn't be copy capture. Use reference capture.
If you use https://github.com/redboltz/mqtt_cpp/pull/781/files#r552543783 std::string, then name should be move capture.
If char const* then name should be copy capture. copy capture should be [name]
not [name = name]
.
example/broker.cpp
Outdated
reload_timer->async_wait([&, server = server, reload_timer = reload_timer, name = name](boost::system::error_code const &e) { | ||
BOOST_ASSERT(!e || e == boost::asio::error::operation_aborted); | ||
|
||
if(!e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coding rule violation.
There are symlinks to individual files, so the update of the key/certificate will be atomic by itself, but the key and certificate may have different version:
I don't think it will be an issue the certificate is already updated, and the key not yet, or the other way around. There is an overlap period. Keys/certificates stay valid for 90 days, but should be refreshed every 60 days with letsencrypt |
To make a test, the reload_ctx and run_broker need to be moved into a header file. Also the servers (test_server_tls, test_server_no_tls, etc ...), they are part of the system test. Should this be cleaned up first ? |
example/broker.cpp
Outdated
@@ -84,7 +116,8 @@ int main(int argc, char **argv) { | |||
#endif // defined(MQTT_USE_LOG) | |||
("certificate", boost::program_options::value<std::string>(), "Certificate file for TLS connections") | |||
("private_key", boost::program_options::value<std::string>(), "Private key file for TLS connections") | |||
; | |||
("certificate_reload_interval", boost::program_options::value<unsigned int>()->default_value(3600), "Reload interval for the certificate and private key files (seconds)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
("certificate_reload_interval", boost::program_options::value<unsigned int>()->default_value(3600), "Reload interval for the certificate and private key files (seconds)") | |
("certificate_reload_interval", boost::program_options::value<unsigned int>()->default_value(0), "Reload interval for the certificate and private key files (seconds)") |
Never reload by default is better. 0 means no reload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all users expect automatic reload. It should be optional functionality.
example/broker.cpp
Outdated
boost::asio::ssl::context init_ctx(boost::program_options::variables_map const& vm) | ||
|
||
template<typename Server> | ||
void reload_ctx(boost::asio::steady_timer &reload_timer, Server &server, boost::program_options::variables_map const& vm, char const *name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void reload_ctx(boost::asio::steady_timer &reload_timer, Server &server, boost::program_options::variables_map const& vm, char const *name) | |
void reload_ctx(boost::asio::steady_timer& reload_timer, Server& server, boost::program_options::variables_map const& vm, char const* name) { |
Please move {
to the end of the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing boost::program_options::variables_map const& vm
is not good.
Analyze program options is not suitable for reload_ctx() responsibility. reload_ctx() should focus on reload ssl context.
So certificate, private_key, and certificate_reload_interval should be passed.
The arguments combination checking should be done at run_broker().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still needs to be updated ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the function itself already updated as different parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I updated. Binding the variables_map to the timer is not a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the function itself already updated as different parameters.
It is wrong. The answer is yes. I will add comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got confused between load_ctx and reload_ctx. I need a time to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I understand. No updated is required here. reload_ctx() 's parameter is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only coding rules update ?
passing as const & does result in copy everytime the callback timer is called
if (ec) { | ||
throw std::runtime_error("Failed to load private key file: " + ec.message()); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If certificate_reload_interval
is 0 then don't set timer.
certificate=broker.crt.pem | ||
private_key=broker.key.pem | ||
certificate=server.crt.pem | ||
private_key=server.key.pem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following comment should be added. Please update to better English :)
# Reload interval for the certificate and private key files (seconds)
# It is useful for Let's Encrypt certs update.
# If not set or set to 0, then no reload.
#
# certificate_reload_interval=3600
Ah, I misunderstood it is a part of broker.hpp, but actually it is |
I still need to check manually if the reloading is actually done |
Ok, after you finish the checking, please let me know. Even if |
It seems when I update only the key and not the cert, i get the following error: So you would have to update atomicly both the key and the cert: So you would have to update a symlink to a directory containing both the key and the cert. I wonder why letsencrypt does not do this. So maybe first load the certificate and private key in a temporary ssl context. And if succeed, update the context in the server. If fails: keep old context in the server ? |
I bundled now both the key and the certificate in a bundle.pem: Now the updating works. But I do see that it sometimes takes a while before the updated certificate is used. I think the following happens:
So there will be 1 more connection with the old certificate, before the new certificate is used. Or you need to restart the listening when the certificate is updated.
yes, it does |
ab91861
to
10cdb65
Compare
37a9082
to
9d87b9a
Compare
example/broker.cpp
Outdated
|
||
template<typename Server> | ||
void reload_ctx(Server& server, boost::asio::steady_timer& reload_timer, | ||
std::string const &certificate_filename, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::string const &certificate_filename, | |
std::string const& certificate_filename, |
example/broker.cpp
Outdated
template<typename Server> | ||
void reload_ctx(Server& server, boost::asio::steady_timer& reload_timer, | ||
std::string const &certificate_filename, | ||
std::string const &key_filename, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::string const &key_filename, | |
std::string const& key_filename, |
example/broker.cpp
Outdated
reload_timer.async_wait([&server, &reload_timer, | ||
certificate_filename, key_filename, certificate_reload_interval, name](boost::system::error_code const &e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reload_timer.async_wait([&server, &reload_timer, | |
certificate_filename, key_filename, certificate_reload_interval, name](boost::system::error_code const &e) { | |
reload_timer.async_wait( | |
[&server, &reload_timer, certificate_filename, key_filename, certificate_reload_interval, name] | |
(boost::system::error_code const& e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main point is const &e
vs const& e
. But I also suggested change line breaks. Please update it.
Thank you for updating! |
Broker crashes if non-existing certificate file was specified and error code was not checked on loading