Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LogCleaner can not delete old logs #971

Closed
lingbin opened this issue Oct 10, 2023 · 4 comments · Fixed by #972
Closed

LogCleaner can not delete old logs #971

lingbin opened this issue Oct 10, 2023 · 4 comments · Fixed by #972
Labels
Milestone

Comments

@lingbin
Copy link
Contributor

lingbin commented Oct 10, 2023

glog version: 0.6.0

My simplified reproduction process is as follows:

  1. Old logs in '/tmp/glogs':
#ll -rt /tmp/glogs/
total 32776
-rw-r--r-- 1 root root 33554535 Oct  1 00:00 test-cleaner.28f8271ff892.root.log.INFO.20231001-000000.65295
  1. Code that can reproduce the problem (simplified):
#include <string>

#include <glog/logging.h>

static constexpr const char* kLogPrefix = "test-cleaner";
static constexpr const char* kLogDirEnv = "LOG_DIR";
const std::string kLogContent(100, 'a');

int main() {
  // Init logging
  FLAGS_log_dir = std::getenv(kLogDirEnv);
  FLAGS_max_log_size = 32;  // 32 MB

  google::InitGoogleLogging(kLogPrefix);
  google::EnableLogCleaner(3);

  for (int i = 0; i < 1'000; ++i) {
    LOG(INFO) << kLogContent;
  }

  google::DisableLogCleaner();
  google::ShutdownGoogleLogging();
}
  1. Environment variables:LOG_DIR=/tmp/glogs

It is set above to delete files that are more than 3 days old, but they are not deleted.

@lingbin
Copy link
Contributor Author

lingbin commented Oct 10, 2023

I've traced the issue and it should be a bug of glog.

I will add the reason analysis later and submit a PR to fix it later.

@lingbin
Copy link
Contributor Author

lingbin commented Oct 10, 2023

Because FLAGS_log_dir specifies the output directory of the log, the following two methods should be completely equivalent. (Both are absolute paths and must be a directory ("dir" is explicitly used in the name of this flag))

FLAGS_log_dir=/tmp/glogs --> No '/' at the end
VS.
FLAGS_log_dir=/tmp/glogs/

But unfortunately, I found that both will indeed print logs to the "/tmp/glogs/" directory, but if there is no '/' at the end, LogCleaner cannot clean the logs correctly, ie, LogCleaner can only work correctly if the path specified by FLAGS_log_dir has '/' at the end.

  1. To find the cause, I first created some log files (following the naming rules of glog log files) and modified their mtime to a very old time.
#touch -mt 202310010000  /tmp/glogs/*

#ll -rt /tmp/glogs/
total 32776
-rw-r--r-- 1 root root 33554535 Oct  1 00:00 test-cleaner.28f8271ff892.root.log.INFO.20231001-000000.65295
  1. Secondly, in order to track the code execution process, I hacked the glog code and added some output information. The output information in the two methods is as follows (only the required part is retained, we only need to care about the checking process of the "test-cleaner.28f8271ff892.root.log.INFO.20231010-130507.58195" file):
# ------------ FLAGS_log_dir=/tmp/glogs   (No '/' at the end)----------------
lingbin-debugging[Run:1383]. Into LogCleaner::Run(). base_filename_selected=0, base_filename=/tmp/glogs/test-cleaner.28f8271ff892.root.log.INFO., filename_extension=
lingbin-debugging[Run:1389]. dirs=[/tmp/glogs]
lingbin-debugging[GetOverdueLogNames:1416]. Into GetOverdueLogNames. days=3, log_directory=/tmp/glogs, base_filename=/tmp/glogs/test-cleaner.28f8271ff892.root.log.INFO., filename_extension=
lingbin-debugging[GetOverdueLogNames:1440]. Checking filepath=test-cleaner.28f8271ff892.root.log.INFO.20231001-000000.65295, possible_dir_delim=/, log_directory[log_directory.size() - 1]=s
lingbin-debugging[IsLogFromCurrentProject:1498]. Into IsLogFromCurrentProject(). filepath=test-cleaner.28f8271ff892.root.log.INFO.20231001-000000.65295, base_filename=/tmp/glogs/test-cleaner.28f8271ff892.root.log.INFO., filename_extension=
lingbin-debugging[IsLogFromCurrentProject:1527]. cleaned_base_filename=[/tmp/glogs/test-cleaner.28f8271ff892.root.log.INFO.]
lingbin-debugging[IsLogFromCurrentProject:1533]. 'filepath' does not contain 'cleaned_base_filename', return
# ------------ FLAGS_log_dir=/tmp/glogs/    ----------------

lingbin-debugging[Run:1383]. Into LogCleaner::Run(). base_filename_selected=0, base_filename=/tmp/glogs//test-cleaner.28f8271ff892.root.log.INFO., filename_extension=
lingbin-debugging[Run:1389]. dirs=[/tmp/glogs/]
lingbin-debugging[GetOverdueLogNames:1416]. Into GetOverdueLogNames. days=3, log_directory=/tmp/glogs/, base_filename=/tmp/glogs//test-cleaner.28f8271ff892.root.log.INFO., filename_extension=
lingbin-debugging[GetOverdueLogNames:1440]. Checking filepath=test-cleaner.28f8271ff892.root.log.INFO.20231001-000000.65295, possible_dir_delim=/, log_directory[log_directory.size() - 1]=/
lingbin-debugging[GetOverdueLogNames:1462]. modified filepath=/tmp/glogs/test-cleaner.28f8271ff892.root.log.INFO.20231001-000000.65295
lingbin-debugging[IsLogFromCurrentProject:1498]. Into IsLogFromCurrentProject(). filepath=/tmp/glogs/test-cleaner.28f8271ff892.root.log.INFO.20231001-000000.65295, base_filename=/tmp/glogs//test-cleaner.28f8271ff892.root.log.INFO., filename_extension=
lingbin-debugging[IsLogFromCurrentProject:1527]. cleaned_base_filename=[/tmp/glogs/test-cleaner.28f8271ff892.root.log.INFO.]
lingbin-debugging[GetOverdueLogNames:1469]. Checking filepath=/tmp/glogs/test-cleaner.28f8271ff892.root.log.INFO.20231001-000000.65295, isFromCurrProject=1
lingbin-debugging[IsLogLastModifiedOver:1597]. filepath=/tmp/glogs/test-cleaner.28f8271ff892.root.log.INFO.20231001-000000.65295, last_modified_time=1696089600, current_time=1696915269, difftime=825669, configed_days=3
lingbin-debugging[GetOverdueLogNames:1476]. Checking filepath=/tmp/glogs/test-cleaner.28f8271ff892.root.log.INFO.20231001-000000.65295, isLogLastModifiedOver=1

lingbin-debugging[Run:1406]. Deleted /tmp/glogs/test-cleaner.28f8271ff892.root.log.INFO.20231001-000000.65295

The biggest difference between the two is the value of filePath in line 6 above, which is also the reason for this bug.

# ------------ FLAGS_log_dir=/tmp/glogs   (No '/' at the end)----------------
Into IsLogFromCurrentProject(). filepath=test-cleaner.28f8271ff892.root.log.INFO.20231001-000000.65295
   
VS

# ------------ FLAGS_log_dir=/tmp/glogs/    ---------------- 
Into IsLogFromCurrentProject(). filepath=/tmp/glogs/test-cleaner.28f8271ff892.root.log.INFO.20231001-000000.65295,

Short summary: here filePath should be a string with a path, not just the filename.

@lingbin
Copy link
Contributor Author

lingbin commented Oct 10, 2023

After further exploring the code, I found that the problem came from the following places:

glog/src/logging.cc

Lines 1430 to 1434 in b58718f

if (!log_directory.empty() &&
std::find(possible_dir_delim, dir_delim_end,
log_directory[log_directory.size() - 1]) != dir_delim_end) {
filepath = log_directory + filepath;
}

When FLAGS_log_dir=/tmp/glogs, the last character in log_directory="/tmp/glogs" is not '/', so the if condition here is not satisfied, and finally filepath will only contain A filename without a "path". In subsequent comparisons, it will be returned directly in the following line.

glog/src/logging.cc

Lines 1470 to 1473 in b58718f

// Return early if the filename doesn't start with `cleaned_base_filename`.
if (filepath.find(cleaned_base_filename) != 0) {
return false;
}

In order to solve this problem, we only need to modify the if statement above: we should ensure that filepath must eventually contain the log_directory part. That is to say: if log_directory is not empty, but the last character is not '/', then add one explicitly instead of ignoring it directly.

      if (!log_directory.empty()) {
        if (std::find(possible_dir_delim, dir_delim_end,
              log_directory[log_directory.size() - 1]) == dir_delim_end) {
          filepath = log_directory + '/' + filepath;
        } else {
          filepath = log_directory + filepath;
        }
      }

I will follow https://github.com/google/glog#how-to-contribute and submit a PR as soon as possible.

lingbin added a commit to lingbin/glog that referenced this issue Oct 10, 2023
@lingbin
Copy link
Contributor Author

lingbin commented Oct 10, 2023

Update:
With the fix above, my code will work correctly, but the unit test cleanup_with_absolute_prefix will fail with some debug information as follows:

lingbin-debugging[Run:1383]. Into LogCleaner::Run(). base_filename_selected=1, base_filename=test_cleanup_, filename_extension=.barfoo
lingbin-debugging[Run:1389]. dirs=['.']
lingbin-debugging[GetOverdueLogNames:1416]. Into GetOverdueLogNames. days=0, log_directory=., base_filename=test_cleanup_, filename_extension=.barfoo
lingbin-debugging[GetOverdueLogNames:1440]. Checking filepath=test_cleanup_20231010-221021.34375.barfoo, possible_dir_delim=/, log_directory[log_directory.size() - 1]=.
lingbin-debugging[GetOverdueLogNames:1450]. modified filepath=./test_cleanup_20231010-221021.34375.barfoo
lingbin-debugging[IsLogFromCurrentProject:1498]. Into IsLogFromCurrentProject(). filepath=./test_cleanup_20231010-221021.34375.barfoo, base_filename=test_cleanup_, filename_extension=.barfoo
lingbin-debugging[IsLogFromCurrentProject:1527]. cleaned_base_filename=[test_cleanup_]
lingbin-debugging[IsLogFromCurrentProject:1533]. 'filepath' does not contain 'cleaned_base_filename', return
lingbin-debugging[GetOverdueLogNames:1469]. Checking filepath=./test_cleanup_20231010-221021.34375.barfoo, isFromCurrProject=0

What needs to be noted here is that base_filename no longer contains "directory" information, but a value directly specified by the user. (In my reproduction code above, google::SetLogDestination() is not used, base_filename will contain "directory").

In this unit test, the following code is included:

google::SetLogDestination(GLOG_INFO, "test_cleanup_");

google::SetLogDestination(GLOG_INFO, "test_cleanup_");

After further exploration, I found that there are some difficulties in our current processing of directories, especially the judgment of whether it ends with '/', such as the following code:

glog/src/logging.cc

Lines 1388 to 1399 in b58718f

if (!base_filename_selected) {
dirs = GetLoggingDirectories();
} else {
size_t pos = base_filename.find_last_of(possible_dir_delim, string::npos,
sizeof(possible_dir_delim));
if (pos != string::npos) {
string dir = base_filename.substr(0, pos + 1);
dirs.push_back(dir);
} else {
dirs.emplace_back(".");
}
}

Whether FLAGS_log_dir ends with `'/'' is not handled consistently.

  • In line L1395, it is expected that dirs contains '/' (because pos+1 is used);
  • but in line L1397, '.' is used (it is not written as './' , that is, it will not end with '/');
  • In line L1389, whether the path added to dirs ends with '/' depends entirely on the user's own command line options.

However, regardless of the processing method here, I suggest that we should normalize the input value of FLAGS_log_dir when the user inputs. That is, in GetLoggingDirectories(), first ensure that the directory must end with '/', which can greatly reduce the trivial lines and complexity of subsequent processing.

This is exactly what I modified in the final PR I submitted. The problem I encountered will also be solved, and all existing unit tests will succeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants