Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CON-389] Content node log cleanup & reduction [pt 1] #3903

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion creator-node/src/apiHelpers.js
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ module.exports.handleResponseWithHeartbeat = (func) => {
}

const sendResponse = (module.exports.sendResponse = (req, res, resp) => {
const skipErrorLogStatusCodes = [403, 404] // skip logging response error
dmanjunath marked this conversation as resolved.
Show resolved Hide resolved
const duration = getDuration(req)
let logger = createChildLogger(req.logger, {
duration,
Expand All @@ -90,7 +91,9 @@ const sendResponse = (module.exports.sendResponse = (req, res, resp) => {
req.query
)
} else {
logger.info('Error processing request:', resp.object.error)
if (!skipErrorLogStatusCodes.includes(resp.statusCode)) {
logger.info('Error processing request:', resp.object.error)
}
}
}

Expand Down
9 changes: 4 additions & 5 deletions creator-node/src/fileManager.js
Original file line number Diff line number Diff line change
Expand Up @@ -194,7 +194,7 @@ async function fetchFileFromNetworkAndWriteToDisk({
}
},
logger,
logLabel: 'fetchFileFromTargetGatewayAndWriteToDisk',
log: false,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this gets included in the saveFileForMultihashToFS decision tree

options: {
retries: numRetries,
minTimeout: 3000
Expand Down Expand Up @@ -479,7 +479,6 @@ async function saveFileForMultihashToFS(
async function removeTrackFolder({ logContext }, fileDir) {
const logger = genericLogger.child(logContext)
try {
logger.info(`Removing track folder at fileDir ${fileDir}...`)
if (!fileDir) {
throw new Error('Cannot remove null fileDir')
}
Expand Down Expand Up @@ -520,7 +519,6 @@ async function removeTrackFolder({ logContext }, fileDir) {
await fs.rmdir(curPath)
} else {
// Delete file inside /fileDir/
logger.info(`Removing ${curPath}`)
await fs.unlink(curPath)
}
}
Expand Down Expand Up @@ -651,8 +649,9 @@ function checkFileType(logger, { fileName, fileMimeType }) {
ALLOWED_UPLOAD_FILE_EXTENSIONS.includes(fileExtension) &&
AUDIO_MIME_TYPE_REGEX.test(fileMimeType)
) {
logger.info(`Filetype: ${fileExtension}`)
logger.info(`Mimetype: ${fileMimeType}`)
logger.info(
`fileManager#checkFileType - FileName: ${fileName}, Filetype: ${fileExtension}, Mimetype: ${fileMimeType}`
)
} else {
throw new Error(
`File type not accepted. Must be one of [${ALLOWED_UPLOAD_FILE_EXTENSIONS}] with mime type matching ${AUDIO_MIME_TYPE_REGEX}, got file ${fileExtension} with mime ${fileMimeType}`
Expand Down
3 changes: 2 additions & 1 deletion creator-node/src/monitors/MonitoringQueue.js
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,8 @@ class MonitoringQueue {
}
})

this.logStatus('Completed')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this? and the Starting log on line 47?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i found it useful to find when the MonitoringQueue runs, otherwise we just don't know when the job starts and end. I was using this to debug #3971

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can tell this from bull dashboard right?
fine for now if you want it - how about debug level tho


done(null, {})
} catch (e) {
this.logStatus(`Error ${e}`)
Expand Down Expand Up @@ -84,7 +86,6 @@ class MonitoringQueue {
if (isFresh) return

const value = await monitor.func()
this.logStatus(`Computed value for ${monitor.name} ${value}`)

// Set the value
redis.set(key, value)
Expand Down
11 changes: 7 additions & 4 deletions creator-node/src/routes/files.js
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ const getStoragePathQueryCacheKey = (path) => `storagePathQuery:${path}`

const logGetCIDDecisionTree = (decisionTree, req) => {
try {
req.logger.info(`[getCID] Decision Tree: ${JSON.stringify(decisionTree)}`)
req.logger.debug(`[getCID] Decision Tree: ${JSON.stringify(decisionTree)}`)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is by far the largest win as far as loggly is concerned. basically we don't send these logs to loggly but they'll be around for local dev

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im gonna be removing these logs by adding prometheus metrics on this route anyway :D

} catch (e) {
req.logger.error(`[getCID] Decision Tree - Failed to print: ${e.message}`)
}
Expand Down Expand Up @@ -513,6 +513,7 @@ const getDirCID = async (req, res) => {
const dirCID = req.params.dirCID
const filename = req.params.filename
const path = `${dirCID}/${filename}`
const logPrefix = '[getDirCID]'

const cacheKey = getStoragePathQueryCacheKey(path)

Expand Down Expand Up @@ -542,10 +543,12 @@ const getDirCID = async (req, res) => {

// Attempt to stream file to client
try {
req.logger.info(`Retrieving ${storagePath} directly from filesystem`)
req.logger.info(
`${logPrefix} - Retrieving ${storagePath} directly from filesystem`
)
return await streamFromFileSystem(req, res, storagePath)
} catch (e) {
req.logger.info(`Failed to retrieve ${storagePath} from FS`)
req.logger.info(`${logPrefix} - Failed to retrieve ${storagePath} from FS`)
}

// Attempt to find and stream CID from other content nodes in the network
Expand All @@ -559,7 +562,7 @@ const getDirCID = async (req, res) => {
return await streamFromFileSystem(req, res, storagePath)
} catch (e) {
req.logger.error(
`Error calling findCIDInNetwork for path ${storagePath}`,
`${logPrefix} - Error calling findCIDInNetwork for path ${storagePath}`,
e
)
return sendResponse(req, res, errorResponseServerError(e.message))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,9 @@ function makeOnCompleteCallback(
// Bull serializes the job result into redis, so we have to deserialize it into JSON
let jobResult: AnyDecoratedJobReturnValue
try {
logger.info(`Job successfully completed. Parsing result: ${resultString}`)
logger.debug(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was the gigantic log that printed stuff for all the bull queues

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense to move to debug for now, i think we can just remove altogether tbh, since its already recorded in the bull dashboard
but our disk pressure isn't coming from logs so no concern there

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah actually @theoilie already made this change to remove the massive resultString separately

`Job successfully completed. Parsing result: ${resultString}`
)
jobResult = JSON.parse(resultString) || {}
} catch (e: any) {
logger.error(`Failed to parse job result string: ${e.message}`)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -363,7 +363,7 @@ async function _handleIssueSyncRequest({

return {
result: 'failure_issue_sync_request',
error: `${logMsgString} || Error issuing sync request: ${e.message}`,
error: `${logMsgString} || Error issuing sync request: ${e.message} - ${e.response?.data}`,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error msg isn't useful cause it's straight from axios, we see "Request failed with status code 500". one way to get more visibility for this is https://stackoverflow.com/questions/56577124/how-to-handle-500-error-message-with-axios

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've tried printing e.response.status and also did not get much context on the error. we can add this field to log and observe however

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i actually think some version of this is useful - its been a backlog item of mine to track the reason for issue_sync_request failure

this change is fine tho

syncReqsToEnqueue,
additionalSync
}
Expand Down
8 changes: 5 additions & 3 deletions creator-node/src/utils/cidUtils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -151,9 +151,11 @@ export async function findCIDInNetwork(
})
} catch (e: any) {
// Do not error and stop the flow of execution for functions that call it
logger.error(
`findCIDInNetwork error from ${endpoint} for ${cid} - ${e.message}`
)
// only log err message if legitimate error, not if content is not found
if (!e.message.includes('Content is not available'))
logger.error(
`findCIDInNetwork error from ${endpoint} for ${cid} - ${e.message}`
)
}
}

Expand Down
2 changes: 1 addition & 1 deletion creator-node/src/utils/utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ export function verifySignature(data: any, sig: any) {

export async function timeout(ms: number, log = true) {
if (log) {
genericLogger.info(`starting timeout of ${ms}`)
genericLogger.debug(`starting timeout of ${ms}`)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

honestly considering just removing this. idk if this is useful for local dev. definitely not useful on stage or prod

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is anything is can be added as an event to a span

tracing.debug(`starting timeout of ${ms}`)

Copy link
Contributor Author

@dmanjunath dmanjunath Oct 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonaylor89 oh yeah this would be great. do you want me to make this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 lets do it

}
return new Promise((resolve) => setTimeout(resolve, ms))
}
Expand Down