-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
682 dataset task list fetch data from all active resources #717
base: main
Are you sure you want to change the base?
682 dataset task list fetch data from all active resources #717
Conversation
WalkthroughThe pull request introduces significant modifications across several middleware files, focusing on enhancing issue handling and resource fetching. Key changes include the addition of new functions and the restructuring of existing ones to improve the accuracy and efficiency of issue tracking. New constants and updated SQL queries facilitate better filtering and categorisation of issues. Additionally, several test files have been updated to reflect these changes, ensuring the robustness of the middleware functions. Changes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
…set-task-list-fetch-data-from-all-active-resources
…set-task-list-fetch-data-from-all-active-resources
got the tasklist page working with all active endpoints
…set-task-list-fetch-data-from-all-active-resources
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (1)
test/unit/middleware/datasetTaskList.middleware.test.js (1)
61-70
: Extract test data setupConsider extracting common test data setup to reduce duplication and improve maintainability.
Apply this pattern:
const createTestRequest = (overrides = {}) => ({ parsedParams: { lpa: 'some-lpa', dataset: 'some-dataset' }, entities: ['entity1', 'entity2'], resources: [{ entry_count: 10 }], entryIssueCounts: [], entityIssueCounts: [], ...overrides })
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/middleware/datasetTaskList.middleware.js
(3 hunks)src/services/performanceDbApi.js
(3 hunks)test/unit/middleware/datasetTaskList.middleware.test.js
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/services/performanceDbApi.js
🔇 Additional comments (3)
src/middleware/datasetTaskList.middleware.js (2)
169-179
: LGTM! Well-structured middleware chain
The middleware chain is properly ordered with dependencies, ensuring data is processed in the correct sequence.
53-57
:
Add validation for required request properties
The function assumes all required properties exist on the request object. Add validation to prevent potential runtime errors.
Apply this diff:
export const prepareTasks = (req, res, next) => {
+ if (!req.parsedParams?.lpa || !req.parsedParams?.dataset) {
+ return next(new Error('Missing required parsed parameters'))
+ }
+ if (!Array.isArray(req.entities) || !Array.isArray(req.resources)) {
+ return next(new Error('Missing or invalid entities/resources arrays'))
+ }
+ if (!Array.isArray(req.entryIssueCounts) || !Array.isArray(req.entityIssueCounts)) {
+ return next(new Error('Missing or invalid issue counts arrays'))
+ }
const { lpa, dataset } = req.parsedParams
const { entities, resources } = req
const { entryIssueCounts, entityIssueCounts } = req
Likely invalid or redundant comment.
test/unit/middleware/datasetTaskList.middleware.test.js (1)
96-98
:
Fix undefined mock responses
The test expects undefined responses from performanceDbApi.getTaskMessage
. Mock should return actual messages.
Apply this diff:
+ // Setup mock responses
+ vi.mocked(performanceDbApi.getTaskMessage)
+ .mockReturnValueOnce('1 issue found in field1')
+ .mockReturnValueOnce('1 issue found in field2')
expect(req.taskList).toEqual([
{
title: {
- text: undefined
+ text: '1 issue found in field1'
},
Likely invalid or redundant comment.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
src/middleware/common.middleware.js (1)
Line range hint
191-198
: Fix SQL injection vulnerability in fetchResources.The query uses string interpolation for parameters, which is unsafe.
Apply this diff to use parameterised queries:
export const fetchResources = fetchMany({ - query: ({ req }) => ` - SELECT DISTINCT rd.dataset, r.end_date, r.entry_date, r.mime_type, r.resource, r.start_date, rle.endpoint_url, rle.licence, rle.status, rle.latest_log_entry_date, rle.endpoint_entry_date from resource r - LEFT JOIN resource_organisation ro ON ro.resource = r.resource - LEFT JOIN resource_dataset rd ON rd.resource = r.resource - LEFT JOIN reporting_latest_endpoints rle ON r.resource = rle.resource - WHERE ro.organisation = '${req.params.lpa}' - AND rd.dataset = '${req.params.dataset}' - AND r.end_date = '' - ORDER BY start_date desc`, + query: ({ req }) => ({ + text: ` + SELECT DISTINCT rd.dataset, r.end_date, r.entry_date, r.mime_type, r.resource, + r.start_date, rle.endpoint_url, rle.licence, rle.status, + rle.latest_log_entry_date, rle.endpoint_entry_date + FROM resource r + LEFT JOIN resource_organisation ro ON ro.resource = r.resource + LEFT JOIN resource_dataset rd ON rd.resource = r.resource + LEFT JOIN reporting_latest_endpoints rle ON r.resource = rle.resource + WHERE ro.organisation = $1 + AND rd.dataset = $2 + AND r.end_date = '' + ORDER BY start_date desc`, + values: [req.params.lpa, req.params.dataset] + }), result: 'resources' })
🧹 Nitpick comments (3)
src/middleware/datasetTaskList.middleware.js (1)
82-88
: Improve error handling for task message generation.The error handling could be more robust by including more context in the error message and considering retry logic.
Apply this diff to enhance error handling:
let title try { title = performanceDbApi.getTaskMessage({ num_issues: count, rowCount, field, issue_type: type }) } catch (e) { - logger.warn('datasetTaskList::prepareTasks could not get task title so setting to default', { error: e, params: { num_issues: count, rowCount, field, issue_type: type } }) + logger.error('Failed to generate task message', { + error: e, + context: { + function: 'prepareTasks', + params: { num_issues: count, rowCount, field, issue_type: type }, + stack: e.stack + }, + type: 'TaskMessageError' + }) title = `${count} issue${count > 1 ? 's' : ''} of type ${type}` }test/unit/middleware/datasetTaskList.middleware.test.js (2)
59-122
: Add test cases for error scenarios and edge cases.While the basic test coverage is good, consider adding these scenarios:
- Invalid request properties
- Empty resources array
- Null/undefined values in issue counts
Add these test cases:
it('handles invalid request properties', () => { const req = { parsedParams: { lpa: 'some-lpa' }, // missing dataset entities: [], resources: [], entryIssueCounts: [], entityIssueCounts: [] } const res = { status: vi.fn() } const next = vi.fn() prepareTasks(req, res, next) expect(next).toHaveBeenCalledWith(expect.any(Error)) expect(next.mock.calls[0][0].message).toBe('Missing required request properties') }) it('handles empty resources array', () => { const req = { parsedParams: { lpa: 'some-lpa', dataset: 'some-dataset' }, entities: ['entity1'], resources: [], entryIssueCounts: [{ field: 'field1', issue_type: 'issue-type1', count: 1 }], entityIssueCounts: [] } const res = { status: vi.fn() } const next = vi.fn() prepareTasks(req, res, next) expect(req.taskList[0].title.text).toBe('1 issue of type issue-type1') })
124-167
: Enhance test coverage for special issue types.The test should verify more aspects of the special issue type handling.
Add these assertions:
expect(performanceDbApi.getTaskMessage).toHaveBeenCalledWith({ issue_type: 'reference values are not unique', num_issues: 1, rowCount: 10, field: 'field1' }) + // Verify that rowCount is correctly taken from resources + expect(req.taskList[0].title.text).toBeDefined() + + // Verify the status tag + expect(req.taskList[0].status.tag).toEqual({ + classes: 'govuk-tag--yellow', + text: 'Needs fixing' + })
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/middleware/common.middleware.js
(7 hunks)src/middleware/datasetTaskList.middleware.js
(3 hunks)test/unit/middleware/datasetTaskList.middleware.test.js
(2 hunks)
🔇 Additional comments (9)
src/middleware/datasetTaskList.middleware.js (4)
4-10
: LGTM! Well-structured imports and constants.
The imports are properly organised and the constant SPECIAL_ISSUE_TYPES
is appropriately defined at module level.
Also applies to: 41-41
Line range hint 110-120
: LGTM! Clean and focused template parameter preparation.
The function follows good practices with clear destructuring and object construction.
169-178
: LGTM! Well-structured middleware chain.
The middleware functions are arranged in a logical order, ensuring proper data flow and dependencies.
55-59
:
Add input validation for required request properties.
The function should validate the presence of required request properties before accessing them.
Apply this diff to add validation:
export const prepareTasks = (req, res, next) => {
+ if (!req.parsedParams?.lpa || !req.parsedParams?.dataset || !req.entities || !req.resources || !req.entryIssueCounts || !req.entityIssueCounts) {
+ return next(new Error('Missing required request properties'))
+ }
const { lpa, dataset } = req.parsedParams
const { entities, resources } = req
const { entryIssueCounts, entityIssueCounts } = req
Likely invalid or redundant comment.
test/unit/middleware/datasetTaskList.middleware.test.js (1)
191-266
: LGTM! Comprehensive error handling test coverage.
The test suite effectively covers various error scenarios including API errors, invalid issue types, and missing fields.
src/middleware/common.middleware.js (4)
530-538
: LGTM! Well-structured helper function.
The getMostRecentResources
helper function is well-implemented with clear logic for finding the most recent resources.
397-411
:
Fix SQL injection vulnerability in fetchEntityIssuesForFieldAndType.
The query is vulnerable to SQL injection through string interpolation.
Apply this diff to use parameterised queries:
const fetchEntityIssuesForFieldAndType = fetchMany({
- query: ({ req, params }) => {
- const issueTypeClause = params.issue_type ? `AND i.issue_type = '${params.issue_type}'` : ''
- const issueFieldClause = params.issue_field ? `AND field = '${params.issue_field}'` : ''
- return `
+ query: ({ req, params }) => {
+ const values = [req.resources[0].resource]
+ let paramCount = 2
+ const clauses = []
+
+ if (params.issue_type) {
+ clauses.push(`AND i.issue_type = $${paramCount++}`)
+ values.push(params.issue_type)
+ }
+ if (params.issue_field) {
+ clauses.push(`AND field = $${paramCount++}`)
+ values.push(params.issue_field)
+ }
+ values.push(req.params.dataset)
+
+ return {
+ text: `
select *
from issue i
LEFT JOIN issue_type it ON i.issue_type = it.issue_type
- WHERE resource = '${req.resources[0].resource}'
- ${issueTypeClause}
+ WHERE resource = $1
+ ${clauses.join(' ')}
AND it.responsibility = 'external'
AND it.severity = 'error'
- ${issueFieldClause}
- AND i.dataset = '${req.params.dataset}'
+ AND i.dataset = $${paramCount}
AND entity != ''
- `
+ `,
+ values
+ }
},
result: 'issues'
})
Likely invalid or redundant comment.
541-562
:
Fix SQL injection vulnerability in fetchEntryIssueCounts.
The query is vulnerable to SQL injection through string interpolation.
Apply this diff:
export const fetchEntryIssueCounts = fetchMany({
- query: ({ req }) => {
- const datasetClause = req.params.dataset ? `AND i.dataset = '${req.params.dataset}'` : ''
+ query: ({ req }) => {
+ const values = []
+ let paramCount = 1
+ const clauses = []
const mostRecentResources = getMostRecentResources(req.resources)
const resourceIds = Object.values(mostRecentResources).map(resource => resource.resource)
+ values.push(resourceIds)
+
+ if (req.params.dataset) {
+ clauses.push(`AND i.dataset = $${++paramCount}`)
+ values.push(req.params.dataset)
+ }
- return `
+ return {
+ text: `
select dataset, field, i.issue_type, COUNT(CONCAT(resource, line_number)) as count
from issue i
LEFT JOIN issue_type it ON i.issue_type = it.issue_type
- WHERE resource in ('${resourceIds.join("', '")}')
+ WHERE resource = ANY($1::text[])
AND (entity = '' OR entity is NULL)
AND it.responsibility = 'external'
AND it.severity = 'error'
- ${datasetClause}
+ ${clauses.join(' ')}
GROUP BY field, i.issue_type, dataset
- `
+ `,
+ values
+ }
},
result: 'entryIssueCounts'
})
Likely invalid or redundant comment.
202-225
:
Fix SQL injection vulnerability and improve error handling.
The function has two issues that need addressing:
- SQL injection vulnerability in the query
- Error handling could be more robust
Apply this diff to fix both issues:
export const addEntityCountsToResources = async (req, res, next) => {
const { resources } = req
const promises = resources.map(resource => {
- const query = `SELECT entry_count FROM dataset_resource WHERE resource = "${resource.resource}"`
+ const query = {
+ text: 'SELECT entry_count FROM dataset_resource WHERE resource = $1',
+ values: [resource.resource]
+ }
return datasette.runQuery(query, resource.dataset)
})
try {
const datasetResources = await Promise.all(promises).catch(error => {
logger.error('Failed to fetch dataset resources', { error, type: types.App })
throw error
})
+ const missingResources = []
req.resources = resources.map((resource, i) => {
+ const entryCount = datasetResources[i]?.formattedData[0]?.entry_count
+ if (entryCount === undefined) {
+ missingResources.push(resource.resource)
+ }
- return { ...resource, entry_count: datasetResources[i]?.formattedData[0]?.entry_count }
+ return { ...resource, entry_count: entryCount ?? 0 }
})
+ if (missingResources.length > 0) {
+ logger.warn('Missing entry counts for resources', {
+ resources: missingResources,
+ type: types.App
+ })
+ }
next()
} catch (error) {
logger.error('Error in addEntityCountsToResources', {
error,
+ stack: error.stack,
type: types.App
})
next(error)
}
}
Likely invalid or redundant comment.
…set-task-list-fetch-data-from-all-active-resources
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/middleware/entryIssueDetails.middleware.js (1)
2-2
: Consider breaking down the imports for better maintainabilityThe import statement is quite lengthy. Consider breaking it down into logical groups using multiple import statements. This would improve readability and make future modifications easier.
-import { createPaginationTemplateParams, fetchDatasetInfo, fetchEntryIssues, fetchOrgInfo, fetchResources, getErrorSummaryItems, getSetBaseSubPath, getSetDataRange, prepareIssueDetailsTemplateParams, show404IfPageNumberNotInRange, validateQueryParams } from './common.middleware.js' +// Data fetching utilities +import { + fetchDatasetInfo, + fetchEntryIssues, + fetchOrgInfo, + fetchResources +} from './common.middleware.js' +// Template and pagination helpers +import { + createPaginationTemplateParams, + getErrorSummaryItems, + getSetBaseSubPath, + getSetDataRange, + prepareIssueDetailsTemplateParams, + show404IfPageNumberNotInRange +} from './common.middleware.js' +// Validation utilities +import { validateQueryParams } from './common.middleware.js'
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/middleware/entryIssueDetails.middleware.js
(3 hunks)
🔇 Additional comments (2)
src/middleware/entryIssueDetails.middleware.js (2)
64-68
: Proper implementation of empty checks
The code now correctly handles cases where entryIssues is empty or undefined, addressing the previous review comment.
125-125
: Appropriate placement in middleware chain
The fetchEntryIssues middleware is correctly positioned in the chain, ensuring that entry issues are available before they're needed by subsequent middleware functions.
const error = new Error('Missing required values on request object') | ||
error.status = 404 | ||
return next(error) | ||
} | ||
|
||
const issue = issues[pageNumber - 1] | ||
const issue = entryIssues[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add type validation for entry issues
The code assumes that the first entry issue has the expected structure. Consider adding type validation to ensure the object has the required properties.
+ if (!issue.entry_number || !issue.issue_type || typeof issue.line_number !== 'number') {
+ const error = new Error('Invalid entry issue structure')
+ error.status = 500
+ return next(error)
+ }
Committable suggestion skipped: line range outside the PR's diff.
|
||
try { | ||
const datasetResources = await Promise.all(promises).catch(error => { | ||
logger.error('Failed to fetch dataset resources', { error, type: types.App }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type: types.DataFetch
- including the exception instance in the info object doesn't really work - it's not a plain JS object and you need to extract the message and/or stack.
grep
forerrorMessage
anderrorStack
in the codebase for examples.
WHERE e.organisation_entity = ${req.orgInfo.entity} | ||
${issueTypeFilter} | ||
${issueFieldFilter}` | ||
select * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need everything (select * ...
) from that table? Even if we do, it's better to have that stated explicitly as a form of documentation.
const issueTypeClause = params.issue_type ? `AND i.issue_type = '${params.issue_type}'` : '' | ||
const issueFieldClause = params.issue_field ? `AND field = '${params.issue_field}'` : '' | ||
return ` | ||
select * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentioned previously: select * ...
const mostRecentResourcesMap = {} | ||
resources.forEach(resource => { | ||
const currentRecent = mostRecentResourcesMap[resource.dataset] | ||
if (!currentRecent || new Date(currentRecent.start_date) < resource.start_date) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comparing (possibly invalid) date to a string here. JS will turn this into string comparison but one of the strings can be "Invalid date" etc. Use Date.getTime()
for comparison
* @param {*} res | ||
* @param {*} next | ||
* @returns { { templateParams: object }} | ||
* This function takes the request, response, and next middleware function as arguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit verbose and states things that are true of almost any middleware. It could be shortened to Generates a list of tasks based on the issues found in the dataset
and details of what members of req object are used could be added to the @param
const promises = availableDatasets.map((dataset) => { | ||
return datasette.runQuery(query, dataset).catch(error => { | ||
logger.error('Query failed for dataset', { dataset, error, type: types.DataFetch }) | ||
return { formattedData: [] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, logging the exception object itself works differently than printing it in the console etc. You need to manually extract the data you want to see in the logs.
@@ -216,7 +216,7 @@ export default { | |||
} | |||
|
|||
let message | |||
if (entityCount && numIssues >= entityCount) { | |||
if (rowCount && numIssues >= rowCount) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the bare rowCount
to an explicit test (Number.isInteger(...)
or rowCount > 0
). Otherwise, if rowCount = 0
, this test will fail, and it may not be obvious it should.
What type of PR is this? (check all applicable)
Description
Updates the dataset task list to look at all active resources when looking at entity issues and only the latest resource when looking at entry issues
Related Tickets & Documents
QA Instructions, Screenshots, Recordings
no styling or changes should be present. however the data we are fetching should be different.
Before:
After:
Added/updated tests?
We encourage you to keep the code coverage percentage at 80% and above.
QA sign off
Summary by CodeRabbit
Summary by CodeRabbit
New Features
entryIssueGroups
for enhanced error handling of data entry issues.Bug Fixes
Documentation
Tests