-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enhance(apps/analytics): add computation logic for participant activity performance #4409
Conversation
…into participant-activity-performance
Current Aviator status
This PR was merged manually (without Aviator). Merging manually can negatively impact the performance of the queue. Consider using Aviator next time.
See the real-time status of this PR on the
Aviator webapp.
Use the Aviator Chrome Extension
to see the status of your PR within GitHub.
|
️✅ There are no secrets present in this pull request anymore.If these secrets were true positive and are still valid, we highly recommend you to revoke them. 🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request. |
Warning Rate limit exceeded@sjschlapbach has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 15 minutes and 25 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThis pull request introduces a comprehensive enhancement to the analytics module for tracking participant activity performance. It includes the creation of a new database model Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
…nce computation to separate function
… script execution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (6)
apps/analytics/src/modules/__init__.py (1)
8-8
: Consider avoiding wildcard imports.
Using wildcard imports (e.g., from .participant_activity_performance import *) can lead to namespace pollution and hamper clarity about which functions are exported. It is often considered best practice to import specific functions or classes.🧰 Tools
🪛 Ruff (0.8.2)
8-8:
from .participant_activity_performance import *
used; unable to detect undefined names(F403)
apps/analytics/src/modules/participant_activity_performance/save_participant_activity_performance.py (1)
1-35
: Handle unknown activity types or invalid rows gracefully.
Currently, the function supports only "practiceQuizzes" or "microLearnings". If an unknown activity type is passed in, the function silently does nothing. Consider adding a default branch or error handling to account for unexpected input. Additionally, if the DataFrame row is missing keys like "activityId" or "participantId", an exception will occur. Paranoid checks could prevent potential runtime issues.apps/analytics/src/modules/participant_activity_performance/agg_participant_activity_performance.py (1)
20-20
: Rename unused loop variable 'idx' to '_idx'.
The variable 'idx' is not used in the loop body. Renaming it to '_idx' or removing it would make the code clearer and address the linter's suggestion.Here's a sample diff:
- for idx, activity in df_activities.iterrows(): + for _idx, activity in df_activities.iterrows():🧰 Tools
🪛 Ruff (0.8.2)
20-20: Loop control variable
idx
not used within loop bodyRename unused
idx
to_idx
(B007)
packages/prisma/src/prisma/schema/quiz.prisma (1)
217-221
: Promote unified approach for activity performance tracking.Like in PracticeQuiz, the MicroLearning model also receives participantPerformances, possibly coexisting with performance and progress fields. Clarify if these older fields remain relevant or can be deprecated to avoid partial duplication of data.
apps/analytics/src/notebooks/participant_activity_performance.ipynb (1)
57-258
: Thorough logic, but watch out for large data sets.The loop over each course aggregates participant activity. For large courses, DataFrame concatenations may be memory-intensive. Consider chunking or an alternative approach if performance becomes a concern.
apps/analytics/src/modules/participant_activity_performance/__init__.py (1)
1-3
: Add explicit exports declaration.To address the static analysis warnings and make the module's public interface explicit, consider adding an
__all__
declaration:from .prepare_participant_activity_data import prepare_participant_activity_data from .save_participant_activity_performance import save_participant_activity_performance from .agg_participant_activity_performance import agg_participant_activity_performance + +__all__ = [ + "prepare_participant_activity_data", + "save_participant_activity_performance", + "agg_participant_activity_performance", +]🧰 Tools
🪛 Ruff (0.8.2)
1-1:
.prepare_participant_activity_data.prepare_participant_activity_data
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
2-2:
.save_participant_activity_performance.save_participant_activity_performance
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
3-3:
.agg_participant_activity_performance.agg_participant_activity_performance
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
apps/analytics/package.json
(1 hunks)apps/analytics/src/modules/__init__.py
(1 hunks)apps/analytics/src/modules/participant_activity_performance/__init__.py
(1 hunks)apps/analytics/src/modules/participant_activity_performance/agg_participant_activity_performance.py
(1 hunks)apps/analytics/src/modules/participant_activity_performance/prepare_participant_activity_data.py
(1 hunks)apps/analytics/src/modules/participant_activity_performance/save_participant_activity_performance.py
(1 hunks)apps/analytics/src/notebooks/participant_activity_performance.ipynb
(1 hunks)packages/prisma/src/prisma/migrations/20241223092511_participant_activity_performance/migration.sql
(1 hunks)packages/prisma/src/prisma/schema/analytics.prisma
(1 hunks)packages/prisma/src/prisma/schema/participant.prisma
(1 hunks)packages/prisma/src/prisma/schema/quiz.prisma
(2 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
apps/analytics/src/modules/__init__.py
8-8: from .participant_activity_performance import *
used; unable to detect undefined names
(F403)
apps/analytics/src/modules/participant_activity_performance/agg_participant_activity_performance.py
20-20: Loop control variable idx
not used within loop body
Rename unused idx
to _idx
(B007)
apps/analytics/src/modules/participant_activity_performance/__init__.py
1-1: .prepare_participant_activity_data.prepare_participant_activity_data
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
2-2: .save_participant_activity_performance.save_participant_activity_performance
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .agg_participant_activity_performance.agg_participant_activity_performance
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
🔇 Additional comments (8)
apps/analytics/src/modules/participant_activity_performance/agg_participant_activity_performance.py (2)
59-83
: Ensure consistent approach for 'empty' vs. 'non-empty' responses.
Currently, "empty" scenarios set completion to 0, while "non-empty" ones dynamically compute completion as responseCount / instanceCount. Ensure this approach aligns with all business rules for measuring participant "completion" in partial or borderline cases.
[approve]
8-17
: Validate grouping logic for large datasets.
When grouping a large number of responses, memory usage can grow significantly. Consider verifying that data volumes remain within acceptable levels or implementing chunked processing.
packages/prisma/src/prisma/schema/participant.prisma (1)
36-54
: Ensure consistency for newly consolidated performance fields.
These fields still reference multiple legacy relationships (e.g., questionResponses, feedbacks, etc.) alongside the new participantActivityPerformances. If the intent was to remove or refactor various participant relationships into the consolidated participantActivityPerformances, please verify that the old fields are indeed needed.
packages/prisma/src/prisma/schema/quiz.prisma (1)
32-36
: Confirm legacy performance fields vs. participantPerformances.
The addition of participantPerformances might overlap with existing fields such as performance or progress. Ensure these older fields (e.g., performance = ActivityPerformance?) are either retained intentionally or removed if they are superseded.
✅ Verification successful
Let me gather information about the relationship between these fields by examining the schema and related code.
Based on the code analysis, I can now provide a definitive response:
The fields serve different purposes and should be retained.
The codebase shows that:
participantPerformances
tracks individual participant performance metricsperformance
(ActivityPerformance) tracks aggregate performance metrics for the entire activity, including error rates and correct ratesprogress
(ActivityProgress) tracks activity completion statistics (started, completed, repeated counts)
Each field serves a distinct purpose in the analytics system, as evidenced by their separate usage in the frontend components (ActivityProgressPlot, PerformanceRates) and GraphQL operations. They are not redundant but rather complementary metrics tracking different aspects of activity engagement and performance.
🏁 Scripts executed
The following scripts were executed for the analysis:
Error: No shell scripts found
Script:
#!/bin/bash
# First, let's look at the complete quiz.prisma schema to understand the context
cat packages/prisma/src/prisma/schema/quiz.prisma
# Then, let's search for any references to these fields in the codebase
rg "ActivityPerformance" --type ts -A 3
rg "ParticipantActivityPerformance" --type ts -A 3
rg "ActivityProgress" --type ts -A 3
Length of output: 32460
packages/prisma/src/prisma/schema/analytics.prisma (1)
251-268
: Double-check foreign key uniqueness constraints.
The ParticipantActivityPerformance model includes unique constraints on (participantId, practiceQuizId) and (participantId, microLearningId). This design makes sense for a single performance record per participant per quiz. However, ensure no concurrency issues arise if multiple upsert operations target the same record simultaneously.
apps/analytics/src/notebooks/participant_activity_performance.ipynb (2)
1-11
: Good descriptive introduction.
You provide a clear explanation of how completion and totalScore are calculated. The high-level purpose is well documented and helps future contributors.
260-267
: Proper resource cleanup.
Disconnecting from the database is a best practice to avoid lingering connections. This final cell is well-placed.
packages/prisma/src/prisma/migrations/20241223092511_participant_activity_performance/migration.sql (1)
19-26
: Consider the impact of CASCADE deletions.
The CASCADE deletion on foreign keys means that when a Participant, PracticeQuiz, or MicroLearning is deleted, all related performance records will be automatically deleted. Verify if this aligns with the data retention requirements.
✅ Verification successful
ON DELETE CASCADE is appropriate for this performance tracking system
The cascade deletion pattern is consistent throughout the application's data model, particularly for analytics and performance tracking. This is appropriate because:
- ParticipantActivityPerformance records are purely analytical data that have no value without their parent entities (Participant, PracticeQuiz, or MicroLearning).
- The pattern matches existing cascade deletions for similar analytics tables like:
- ParticipantPerformance
- InstancePerformance
- ActivityPerformance
- ParticipantAnalytics
While there are privacy-related files (mentions of privacy policy in CHANGELOG.md), there are no explicit data retention requirements that would conflict with the cascade deletion pattern. The cascade deletion ensures clean data management by automatically removing orphaned analytics data when the primary entities are deleted.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for data retention policies or requirements
fd . -e md -e txt | xargs rg -l "retention|GDPR|backup"
Length of output: 57
Script:
#!/bin/bash
# Let's check for any configuration files or documentation that might contain data retention policies
fd -e yaml -e yml -e json -e env -e config -e md | xargs rg -i "retention|gdpr|data.*policy|privacy"
# Also check for any schema definitions that might give us context about the data model
fd schema.prisma | xargs cat
# Let's also check for any existing cascade delete patterns in migrations
fd -e sql | xargs rg "ON DELETE CASCADE"
Length of output: 38075
.../analytics/src/modules/participant_activity_performance/prepare_participant_activity_data.py
Show resolved
Hide resolved
apps/analytics/src/notebooks/participant_activity_performance.ipynb
Outdated
Show resolved
Hide resolved
...s/prisma/src/prisma/migrations/20241223092511_participant_activity_performance/migration.sql
Show resolved
Hide resolved
...s/prisma/src/prisma/migrations/20241223092511_participant_activity_performance/migration.sql
Show resolved
Hide resolved
apps/analytics/src/notebooks/participant_activity_performance.ipynb
Dismissed
Show dismissed
Hide dismissed
Quality Gate passedIssues Measures |
Summary by CodeRabbit
New Features
Bug Fixes
Database Changes
Refactor