-
Notifications
You must be signed in to change notification settings - Fork 331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User session tracking #1114
User session tracking #1114
Conversation
dfc6462
to
5880ecc
Compare
5880ecc
to
0f92181
Compare
0f92181
to
98924a6
Compare
'summary_fields', | ||
'created', | ||
'modified', | ||
'session_id', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer it if we didn't make user's actual session IDs available to the public via the API. It seems like these could be used to potentially link searches back to individual users, or make it easier for attackers to spoof other people's sessions.
I propose storing SHA hashes of the session IDs in the database instead. That way we can still aggregate events by a unique session identifier, but we can't look up session IDs that are linked to a specific user's session without knowing the real session ID, which should only be available to the individual user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely agree. Considering current implementation of ACL (SessionEventAccess
class) it's a huge security issue.
def can_add(self, data): | ||
return True | ||
|
||
def can_change(self, obj, data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe make it so that you can only change an object if the session ID's match
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree
'summary_fields', | ||
'created', | ||
'modified', | ||
'session_id', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely agree. Considering current implementation of ACL (SessionEventAccess
class) it's a huge security issue.
model = models.SessionEvent | ||
|
||
def can_read(self, obj): | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want to allow anybody access to our internal session tracking information?
I would say it should be push-only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we change SessionEvent
to store session hashes, does it matter if people can read the hashes?
def can_add(self, data): | ||
return True | ||
|
||
def can_change(self, obj, data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree
class ValidSearchType(object): | ||
choices = [] | ||
|
||
def __init__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you initialize static class field in the __init__
method?
class ValidSearchType(object):
choices = [tp.value for tp in constances.EventType]
def __call__(self, value): | ||
if value not in self.choices: | ||
message = '%s is not a valid choice.' % value | ||
raise drf_serializers.ValidationError(message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually you could replace this whole class with simple function:
def valid_search_type(value):
try:
constants.EnumType(value)
except ValueError:
raise drf_serializers.ValidationError(...)
return {} | ||
|
||
def update(self, instance, validated_data): | ||
event_data = clean_search_data(copy.copy(instance.event_data)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why copy is needed here?
} | ||
|
||
|
||
def append_list_to_list(source_data, target_data, label): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear what this function does. Docstring or some explanation is required.
Function name indicates common used util function, but instead it implements some context specific business logic.
Attempting to bring together #1111 and #1013 in a single solution to be flexible enough to capture any UI events we want to track, but initially focused on search.
Here's what it does so far:
/api/v1/events/
endpoint for creating UI eventsFor each user, each unique search is tracked, including query params, number of results returned, and subsequent clicks on next page, prev page, page size, and content item.
The resulting data structure for a search looks like the following: