Search analytics #6019

dojutsu-user · 2019-07-31T14:40:25Z

Closes #5967
WIP

readthedocs/search/utils.py

ericholscher

This looks great! I haven't heavily reviewed the graphing. I'd like to keep to the same libraries we're already using, so if you can use the same stuff as the ad code, that would be great. I don't feel strongly though.

readthedocs/search/utils.py

readthedocs/settings/base.py

ericholscher · 2019-07-31T21:17:42Z

readthedocs/search/models.py

+        _('Query'),
+        max_length=4092,
+    )
+    count = models.PositiveIntegerField(


I'm wondering if we want more data here. Should we be storing an object each time a search happens? That way we can show the frequency of a search over time. Currently, this only tells us how many times a search has happened.

I think if we plan to delete the data every 3 months, we can probably store every search query with it's own timestamp. I'm fine with shipping this initially though, before we start storing a lot more data.

I am not sure what you meant be more data.

Storing search object everytime a search was made is a better idea. I realised that the graphs were wrong before. And going this way makes them correct and easier.

That way we can show the frequency of a search over time.

Can you exapand this more?

Do we want this to be selected by user? Like the user can select a date and we show him the frequency of searches made vs time for that day.

Or we just show this for today/yesterday?

Just that we will be able to see when a specific search was done each time it was done. The current modeling only shows the number of times a search was done, but no time data about each search.

…ctions.

dojutsu-user · 2019-08-03T11:16:10Z

@ericholscher
I have updated the PR.
And also added a download button which allows proj admins to download all data in csv format.

Screenshot - https://ibb.co/xzX54Ty

ericholscher

Looks good with a few small nits. I'll go ahead and merge this to get the modeling shipped, but we should clean up some of these tidbits.

ericholscher · 2019-08-07T18:54:48Z

readthedocs/projects/views/private.py

+        project_slug
+    )
+    # data for plotting the doughnut-chart
+    distribution_of_top_queries = SearchQuery.generate_distribution_of_top_queries(


I'm a little worried this will be slow in production after we have a lot of data, but we can deal with it then.

ericholscher · 2019-08-07T18:56:00Z

readthedocs/projects/views/private.py

+
+    response = HttpResponse(content_type='text/csv')
+    response['Content-Disposition'] = f'attachment; filename="{file_name}"'
+    template = loader.get_template('projects/search_analytics/csv_data_template.txt')


Why are we writing this with a template instead of a CSV library?

ericholscher · 2019-08-07T18:57:35Z

readthedocs/search/models.py

+        verbose_name=_('Version'),
+        related_name='search_queries',
+        on_delete=models.CASCADE,
+    )


Not sure if we really even want to cascade these deletes. Is there a reason we don't want to store Version here as a string, so we can keep them forever even if a version is deleted?

ericholscher · 2019-08-07T18:58:31Z

readthedocs/search/models.py

+            .order_by('created_date')
+            .annotate(count=Count('id'))
+            .values_list('created_date', 'count')
+        )


This looks really slow. We will see in prod, hopefully it won't be an issue.

ericholscher · 2019-08-07T18:59:48Z

readthedocs/search/tasks.py

+
+
+@app.task(queue='web')
+def record_search_query(project_slug, version_slug, query, total_results):


Hrm yea, that seems less than ideal. We should probably think more about the right approach for "search as you type" -- probably adding Autocomplete vs. search as you type in some cases.

ericholscher · 2019-08-07T19:00:06Z

readthedocs/search/tasks.py

+
+    project_qs = Project.objects.filter(slug=project_slug)
+
+    if not project_qs.exists():


This should probably log a warning.

ericholscher · 2019-08-07T19:00:50Z

readthedocs/templates/projects/search_analytics/csv_data_template.txt

@@ -0,0 +1,3 @@
+serial_no,date_time,query
+{% for row in data %}{{ forloop.counter }},"{{ row.0|addslashes }}","{{ row.1|addslashes }}"
+{% endfor %}


Definitely we should use the csv library for this.

dojutsu-user added 8 commits July 31, 2019 19:51

make model

7477591

save searchqueries to databse

68c7f23

update template

c161dcd

add form

c463226

update views

8d65c70

update meta options

43aa613

add help_text

483d07b

add new line

d1e3f3c

dojutsu-user added the PR: work in progress Pull request is not ready for full review label Jul 31, 2019

add h2

6406871

dojutsu-user requested a review from a team July 31, 2019 14:47

dojutsu-user added 2 commits July 31, 2019 20:35

add cron job

55611bf

fix lint

5d2a947

dojutsu-user commented Jul 31, 2019

View reviewed changes

readthedocs/search/utils.py Outdated Show resolved Hide resolved

dojutsu-user added 6 commits August 1, 2019 00:11

integrate chartsjs

af37565

fix task

9cdef59

change msgs

dc12a22

enable only integers for chart

5ad4752

add doughnut part

cc8ac82

add doughnut chart

f4e68ee

ericholscher reviewed Jul 31, 2019

View reviewed changes

dojutsu-user added 9 commits August 1, 2019 12:48

record query in a celery task

9ab6af5

generate doughnut chart data from the classmethod

4e17e7b

add logger

a26cd44

update cronjob schedule

a20a01b

Update searchquery model to remove count and fix graph generating fun…

f25c8e6

…ctions.

create a search obj every day

ef8acfd

update admin.py

8b9eafd

update search-analytics view func

52bce57

template improvements

7a8edc1

dojutsu-user added 6 commits August 2, 2019 23:11

improve in utils.py

fbc14d5

add test fixture

8670741

add form tests

53c9a67

add views test

b8e786b

add test_search_tasks.py

a358bf0

Merge branch 'master' into search-analytics

343bdc2

dojutsu-user requested review from ericholscher and a team August 2, 2019 18:11

dojutsu-user self-assigned this Aug 2, 2019

dojutsu-user added 4 commits August 3, 2019 13:16

add feature flag

0d83ba6

fix test

3cce53f

add download-data button

4c720ca

fix line

3f06d8d

add test for generated csv data

372f1ac

dojutsu-user changed the title ~~[WIP] Search analytics~~ Search analytics Aug 5, 2019

dojutsu-user removed the PR: work in progress Pull request is not ready for full review label Aug 5, 2019

dojutsu-user added 5 commits August 7, 2019 22:12

refactoring and simplify form

324ff36

fix tests

13a6205

remove form completely

13b6880

fix tests

c8fea2c

remove unnecessary import

862ab30

ericholscher approved these changes Aug 7, 2019

View reviewed changes

ericholscher merged commit f9f6c53 into readthedocs:master Aug 7, 2019

dojutsu-user deleted the search-analytics branch August 7, 2019 19:03

dojutsu-user restored the search-analytics branch August 7, 2019 19:04

dojutsu-user deleted the search-analytics branch August 8, 2019 08:00

dojutsu-user mentioned this pull request Aug 8, 2019

Search analytics improvements #6050

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Search analytics #6019

Search analytics #6019

dojutsu-user commented Jul 31, 2019 •

edited

Loading

ericholscher left a comment

ericholscher Jul 31, 2019

dojutsu-user Aug 1, 2019

ericholscher Aug 1, 2019

dojutsu-user commented Aug 3, 2019 •

edited

Loading

ericholscher left a comment

ericholscher Aug 7, 2019

ericholscher Aug 7, 2019

ericholscher Aug 7, 2019

ericholscher Aug 7, 2019

ericholscher Aug 7, 2019

ericholscher Aug 7, 2019

ericholscher Aug 7, 2019



		@app.task(queue='web')
		def record_search_query(project_slug, version_slug, query, total_results):


		project_qs = Project.objects.filter(slug=project_slug)

		if not project_qs.exists():

Search analytics #6019

Search analytics #6019

Conversation

dojutsu-user commented Jul 31, 2019 • edited Loading

ericholscher left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dojutsu-user commented Aug 3, 2019 • edited Loading

ericholscher left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dojutsu-user commented Jul 31, 2019 •

edited

Loading

dojutsu-user commented Aug 3, 2019 •

edited

Loading