Summarize email thread #8508

ChristophWurst · 2023-06-01T06:42:28Z

Is your feature request related to a problem? Please describe.

As a user I receive long email threads and I want to efficiently get the gist of them without reading the details.

Describe the solution you'd like

Show a text box below (long) threads with a few sentences of summary of the messages above.

A reference concept can be seen at https://www.youtube.com/watch?v=6DaJVZBXETE&t=25s.

Implementation

Add API to summarize text server#38578 (to be done by the Nextcloud integration team)
Requires mockup and interaction specs from designers
- Where to render the text
- Starting at how many messages in a thread do we summarize? 1? 2?
MVP
- Make it work with integration_openai mainly because that is fast
- Process summaries on demand when the user opens the thread (in a web request, no background processing)
- Fetch thread message bodies in a service inside the Mail app and send the data for processing to https://docs.nextcloud.com/server/latest/developer_manual/digging_deeper/text_processing.html
Cache processed threads
Document Mail summary feature documentation#10917

An API will be provided to which we can send text and retrieve a summary. Therefore, the Mail app has to be changed to collect a thread's message texts, have the text summarized and show the results in the UI.

Sync vs async processing

If the summary should show instantly, when loading the thread, the processing has to happen in the background and the result persisted with the thread data.
If the summary is generated async, the data can be processed on demand. Depending on the performance of the AI there can be a significant delay.

Opt-out

There will be people who don't want this feature. We should allow them to turn off the feature.

Describe alternatives you've considered

No response

Additional context

No response

ChristophWurst · 2023-06-01T12:11:18Z

Backend skills only required if the summary API is a PHP API. Then we need a small controller to invoke the API. If the API is exposed via OCS a frontender can query results directly.

ChristophWurst · 2023-06-29T17:26:54Z

The original idea was to do the processing in an AJAX request on-demand when the thread is loaded by the user. With the insights from nextcloud/server#38578 (comment) we'll have to remodel the architecture. It's not reasonable to process the summary on-demand.

The mail app will continue to synchronize mailboxes in the background like it does now. As the last step of this process, threads are (re)built. We have to add logic to detect when threads are new or changed. Then we have to fire an async text processing task to build the summary and register a listener to process/store the result. The result can go into something like a oc_threads table.

If a thread changes, the previous summary has to be discarded immediately so we don't show an outdated summary to the user if they open the thread earlier than the finished background task.

Stocking thread summaries sounds expensive, so I'm thinking of limiting the feature to threads of x messages. E.g. only start to summarize if there are three or more messages.

@marcoambrosini also raised that if you have an organizational instance with emails sent between a group of people, the same summary might be processed n times. It's to be evaluated if an optimization is possible.

nimishavijay · 2023-07-13T13:48:58Z

ChristophWurst · 2023-07-20T14:46:40Z

Attachments

What is the logic behind that? Does it show all attachments in the thread or does the LLM do some sort of selection?

nimishavijay · 2023-07-20T15:01:08Z

What is the logic behind that? Does it show all attachments in the thread or does the LLM do some sort of selection?

Most ideal: the LLM decides which attachments are relevant and shows that (for eg. if a colleague shared an updated invoicing template that will be shown and not the original)

It would also be nice if all the attachments were shown if the associated text is also summarized and shown. For eg.

- Alice shared the invoicing template
  [invoice_template.pdf]

- Bob shared the scripts and slides for the release presentation
  [script_hub5.md] [Frank's part .md] [Alice's part .md] [slides.pptx]

- Alice updated the invoice template
  [invoice_template (1).pdf]

Would any of those be in scope?

ChristophWurst · 2023-07-20T15:02:35Z

@DaphneMuller @marcelklehr would it? ^

ChristophWurst · 2023-07-20T15:04:54Z

How do we feed messages of a thread into the LLM so the LLM understands who said/shared/sent what? Do we feed attachments too?
E.g. in the example above the result contains information about who said what. If we only concatenate the message text without any sender information it won't be possible to generate such results.

Just thinking out loud.

ChristophWurst · 2023-07-21T09:59:13Z

nextcloud/server#38578 is in.

@nimishavijay the LLM won't be able to provide links or lists, it's only plain text that will be returned. Ref nextcloud/server#38578 (comment). We'll have to expect a simpler summarization for the first iteration.

ChristophWurst · 2023-07-21T10:15:42Z

Conversation summary heading

Could it be thread instead of conversation to avoid two terms for the same thing?

nimishavijay · 2023-07-21T12:46:29Z

We'll have to expect a simpler summarization for the first iteration.

No worries, we can make design changes if needed based on the first version :)

Could it be thread instead of conversation to avoid two terms for the same thing?

Works for me!

ChristophWurst · 2023-07-25T11:21:44Z

We will make the feature opt-in for admins so it doesn't put too much load on a system. Additionally we only summarize threads of three or more messages.

hamza221 · 2023-07-26T14:56:42Z

After talking to @ChristophWurst and @jancborchardt we agreed that actions are not really usable on the summary we're replacing them with a button to scroll to the newest message instead.

ChristophWurst · 2023-07-27T08:48:47Z

Short update on the text summarization experiments:

The local LLM of https://github.com/nextcloud/llm doesn't provide usable summaries at the moment. Short text stays as-is. Longer text processes for 20 minutes and runs into the process timeout.

We could look into OpenAI/ChatGPT instead for a proof of concept. The app doesn't support the text processing APIs yet: nextcloud/integration_openai#35.

DaphneMuller · 2023-07-27T08:59:05Z

We will probably make this supported in the openai integration but it has a lower priority compared to making the on-premise ai of Marcel work

marcelklehr · 2023-07-28T09:25:08Z

The local LLM of nextcloud/llm doesn't provide usable summaries at the moment. Short text stays as-is. Longer text processes for 20 minutes and runs into the process timeout.

Timeout depends on the machine it runs on I'd guess. I've changed the model to a more up-to-date, lightweight one that should also give higher-quality output, increased the timeout and improved the summary algorithm. Also see nextcloud/server#39567 for a few fixes and improvements to the textprocessing feature.

ChristophWurst · 2023-08-09T16:01:25Z

MVP is in. Reopening for the follow-ups.

ChristophWurst · 2023-09-05T09:44:40Z

Planned follow-up seem to be done

ChristophWurst added enhancement 0. to triage labels Jun 1, 2023

ChristophWurst mentioned this issue Jun 1, 2023

Hub 6 nextcloud/groupware#70

Closed

10 tasks

ChristophWurst added 1. to develop design blocked skill:backend Issues and PRs that require backend development skills skill:frontend Issues and PRs that require JavaScript/Vue/styling development skills and removed 0. to triage labels Jun 1, 2023

ChristophWurst mentioned this issue Jul 5, 2023

Introduce LanguageModel/TextProcessing OCP API nextcloud/server#38854

Merged

8 tasks

nimishavijay self-assigned this Jul 13, 2023

nimishavijay removed their assignment Jul 13, 2023

jancborchardt assigned nimishavijay and unassigned nimishavijay Jul 19, 2023

ChristophWurst mentioned this issue Jul 20, 2023

Add API to summarize text nextcloud/server#38578

Closed

ChristophWurst removed the blocked label Jul 21, 2023

ChristophWurst assigned hamza221 Jul 25, 2023

st3iny self-assigned this Jul 25, 2023

st3iny removed the 1. to develop label Jul 25, 2023

st3iny added the 2. developing label Jul 25, 2023

hamza221 mentioned this issue Jul 26, 2023

feat: Summarize email thread #8653

Merged

ChristophWurst mentioned this issue Aug 3, 2023

Document Mail summary feature nextcloud/documentation#10917

Closed

ChristophWurst added 3. to review and removed 2. developing labels Aug 8, 2023

ChristophWurst closed this as completed in #8653 Aug 9, 2023

ChristophWurst reopened this Aug 9, 2023

This was referenced Aug 10, 2023

Add unit tests for thread summary feature #8716

Merged

Add Caching to Thread summaries #8717

Merged

ChristophWurst mentioned this issue Aug 11, 2023

Ethical AI rating unclear with thread summaries #8719

Closed

ChristophWurst closed this as completed Sep 5, 2023

nimishavijay mentioned this issue Sep 8, 2023

Thread summaries "Nextcloud assistant" and "Go to newest message" have unbalanced padding #8837

Closed

ChristophWurst mentioned this issue Dec 1, 2023

Document limitations of Mail thread summaries nextcloud/documentation#11338

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Summarize email thread #8508

Summarize email thread #8508

ChristophWurst commented Jun 1, 2023 •

edited by hamza221

Loading

ChristophWurst commented Jun 1, 2023

ChristophWurst commented Jun 29, 2023

nimishavijay commented Jul 13, 2023 •

edited by ChristophWurst

Loading

ChristophWurst commented Jul 20, 2023

nimishavijay commented Jul 20, 2023

ChristophWurst commented Jul 20, 2023

ChristophWurst commented Jul 20, 2023 •

edited

Loading

ChristophWurst commented Jul 21, 2023

ChristophWurst commented Jul 21, 2023

nimishavijay commented Jul 21, 2023

ChristophWurst commented Jul 25, 2023 •

edited

Loading

hamza221 commented Jul 26, 2023 •

edited

Loading

ChristophWurst commented Jul 27, 2023

DaphneMuller commented Jul 27, 2023 •

edited

Loading

marcelklehr commented Jul 28, 2023

ChristophWurst commented Aug 9, 2023

ChristophWurst commented Sep 5, 2023

Summarize email thread #8508

Summarize email thread #8508

Comments

ChristophWurst commented Jun 1, 2023 • edited by hamza221 Loading

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Implementation

Sync vs async processing

Opt-out

Describe alternatives you've considered

Additional context

ChristophWurst commented Jun 1, 2023

ChristophWurst commented Jun 29, 2023

nimishavijay commented Jul 13, 2023 • edited by ChristophWurst Loading

ChristophWurst commented Jul 20, 2023

nimishavijay commented Jul 20, 2023

ChristophWurst commented Jul 20, 2023

ChristophWurst commented Jul 20, 2023 • edited Loading

ChristophWurst commented Jul 21, 2023

ChristophWurst commented Jul 21, 2023

nimishavijay commented Jul 21, 2023

ChristophWurst commented Jul 25, 2023 • edited Loading

hamza221 commented Jul 26, 2023 • edited Loading

ChristophWurst commented Jul 27, 2023

DaphneMuller commented Jul 27, 2023 • edited Loading

marcelklehr commented Jul 28, 2023

ChristophWurst commented Aug 9, 2023

ChristophWurst commented Sep 5, 2023

ChristophWurst commented Jun 1, 2023 •

edited by hamza221

Loading

nimishavijay commented Jul 13, 2023 •

edited by ChristophWurst

Loading

ChristophWurst commented Jul 20, 2023 •

edited

Loading

ChristophWurst commented Jul 25, 2023 •

edited

Loading

hamza221 commented Jul 26, 2023 •

edited

Loading

DaphneMuller commented Jul 27, 2023 •

edited

Loading