-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Email - Stage 1 Proposal #1219
Conversation
We've recently removed Stage 4 from the RFC process. Updated proposal stages and their requirements: https://elastic.github.io/ecs/stages.html This PR was initially targeting the now "legacy" stage two, and I'm thinking we update to target stage 1. Stage 1 (draft) fields will still be added as experimental fields as was done with legacy stage 2 fields. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @P1llus, for opening this. I listed some of the areas we want to go over and capture to move forward.
Field definitions
I suggest we replace any instance of wildcard
with keyword
since we've paused on introducing wildcard
field support into ECS for now.
Can we create a directory using the RFC's number and add a YAML for the proposed set of field definitions? Here's an example: https://github.com/elastic/ecs/tree/master/rfcs/text/0009
Example data
We've added a couple of great examples from M365, and it'd be great to capture one or two more examples from other email data sources.
Concerns
Are there any new concerns that have come up to be captured? Or, do any of the existing concerns have updates or resolutions we can capture?
Feedback
Any individuals or teams we should ask for their feedback around the approach, fields, etc.?
ECS team will review and determine the next steps to continue moving this forward. @jamiehynds, are you still willing to act as a sponsor? |
Thanks @ebeahan - yes, I'll continue to sponsor the RFC. |
rfcs/text/0010-email.md
Outdated
| `email.bcc` | keyword (array) | The email address(es) of the blind carbon copy (CC) recipient(s) | | ||
| `email.content_type` | keyword | Information about how the message is to be displayed. Typically a MIME type | | ||
| `email.message_id` | wildcard | Unique identifier for the email message that refers to a particular version of a particular message | | ||
| `email.reply_to` | keyword | Address that replies should be delivered to | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be clarified that this is referring to RFC5322.ReplyTo and is not the same as Return-Path or RFC5321.MailFrom
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ericwbentley
So, something like this:
| - | - | - |
| - | - | - |
| email.reply_to
| keyword | Stores the email address provided in the "Reply-To" originator field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would go as far to explicitly state the RFC in the description to eliminate any confusion. Or at least the name of the field in the headers.
| `email.reply_to` | keyword | Address that replies should be delivered to | | |
| `email.reply_to` | keyword | Address that replies should be delivered to (RFC5322.ReplyTo) | |
Or
| `email.reply_to` | keyword | Address that replies should be delivered to | | |
| `email.reply_to` | keyword | Address that replies should be delivered to (Reply-To)| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to improve and clarify this point in a856c0a. Let me know what you think.
rfcs/text/0010-email.md
Outdated
| `tls.*` | Used for TLS related information for the connection to for example a SMTP server over TLS | | ||
|
||
|
||
| `email.from` | keyword | Stores the `from` email address | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be clarified that this is referring to the RFC5322.From or "Header From" address specifically
There should be another field for RFC5321.MailFrom "Envelope From" and/or Return-Path as RFC5321.MailFrom and RFC5322.From often differ
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ericwbentley
So, something like this:
| - | - | - |
| - | - | - |
| email.from
| keyword | Stores the email address provided in the "From" originator field |
Reference: https://datatracker.ietf.org/doc/html/rfc5322#section-3.6.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd reference the RFC in the description and add a second field for the envelope from address since they should be distinct from each other.
| `email.from` | keyword | Stores the `from` email address | | |
| `email.from` | keyword | Stores the header `from` email address (RFC5322.From) | | |
| `email.envelope_from` | keyword | Stores the envelope `from` email address (RFC5321.MailFrom) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ericwbentley Makes sense to reference RFC 5322 - thanks for suggesting the distinction.
Like I put in the Concerns
section, right now, this proposal focuses on the IMF from RFC5322. However, I like the idea of also introducing an smtp
field set later that focuses on details of the protocol (and other email protocols, if helpful).
I see smtp.*
fields pairing well with email.*
, but I want to avoid increasing the scope here too much. Open to feedback, though.
| `email.direction` | keyword | Direction of the message based on the sending and receiving domains | | ||
| `email.x_mailer` | keyword | What application was used to draft and send the original email. | ||
|
||
### Additional event categorization allowed values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we consider adding another ID field, for example email.local-id
? Exchange uses it's own IDs in addition to message-id for example internal-message-id
or network-message-id
. Other email servers will have their own session IDs as well.
What happened to this E-mail fieldset? #999 I see it was merged, but I don't see it in |
@peasead This picks up on the same work from #999 that @P1llus @jamiehynds started. @P1llus has been focused on other commitments, and I'm helping to move the proposal forward. The field changes proposed in #999 aren't anywhere in the schema yet. Once we have a consensus and merge this PR, the agreed changes will be added to the experimental schema. |
Corrected
The
I added a nested |
Thanks, all, for the comments so far! I think I've addressed all the outstanding feedback. Would appreciate additional looks to see if this is set for stage one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good for a Stage 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
I overlooked including field definition files for stage 1. I will include those with the stage 2 PR, but the proposed definitions are also now included in the experimental schema. |
Hi @ebeahan, |
Thanks, Jamie. How would you like to see them reflected and where would do you think that the actual Something like this?
|
@peasead It's definitely on the right track and works for "from". However "to", "cc", and "bcc" can be a list of addresses and names, so maybe it would look like the email.attachments nested objects. I'm not sure about the naming standards. "to_addresses" seems long, but "tos" or "toes" doesn't seem right either.
|
@JamisonWhite @peasead In email the to/cc/bcc is always a singular label regardless of having multiples. I would vote not to include the |
Updated based on @wasserman feedback.
|
Thanks, @wasserman @JamisonWhite, for the feedback! I will incorporate the thinking from this discussion into the stage 2 PR #1593 |
Moving the Email RFC to Stage 1
Preview of the RFC proposal
Criteria for Stage 1