Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support fluentbit iso8601 time format for p_timestamp #685

Closed
sp0cket opened this issue Mar 5, 2024 · 2 comments
Closed

Support fluentbit iso8601 time format for p_timestamp #685

sp0cket opened this issue Mar 5, 2024 · 2 comments

Comments

@sp0cket
Copy link

sp0cket commented Mar 5, 2024

fluent-bit.conf

[OUTPUT]
    Name  http
    Match *
    host  127.0.0.1
    header Content-Type application/json
    header_tag X-P-Stream
    format json
    Port 8000
    URI /api/v1/ingest
    json_date_key p_timestamp
    json_date_format iso8601

I got an error Event Error: Stream Writer Failed: Schema error: Fail to merge schema field 'p_timestamp' because the from data_type = Utf8 does not equal Timestamp(Millisecond, None)

And I don't want to have an extra timestamp-like key for the time.

@nitisht
Copy link
Member

nitisht commented Mar 5, 2024

p_timestamp is an internal field - if you want log event created timestamp to be used for query etc - #683 is the PR for that.

@sp0cket
Copy link
Author

sp0cket commented Mar 5, 2024

This PR seems to be exactly what I want.

@nitisht nitisht added this to the Release v0.9.1 milestone Mar 5, 2024
nikhilsinhaparseable added a commit to nikhilsinhaparseable/parseable that referenced this issue Apr 20, 2024
…seablehq#683)

This PR adds enhancement to use a user provided timestamp for partition 
in ingesting logs instead of using server time.

User needs to add custom header X-P-Time-Partition (optional) at stream 
creation api to allow ingestion/query using timestamp column from the 
log data instead of server time p_timestamp

This is time_partition field name is stored in stream.json and in memory 
STREAM_INFO in ingest api. Server checks if timestamp column name exists in 
the log event, if not, throw exception. Also, checks if timestamp value can be 
parsed into datetime, if not, throw exception arrow file name gets the date, 
hr, mm from the timestamp field (if defined in stream) else file name gets 
the date, hr, mm from the server time parquet file name gets a random 
number attached to it. This is because a lot of log data can have same 
date, hr, mm value of the timestamp field and with this random number, 
parquet will not get overwritten in the console, query from and to date will 
be matched against the value of the timestamp column of the log data (if 
defined in the stream), else from and to date will be matched against the 
p_timestamp column.

Fixes parseablehq#671 
Fixes parseablehq#685
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants