Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add semantic search news article #507

Merged
merged 16 commits into from
Nov 6, 2024
Merged

Conversation

robertdhayanturner
Copy link
Collaborator

notebook article on semantic search, news

@robertdhayanturner robertdhayanturner self-assigned this Sep 26, 2024
@robertdhayanturner robertdhayanturner added the stage: style review PR under review for style guide compliance ( https://hub.superlinked.com/contributing ) label Sep 26, 2024
Comment on lines 99 to 100
# we need to handle the timestamp being set in milliseconds
business_news["date"] = [
Copy link
Collaborator Author

@robertdhayanturner robertdhayanturner Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@morkapronczay
meaning the original timestamp is in milliseconds?
In kaggle, it looks like it's just dates - eg "date":string"2022-09-23"
As I understand it, the unix timestamp is in seconds.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about the kaggle text. Might have been me altering it when I filtered the dataset for myself. (probably when I read it it automatically converted it to date and I wrote it out like this)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@morkapronczay
The kaggle is just dates.
How should we discuss the conversion to seconds from milliseconds?
Or does the following snippet take care of converting from milliseconds (to seconds) whether your dataset has been read into milliseconds or not:

# we need to handle the timestamp being set in milliseconds
business_news["date"] = [
    int(date.replace(tzinfo=timezone.utc).timestamp()) for date in business_news.date
]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@morkapronczay no rush on this question above, just pinging you for when you get time. Thanks!

docs/articles/semantic_search_news.md Outdated Show resolved Hide resolved
docs/articles/semantic_search_news.md Outdated Show resolved Hide resolved
docs/articles/semantic_search_news.md Outdated Show resolved Hide resolved
docs/articles/semantic_search_news.md Outdated Show resolved Hide resolved
docs/articles/semantic_search_news.md Outdated Show resolved Hide resolved
docs/articles/semantic_search_news.md Outdated Show resolved Hide resolved
@robertdhayanturner robertdhayanturner added stage: ready to publish PR is ready to be published on hub.superlinked.com and removed stage: style review PR under review for style guide compliance ( https://hub.superlinked.com/contributing ) labels Nov 6, 2024
@AruneshSingh AruneshSingh merged commit 9128a34 into main Nov 6, 2024
1 check passed
@AruneshSingh AruneshSingh deleted the semantic-search-article---new branch November 6, 2024 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stage: ready to publish PR is ready to be published on hub.superlinked.com
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants