Skip to content
This repository has been archived by the owner on Sep 7, 2020. It is now read-only.

[WIP] Add search function with Algolia #768

Closed
wants to merge 2 commits into from
Closed

[WIP] Add search function with Algolia #768

wants to merge 2 commits into from

Conversation

thangngoc89
Copy link
Contributor

@thangngoc89 thangngoc89 commented Sep 20, 2016

This PR is far from complete, but I want to get early feedback.

  • Build index
  • Publish index to Algolia
  • Lazy update
  • Front-end implementation

@MoOx @DavidWells

If you clone this branch and build docs, you'll get a search-index.json. We can upload it to algolia to create an index. I dumped the content of search-index.json here

Online demo: https://www.algolia.com/realtime-search-demo/phenomic-docs-search

"This probably means you are playing with fire."
)
}
const collection = PhenomicLoaderWebpackPlugin.collection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woah cool. Is this how you can access the entire collection via webpack?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah (currently), I hope to offer a better API soon. That's what @thangngoc89 pointed out here #621 (comment)

@rayrutjes
Copy link

Looks good @thangngoc89
Then it will mainly come down to the index configuration as well.

Also, when it comes to splitting up content based on the DOM structure, there are two strategies.

Minimum amount of records

Will avoid having a record with an H1 alone in it in favor of records going always as deep as possible in the hierarchy

Verbose

Will create a record for each level in the DOM structure

The second option allows you to display result suggestions for a same page to let users access a specific anchor of that page. Will result in a lot more records though.

The choosing between option 1 and 2 comes down to the type of site. For a documentation you probably want users to be able to jump right into a section of a page. For a blog or a corporate website you probably just want people to be heading to the top of the page.

Hope that helps, I'd love to find some time giving your solution a try, looks gorgeous!

@vvo
Copy link

vvo commented Sep 22, 2016

Then it will mainly come down to the index configuration as well.

This is doable all via the API, using https://github.com/algolia/algoliasearch-client-js

@thangngoc89
Copy link
Contributor Author

thangngoc89 commented Sep 22, 2016

@vvo thank you. That's what I'm looking for.

@rayrutjes so you're saying that if I don't need to let user users access a specific anchor of the page . I can just use 1 record for each post? What about the 10KB per record limit?

@rayrutjes
Copy link

so you're saying that if I don't need to let user users access a specific anchor of the page . I can just use 1 record for each post?

Not exactly because if you choose to index 1 record per post you will reach the 10kb limit. Actually we even recommend you stay under 3kb ;)

What I'm saying is that in documentation websites, it makes sense to have a record pointing toward an H1 anchor of a page, so having a record containing only the H1 makes sense there.

In non documentation website, you don't need those records because you are only interested in global relevancy of the result. So having the H1 part of another record containing the children is OK.

When splitting posts/pages into multiple records, you can use the DISTINCT feature of Algolia to only get back the most relevant record for a given post/page.

@thangngoc89
Copy link
Contributor Author

@rayrutjes thank you. I got it. I think I'll go with docsearch-configs https://github.com/algolia/docsearch-configs. It's pretty flexible and user can choose which selector they want to index. I'm learning some python to see how docsearch-crawler works

@MoOx
Copy link
Owner

MoOx commented Sep 22, 2016

@thangngoc89 I guess docsearch make sense for #769, maybe not for all websites, right?

Thanks @rayrutjes @vvo for your inputs, it's really nice from you :)

@vvo
Copy link

vvo commented Sep 22, 2016

I guess docsearch make sense for #769, maybe not for all websites, right?

Yes it makes sense only for Phenomic website docs, we will provide a free crawling and indexing documentation infrastructure like we do for react/react-native..

At least not the docsearch "SAAS", where Algolia company configures, host for free your index. We do provide free DocSearch installs even for websites where the free plan would not fit. If the website is compatible with what we call community websites, websites that are aimed at developer communities.

But if $BIGCORP uses Phenomic then we won't provide a free DocSearch install for instance.

Still, you could take the DocSearch crawler and index format to achieve what we do yes. That may be a bit overengineered for a JavaScript project. Maybe also you do not want to maintain some python programs.

So a simple approach could be to just do a small index like you did @thangngoc89 and use react-instantsearch (in alpha, @thangngoc89 has it), to provide a simple set of components inside Phenomic.

@thangngoc89
Copy link
Contributor Author

I'll keep the surface of the PR small, I won't change the index builder algorithm in this PR. I'll focus on completing all of those tasks above. We can add more complex algorithm and options to the index builder later.

Also, it's a good time for a monorepo because I see other Node.js SSGs can use what I have written here to implement their index builder for Algolia.

Furthermore, Phenomic will has 2 more commands:

# set index config to Phenomic's recipe
phenomic search algolia:config --masterKey (or whatever)

# publish index to Algolia 
phenomic search algolia:publish (lazy update option will come later) 

@MoOx MoOx added the wip label Nov 8, 2016
@MoOx
Copy link
Owner

MoOx commented May 23, 2017

Closing as this PR have to be completely rewrote (new engine, API etc). #142 is still open.

@MoOx MoOx closed this May 23, 2017
@MoOx MoOx deleted the algolia branch May 23, 2017 03:39
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants