Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getOrCreateIndex method #2

Closed
7 tasks done
curquiza opened this issue Apr 9, 2020 · 12 comments
Closed
7 tasks done

getOrCreateIndex method #2

curquiza opened this issue Apr 9, 2020 · 12 comments

Comments

@curquiza
Copy link
Member

curquiza commented Apr 9, 2020

We are thinking about creating a getOrCreateIndex method.

Description

This method returns an Index object. It checks if the index already exists and if it's not, it creates the index.

How

By trying to create the index, and catching the exception. In every situation, only one HTTP call is done.

An example in ruby:

require 'meilisearch'

URL = 'http://127.0.0.1:7700'

def get_or_create_index(index_uid, primary_key = nil)
  client = MeiliSearch::Client.new(URL)

  begin
    index = client.create_index(uid: index_uid, primaryKey: primary_key)
  rescue MeiliSearch::HTTPError => e
    raise e unless e.message.include? 'index already exists'
    index = client.index(index_uid)
  end

  return index
end

index = get_or_create_index('indexUID')
puts index.show

(In ruby the index method is the equivalent of getIndex in other SDKs, and this method does not do any HTTP call)

This example is out of the MeiliSearch library. Of course, this method must be included in the client class.

Tests

This method must be tested:

  • Test with no index created before
  • Test with an index already existing
  • Test with a primary-key in parameter

Steps

Do it in:

Removed from the to-do list:

  • Go -> not relevant since the client is divided into sub-clients that take an index themselves. The main client does not take any index.

Feedback

Don't hesitate to suggest other ideas to implement the method, or if I forgot any tests 🙂

@alexisvisco
Copy link

In go it does not seem very relevant to me since the client is divided into sub-clients who take an index themselves. The main client does not take any index.

@curquiza
Copy link
Member Author

curquiza commented May 4, 2020

You're right, I update the issue! Thanks

@ppamorim
Copy link

ppamorim commented May 6, 2020

I really don't like the idea to compare the error message string. Would be nice to have a type of reason code or something.

@curquiza
Copy link
Member Author

What do you suggest to not to check the message? 🙂
The create_index method could raise an exception if the index already exists, but it also means the method would have to check the error message to know if the error from MeiliSearch is about index existing...
What I mean it's if the weird check is not done in the get_or_create_index it would have done in create_index.

@shokme
Copy link

shokme commented May 13, 2020

Meilisearch should have a way to check the existence of an index.
an endpoint like: 127.0.0.1:7700/indexes/{name}/exists

then, you just have to create a method hasIndex(string $name): bool

@eskombro
Copy link
Member

eskombro commented May 14, 2020

I agree with @ppamorim about comparing error messages, seems really unreliable.
I like your idea @shokme, we could make an issue in the MeiliSearch repo to see if that is an option that pleases everyone.

Also, as for now, it seems that the possibilities of providing a method like this are:

  1. Try to create the index, catch the error (if exists), check the error message to realize index already exists.
  2. Try to request the index, catch the error (if it doesn't exist), check the error message and realize the index doesn't exist yet.
  3. List the existing indexes, look for your index inside the list to know if it exists.
  4. ... ?

None of those options seem to be reliable or sustainable in my opinion. The problem is that if we do not provide a method like this one, every SDK user is forced to check in it's own code every time if the index exists, which is clearly not ideal

Can I request your POV @Kerollmops ?

@Kerollmops
Copy link
Member

Kerollmops commented May 14, 2020

I prefer the one-call way proposed by @curquiza instead of the two-calls way proposed by @shokme as this is prone to races. A user can run multiple clients at the same time that which asks if the index exists or not, all the clients will see that the index doesn't exist and tries to create it at the same time. All the clients that were trying to create the index except the one that succeeded will return an error.

However, the error reporting story of MeiliSearch is not that good, we must introduce error codes as I already explained to @ppamorim. A unique code that helps SDKs understand the reported error without having to do text comparison (which isn't a reliable solution). The documentation website will understand this unique code and redirect the user to the appropriate page.

In conclusion, I think that we should rely on string comparison as this is the safest and less race condition prone solution we have. We will improve the error reporting of the engine and you can track that in the related issue.

@bidoubiwa
Copy link
Contributor

I'm building this method in the javascript library and I have a question about the primaryKey key.

The primary Key is set only once, either by inference either manually by the user. With this in mind, imagine the following scenario:

Scenario 1: existing index without no primary key

I have an index myIndex that already exists, this index has no primaryKey.

Then, I call this method the following way.

getOrCreateIndex("myIndex", "id")

The primary key that is given here is useless because your method does not use this parameter unless the index does not exists.
This is not explicit, users could imagine that the primaryKey will be updated.

Scenario 2: existing index existing primary key

I have an index myIndex that already exists, this index has a primaryKey that is id.

Then, I call this method the following way.

getOrCreateIndex("myIndex", "newId")

The primary key that is given here is useless as the primary key already exists and can not be changed and for reason in scenario 1.

Proposition

Taken the following parameters into account:

  • You only need an uid to create an index
  • Options when creating will evolve and there will be more than just primaryKey
  • because stating primaryKey as a parameter name in this function is not explicit in what will be done with it

I suggest creating a second parameter which would be createOptions in which for the moment there will only be primaryKey

a call in javascript would look like this

await meili.getOrCreateIndex("uid", {
"primaryKey": "newId"
})

@curquiza
Copy link
Member Author

curquiza commented Jun 9, 2020

I don't see the difference between your first scenario and second scenario example 😅

But, I think (I'm not sure) I understood your suggestion in your conclusion.

You want as definition:

def get_or_create_index(index_uid, options = nil)

and as usage:

client.get_or_create_index('uid', {'primaryKey': 'id'})

Right?

I like this kind of idea, but I would improve your idea with this goal: staying consistent with the create_index method behavior.
It leads obviously to a better user experience since the methods are completely linked.

Look at the usage in JS:

// Create an index
const index = await client.createIndex({ uid: 'books' })
// Create an index and give the primary-key
const index = await client.createIndex({ uid: 'books', primaryKey: 'book_id' })

=> the getOrCreateIndex method should expect for an object with at least uid, as the createIndex method does.

For the ruby:

# Create an index
client.create_index('books')
# Create an index and give the primary-key
client.create_index({uid: 'books', primaryKey: 'book_id'})

=> the get_or_create_index method should expect for a string or a hash table (= a dict = an object) with at least the uid field.

For the PHP: same behavior than ruby.
For the Go: same behavior than the JS for the moment.

For the python, it's a little bit different... It does not have the same logic than the others 😞

# Create an index
client.create_index(uid='books')
# Create an index and give the primary-key
client.create_index(uid='books', primary_key='book_id')

Options when creating will evolve and there will be more than just primaryKey

This is not possible here...

A solution is:

  • to break the python, but not too much, and to provide the same behavior than in the @bidoubiwa comment:
client.create_index(uid='indexUID', options={ primaryKey: 'id' })
client.get_or_create_index(uid='indexUID', options={ primaryKey: 'id' })
  • not to break the python and create a get_or_create_index method with the same behavior than the current one

Need your opinion/approval

What do you think guys about:

  • staying consistent with the create_index method depending on the package?
  • breaking or not the python?

@bidoubiwa @eskombro

Personally, I'm okay with breaking the python package.

Outdated, see the comment below.

@curquiza curquiza added need approval A final choice is suggested need vote When several solutions are suggested and removed ready to implement need approval A final choice is suggested labels Jun 9, 2020
@bidoubiwa
Copy link
Contributor

bidoubiwa commented Jun 9, 2020

staying consistent with the create_method depending on the package?

I really like your suggestion of staying consistent. I'm not against breaking the python. The first parameter is still UID which is the most important.

@curquiza
Copy link
Member Author

curquiza commented Jun 10, 2020

After discussing with @bidoubiwa, here are the final solution we thought about:

For all the SDKs, the createIndex method should have this kind of prototype:

createIndex(uid as string, options as object)

Same for the new getOrCreateIndex method:

getOrCreateIndex(uid as string, options as object)

The pros:

  • all the SDKs follow the "rule" method_name(mandatory_param1, options_as_object) which is much better
  • for the JS, the createIndex is now consistent with the getIndex (that expect only a string and not an object)
  • all the SDKs are consistent (for the moment, each SDK has its own behavior concerning the create_index method)

The cons:

  • big break for all the SDKs. Not a big deal from my POV, we are at the beginning, it's now or never :)

@eskombro and @bidoubiwa I need your approval on it 🚀

@curquiza curquiza added need approval A final choice is suggested and removed need vote When several solutions are suggested labels Jun 10, 2020
@eskombro
Copy link
Member

I really like the idea of encouraging consistence and coherence between all SDKs and between this two different but close methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants