Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds new_session_client method #100

Merged
merged 4 commits into from
Nov 7, 2016
Merged

adds new_session_client method #100

merged 4 commits into from
Nov 7, 2016

Conversation

marrony
Copy link
Contributor

@marrony marrony commented Oct 24, 2016

if observer is None:
observer = self.observer

return Client(self.domain, self.scheme, self.port, self.session.timeout, secret, observer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sharing the underlying http connection pool if you don't share session object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, and I don't think it's possible, because session object hold the headers with the authentication. I might be wrong though but I don't find good examples except ones that says it's thread safe but it's a different thing

@@ -91,6 +91,12 @@ def ping(self, scope=None, timeout=None):
"""
return self._execute("GET", "ping", query={"scope": scope, "timeout": timeout})

def new_session_client(self, secret, observer=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this is kind of a clunky title, and it's also not documented what this does or is for, so from both of these points, the user of this library will probably find this method a mystery.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anyone oppose, I'll use ruby driver as a reference to this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine by me, though I'm biased 😛

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, with_secret still a good name for this method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, yes.

@marrony marrony force-pushed the new_session_client branch from a503d9f to 1ef0bde Compare October 26, 2016 00:00
if observer is None:
observer = self.observer

return Client(self.domain, self.scheme, self.port, self.session.timeout, secret, observer)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to rename this when #99 get closed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my previous question still applies here:

Are you sharing the underlying http connection pool if you don't share session object?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each session has it's own connection pool.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the thread pool? Is it the same for all http calls or each session object has its own?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I gather in the source code, it doesn't seem to have a thread pool. As far as I can tell, it just performs the request on the thread that made the request.

What are we trying to determine here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have explained that before.

The whole reason why we came up with this session client concept is that we could share the underlying http resources (connection pool and thread pool) with short live clients.

Those clients are usually used when you have a instance specific token that is required to validate a specific query.

In Java, for instance, if you just advise people to create new clients for those scenarios, they could run out or threads or file descriptors as each new client would create new pools.

What I'm trying to understand is if we are wasting resources or causing problems under high load by not sharing the underneath Session object between clients.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to share sessions between clients, we'll have to move the credentials to where we do each request, rather than setting it at the session level. We'll obviously also have to figure out where we want to put the shared session object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. That is true. Unless we follow the Scala approach, if that makes sense for python of course.

I guess what we need to figure out first is if we are causing any potential harm by creating several session objects. We need to understand how it uses its connection and thread pools, if any.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I was poking around and found some places that says it might not be a good idea to share the same Session object among multiple threads, at least for now.

There are some open discussions on GH
https://github.com/kennethreitz/requests/issues/1871
https://github.com/kennethreitz/requests/issues/2766

and some recommendations on SO
http://stackoverflow.com/questions/18188044/is-the-session-object-from-pythons-requests-library-thread-safe

this one was randomly found
http://toolbelt.readthedocs.io/en/latest/threading.html

A requests Session is documented as threadsafe but there are still a couple corner cases where it isn’t perfectly threadsafe. The best way to use a Session is to use one per thread.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is odd.

I was looking at the code for requests library. It seems like Session object is thread safe as long as we don't mutate its attributes after it's constructed.

It also feels weird to use a session per thread, as suggest by stack overflow, since each session will have its own connection pool. For short live threads, such as a http request, tcp connections won't be reused at all.

The default connection pool size is 10 by default. Which seems a little low to me.

IMO, the ideal implementation would be:

  • Provide a configuration so the users can change the number of connections as they initialize FaunaClient. Perhaps allow a little more than 10 by default?
  • Move the authentication to the request rather than the session object.
  • Share the session object between connections, making sure it's not mutated anywhere, so it reuses underlying tcp connections to fauna.

if observer is None:
observer = self.observer

return Client(self.domain, self.scheme, self.port, self.session.timeout, secret, observer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my previous question still applies here:

Are you sharing the underlying http connection pool if you don't share session object?

@@ -91,6 +91,21 @@ def ping(self, scope=None, timeout=None):
"""
return self._execute("GET", "ping", query={"scope": scope, "timeout": timeout})

def with_secret(self, secret, observer=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, when I read with_secret, I would expect the method to receive a lambda and perform the produced queries with a different secret.

That would avoid the need for a reference count in order to manage shared resources.

@marrony marrony force-pushed the new_session_client branch from 1ef0bde to 5187682 Compare November 2, 2016 21:51
@marrony marrony force-pushed the new_session_client branch from 5187682 to 5a6cf05 Compare November 3, 2016 15:46
@codecov-io
Copy link

codecov-io commented Nov 3, 2016

Current coverage is 99.79% (diff: 97.22%)

Merging #100 into master will increase coverage by 0.22%

@@             master       #100   diff @@
==========================================
  Files             9          9          
  Lines           462        489    +27   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            460        488    +28   
+ Misses            2          1     -1   
  Partials          0          0          

Powered by Codecov. Last update 397d587...a6c77bc

Callback that will be passed a :any:`RequestResult` after every completed request.
:return:
"""
if self.counter.get() > 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use increment here. There is a chance that get returns more than zero but immediately after some other thread decrements and close the session you're sharing with the new client.

@marrony
Copy link
Contributor Author

marrony commented Nov 3, 2016

do we have a verdict here on what should be the name? new_session_client or with_secret?

FYI: my last commit made the clients share the same Session object with a ref counter. About add an option to increase/decrease the threadpool, the only way I see was to add a new adapter, pretty much like this example https://github.com/kennethreitz/requests/blob/master/docs/user/advanced.rst#example-specific-ssl-version

@erickpintor
Copy link
Contributor

If you return a new client, I think new_session_client is a proper name. If you receive a lambda and just execute it on a contained method, with_secret seems better.

For the pool size, I think you can try to mount http and https to the same adapter but, with a different pool size.

def __str__(self):
return "Counter(%s)" % self.counter

def increment(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method not used anymore.

self.counter += 1
return self.counter

def get_and_increment(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still a race condition in here:

  • Thread 1 calls decrement: decrements counter to 0 and close the session
  • Thread 2 calls get_and_increment: increments the counter to 1, returns 0, therefore fail to create a new session client
  • Thread 3 calls get_and_increment: increments the counter to 2, returns 1, allowing a new session client to be create with a closed session.

The knowledge of a live session must be composed by its ref count plus a flag that tells you if the session was closed already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand you reasoning but, if decrement returns 0 means that no other client is holding this session anymore, also, at this point, if __del__ gets called means that this client is being garbage collected, hence, no client could possibly call new_session_client.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Thread 1 would not decrement to zero if thread 2 and 3 are holding a client. Ignore me. :)

@marrony marrony force-pushed the new_session_client branch from ed07d7e to a9ffffe Compare November 3, 2016 19:19
except:
pass
if self.counter.decrement() == 0:
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This try/except block was to attempt to handle some weird corner that came up in testing (I think it was when a test failed?), it's not a proper fix, and should be removed if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I really wanted to remove this but I was afraid to do, though, I removed and no exception was thrown

@marrony
Copy link
Contributor Author

marrony commented Nov 7, 2016

Can I have another review on this PR?

Copy link
Contributor

@erickpintor erickpintor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a little question.

})
self.session.timeout = timeout

self.auth = HTTPBasicAuth(secret, "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm: is this base64 encrypted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@marrony marrony merged commit 949c811 into master Nov 7, 2016
@marrony marrony deleted the new_session_client branch November 7, 2016 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants