rez-env Performance and socket.getfqdn() #617

mottosso · 2019-04-26T09:14:06Z

15% of the time spent calling rez env is spent calling socket.getfqdn(), under some circumstances.

Problem

This command is run alongside rez-env which gets you the toy.fritz.box part of the welcome banner.

$ rez env
resolved by manima@toy.fritz.box, on Fri Apr 26 09:56:24 2019, using Rez v2.29.1
...

However I found it can get quite costly, depending on your setup.

import time
import socket
t0 = time.time()
socket.getfqdn()
t1 = time.time()
print("%.2f seconds" % (t1 - t0))
# 0.27 seconds

That's on Windows 10 with wireless connection to router.

0.02 seconds

This on the other hand is on one connected via wire on another Windows 10 machine. I'm not entirely confident the numbers are based solely on wi-fi versus wired, but the former is my working environment which gets me a 0.27 delay whenever I enter an environment.

A Solution

Spontaneously, I would try and "bake" the value of that variable, like in an environment variable.

fqdn = os.getenv("REZ_CACHED_FQDN") or socket.getfqdn()

At least then there's the option of avoiding the penalty.

I'm also curious what your results are, and perhaps most importantly; do we need it? If so, does it need to be written in the terminal, or could it be included in logs etc? In which case maybe it could get called asynchronously.

Background

I found the issue by running Rez through PySpy.

$ set PYTHONPATH=%REZ_INSTALL_DIR%\Lib\site-packages\rez-2.29.0-py2.7.egg
$ c:\Python27\Scripts\py-spy.exe -- c:\Python27\python.exe -m rez.cli._main env -- exit

Collecting samples from 'c:\Python27\python.exe -m rez.cli._main env -- exit' (python v2.7.14)
Total Samples 100
GIL: 0.00%, Active: 100.00%, Threads: 1

  %Own   %Total  OwnTime  TotalTime  Function (filename:line)
 77.78%  77.78%   0.700s    0.700s   meth (c:\Python27\lib\socket.py:228)
 17.78%  17.78%   0.260s    0.260s   getfqdn (c:\Python27\lib\socket.py:141)
  1.11%   1.11%   0.010s    0.010s   forward (C:\Users\manima\Dropbox\dev\anima\rez2\Lib\site-packages\rez-2.29.0-py2.7.egg\rez\vendor\yaml\reader.py:107)
  1.11%   1.11%   0.010s    0.010s   <module> (C:\Users\manima\Dropbox\dev\anima\rez2\Lib\site-packages\rez-2.29.0-py2.7.egg\rezplugins\package_repository\memory.py:7)
  1.11%   1.11%   0.010s    0.010s   exists (c:\Python27\lib\genericpath.py:26)
  1.11%   1.11%   0.010s    0.010s   find_module (c:\Python27\lib\pkgutil.py:186)
  0.00%   3.33%   0.000s    0.030s   get_plugin_class (C:\Users\manima\Dropbox\dev\anima\rez2\Lib\site-packages\rez-2.29.0-py2.7.egg\rez\plugin_managers.py:280)

The text was updated successfully, but these errors were encountered:

bpabel · 2019-04-26T15:56:13Z

For comparison, running socket.getfqdn() takes 4.5 seconds on my production machine!! And I'm not on wifi. TBH, maybe ditch that portion of the welcome banner entirely.

mottosso · 2019-04-26T16:14:02Z

You can do a quick patch in rez.__init__.py to get a sense of difference in time spent.

rez.__init__.py

...

import socket
socket._getfqdn = socket.getfqdn

def getfqdn():
  return "none.ya.beeswax"

socket.getfqdn = getfqdn

...

nerdvegas · 2019-04-26T23:04:14Z

That's nuts. On our prod network this is pretty much instant. What results do you get from `socket.gethostname`? We don't necessarily need this in the rez-env output I agree, however it isn't that simple. Rez-env is just printing out the current context, and contexts need this info, because contexts can be shared, and where the context was created is important info because that may have affected the resolve. Thx A ps - there's a hidden `--profile FILE` option available on all the cli tools. This will write out a cprofile file to the given path, which you can then view using your favorite cprofile viewer. I quite like https://pypi.org/project/RunSnakeRun/ - it looks like it's from the 90s, but that squaremap is wonderful.

…

On Sat, Apr 27, 2019 at 2:14 AM Marcus Ottosson ***@***.***> wrote: You can do a quick patch in rez.__init__.py to get a sense of difference in time spent. *rez.__init__.py* ... import socket socket._getfqdn = socket.getfqdn def getfqdn(): return "none.ya.beeswax" socket.getfqdn = getfqdn ... — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#617 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAMOUSRMMDOWD4X2EDMHTJDPSMS4ZANCNFSM4HIT4OQA> .

bpabel · 2019-04-26T23:26:21Z

socket.gethostname() is pretty much instant.

JeanChristopheMorinPerso · 2019-04-27T00:00:59Z

@bpabel @mottosso how long does it take to run a nslookup with your IP and same with your hostname? If everything is right with your network, it should be instant. If if it's not instant (or almost instant), something might be wrong with your network (more particularly DNS server).

mottosso · 2019-04-27T07:41:53Z

>>> timeit.timeit("socket.getfqdn()", setup="import socket", number=10)
2.576872299999998
>>> timeit.timeit("socket.gethostname()", setup="import socket", number=10)
0.001569600000010496

how long does it take to run a nslookup with your IP and same with your hostname?

How do I do that? :O

mottosso · 2019-04-27T07:44:52Z

Oh and also, I found what caused the slowdown on the above slow machine, my VPN. I'm connected from London to a company in Tokyo.

Without it, I'm getting near instant results.

>>> timeit.timeit("socket.getfqdn()", setup="import socket", number=10)
0.009264200000018263

bpabel · 2019-04-27T17:25:08Z

Regardless of how long it takes and what network changes can be made to improve the startup times, it seems like an unnecessary requirement to make network requests just to startup a Rez command.

nerdvegas · 2019-04-28T04:27:40Z

Brendan, see previous comment: *"""We don't necessarily need this in the rez-env output I agree, however it isn't that simple. Rez-env is just printing out the current context, and contexts need this info, because contexts can be shared, and where the context was created is important info because that may have affected the resolve."""* It seems like switching to gethostname() should do the trick in any case. Thx A

…

On Sun, Apr 28, 2019 at 3:25 AM Brendan Abel ***@***.***> wrote: Regardless of how long it takes and what network changes can be made to improve the startup times, it seems like an unnecessary requirement to make network requests just to startup a Rez command. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#617 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAMOUSXNLD2WDKYD6N3TJ63PSSD7LANCNFSM4HIT4OQA> .

bpabel · 2019-04-28T05:33:10Z

@nerdvegas It's still not clear to me why the context needs a fully qualified domain name. Why does the domain name have any bearing on what happens if a context is shared? When you say sharing contexts, are you talking about the feature that allows you to pass an .rxt file into rez-env? I just browsed through that code and it doesn't look like it's using the host to make any decisions about how to resolve the context.

Also, I'm willing to bet that the number of people actually sharing contexts can be counted on one hand (if there actually are any), so I don't find that use case a compelling reason not to improve the performance of a simple rez resolve, which is felt by every single person that uses rez.

nerdvegas · 2019-04-28T05:39:57Z

The fqdn is possibly more info than we need, which is why I suggested hostname instead. Some indication of where the context was created is necessary. So far it seems that gethostname is cheap, so there's not reason not to use it. Method shares contexts. Production tool envs are resolved on the fly, but we have an entire suite of tools available globally (which tend to be important utility tools that pipeline devs use). These are stored in suites (ie, shared contexts), because we don't need to update them often, and in that scenario contexts win out, because they're far more lightweight than running a resolve. When rez-env is run, a temporary context is created and is stored to a temp rxt file, pointed at by $REZ_RXT_FILE. It's feasible that someone somewhere would on occasion copy this rxt file and share it, so it has to be fully formed. Thus, a rez-env does actually need to establish what host that context was created on. A

…

On Sun, Apr 28, 2019 at 3:33 PM Brendan Abel ***@***.***> wrote: @nerdvegas <https://github.com/nerdvegas> It's still not clear to me why the context needs a fully qualified domain name. Why does the domain name have any bearing on what happens if a context is shared? When you say sharing contexts, are you talking about the feature that allows you to pass an .rxt file into rez-env? I just browsed through that code and it doesn't look like it's using the host to make any decisions about how to resolve the context. Also, I'm willing to bet that the number of people actually sharing contexts can be counted on one hand (if there actually are any), so I don't find that use case a compelling reason not to improve the performance of a simple rez resolve, which is felt by every single person that uses rez. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#617 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAMOUSQF7HW26RBEUC55ONTPSUZJPANCNFSM4HIT4OQA> .

bpabel · 2019-04-28T05:49:38Z

I agree, switching to hostname is probably enough to solve this issue.

But just for the sake of curiosity

It's feasible that someone somewhere
would on occasion copy this rxt file and share it, so it has to be fully
formed. Thus, a rez-env does actually need to establish what host that
context was created on.

I don't think rez-env actually does need to establish the host. What would happen, right now, if someone shared an rxt file from another host? As far as I can tell, rez doesn't use the host to make any decisions about how to load or resolve a context.

nerdvegas · 2019-04-28T05:54:38Z

Correct, rez doesn't use that info. It's there for debugging purposes. For example, a package might have late-bound requires that were affected by the host the resolve was run on (which would be bad practice, but I can't control what someone does in their late-bound package attributes). I would drop host info from contexts only as a last resort, hence the gethostname suggestion. A

…

On Sun, Apr 28, 2019 at 3:49 PM Brendan Abel ***@***.***> wrote: I agree, switching to hostname is probably enough to solve this issue. But just for the sake of curiosity It's feasible that someone somewhere would on occasion copy this rxt file and share it, so it has to be fully formed. Thus, a rez-env does actually need to establish what host that context was created on. I don't think rez-env actually does need to establish the host. What would happen, right now, if someone shared an rxt file from another host? As far as I can tell, rez doesn't use the host to make any decisions about how to load or resolve a context. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#617 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAMOUSSZU7HDRTCL7ESFDXDPSU3HJANCNFSM4HIT4OQA> .

JeanChristopheMorinPerso · 2019-04-28T11:27:13Z

I fear the hostname is not enough. Studios which have multiple sites, it's entire possible they will have multiple domain and so the the hostnames could be duplicated and cause confusion if fqdn is not used.

bpabel · 2019-04-28T15:56:24Z

@JeanChristopheMorinPerso In what scenario would you see this being confusing? In 99% of the cases, the host and fqdn are going to be the localhost.

@nerdvegas postulated a scenario where someone using a shared context could possibly write a package with a late-bound function that checked the host/fqdn where the context was created and make decisions about how the package loads based off the hostname/fqdn. While that's technically possible, I'd be surprised if anyone is actually doing that, and even if they were, there are almost certainly better ways of doing whatever they're doing based off other context metadata.

TBH, if having just the hostname in the context metadata would be confusing, I think that's more of an argument for removing the hostname entirely, not adding the fqdn. The fqdn and the hostname aren't used by rez (other than the welcome message).

I just can't envision a scenario where this would ever confuse someone.

nerdvegas · 2019-04-29T23:18:00Z

It's worth simply switching to gethostname to remove what appears to be a significant performance bottleneck for some people. If any studios require more info in their contexts, for debugging reasons or other, then I think it makes sense to add a new feature to allow for storing arbitrary metadata into contexts. There could be a global setting to default this to something, which would look something like this: # rezconfig.py def default_context_metadata(): import socket return { "fqdn": "socket.getfqdn()" } This approach would give the flexibility for studios to add the info they need, while removing any performance bottleneck for those who don't. A

…

On Mon, Apr 29, 2019 at 2:30 AM Marcus Ottosson ***@***.***> wrote: I'm with @bpabel <https://github.com/bpabel> on this one; these sound like the edges of edgecases. If you really need to universally and uniquely distinguish a location, then why not let people do that themselves via e.g. an environment variable REZ_UNIVERSAL_LOCATION_ID. Fast and unambiguous. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#617 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAMOUSX5WCSHMKZEPVL566LPSXGIVANCNFSM4HIT4OQA> .

mottosso · 2019-04-30T08:27:04Z

I thought of one more approach to this; caching.

The fqdn call could e.g. be made once per machine, and stored in a users home directory. That way, you'll get precision and speed all rolled into one, at the expense of the value eventually getting stale. I would imagine the value isn't something that changes very often, and when it does, one could simply delete the file. One could even set it up to get deleted on logout/login such that the cost is paid only once per boot.

nerdvegas · 2019-05-07T12:21:15Z

closed by https://github.com/nerdvegas/rez/releases/tag/2.30.0

bpabel mentioned this issue Apr 28, 2019

617 fqdn #621

Merged

nerdvegas closed this as completed May 7, 2019

ksg97031 mentioned this issue Feb 25, 2022

slow importing fortifyadmin/fortifyapi#34

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rez-env Performance and socket.getfqdn() #617

rez-env Performance and socket.getfqdn() #617

mottosso commented Apr 26, 2019

bpabel commented Apr 26, 2019

mottosso commented Apr 26, 2019

nerdvegas commented Apr 26, 2019 via email

bpabel commented Apr 26, 2019

JeanChristopheMorinPerso commented Apr 27, 2019 •

edited

Loading

mottosso commented Apr 27, 2019

mottosso commented Apr 27, 2019 •

edited

Loading

bpabel commented Apr 27, 2019

nerdvegas commented Apr 28, 2019 via email

bpabel commented Apr 28, 2019

nerdvegas commented Apr 28, 2019 via email

bpabel commented Apr 28, 2019

nerdvegas commented Apr 28, 2019 via email

JeanChristopheMorinPerso commented Apr 28, 2019

bpabel commented Apr 28, 2019

nerdvegas commented Apr 29, 2019 via email

mottosso commented Apr 30, 2019

nerdvegas commented May 7, 2019

rez-env Performance and socket.getfqdn() #617

rez-env Performance and socket.getfqdn() #617

Comments

mottosso commented Apr 26, 2019

Problem

A Solution

Background

bpabel commented Apr 26, 2019

mottosso commented Apr 26, 2019

nerdvegas commented Apr 26, 2019 via email

bpabel commented Apr 26, 2019

JeanChristopheMorinPerso commented Apr 27, 2019 • edited Loading

mottosso commented Apr 27, 2019

mottosso commented Apr 27, 2019 • edited Loading

bpabel commented Apr 27, 2019

nerdvegas commented Apr 28, 2019 via email

bpabel commented Apr 28, 2019

nerdvegas commented Apr 28, 2019 via email

bpabel commented Apr 28, 2019

nerdvegas commented Apr 28, 2019 via email

JeanChristopheMorinPerso commented Apr 28, 2019

bpabel commented Apr 28, 2019

nerdvegas commented Apr 29, 2019 via email

mottosso commented Apr 30, 2019

nerdvegas commented May 7, 2019

JeanChristopheMorinPerso commented Apr 27, 2019 •

edited

Loading

mottosso commented Apr 27, 2019 •

edited

Loading