Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorize observe method and _correct_for_light_travel_time #526

Open
wants to merge 20 commits into
base: master
Choose a base branch
from

Conversation

JoshPaterson
Copy link
Contributor

This PR vectorizes mpc.comet_orbit and mpc.mpc_orbit while also allowing them to continue working the way they did previously for scalars.

I changed keplerlib.propagate so that returned arrays have shape (xyz, t, n) in light of the discussion in #520.

As far as I know no further changes are necessary in mpc.py or keplerlib.py . Here are the problems I currently know about:

  • comets.at(t).xyz_position(ecliptic_frame) creates a broadcasting error in functions.mxv
  • earth.at(t).observe(comets) creates a broadcasting error in vectorlib._correct_for_light_travel_time

Should I try making changes directly where those problems are happening, or is there perhaps a different approach I'm not aware of?

@JoshPaterson JoshPaterson marked this pull request as draft January 2, 2021 06:42
@JoshPaterson JoshPaterson marked this pull request as ready for review January 2, 2021 06:43
@JoshPaterson JoshPaterson changed the title WIP Vectorize mpc.comet_orbit, mpc.mpc_orbit, etc. Vectorize mpc.comet_orbit, mpc.mpc_orbit, etc. Jan 2, 2021
@brandon-rhodes
Copy link
Member

Should I try making changes directly where those problems are happening, or is there perhaps a different approach I'm not aware of?

It's hard to say without diving into each case. Could you create a little repro for each of them? Each could be turned into a little test case and we could solve those two problems before landing this PR.

@JoshPaterson
Copy link
Contributor Author

I added some tests for multiple comets and multiple minor planets that trip up vectorlib._correct_for_light_travel_time.

@brandon-rhodes
Copy link
Member

Thanks! I'll plan to take a look at them early next week, if I don't get to it this weekend.

@JoshPaterson
Copy link
Contributor Author

This is the script I've been using with a debugger:

from skyfield.api import load
from skyfield.data import mpc
from skyfield.constants import GM_SUN_Pitjeva_2005_km3_s2 as GM_SUN

ts = load.timescale()

eph = load('de421.bsp')
sun = eph['sun']

with load.open(mpc.COMET_URL) as f:
    comets_df = mpc.load_comets_dataframe(f)

comets_df = (comets_df.sort_values('reference')
          .groupby('designation', as_index=False).last()
          .set_index('designation', drop=False))

observer = sun + mpc.comet_orbit(comets_df.iloc[:2], ts, GM_SUN)
time = ts.utc([2020, 2021, 2022, 2023])
comets = sun + mpc.comet_orbit(comets_df.iloc[20:25], ts, GM_SUN)

observer.at(ts.utc([2020, 2021, 2022, 2023])).observe(comets)

@JoshPaterson
Copy link
Contributor Author

JoshPaterson commented Jan 3, 2021

I found it helpful to write down my reasoning for what should happen step-by-step in _correct_for_light_travel_time, so I thought it might be useful to you to read it.

The call to the observe method from the above script looks like this:

observer.at(time).observe(comets)

observer has length m, time has length t, and comets represents n objects. We should expect _correct_for_light_travel_time to return position, velocity, and light_time arrays with shapes (3, t, n, m), (3, t, n, m), and (t, n, m) respectively.

Once inside the function, t, whole, and tdb_fraction have shape (t, 1, 1), because they correspond to time. cposition, and cvelocity have shape (3, t, 1, m) because they represent the position and velocity of each observer at time. tposition and tvelocity have shape (3, t, n, 1) because they represent the position and velocity of comets at time. distance, light_time, and delta should then have shape (t, n, m).

t2 is tricky. Unlike t, it is corrected for light travel time, so it will have the same shape as light_time: (t, n, m).

So far so good. Now we get to this line:

tposition, tvelocity, gcrs_position, message = target._at(t2)

We want to update tposition and tvelocity, which means we need arrays with shape (3, t, n, 1). Here is where I'm stuck. If t2 has shape (t, n, m), then wouldn't target._at(t2) return arrays with shape (t, n, m, n)? At the moment I can't figure out what should be done with target and t2 to produce arrays with shape (3, t, n, 1).

Edited to add:
I've thought about this some more. Now that t2 is different for each observer, that means tposition and tvelocity are also going to be different for each observer, so the number of dimensions in tposition and tvelocity should increase after executing this line so that their shapes are (3, t, n, m). We don't want the positions of all of the objects at all of the times, e.g., we only care about where comets[0] is at t2[:, 0, :], so this line should be a loop that calls _at() once for each object. I created the variables tposition_out and tvelocity_out for the versions of tposition and tvelocity with the extra dimension.

Inside of the loop, the argument used to call _at is a time with shape (t,m). _KeplerOrbit objects can't be propagated to a 2 dimensional array of times, so this array needs to be flattened before being given to target._at(), and then reshaped back to (t, m) before being placed into tposition_out and tvelocity_out.

@JoshPaterson
Copy link
Contributor Author

OK, I think that latest commit successfully vectorizes _correct_for_light_travel_time, but only when all the inputs are vectors, i.e., a _KeplerOrbit representing multiple objects observes another _KeplerOrbit representing multiple objects at multiple times. It needs to be tweaked so it will still work when they are all scalars or when just time is vectorized.

@JoshPaterson
Copy link
Contributor Author

JoshPaterson commented Jan 5, 2021

The changes necessary to do this raise some interesting questions:

  • It might be nice to be able to iterate through _KeplerOrbit objects representing multiple orbits and get back objects representing a single orbit, so instead of this being necessary:
for i, (pos, vel, epoch) in enumerate(zip(target.vector_functions[-1].position_at_epoch.au.T,
                                target.vector_functions[-1].velocity_at_epoch.au_per_d.T,
                                target.vector_functions[-1].epoch.tt,
                                )):
    k_orbit = type(target.vector_functions[-1]) #avoid import loop?
    target_i = target.vector_functions[0] + k_orbit(Distance(au=pos), Velocity(au_per_d=vel), ts.tt_jd(epoch), target.vector_functions[-1].mu_au3_d2, center=10)

we could just have this:

for i, target_i in enumerate(target):
    ...
  • What assumptions can we make about using a vectorized skyfield object? I would suggest the following:
    • All of the astronomical objects in a single skyfield object should have the same center.
    • A VectorSum object should only contain one vectorized skyfield object. This would be a useful thing to assume or enforce if vectorized skyfield objects were to be iterables like I mentioned above
    • A vectorized skyfield object would usually be the last object added to a VectorSum, but as far as the math is concerned I don't think this needs to be explicitly enforced?

@JoshPaterson
Copy link
Contributor Author

Though I suppose if it's going to be possible to observe multiple objects from a vectorized skyfield object it would have to be possible for the resulting Astrometric object to have multiple centers.

@JoshPaterson
Copy link
Contributor Author

OK, that last commit wast the last patch necessary to allow the above script to run without errors.

@brandon-rhodes
Copy link
Member

Thanks for looking into this problem! Here are my initial thoughts:

  1. This PR starts trying to solve one problem (the multi-Kepler object), then is forced to veer into an entirely different problem (making _correct_for_light_travel_time() work with extra dimensions), with the result that it makes a few too many changes at once. We should hit "pause" on this PR and have you start a new one that just tackles the light-time issue with a simple unit test that provides a vector input and looks for the right output. Once we have the light-time situation fixed, we can return to this PR with a more solid Skyfield behind it that's ready to support the full use case.
  2. So the rest of my comments will just focus on the light-time part.
  3. What is happening inside of the for loop? The light-time logic, I suspect, should just be calling at() methods (or _at() for now, which I think I did for efficiency?) without knowing any details about how positions are computed. Otherwise it won't work as newer ways of doing _at() are added.
  4. It looks like it’s reaching inside the vector function on the assumption that it's a vector sum (which isn't true of all vector functions in Skyfield?) and pulling off the final function. Why not just call _at() on the whole thing? What’s special about the last piece?
  5. Am I seeing correctly that it does not actually propagate the original orbit, but instead builds a whole new orbit using two new vectors? This puzzles me. Wouldn’t the original orbit give the same position given an input time, and do so without requiring a new object to be built?
  6. Does this mean that if this light-time function is used for, say, Jupiter, that instead of calling Jupiter’s own _at() for the back-dated position, that it will instead approximate Jupiter’s orbit as an ellipse, and use a position on that ellipse in place of the Jupiter position the ephemeris would produce? (Or, am I not yet supposed to be reading the code as Jupiter-ready, and it’s only meant for Kepler orbits right now? Apologies if I'm jumping ahead!)
  7. I only see vector functions 0 and -1. What about the ones in between? A satellite in orbit around Earth, for example, has 3 pieces, Barycenter→Earth/Moon Barycenter→Earth→Satellite.

Stepping back, I like the questions you are asking about vectors in general, that's a good starting point. Here's a first try at some responses:

What assumptions can we make about using a vectorized skyfield object? I would suggest the following: All of the astronomical objects in a single skyfield object should have the same center.

Yes, which sounds symmetric with the current design: the .center is always a single object, never a vector.

A VectorSum object should only contain one vectorized skyfield object. This would be a useful thing to assume or enforce if vectorized skyfield objects were to be iterables like I mentioned above

I would prefer that we not impose restrictions on the internals of the vector sum. While I suspect it's true that the vector sums we build in the near future will indeed have the internal property you outline here, I'm not sure that we'd want to ratchet down a requirement on them.

A vectorized skyfield object would usually be the last object added to a VectorSum, but as far as the math is concerned I don't think this needs to be explicitly enforced?

Agreed, I don't think this needs to be enforced. Vector sums should be treated by other code like simple vector functions that have at() and _at() like other vectors; other code shouldn't have opinions about how they generate positions.

Finally:

I think your questions about dimensions really go best to the heart of the problem; we might need to define more useful constraints on the dimensions users supply. To get that question underway, let's start with a test case for light-travel time. I'll go try writing one up, and then I'll post it as a comment here for us to discuss. Then, once we agree on a good shape for the problem, we can look at getting the light-travel routine ready for this problem!

@JoshPaterson JoshPaterson changed the title Vectorize mpc.comet_orbit, mpc.mpc_orbit, etc. Vectorize observe method and _correct_for_light_travel_time Jan 6, 2021
@JoshPaterson
Copy link
Contributor Author

I moved the changes that were for vectorizing _KeplerOrbit creation into #532. I left the changes in _correct_for_light_travel_time here in this PR so they would stay with all of the discussion about them.

@brandon-rhodes
Copy link
Member

Thanks for splitting this into two PRs!

The idea of a “vector of comets” (or minor planets, etc) introduces a novel semantic question for Skyfield. Imagine a vector function comets of length n. Everyone agrees that calling comets.at(t) for a scalar time t should return n positions, each of which gives the position of a different comet at the exact same time t.

But what if instead, t is not a scalar but also has length n?

  1. The result could be an array with shape 3, n, n giving the positions of every comet for every time in the array.
  2. Or, the rule could be that it applies each time to the corresponding comet — we could declare that in the scalar case, we were being friendly and broadcasting the single time across the n comets, but that otherwise the dimensions have to match because each time in a time array “belongs to” one comet.

I suspect that the second rule will be less surprising to users than the first rule. But, more importantly, if we adopt the first rule, then I think we make it impossible to apply n times to n comets pairwise, both for internal routines like light-travel time, and also for normal users who might want to do so. So I think there's a strong argument to be made that the second rule is the one that new "multi-body objects" should follow?

If someone really wanted n × n results, maybe we could try creating time arrays with shapes like 1, n? But that would probably break most time internals (like printing the times as strings etc); I haven't tried.

@JoshPaterson
Copy link
Contributor Author

if we adopt the first rule, then I think we make it impossible to apply n times to n comets pairwise

I am having trouble thinking of a use case for option 2 outside of _correct_for_light_travel_time. The downside of option 2 is that if you did want to generate an array of positions with shape (3, t, n) you would have to do it by looping over the times, but I think that would negate the potential speed increase from vectorizing the code. Also, it's already possible to generate an array like that in Skyfield as it exists today by looping over the target objects, so option 2 would just move people from looping over the objects they're interested in to looping over the times that they're interested in. On the other hand I think option 1 would be very useful for speeding up the example scripts in issues #490 and #496, and also in situations like generating the example plot in #511, and I think that kind of vectorization would be more useful in general.

Independent of whether we choose option 1 or 2, the _correct_for_light_time_travel would still need to be able to use option 2. It's not so good to do this with a loop in _correct_for_light_travel_time as it currently is in this PR, so perhaps we could give _KeplerOrbit a private method for propagating in this way since I believe it would only need to be used by the light travel calculation. It wouldn't be that hard to write a second version of the propagate function to do this.

I suspect that the second rule will be less surprising to users than the first rule.

Right now this PR attempts to vectorize observer.at(t).observe(target) with respect to observer, t, and target. Perhaps to keep things relatively simple we could vectorize it only with respect to t and target but not observer. While the code for vectorizing all three objects is not very much more complicated than only vectorizing t and target it would be easier to explain to users. In that case the largest number of dimensions in a position array would be 3, t, n rather than 3, t, n, m.

@xmichaelx
Copy link

Perhaps to keep things relatively simple we could vectorize it only with respect to t and target but not observer.

I believe that this would also be a pretty common use case for this functionality. At least coming from observation scheduling and results processing background this makes a lot of sense.

@brandon-rhodes
Copy link
Member

A busy January has prevented me so far from following up with a knock-down drag-out essay here that would weigh every possible issue and reach a conclusion, so I'm instead going to try making a short and simple comment every day or two so that y’all are not kept waiting forever, and hope thereby to gradually answer the interesting points that you each have raised.

Today’s single point:

I do not envision Skyfield growing to contain any for loops in its code. Its front-page brag on the web site is that it uses NumPy array acceleration, and I suspect we will always keep its internal routines free of Python for loops because that would hide from users occasions on which we hadn't actually worked out a way to accelerate something with NumPy. Where acceleration winds up not being feasible, let’s instead share example code that puts the for loop out in user code. If something can't be done efficiently, I prefer to put users in control instead of our own code.

So, yeah. Single small points are definitely easier to write up; that one wasn't bad. I'll plan to continue to return to this issue a few times a week, and hopefully the comments will lead somewhere! Thanks for y'all's patience as I've not yet been forthcoming with a conclusion pronouncement :)

@brandon-rhodes
Copy link
Member

A second thought is that I am not sure it is only comets that would need a second secret way to generate positions, if we break away from NumPy’s behavior and decree that n_comets._at(t_of_length_n) generates n×n positions instead of n positions. Wouldn’t it also affect stars, and planets, and Earth satellites?

Unless comets are a special case for a reason that I am not yet seeing, then all of the Skyfield vector functions — including those that might be written in the future — would need the secret method added that broadcasts like NumPy does when an array of n objects is asked for their position at n distinct times.

Which seems like a steep price in code and tests and future maintenance.

@JoshPaterson
Copy link
Contributor Author

Ahh, I think up until now I've been a little confused when you talked about generating nxn positions as "break[ing] away from NumPy’s behavior". The way I've been thinking about it is that if we have n_comets._at(t_of_length_n) n_comets is like a row vector, and t_of_length_n is like a column vector, so when we do any math with them the result is a 2d array. That's how the vectorization in #520 has been implemented, and it corresponds to option 1 from a few comments ago. It seems to me that this is also compatible with NumPy's behavior, and is not a breakaway, but an alternative way of thinking about what is happening. If I understand correctly the way you've been thinking about it is as if both n_comets and t_of_length_n are row vectors and should therefore obey NumPy's broadcasting rules, and this corresponds to option 2 from a few comments ago. As far as I know Skyfield's existing behavior hasn't yet contradicted either of those conceptions, so they've been able to coexist, but now we've reached a point where it's necessary to choose between them, thus my confusion.

all of the Skyfield vector functions — including those that might be written in the future — would need the secret method added

Yes, that is what I was suggesting before, but I've since learned of a way to avoid this: suppose you have an array with shape 3, t, n that's been produced using option 1, if t==n you can reduce it to the array that option 2 would have produced like this: np.diagonal(array, axis1=1, axis2=2). _correct_for_light_travel_time could use method 1 along with np.diagonal and would then not need another private method implementing option 2 on each object type.

Wouldn’t it also affect stars, and planets, and Earth satellites?

I think that's correct in that either option 1 or 2 should be the way vectorization works everywhere it's implemented. But I don't think we would have to implement vectorization for every object type.

If we do go ahead with option 2 I think we should evaluate whether the increase in complexity would be worthwhile. For the use cases I can think of I'm not sure it is, but there might be others I haven't thought of.

@nkelhos nkelhos mentioned this pull request Jan 20, 2021
@JoshPaterson
Copy link
Contributor Author

One thing I want to mention is I hope I don't come across as adversarial in this discussion. I'm sure that whatever path you choose will turn out well and either way I'll be happy to help make it happen!

@brandon-rhodes
Copy link
Member

Don’t worry, your comments have been helpful! And thanks for your patience — it will be another week or two before I am finally able to return to a normal schedule.

@JoshPaterson
Copy link
Contributor Author

No worries! I've got lots to stay busy with :)

@brandon-rhodes
Copy link
Member

I note that the new issue #539 also seems to ask for Skyfield to support arbitrary n×m vectorization combinations.

@JoshPaterson
Copy link
Contributor Author

Perhaps these PR's having to do with vectorization could go into their own branch. Knowing more about what changes are necessary might make it easier to decide between the different kinds of vectorization.

@brandon-rhodes
Copy link
Member

Perhaps these PR's having to do with vectorization could go into their own branch…

This might be an area where I’m not familiar enough with how folks use GitHub to know the advantages — are there operations you would like to perform that become possible, or easier, if instead of living on your own branch, they live on a branch in the main repository?

I did, by the way, spend time in late January looking at this, and as before ran into the fact that NumPy is simply backwards from how I would like a numeric library to work. So there are two possible approaches:

  1. Instead of ever using a * b, always call something like mul(a, b) where Skyfield would have its own mul() that matches dimensions starting at the top rather than the bottom. This approach already exists in routines like mxv() (matrix times vector) and mxm(). Thanks to the fact that einsum() lets you name exactly the axes over which your own routine needs to iterate, the “semantic” dimensions like x/y/z can remain as the top array dimensions, and the broadcasting happen down below them.
  2. Or there could be two tiers of Skyfield code: low-level code that does things the NumPy way and puts x/y/z as the bottom dimension, and then top-level arrays like position.km that put x/y/z at the top, and Skyfield’s position class could do a .T before calling something like _correct_for_light_travel_time() and then do a second .T before storing the results. That would be tolerable, as .T I don't think copies any data, it just returns a view that iterates in a different order across the exact same array.

While traveling a couple of weeks ago (quick drive to visit parents, whom I hadn't seen in a year), I did get far enough into _correct_for_light_travel_time() myself to run again into the above two choices, and since then have been pondering — or maybe just avoiding, because the code will forever after be a bit slower, or harder to read, or both.

But as I've just finished up, and published yesterday, the Ptolemy medieval cosmos model that I promised long ago in a PyCon Ireland talk:

https://rhodesmill.org/ptolemy/

— and as I just reached the chapter in re-reading The Lord of the Rings where Aragorn has to decide whether to go east or west from Rauros, it’s time to return to _correct_for_light_travel_time() and decide between one course and the other.

Actually, here’s what I should do: I should not choose, but instead implement a vectorized _correct_for_light_travel_time() both ways, taking both approaches all the way through to a full solution. Then (a) I don't have to choose yet, so can get started immediately, and (b) will have actual working examples on which to base the choice! It will take a bit longer, but far less long than if I keep waiting til decide without being able to see both solutions.

I'll post back here with the results!

@brandon-rhodes
Copy link
Member

@JoshPaterson — All right: it looks like the couple of weeks of thinking I've done since last attempting to upgrade _correct_for_light_travel_time() have helped! This time I simply asked of each line, “is NumPy going to do its usual upside-down broadcasting here?” and after adding lots of print statements it turned out that it was always subtraction that was triggering NumPy to do the opposite of what we want. So I created a little experimental replacement that broadcasts top-down rather than bottom-up:

def sub(a, b):
    return (a.T - b.T).T

And as you'll see if you try running the new script (see the commit linked just above this comment), you'll see that fixing those subtractions produced a _correct_for_light_travel_time() that works exactly like the old one (I think?) except that it doesn’t break if beyond the dimensions xyz, t there are further dimensions, in this case xyz, t, 2 target objects.

Of course the design script now needs to be taken farther. What if there are n observers, too? And of course once _correct_for_light_travel_time() is working for all cases (but maybe this sub() covers them all), then all of Skyfield might need to be rewritten with calls like sub() if everything, every possible call and transform, is going to handle n observers, and m targets, and so forth, correctly; and it’s not clear to me yet what test code to write to see how extensive those changes might need to be.

But for now I'm going to stop and let you read over the code first and respond to the ideas there. Try running it, and check out the output of the print() statements compared to what you expect, and comment back here. Thanks!

brandon-rhodes added a commit that referenced this pull request Feb 19, 2021
The light travel time routine can survive almost unchanged if instead of
expecting NumPy broadcasting to make up for extra missing dimensions, we
add an extra dimension to the observer’s position before calling, to
avoid NumPy needing to do its own back-to-front dimension matching.
@brandon-rhodes
Copy link
Member

@JoshPaterson — More good news: after working through the examples described above, it occurred to me that maybe we should just avoid having mismatched dimensions in the first place. If our calls to NumPy subtraction, for example, never asks NumPy to fix a mismatch between 2 dimensions in one array and 3 in the other, then Skyfield’s code will never even be sensitive to whether NumPy matches front-to-back or back-to-front. So maybe we just need checks in front of big operations that would stop and ask the user to make their dimensions match, before we proceed with vector operations?

You can check out the commit I've just made (see the link above) to see where things stand, as I'll now go look at why the JPL ephemeris logic can't happily accept a time with 2 dimensions instead of 1.

@JoshPaterson
Copy link
Contributor Author

are there operations you would like to perform that become possible, or easier, if instead of living on your own branch, they live on a branch in the main repository?

The problem I'm trying to solve is that this PR depends on #532, so that makes it hard to keep experimenting on this PR before that one is merged. The only advantage I had in mind for having the branch here versus in my fork was that it would be easier to discuss it here if the branch was also here. There might be a better way around this problem? Maybe I could keep the branch in my fork and then start a discussion in the discussions tab if becomes necessary.

But as I've just finished up, and published yesterday, the Ptolemy medieval cosmos model that I promised long ago in a PyCon Ireland talk:
https://rhodesmill.org/ptolemy/

I'm glad you mentioned that here; it's super interesting, especially the scripts for generating the animation!

I'm taking a look at the different ways of vectorizing correct_for_light_travel_time, I'll let you know what I think!

@JoshPaterson
Copy link
Contributor Author

I changed the version of correct_for_light_travel_time in this PR by removing the stuff that was there just to force it to work for one particular case. It needs some more adjustments to be able to run properly when called from design/broadcasting.py, but I wanted to finish writing this comment before I do that.

To summarize the methods for vectorization so far:

  • (this PR) the function adds singleton dimensions so the number of dimensions is the same and they match up properly
  • (cfltt2) create sub() within which broadcasting matches dimensions in reverse
  • (cfltt3) use transposes so broadcasting matches dimensions in reverse
  • (hypothetical) user is responsible for adding singleton dimensions in the right places so that the number of dimensions is the same and they match up properly

I think any of these could work, but the last one seems to have the most potential advantages:

  • Keep Skyfield's code simpler, or even unchanged in some places?
  • Make the user's code more explicit and therefore more clear

Do you have any ideas about what user code that adds dimensions could look like?

@brandon-rhodes
Copy link
Member

Do you have any ideas about what user code that adds dimensions could look like?

Good question — and, no. Not yet, because as I've worked through those light-time examples I've been working my way through the possibilities for the first time myself. My guess is that we'll want to make whatever (hopefully small) tweaks are necessary to get all position routines working in the n×m×p case (where there are arbitrary user-defined dimensions involved), and then add big informative messages when someone tries something like observe() that not only tell them that the dimensions don't match, but that show them the exact steps to make in their code to bring the dimensions into alignment.

I agree with you entirely that explicit dimension adjustments in user code is preferable to magic invisible dimension adjustments in Skyfield code, not only for keeping Skyfield simpler, but so user code explains itself as best it can.

I am glad you also see what might be a way forward here without for loops! I was hopeful that NumPy could be corralled into doing the right thing.

I plan to work on my design example more this week. Depending on how much free time I have, we might have a way forward ready by the end of the week!

@JoshPaterson
Copy link
Contributor Author

One thing to look out for is that functions that contain branches or loops that cause different parts of arrays to go through different paths can't be vectorized with so few changes as correct_for_light_travel_time. This is because if statements aren't able to act on parts of arrays, but switching them to boolean indexing makes them not work for scalars.

The changes in keplerlib.true_anomaly in #532 are a good example of how I've solved this problem so far by changing scalars to arrays and back to scalars after the math:

def true_anomaly(e, M, p, gm):
    return_scalar = isinstance(e, (float, float64))
    e, M, p = atleast_1d(e, M, p)
    
    # All the math here is written for arrays using boolean indexing

    return v[0] if return_scalar else v

I looked ahead at the functions that need to be vectorized for the apparent() method, and I didn't see anything that would obviously cause them to need to be rewritten for arrays. Based on that it's possible that correct_for_light_travel_time is representative of the rest of the changes that will be necessary for Skyfield to be vectorized, but we should also be aware that in some cases the changes might not be so minimal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants