wrong stackless.current in schedule callback #42

ghost · 2013-12-27T15:28:25Z

Originally reported by: Anselm Kruis (Bitbucket: akruis, GitHub: akruis)

stackless.run() switches from the main tasklet to another tasklet T. If a schedule callback is invoked for this switch, the stackless.current is T. This is incorrect. Inspection of the code in slp_schedule_task_prepared() reveals, that Stackless calls the schedule callback before the switch.

I added a test case and I also have a preliminary fix.

Bitbucket: https://bitbucket.org/stackless-dev/stackless/issue/42

ghost · 2013-12-27T16:01:21Z

Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis):

The test case is test_sched_cb.SchedulingCallbackTestCase.test2currentWithPrevAndNext, commit a70cc6e86ee1.

The preliminary fix is in commit 4878b93d5294. I'm not completely sure that this the right way to fix this problem. I would prefer a solution that avoids a wrong value of current altogether.

This bug renders the schedule callback almost useless, because many tasklet methods depend on the correct value of ts->st.current.

ghost · 2014-01-03T12:06:18Z

Original comment by Kristján Valur Jónsson (Bitbucket: krisvale, GitHub: kristjanvalur):

I don't think this is a bug. When switching, there is no such thing as a "current" tasklet. This is why the "old" and the "new" are provided as parameters. Whether the callback is called before or after the actual switch is in fact unspecified and an implementation detail.

In facts.stackless.get_current() should probably return None in this case :)

ghost · 2014-01-03T12:54:33Z

Original comment by Anonymous:

anselm: "This bug renders the schedule callback almost useless, because many tasklet methods depend on the correct value of ts->st.current."

Why do you think the schedule callback is rendered useless?
Do you want to call tasklet methods while switching?
But for that the prev and next arguments are explicitly passed.

We have no real notation what is current during a switch, and I fear
the behavior may be even slightly different between hard- and soft- switching.
I'm in favor of returning None, too, or even give a warning.

Not sure if it is important to define the exact phase when the callback has to happen.
Do you think this is important?

ghost · 2014-01-04T22:26:03Z

Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis):

The problem is not the value of the Python expression
stackless.current. As Kristján correctly remarked we have the arguments prev and next. But in the current implementation the callback is called before the switch happens. In some cases the value of PyThreadState_GET()->st.current is tasklet "next" whereas other members - especially frame - of PyThreadState_GET() still belong to tasklet "prev". This leads to wrong results for all sorts of tasklet methods and computed attributes. Every method or computed attribute that contains a test of the form

ts = task->cstate->tstate;
if (ts->st.current == task) {
    /* tasklet is current, use values from *ts */
    ...
} else {
    /* tasklet is not current */
   ...
}

can return wrong results. Without this bug being fixed, application code can't reliable use code like
next.tracing_function = prev.tracing_function in the schedule callback.

I don't care when the callback gets called. I don't care, if stackless.current is prev or next, but I do care that the C-language data structures are in a consistent state when Stackless executes Python code.

I hope, you all agree, that this is a bug. The open question is, how to correctly fix this problem?

ghost · 2014-01-05T00:31:06Z

Original comment by Anonymous:

Whether stackless.current or ts->st.current, this is undefined during a switching callback.

Code that uses stackless.current or ts->st.current during a switch is wrong code.
If that code relies upon functions which depend on these, is wrongly designed.

The callback was always limited and just for monitoring use, not meant
for any modification. I was always against the existence of these functions,
and after long negotiations, I gave in after it was sworn to me that they would never
be abused.

Your test case breaks this agreement. This shows me that it was an error to rely on
promises from long ago, because the rules are forgotten, and I now get test cases
that are testing things that were never meant to work.

I am sorry but do not agree that this is a bug. The bug is that we supported this in the
first place without securing it from abuse. My preferences, highest score first:

remove that problem by removing that functionality completely
let those who requested the functionality propose and implement a cleaner solution (namely CCP).
In addition, replacing the various "current" during a switch callback by "None" makes sense.

If this functionality should continue to exist in a fool-proof way, then a lot more needs
to be done: To make this correct, the stackless machinery needs to disable any modification of the control flow. The way it is right now, anything could happen in a
channel-callback, because it i also exposed to the C level.

So I propose to discard the channel-callback altogether and come up with a good
design proposal that does not expose everything and all of the implementation,
it should be hidden instead.

ghost · 2014-01-05T02:08:27Z

Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis):

Two points:

The schedule callback is a documented function and its limitations (you must not invoke tasklet examine the arguments) are neither documented not logical. I was not aware of any agreements to use the schedule callback for monitoring only.
If you discard the schedule callback, then I (s+c) need a replacement with a similar functionality for my main application too. Same for PyDev.

ghost · 2014-01-05T18:46:59Z

Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis):

About schedule callbacks of the form "function(prev, next)"

Scheduling is always a two step process: decision and switch.
In the decision phase a tasklet gets selected. In the switch phase, the execution switches from the previous to the next tasklet.
Obviously there are two possible locations for a schedule callback: between decision and switch - that is the current implementation - or after the switch.

Now we are unhappy about the current callback, because the "current tasklet" is not well defined in our implementation but many C-functions depends on it.
We have at least two constructive options:

Either we keep the scheduling callback at its current position and define that the current tasklet is always the previous tasklet. (This is what my patch does.). This seems to work well for monitoring or as a debugger hook, but I guess there lurk many subtle problems.
Or we move the callback behind the switch phase. Here scheduling is over and we have a well defined situation. The implementation might be a bit tricky.

The more I reason about it, the more I prefer option 2.

ghost · 2014-01-05T20:57:56Z

Original comment by RMTEW FULL NAME (Bitbucket: rmtew, GitHub: rmtew):

Saying that the documentation doesn't specify that the usage you want to get out of the scheduling callback, isn't really a valid reason to make it work for what you want.

I understand you have a use case, which you and the company you work for wants to use this for, perhaps there's a way to do what you want and to leave the scheduling callback as is?

ghost · 2014-01-05T23:43:44Z

Original comment by Anonymous:

I am with Richard here: What is the real reason why you need to change this so urgently?

It looks like you want to use the callback for modifications, which I would prefer to explicitly forbid and even prevent. Otherwise I cannot understand the high priority
to have "current" always defined.

First I thought you want to (ab)use the callback to have a chance to cancel a switch.

Now that your preferred second version goes completely after the switch destroyed my theory. ;-)

The reason why we ask so much is a bit the fear that you want to add some constraints which are not clear now, and to force something to become defined that was not defined. That needs to have a clear advantage for the project, and making the rules
and new constraints harder to maintain.

Is your proposal really optimum, in every aspect that matters?

Is there a balance between effort, maintainability,
is the code less or more tricky,
is the advantage of a change only on your side or do we all perceive this as "better"?
are new rules and constraints an advantage, or can they strike back because they complicate trying other things?

That consideration goes first, before considering a non-trivial change for no obvious reason.

ghost · 2014-01-06T11:44:22Z

Original comment by Kristján Valur Jónsson (Bitbucket: krisvale, GitHub: kristjanvalur):

On the other hand, Anselm does make a valid point:
during the callback, tstate->frame is not the frame of tstate->st.current.
This means that the c runtime data structures are not in a consistent state which could cause problems for any query functions.

For example, assert(sys._getframe() == stackless.getcurrent().frame) would fail.

Fixing this to have a consistent internal state, while not specifying what the current state is one way or the other, is probably a good thing.

ghost · 2014-01-06T12:14:43Z

Original comment by Anonymous:

Ok then let's go for it, if it has no implicit strings attached ;-)

ghost · 2014-01-07T12:50:18Z

Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis):

Hi Christian,

my proposal to move the callback after the switch is an attempt to make a constructive reaction to your "remove that problem by removing that functionality completely".

I'm fairly sure that my proposal is conceptionally clean, but I don't know, if it would fit the requirements of CCP. It is also obvious, that such a change is not trivial to implement and that it is probably to late for the current release cycle. But if we agree that a callback after the switch is conceptionally clean and that such a callback could suite our needs, then I can try to code a patch. Having a patch, we can discuss its merits and drawbacks more thoroughly.

ghost · 2014-01-07T17:32:50Z

Original comment by Anonymous:

Yes, sorry about my reaction.
My problem was that I still dislike these functions. They are minimally implemented
right now.

I agree that some consistency is worth some effort, if that consistency is needed.

But again, this is about balancing:

how important is the feature to whom
how much does tightening the constraints for a consistent behavior complicate the implementation
why should any "current" be defined at all.
I see no advantage in either a pre- or a post- switch right now.

If I would implement a better alternative, then this would be like the following, and that I would actually like:

The original intent of these callbacks is to signal """a transfer will happen from src to dst""". We already have that. A clean way to implement that without any constraint would IMHO be to split the whole functionality into two steps:
The """a transfer will happen from src to dst""" info is recorded during the transfer, with the same simplicity as today,
a callback is run late after that and just picks this info up, but in a context where "current" is consistently defined, again. That post-call would happen when the channel action or the tasklet schedule returns. There should the real callback be triggered.

So my proposal is to leave the logic as it is right now, but no longer call a callback, but to put the arguments somewhere.

Then we have a very defined context when to pick these arguments up and run a callback. And this time, it is clean and without any constraints.
This version is also allowed to raise exceptions and to do whatever, because we are in the right context now.

If the current behavior is important for CCP, we also could keep the current code and add the "callback-after-the-switch" as another callback to install.

cheers - chris

ghost · 2014-01-08T07:40:55Z

Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis):

Christian, you proposal is a possible implementation of my "callback after the switch" concept. :-)

ghost · 2014-01-08T07:56:32Z

Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis):

I mark this bug-report as resolved, because it is fixed (4878b93d5294) and we agreed, that this fix is OK for now. The discussion about a cleaner implementation of the schedule callback should be continued in issue #48.

ghost · 2016-11-06T20:20:53Z

Original comment by Anselm Kruis (Bitbucket: akruis, GitHub: akruis):

Removing milestone: 2.7.6-slp (automated comment)

ghost added major bug 2.7-slp stackless python labels Sep 24, 2017

ghost closed this as completed Sep 24, 2017

ghost mentioned this issue Sep 24, 2017

Clean implementation of the schedule callback #48

Closed

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wrong stackless.current in schedule callback #42

wrong stackless.current in schedule callback #42

ghost commented Dec 27, 2013

ghost commented Dec 27, 2013

ghost commented Jan 3, 2014

ghost commented Jan 3, 2014

ghost commented Jan 4, 2014

ghost commented Jan 5, 2014

ghost commented Jan 5, 2014

ghost commented Jan 5, 2014

ghost commented Jan 5, 2014

ghost commented Jan 5, 2014

ghost commented Jan 6, 2014

ghost commented Jan 6, 2014

ghost commented Jan 7, 2014

ghost commented Jan 7, 2014

ghost commented Jan 8, 2014

ghost commented Jan 8, 2014

ghost commented Nov 6, 2016

wrong stackless.current in schedule callback #42

wrong stackless.current in schedule callback #42

Comments

ghost commented Dec 27, 2013

ghost commented Dec 27, 2013

ghost commented Jan 3, 2014

ghost commented Jan 3, 2014

ghost commented Jan 4, 2014

ghost commented Jan 5, 2014

ghost commented Jan 5, 2014

ghost commented Jan 5, 2014

ghost commented Jan 5, 2014

ghost commented Jan 5, 2014

ghost commented Jan 6, 2014

ghost commented Jan 6, 2014

ghost commented Jan 7, 2014

ghost commented Jan 7, 2014

ghost commented Jan 8, 2014

ghost commented Jan 8, 2014

ghost commented Nov 6, 2016