Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tidy up Telnet and other documentation #3116

Closed
TerryE opened this issue May 18, 2020 · 12 comments
Closed

Tidy up Telnet and other documentation #3116

TerryE opened this issue May 18, 2020 · 12 comments

Comments

@TerryE
Copy link
Collaborator

TerryE commented May 18, 2020

I will move a series of posts by @pnkowl from #3078 (which is discussion about the error handling architecture in the Lua runtime) here. These posts relate to documentation issues.

@pnkowl, thanks for your contributions, but can you please follow our guidelines, and avoid hijacking an issue or PR with off-topic discussions. New topic = new issue. Thank-you. We would normally delete such offf-topic posts, but in this case since this is a lot of content, I've moved them here.

@TerryE
Copy link
Collaborator Author

TerryE commented May 18, 2020

3 days ago @pnkowl wrote:

  • node.startupcommand() added

  • several spelling fixes

Given the above discussion of error handling, would it not be appropriate to add additional clarity to the module reference documentation? When I read this,

Redirects the Lua interpreter output to a callback function. Optionally also prints it to the serial
console.

should I not believe what is says? There is no fine print, no mention or reference is made to what really happens.

I did search for "telnet" in issues.

I believe I had already "read" this post (#3078), and it did not seem to apply -- that is to say, how could a module like telnet be offered without the caveat stated boldly that, in too many cases, errors will not be show?

How could node.output() be offered without some referece to this issue?

Now I certainly understand that fixing an issue is way more satisfying than documenting where things fall short of our expectations, but I believe that the documentation should accurately reflect real life behavior, and not what we wish or for what we have plans. How to avoid the tl;dr? Perhaps have the module/method documentation page contain "td;lr" link(s) to separate pages which contain the caveats?

BTW, I do not feel that pointing telnet and node.output() to this page satisfies the need. Telnet, for example, might benefit, by having a description of each class of errors and whether they are printed or not, and an example of each class. Alternatively, a companion pcall wrapper (executes the pcall, then manages the error handling/printing) could be referenced. I could take a crack at all of these in isolation, but the current style of docs does not cater to this kind of thing... so more experienced folks should guide...

This all being said, the difference between having telnet in its present form versus not having it at all, is day and night.

@TerryE
Copy link
Collaborator Author

TerryE commented May 18, 2020

@TerryE wrote in reply

@pnkowl, time to do an ow-oooom or whatever. If you search the Issue and PR history then you will find 9 open and 57 closed threads which raise and discuss various aspects of this. Sorting out this sort of architectural feature is hard. Doing excellent documentation is hard. We (and this means mostly me in this case) have been sorting these step by step, and that involves at times up to maybe 8 hrs development effort a day unpaid and pro bono.

An open issue is where we collaborate and get feedback when identifying and fixing issues. The PR is where we post the change, and the "The code changes are reflected in the documentation at docs/*" checkbox is there for a reason.

In this case, the early implementations of output redirection were a bit of a botch. The runtime core handled redirection of stdout but stderr was just sent straight to the UART. Remember that using the net interface relies on heavy use of callbacks, so a lot of other features had to be put in place so that stderr content could also be reliably sent to the redirector.

This is community project, so all I can suggest here is that once the PR has been integrated into dev, then review the documentation and if you feel that you could improve this then raise the issue / PR and do this yourself.

@TerryE
Copy link
Collaborator Author

TerryE commented May 18, 2020

2 days ago @pnkowl wrote:

@pnkowl, time to do an ow-oooom or whatever. If you search the Issue and PR history then you will find 9 open and 57 closed threads which raise and discuss various aspects of this. Sorting out this sort of architectural feature is hard. Doing excellent documentation is hard. We (and this means mostly me in this case) have been sorting these step by step, and that involves at times up to maybe 8 hrs development effort a day unpaid and pro bono.

An open issue is where we collaborate and get feedback when identifying and fixing issues. The PR is where we post the change, and the "The code changes are reflected in the documentation at docs/*" checkbox is there for a reason.

In this case, the early implementations of output redirection were a bit of a botch. The runtime core handled redirection of stdout but stderr was just sent straight to the UART. Remember that using the net interface relies on heavy use of callbacks, so a lot of other features had to be put in place so that stderr content could also be reliably sent to the redirector.

This is community project, so all I can suggest here is that once the PR has been integrated into dev, then review the documentation and if you feel that you could improve this then raise the issue / PR and do this yourself.

then review the documentation and if you feel that you could improve this then raise the issue / PR and do this yourself.

Summary

  • Whether it be dedicated volunteers or folks getting paid, rarely are things as we would like. I am okay with "a bit of a botch" as long as the core reference docs reflect the actual state of the system.

  • in the dev branch docs, the node.output() "caution" not to print within the callback, is gone.

  • in the master branch docs, the caution is present

  • should it be expected that print is allowed in the callback now? Based on what I see above, that seems unlikely (but what do I know)

  • when doing some pcall investigation, I had the urge to add a print within the callback, but being forwarned, I did not (thanks)

  • why take time to remove a caveat? Why remove a helpful thought that may lead to a useful connection?

  • if I were to be making PRs that have meaning to me, they would be to add more clarity, more hyperlinks, more examples for those that come next and have no pretentions of becoming experts.

  • before investing much time, it would be comforting to know that at least my sensibilities are aligned with the overall docs and development strategy. I do not want to be in conflict over this. At present, I do not get warm fuzzies that this type of PR would be kindly received.

Details follow (tl;dr)

So this is what I have proactively done as a result of the offered suggestion.

I decided that the PR should contain

  • an additional "caution" regarding stderr in node.output of the docs and perhaps a helper function utilizing pcall that would surface the errors with the least extra typing and complication.

  • a hyperlink to Error Handling in Lua  #3078 as the connection to more information

I then headed out to the docs (and started in master), did the fork, found the location to anchor the changes then realized dev is the place for PRs

So off to dev branch docs

Guess what?

  • the "caution" block is gone
  • the telnet example is gone
  • reference to the telnet.lua exists with no hyperlink

With my limited skills I wade through History, Blame and Raw and can find no explanation as to why the "caution" block is gone (but my skills are limited).

This is not the only occurance of this type of removal in the node module.

Perhaps a print in the callback is allowed now. So

  • I try "print in the callback" in master and it panics. So far, so good.
  • I setup to try dev branch and telnet is broken (without LFS too), so I update issue Telnet Broken #3108

But the basic question still remains. How do new folks (or any folks for that matter) attach the tips, tricks, wisdom, and clarity where it is most likley to create meaningful connections from problems to solutions, especially for those who are, in fact, not experts, when it appears that at least some of these very connections are being systematically removed?

@marcelstoer replied

You can never have too much documentation. So, any correct and meaningful additions are welcome.

it appears that at least some of these very connections are being systematically removed?

Thanks for bringing it to our attention but what makes you think they are systematically removed? How about (documentation) bugs that just went unnoticed so far?

Git blame on node.md (for node.output()) leads straight to Terry's "Lua 5.1 to 5.3 realignement" PR #2836. It changed over a thousand lines of code and documentation.

14c1b8f#diff-45fdccb9fe7b1f8ae504702b0a034d6e is the specific diff for the dropped admonition block. Whether the removal was intentional or not only Terry can explain.

@TerryE replied:

In previous versions of NodeMCU, node.output() was calling was calling the Lua core recursively and there are internal health warnings not to do this, so it was a miracle that it worked at all. Now output spools to a pipe which is emptied on a subsequent task which is posted to empty the pipe. Actually printing inside a output CB does work, though this can easily cause a recursion loop and this will exhaust memory, so yes we probably should have kept the warning. Our challenge is that we have a team of 1 doing all of this work.

BTW I was referring to Tibetan OM chanting -- a chill-out joke.

@TerryE
Copy link
Collaborator Author

TerryE commented May 18, 2020

2 days ago @pnkowl wrote:

@ marcelstoer,

what makes you think they are systematically removed?

Okay, so maybe those could be interpreted as "fighting words". Let me try again.

I see the core documentation as having 3 basic functions (1) to tell it like it is (2) to bridge from the past (3) to foster the interconnection of all the disparate pieces, all using the best technology available.

It seemed to me, that the dev branch documentation had greater brevity than master. This is not a bad thing in itself, but it seemed to me that important connections had been lost. It also seems that there were places where notes and cautions disappeared. Notes and cautions are the evidence that blood has been spilled and IMO should not be discarded without cause or more blood will be spilled in the future.

So to be objective about this, let me take a brief walk down through the node module

Differences

  • node.input(). removed: Attention, This function only has an effect when invoked from a callback. Using it directly on the console does not work.

  • node.output() removed this Caution. Do not attempt to print() or otherwise induce the Lua interpreter to produce output from within the callback function. Doing so results in infinite recursion, and leads to a watchdog-triggered restart.

  • node.output() materially changed in a non backward compatible fashion

  • telnet example removed and a non linked reference added.

  • node.startupcommand() added

  • several spelling fixes

Please understand that I can be very directed in my activities. Although it seems obvious now that node.output is significantly changed, it was not obvious hours ago. It really should be made obvious.

Additionally, it would be helpful to have a set of actions required by end users when the new master appears, and for those attempting dev now. It is not the same old thing but will, if all is done right, possibly break your application. I have not looked very hard, but does this list exist?

Verdict: "systematic" now seems over the top given the evidence presented.

If requested, I could review all the docs, however, copy/paste/diff is likely not the most efficient way to accomplish this. That being said, this seemed to yield a much smaller set than I expected

It changed over a thousand lines of code and documentation.

Blame seems to have many more differences (as seen by the untrained), but that may be the nature of the beast (not side by side, but more a classic unix diff)

plus an addendum:

@TerryE

Our challenge is that we have a team of 1 doing all of this work.

So how do we change that?

Most of us have neither the skill, breadth of knowledge, nor the temperament to do the high value tasks. That being said, I believe there must be a path to a system state where a non trivial number of time consuming and necessary tasks can be carved out that mortals can do, without taking more of your time, hearding all the new volunteer cats trying to help, than is saved,

Where does a volunteer start? For example, I believe I know how to author a PR to put the cautions back and find the new telnet sample and make the link.

@TerryE
Copy link
Collaborator Author

TerryE commented May 18, 2020

Going through this content and distilling it to the essence:

  • Somehow the file names for lua_examples/telnet have got out of sync and telnet.lua is called telnet_pipe.lua. This breaks the documentation links to lua_examples/telnet/telnet.lua and this needs fixing.
  • telnet.lua is documented as per our conventions in the lua_examples/telnet\README.md. Maybe we should move this to lua_modules.
  • telnet.lua and the node.ouput() was changed to use the new Pipe interface, as was documented in the 2.2.1-master_20190405 release notes. Maybe we should add a "compatibility break" caveat to the node documentation.
  • I can't find any references to the obsolete version of the telnet code and as far as I can see apart from the name mix-up for telnet.lua everything here is consistent. I have no control over use of obsolete versions outside this project repository.
  • The reason that the node.input() caution was removed is that the limitation was removed. Try node.input('return "This works"\n') and it works fine. (Though there is an issue that the line must be CR terminated.)
  • The caution on printing in the node.output() was removed because this is no longer handled as a C stack iteration, but rather by adding to the stdout pipe, though yes this can out-of-memory so maybe we should keep it.

So AFAIK, this thread points to a couple of minor documentation changes and one example file rename.

@TerryE
Copy link
Collaborator Author

TerryE commented May 18, 2020

See closure of #3018

@TerryE TerryE closed this as completed May 18, 2020
@pnkowl
Copy link

pnkowl commented May 20, 2020

telnet.lua and the node.ouput() was changed to use the new Pipe interface, as was documented in the 2.2.1-master_20190405 release notes. Maybe we should add a "compatibility break" caveat to the node documentation.

A github search for "2.2.1-master_20190405" and did not find anything. I then did a google search and found this link
https://github.com/nodemcu/nodemcu-firmware/tree/2.2.1-master_20190405
Is this the page you were referenching?

I can't find any references to the obsolete version of the telnet code and as far as I can see apart from the name mix-up for telnet.lua everything here is consistent

telnet_fifosock.lua still in
https://github.com/nodemcu/nodemcu-firmware/tree/dev/lua_examples/telnet

I would give all the particulars, but should the example be in dev branch at all? So no reason to investigate the E:M below
`

print(wifi.sta.getip())
192.168.1.17 255.255.255.0 192.168.1.1
pnksrv=dofile("telnet_fifosock.lua").open()
Telnet server started (40712 mem free, 192.168.1.17)
=pnksrv
nil

after close of putty

=node.heap()
E:M 24592
`

@TerryE
Copy link
Collaborator Author

TerryE commented May 20, 2020

I ask @nwf to have a look at whether we should remove the telnet_fifosock.lua as this still assumes that the node.output() CB takes a string and not a pipe.

@nwf
Copy link
Member

nwf commented May 20, 2020

Yea, given the existence of pipe, telnet_fifosock.lua and possibly fifosock itself should be removed. fifo is more generically useful, tho'.

@TerryE
Copy link
Collaborator Author

TerryE commented May 20, 2020

I agree that fifo is generally useful and the module is also a great implementation example, but the pipe implementation has moved a subset of this functionality into C and its runtime resource use is less -- if your needs fit within its limitations, which they do in the telnet case. Let use proceed on this basis.

@HHHartmann
Copy link
Member

HHHartmann commented May 21, 2020

May I bring to your attention that https://github.com/nodemcu/nodemcu-firmware/blob/master/lua_modules/http/httpserver.lua requires fifosock.

@TerryE
Copy link
Collaborator Author

TerryE commented May 21, 2020

Ok, maybe we need to check whether it still works and raise a separate issue if necessary. I also note that the post handler doesn't use hold/unhold to large posts will overrun the net stack.

PS. Forget the last sentence TLS will shag is first. 🤣

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants