-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
screencast: let the user select the output #12
Comments
It would be nice to have something like this for screenshots as well. I also think it would be good to have a way to select whether to use a rectangle, a window, or the full output (or in the case of screenshot, the full screen). |
I think would also be interesting to investigate how to offer firefox (and other webrtc clients) a list of screens to share. Usually you can select the screen in the browser itself |
@jvanbruegge I don't think this is possible without a change to the WebRTC codebase. In fact, the whole reason that chromium still hides pipewire behind a build and a feature flag is the "dialog spam" of having to do output selection once in the xdp implementation and a second time in the browser. jgrulich, I believe, has been working on reducing the dialog complexity, but I don't know if those changes landed. From an old chat with jgrulich on #pipewire:
Old (on KDE, I think): Fixed:
Ironically, we don't experience the double request issue, because we don't allow for output selection at all at the time of invocation, but this will be an issue when we do, until this patch lands. |
The big difference being that WebRTC itself used to enumerate all screens and windows, and now xdp does it. There's no mechanism in the d-bus api to pass that info to the consumer. The more I think about it, it would not even just be a patch to WebRTC, but to the definition of the xdp screencast portal too. What I'm thinking for a final selector is a fullscreen overlay similar to slurp, but one which only lets you select entire outputs. I think it should be both keyboard and mouse activated. This can be done with the layer shell protocol, but I haven't gotten around to writing anything just yet. That said, I'm open to suggestion and PRs. |
so it wouldn't be possible to capture just part of the screen? |
slurp allows you to provide a set of rectangles on stdin that the user can choose from. |
@emersion The problem with slurp is that it only outputs coordinates, and for screencasting output selection, we need the output name or ID. Otherwise I would have just done something similar to this and called it a day (but with sway-ipc).
I suppose we could get both outputs and their coordinates, invoke slurp, get coordinates back, and map it back to outputs. But, I think it'd be nice to also include a little overlay that explains what's going on, how to select an output, etc... A user invoking slurp has a better grip on the causal relationship of what they're seeing than someone who clicked "share my screen" in the browser. The crosshair cursor and greyed out selection could be a bit confusing without the context. |
@tmccombs Screenshots, OTOH, could totally have region selection, as exemplified by slurp + grim. The question is, should invoking a screenshot portal immediately invoke slurp by default? Should a user-selected region be the typical behavior? I'd be curious to see how this works in other portal implementations (gtk, kde). My guess is that the portal should just provide a full screen capture, and the consuming application should do the cropping as needed. |
For single applications buffers, yes. although this is quite unfortunate and a regression from X. However, wlroots does appear to support capturing a region with the "capture_output_region" request in the wlr-screencopy-unstable-v1 protocol. With that it would be possible to use sway ipc to get the region for a particular window, and then capture that region. Although, handling the case of the window moving during a recording would be tricky, since you would need to watch for movement of the window.
I believe gnome opens a screenshot dialog, although I don't remember exactly what the options are in the dialog. |
Yeah, the window movement issue is the blocker for this for sure. It also isn't possible at all for dmabuf based casting, which we would like to implement eventually. |
I did a very dirty hack to test that: vesim987@45d5b42 but it is not working too well, especially on chromium. It is executing the setup_outputs function three times, so slurp is executed multiple times too. I don't think that there is a good approach for that w/o fixing the WebRTC spec for allowing output selection. |
There is already some work being done on the "dialog hell" problem as mentioned here[0]. There are a set of patches that seek to improve the situation [1][2] which are currently under review, (and, as a side effect, also improves the situation for #34 by supporting alpha channel pixel formats). The bigger problem with the idea behind your "dirty hack" is its reliance on swaymsg, which makes this patch sway-specific. We want xdpw to work for all wlroots based compositors. Using the data from the output protocol keeps things agnostic. [0] https://bugs.chromium.org/p/chromium/issues/detail?id=682122#c58 |
I know, doing that with only slurp is easier, there is even a list of the outputs(ctx->output_list) available, but I didn't know about the fixes to WebRTC and I wanted to make something fast to test this. I will try to create some proper patch around the weekend. |
Hey, no criticism, you got something working and that's excellent! Just wanted to point out what I'd be thinking about if we were to implement this (I'd also probably try to avoid having jq as a dependency). Hopefully those webrtc patches are accepted soon. Thanks for taking interest! |
I'm not sure why this means you can't select a region of an output to capture rather than the entire output. |
You totally can. Finding out the coordinates of that region in a non-Sway specific way is not currently possible. We could use sway-ipc, but then this would no longer work for wayfire/hikari/river/etc. |
Isn't that exactly what slurp does? |
Not exactly, slurp is just a region selection tool. If you look at slurp's README, you'll see the example they provide for getting a window's region relies on the output of swaymsg.
|
Right, but I'm not talking about selecting a window's region, I mean just draw a rectangle on the screen that you want to capture. |
Sure, that's absolutely possible. How should we enable that? Should the region selection immediately begin any time a window capture is initiated? What happens if the window moves once the capture is initiated? We can't detect that to allow the user to make an update. Same with a workspace change. It doesn't feel like it fits the expectation of window sharing. I think there's value in implementing this experimentally, but only as a way to prepare for the availability of a protocol that can signal window geometry. |
@danshick @tmccombs And also I've updated my patch to xdg-desktop-portal-wlr: vesim987@4418f85 now it depends only on slurp. But there is another issue, the screencasting in chromium is not working, I think that there are some timeouts in the WebRTC part of chromium. EDIT: For the xdg-desktop-portal-wlr part I will wait for the WebRTC patches in chromium. |
The name event in the xdg_output protocol is only available in v2. Maybe there is a problem with that? |
I think you would need to change https://github.com/vesim987/slurp/blob/output-selection/main.c#L624 to pass version 2 instead of 1 (or maybe the minimum of 2 and the version of the global object, to be more resilient?). |
@tmccombs I created PR emersion/slurp#60 in slurp. I think it is a better place to discuss that :) |
Only tangentially related, but I wanted to save these links somewhere. This is phoc's approach to screencopy of a given window blending foreign toplevel and screencopy (they use it for thumbnails). https://source.puri.sm/Librem5/phoc/-/blob/master/src/phosh.c#L322-506 https://source.puri.sm/Librem5/phoc/-/blob/master/src/render.c#L401-474 |
I think that is suggested for wlroots her: swaywm/wlr-protocols#93 (comment). |
emersion/slurp#64 has made it possible to implement this capability with slurp. |
I've implemented an output selector (needs cleanup) using a dmenu style of chooser currently using wofi or bemenu. It currently freezes all running screencasts while selecting the output of the next one. https://github.com/columbarius/xdg-desktop-portal-wlr/tree/dmenu-chooser I just added support for slurp, which is currently hardcoded and needs emersion/slurp#64. |
The patch for a graphical output chooser is ready and needs testing #59. So if you are interested, please build xdpw from https://github.com/columbarius/xdg-desktop-portal-wlr/tree/dmenu-chooser and try it.
If you want to quit a chooser you can press ESC and the next one will launch. |
We can split the select sources function right here and stash the Instead of calling setup_outputs directly at this line, we can call an asynchronous chooser function, where the pipes have O_NONBLOCK set on them. Then we can stash the pipes alongside the This should avoid the other streams blocking while we wait for the user to select an output. Definitely don't need to get it done this PR, but I wanted to write this down while I'm thinking about it. |
@danshick Sorry in advance for a somewhat off topic request, but would you mind opening an issue for "Screencast: Let the user select application window"? I realize this is currently impossible for you to implement because of deficiencies in other libraries/protocols, but it would be great to have an issue I can watch for updates. It's currently the only thing preventing me from using Sway full time, as I have a giant monitor and I have to screencast regularly for work. |
Anyone is welcome to create an issue. |
Looks like GNOME now is showing a dialog which allows you to select the appropriate screen or window: |
Any idea how that works? |
@tmccombs not really... I'm not familiar with their codebase, but I guess it was added in flatpak/xdg-desktop-portal-gtk@0d21a34 and the main file to look at is https://github.com/flatpak/xdg-desktop-portal-gtk/blob/master/src/screencastdialog.c |
It's been around for a little while now. IIRC, they use some Gnome-specific protocols to grab window buffers. It isn't something that would be portable to wlroots. |
No, this is "window sharing" as you're thinking. Chromium did need some patches because Gnome sends a buffer the size of the full screen with "crop" info to explain where the window is in the image. The image being sent is just that window's buffer painted onto a blank background the size of the screen. This is done so that they can support window moving and resizing by allocating a frame the full size of the screen no matter what (IIRC). Our challenge isn't support within pipewire or webrtc, but rather the ability to pull a single application's buffers for all surfaces. This is a wlroots limitation. The toplevel management wayland protocol gets us part of the way there, but I don't remember all of the specifics of what is missing. |
Is the problem that there isn't a protocol to get the buffers, in which case a protocol could be created? Or are there deeper problems that would prevent such a feature from being added. Of particular interest would be being able to get a screenshot or capture stream of a window that isn't currently visible on screen. That would allow use cases such as:
Is there an existing wlroots issue for something like this? |
Issue is here: swaywm/wlr-protocols#93 |
Hey everybody I would like to share my little solution for the problem. I am using xdg-desktop-portal-wlr only for screen sharing (for example in MS Teams). For selecting the correct output I am running this script here right now:
The script just catches all connected monitors, shows them via bemenu, let me select one monitor and then it respawns the xdg-desktop-portal-wlr process with the correct output. This is far from being perfect, I know. I would prefer to have the possibility to choose the correct output directly in the browser, but it solves the problem for me right now. |
@shibumi PR #59 implements a way for xdpw to invoke dmenu style programs or slurp to choose an output for each screensharing request without restarting xdpw. If you have the inclination and ability to compile @columbarius's branch yourself, helping to test that PR would be welcomed. |
FWIW, I have been tracking @columbarius' branch and using it in production for many months, and it has been working well for me, with both Firefox and (ungoogled) Chromium. |
Thanks for the feedback! Good to know it has been stable for you. |
Right now we have a CLI option, which is less than ideal. Replace this with a basic UI, probably slurp-based.
xdg-output can be used to obtain output geometry in logical coordinates. A list of rectangles can be fed to slurp on stdin.
The text was updated successfully, but these errors were encountered: