Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

screencast: let the user select the output #12

Closed
emersion opened this issue Mar 13, 2020 · 44 comments · Fixed by #59
Closed

screencast: let the user select the output #12

emersion opened this issue Mar 13, 2020 · 44 comments · Fixed by #59
Labels
enhancement New feature or request

Comments

@emersion
Copy link
Owner

emersion commented Mar 13, 2020

Right now we have a CLI option, which is less than ideal. Replace this with a basic UI, probably slurp-based.

xdg-output can be used to obtain output geometry in logical coordinates. A list of rectangles can be fed to slurp on stdin.

@emersion emersion added the enhancement New feature or request label Mar 13, 2020
@emersion emersion changed the title screencast: let the user select outpu screencast: let the user select the output Mar 13, 2020
@tmccombs
Copy link

Replace this with a basic UI, probably slurp-based.

It would be nice to have something like this for screenshots as well. I also think it would be good to have a way to select whether to use a rectangle, a window, or the full output (or in the case of screenshot, the full screen).

@jvanbruegge
Copy link

I think would also be interesting to investigate how to offer firefox (and other webrtc clients) a list of screens to share. Usually you can select the screen in the browser itself

@danshick
Copy link
Collaborator

danshick commented May 12, 2020

@jvanbruegge I don't think this is possible without a change to the WebRTC codebase. In fact, the whole reason that chromium still hides pipewire behind a build and a feature flag is the "dialog spam" of having to do output selection once in the xdp implementation and a second time in the browser. jgrulich, I believe, has been working on reducing the dialog complexity, but I don't know if those changes landed.

From an old chat with jgrulich on #pipewire:

originally, I had a patch that it didn't show the chromium dialog, it immediately showed the portal dialog and only if you cancelled it, then the Chrome dialog appears with option to share a tab, but Chromium developers didn't like it and they wanted to keep the Chromium dialog as it is

on Chromium side, I merged "Screen" and "Window" tabs into one, because we don't know what the user will choose and it doesn't make sense to create two capturer classes, then once you pick something in the portal dialog, it will show preview in the Chromium dialog and user just needs to confirm it, then you shouldn't get the portal dialog again, but you should see the screen cast content directly on the web page

Old (on KDE, I think):

https://jgrulich.fedorapeople.org/chromium-dialog-hell.webm

Fixed:

https://jgrulich.fedorapeople.org/chromium-sharing-screen.webm

Ironically, we don't experience the double request issue, because we don't allow for output selection at all at the time of invocation, but this will be an issue when we do, until this patch lands.

@danshick
Copy link
Collaborator

The big difference being that WebRTC itself used to enumerate all screens and windows, and now xdp does it. There's no mechanism in the d-bus api to pass that info to the consumer. The more I think about it, it would not even just be a patch to WebRTC, but to the definition of the xdp screencast portal too.

What I'm thinking for a final selector is a fullscreen overlay similar to slurp, but one which only lets you select entire outputs. I think it should be both keyboard and mouse activated. This can be done with the layer shell protocol, but I haven't gotten around to writing anything just yet.

That said, I'm open to suggestion and PRs.

@tmccombs
Copy link

a fullscreen overlay similar to slurp, but one which only lets you select entire outputs

so it wouldn't be possible to capture just part of the screen?

@emersion
Copy link
Owner Author

a fullscreen overlay similar to slurp, but one which only lets you select entire outputs

slurp allows you to provide a set of rectangles on stdin that the user can choose from.

@danshick
Copy link
Collaborator

danshick commented May 13, 2020

@emersion The problem with slurp is that it only outputs coordinates, and for screencasting output selection, we need the output name or ID. Otherwise I would have just done something similar to this and called it a day (but with sway-ipc).

swaymsg -t get_outputs | jq -r '.[] | select(.active) | .rect | "\(.x),\(.y) \(.width)x\(.height)"' | slurp

I suppose we could get both outputs and their coordinates, invoke slurp, get coordinates back, and map it back to outputs. But, I think it'd be nice to also include a little overlay that explains what's going on, how to select an output, etc... A user invoking slurp has a better grip on the causal relationship of what they're seeing than someone who clicked "share my screen" in the browser. The crosshair cursor and greyed out selection could be a bit confusing without the context.

@danshick
Copy link
Collaborator

@tmccombs
Screen casting will (likely) never handle anything other than entire outputs. There's no mechanism in wlroots for getting only a single application's buffer.

Screenshots, OTOH, could totally have region selection, as exemplified by slurp + grim. The question is, should invoking a screenshot portal immediately invoke slurp by default? Should a user-selected region be the typical behavior? I'd be curious to see how this works in other portal implementations (gtk, kde). My guess is that the portal should just provide a full screen capture, and the consuming application should do the cropping as needed.

@tmccombs
Copy link

Screen casting will (likely) never handle anything other than entire outputs.There's no mechanism in wlroots for getting only a single application's buffer.

For single applications buffers, yes. although this is quite unfortunate and a regression from X. However, wlroots does appear to support capturing a region with the "capture_output_region" request in the wlr-screencopy-unstable-v1 protocol. With that it would be possible to use sway ipc to get the region for a particular window, and then capture that region. Although, handling the case of the window moving during a recording would be tricky, since you would need to watch for movement of the window.

I'd be curious to see how this works in other portal implementations (gtk, kde)

I believe gnome opens a screenshot dialog, although I don't remember exactly what the options are in the dialog.

@danshick
Copy link
Collaborator

Yeah, the window movement issue is the blocker for this for sure. It also isn't possible at all for dmabuf based casting, which we would like to implement eventually.

@vesim987
Copy link

I did a very dirty hack to test that: vesim987@45d5b42 but it is not working too well, especially on chromium. It is executing the setup_outputs function three times, so slurp is executed multiple times too. I don't think that there is a good approach for that w/o fixing the WebRTC spec for allowing output selection.

@danshick
Copy link
Collaborator

There is already some work being done on the "dialog hell" problem as mentioned here[0]. There are a set of patches that seek to improve the situation [1][2] which are currently under review, (and, as a side effect, also improves the situation for #34 by supporting alpha channel pixel formats).

The bigger problem with the idea behind your "dirty hack" is its reliance on swaymsg, which makes this patch sway-specific. We want xdpw to work for all wlroots based compositors. Using the data from the output protocol keeps things agnostic.

[0] https://bugs.chromium.org/p/chromium/issues/detail?id=682122#c58
[1] https://webrtc-review.googlesource.com/c/src/+/160649
[2] https://chromium-review.googlesource.com/c/chromium/src/+/1942327

@vesim987
Copy link

vesim987 commented Aug 12, 2020

@danshick

The bigger problem with the idea behind your "dirty hack" is its reliance on swaymsg, which makes this patch sway-specific. We want xdpw to work for all wlroots based compositors. Using the data from the output protocol keeps things agnostic.

I know, doing that with only slurp is easier, there is even a list of the outputs(ctx->output_list) available, but I didn't know about the fixes to WebRTC and I wanted to make something fast to test this. I will try to create some proper patch around the weekend.

@danshick
Copy link
Collaborator

@danshick

The bigger problem with the idea behind your "dirty hack" is its reliance on swaymsg, which makes this patch sway-specific. We want xdpw to work for all wlroots based compositors. Using the data from the output protocol keeps things agnostic.

I know, doing that with only slurp is better, there is even a list of the outputs(ctx->output_list) available, but I didn't know about the fixes to WebRTC and I wanted to make something fast to test this. I will try to create some proper patch around the weekend.

Hey, no criticism, you got something working and that's excellent! Just wanted to point out what I'd be thinking about if we were to implement this (I'd also probably try to avoid having jq as a dependency).

Hopefully those webrtc patches are accepted soon.

Thanks for taking interest!

@tmccombs
Copy link

Screen casting will (likely) never handle anything other than entire outputs. There's no mechanism in wlroots for getting only a single application's buffer.

I'm not sure why this means you can't select a region of an output to capture rather than the entire output.

@danshick
Copy link
Collaborator

Screen casting will (likely) never handle anything other than entire outputs. There's no mechanism in wlroots for getting only a single application's buffer.

I'm not sure why this means you can't select a region of an output to capture rather than the entire output.

You totally can. Finding out the coordinates of that region in a non-Sway specific way is not currently possible. We could use sway-ipc, but then this would no longer work for wayfire/hikari/river/etc.

@tmccombs
Copy link

Finding out the coordinates of that region in a non-Sway specific way is not currently possible.

Isn't that exactly what slurp does?

@danshick
Copy link
Collaborator

Isn't that exactly what slurp does?

Not exactly, slurp is just a region selection tool. If you look at slurp's README, you'll see the example they provide for getting a window's region relies on the output of swaymsg.

swaymsg -t get_tree | jq -r '.. | select(.pid? and .visible?) | .rect | "\(.x),\(.y) \(.width)x\(.height)"' | slurp

@tmccombs
Copy link

Right, but I'm not talking about selecting a window's region, I mean just draw a rectangle on the screen that you want to capture.

@danshick
Copy link
Collaborator

Sure, that's absolutely possible. How should we enable that? Should the region selection immediately begin any time a window capture is initiated? What happens if the window moves once the capture is initiated? We can't detect that to allow the user to make an update. Same with a workspace change. It doesn't feel like it fits the expectation of window sharing.

I think there's value in implementing this experimentally, but only as a way to prepare for the availability of a protocol that can signal window geometry.

@vesim987
Copy link

vesim987 commented Aug 15, 2020

@danshick @tmccombs
So, I created vesim987/slurp@a7d33db for slurp, but I have no idea how to get the output name. xdg_output_unstable_v1.name for some reasons it not working in latest sway, so I hardcoded the output name to eDP-1 for testing.

And also I've updated my patch to xdg-desktop-portal-wlr: vesim987@4418f85 now it depends only on slurp. But there is another issue, the screencasting in chromium is not working, I think that there are some timeouts in the WebRTC part of chromium.

EDIT: For the xdg-desktop-portal-wlr part I will wait for the WebRTC patches in chromium.

@tmccombs
Copy link

tmccombs commented Aug 16, 2020

but I have no idea how to get the output name. xdg_output_unstable_v1.name for some reasons it not working in latest sway

The name event in the xdg_output protocol is only available in v2. Maybe there is a problem with that?

@tmccombs
Copy link

I think you would need to change https://github.com/vesim987/slurp/blob/output-selection/main.c#L624 to pass version 2 instead of 1 (or maybe the minimum of 2 and the version of the global object, to be more resilient?).

@vesim987
Copy link

@tmccombs I created PR emersion/slurp#60 in slurp. I think it is a better place to discuss that :)

@danshick
Copy link
Collaborator

Only tangentially related, but I wanted to save these links somewhere. This is phoc's approach to screencopy of a given window blending foreign toplevel and screencopy (they use it for thumbnails).

https://source.puri.sm/Librem5/phoc/-/blob/master/src/phosh.c#L322-506

https://source.puri.sm/Librem5/phoc/-/blob/master/src/render.c#L401-474

@tmccombs
Copy link

I think that is suggested for wlroots her: swaywm/wlr-protocols#93 (comment).

@danshick
Copy link
Collaborator

emersion/slurp#64 has made it possible to implement this capability with slurp.

@columbarius
Copy link
Collaborator

I've implemented an output selector (needs cleanup) using a dmenu style of chooser currently using wofi or bemenu. It currently freezes all running screencasts while selecting the output of the next one. https://github.com/columbarius/xdg-desktop-portal-wlr/tree/dmenu-chooser

I just added support for slurp, which is currently hardcoded and needs emersion/slurp#64.

@columbarius
Copy link
Collaborator

The patch for a graphical output chooser is ready and needs testing #59. So if you are interested, please build xdpw from https://github.com/columbarius/xdg-desktop-portal-wlr/tree/dmenu-chooser and try it.
Currently there are 3 default chooser supported in this order:

If you want to quit a chooser you can press ESC and the next one will launch.
While choosing an output all streams are paused, so please report if that's an issue.
It's structurally setup to support custom chooser's later.
So please test it and report any errors you encounter to #59.

@danshick
Copy link
Collaborator

While choosing an output all streams are paused, so please report if that's an issue.

We can split the select sources function right here and stash the msg in xdpw_state and move the reply portion that comes after that line to the setup_outputs function.

Instead of calling setup_outputs directly at this line, we can call an asynchronous chooser function, where the pipes have O_NONBLOCK set on them. Then we can stash the pipes alongside the msg pointer watch the pipe FDs in the event loop and call a revised setup_outputs function that accepts an output as an argument, checks for existing instances (like it does currently), and also replies to the d-bus message we stashed earlier.

This should avoid the other streams blocking while we wait for the user to select an output. Definitely don't need to get it done this PR, but I wanted to write this down while I'm thinking about it.

@Hubro
Copy link

Hubro commented Jan 28, 2021

@danshick Sorry in advance for a somewhat off topic request, but would you mind opening an issue for "Screencast: Let the user select application window"? I realize this is currently impossible for you to implement because of deficiencies in other libraries/protocols, but it would be great to have an issue I can watch for updates. It's currently the only thing preventing me from using Sway full time, as I have a giant monitor and I have to screencast regularly for work.

@danshick
Copy link
Collaborator

danshick commented Jan 28, 2021

@danshick Sorry in advance for a somewhat off topic request, but would you mind opening an issue for "Screencast: Let the user select application window"?

Anyone is welcome to create an issue.

@michelesr
Copy link

michelesr commented Jan 31, 2021

Looks like GNOME now is showing a dialog which allows you to select the appropriate screen or window:

https://bugzilla.mozilla.org/show_bug.cgi?id=1675764

screenshot
screenshot2

@tmccombs
Copy link

Any idea how that works?

@michelesr
Copy link

@tmccombs not really... I'm not familiar with their codebase, but I guess it was added in flatpak/xdg-desktop-portal-gtk@0d21a34 and the main file to look at is https://github.com/flatpak/xdg-desktop-portal-gtk/blob/master/src/screencastdialog.c

@danshick
Copy link
Collaborator

Any idea how that works?

It's been around for a little while now. IIRC, they use some Gnome-specific protocols to grab window buffers. It isn't something that would be portable to wlroots.

@tinywrkb
Copy link

So what this is?
Specifically here: Window sharing support
Is this just Chrome/Chromium windows?

@danshick
Copy link
Collaborator

danshick commented Feb 1, 2021

So what this is?
Specifically here: Window sharing support
Is this just Chrome/Chromium windows?

No, this is "window sharing" as you're thinking. Chromium did need some patches because Gnome sends a buffer the size of the full screen with "crop" info to explain where the window is in the image. The image being sent is just that window's buffer painted onto a blank background the size of the screen. This is done so that they can support window moving and resizing by allocating a frame the full size of the screen no matter what (IIRC).

Our challenge isn't support within pipewire or webrtc, but rather the ability to pull a single application's buffers for all surfaces. This is a wlroots limitation.

The toplevel management wayland protocol gets us part of the way there, but I don't remember all of the specifics of what is missing.

@tmccombs
Copy link

tmccombs commented Feb 1, 2021

Is the problem that there isn't a protocol to get the buffers, in which case a protocol could be created? Or are there deeper problems that would prevent such a feature from being added.

Of particular interest would be being able to get a screenshot or capture stream of a window that isn't currently visible on screen. That would allow use cases such as:

  • Sharing a window, and that window continues to be shown if another app covers it (and if it overlaps with another window, the other window isn't shown)
  • Building an expose-like app for sway or other wlroots based compositors (see https://github.com/morrolinux/i3expo-ng) that shows windows not currently being rendered.

Is there an existing wlroots issue for something like this?

@emersion
Copy link
Owner Author

emersion commented Feb 1, 2021

Issue is here: swaywm/wlr-protocols#93

@shibumi
Copy link

shibumi commented Mar 14, 2021

Hey everybody I would like to share my little solution for the problem. I am using xdg-desktop-portal-wlr only for screen sharing (for example in MS Teams). For selecting the correct output I am running this script here right now:

input=$(swaymsg -t get_outputs | jq -r '.[].name' | bemenu)
/usr/lib/xdg-desktop-portal -r & /usr/lib/xdg-desktop-portal-wlr -r -o "$input" &

The script just catches all connected monitors, shows them via bemenu, let me select one monitor and then it respawns the xdg-desktop-portal-wlr process with the correct output.

This is far from being perfect, I know. I would prefer to have the possibility to choose the correct output directly in the browser, but it solves the problem for me right now.

@danshick
Copy link
Collaborator

@shibumi PR #59 implements a way for xdpw to invoke dmenu style programs or slurp to choose an output for each screensharing request without restarting xdpw.

If you have the inclination and ability to compile @columbarius's branch yourself, helping to test that PR would be welcomed.

@piater
Copy link

piater commented Mar 15, 2021

FWIW, I have been tracking @columbarius' branch and using it in production for many months, and it has been working well for me, with both Firefox and (ungoogled) Chromium.

@danshick
Copy link
Collaborator

Thanks for the feedback! Good to know it has been stable for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.