Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeking #176

Closed
mainrs opened this issue Aug 16, 2018 · 43 comments
Closed

Seeking #176

mainrs opened this issue Aug 16, 2018 · 43 comments

Comments

@mainrs
Copy link

mainrs commented Aug 16, 2018

Looks like seeking isn't supported for now. Do the used libraries allow seeking in general?
It's a feature I would need for an application and I would be willing to try to implement it. Not sure where though. The sink looks like a good place for it.

@est31
Copy link
Member

est31 commented Aug 16, 2018

Do the used libraries allow seeking in general?

Lewton does. Dunno about the other libraries.

@tomaka
Copy link
Collaborator

tomaka commented Aug 16, 2018

I guess it should first and foremost be implemented on the decoder struct, if the underlying libraries support it (which I'm not sure of).

@michaelmarconi
Copy link

Anyone get anywhere with this?
I need to skip 30 secs at a time through an MP3. Would it be possible to do this by Seeking in the underlying decoder?

@mainrs
Copy link
Author

mainrs commented Sep 5, 2019

I couldn't really get it working and gave up on it 😢

@Lisible
Copy link

Lisible commented Dec 25, 2019

It's a feature I need too, unfortunately I don't have much time to try to implement it.

@slmjkdbtl
Copy link

@recapitalverb I only had success with seeking with hound (hound::WavReader::seek) + cpal, fitting this into rodio might be some work, you can probably add a seek method to Source and only implement if for WavDecoder

@slmjkdbtl
Copy link

@michaelmarconi Yes you can seek any format that way but it'll take a lot of time to parse the whole audio file, it's the best way if you're fine with that

@est31
Copy link
Member

est31 commented Aug 4, 2020

Yeah it'd be very CPU and RAM intensive and not a good idea. Some people may want to play back 10 hours of background music or something all in one file...

@mainrs
Copy link
Author

mainrs commented Aug 4, 2020

Yeah it'd be very CPU and RAM intensive and not a good idea. Some people may want to play back 10 hours of background music or something all in one file...

Aren't (almost) all audio formats streamable? At least that's what I always thought... You could then load an interval of like 1min into RAM and keep it there at any given time and just continuously load new content and drop the old one.

@est31
Copy link
Member

est31 commented Aug 4, 2020

Already 1 minute of samples at 48000 samples per seconds with 2 bytes per sample would mean 5760k or 6 mb of memory use. If it's an opt-in feature, people can use it if they want but optimally one would have seeking support in all of the libraries that rodio uses, then make rodio abstract over it.

@TonalidadeHidrica
Copy link

TonalidadeHidrica commented Nov 4, 2020

Eager for the seeking feature for this library, I looked up for each underlying decoder library:

MP3

minimp3-rs, has no feature for seeking, though the original C library minimp3 does.

High-level mp3dec_ex_seek function supports precise seek to sample (MP3D_SEEK_TO_SAMPLE) using index and binary search

There is a fork that adds a wrapper to the seek function in the original library, last updated on May 2020, a half year ago. The implementation wraps mp3dec_ex_seek, which enables sample-level seeking.

WAV

hound does provide sample-level seeking.

Vorbis

lewton provides a seeking function OggStreamReader::seek_absgp_pg, which provides with per-page seeking.

Related: RustAudio/lewton#73, and maybe RustAudio/lewton#16 as well? Any updates? @est31

FLAC

claxon is used. According to this issue comment, "Claxon currently does not support seeking".

Some FLAC files have "SEEKTABLE" metadata, so this may help for implementing the feature. Even if there is no such metadata, one can still seek to arbitrary position and look for the frame header, which contains the frame position, so it is easy to implement binary searching.

@dvdsk
Copy link
Collaborator

dvdsk commented Jan 13, 2021

I needed seeking for my podcast player build on rodio. I now have it working (in the most basic way possible) for mp3 on my fork. For that I added:

  • a new member function for sources that can seek called request_pos which takes a float that is the duration from the start of a source to seek to (in seconds).
  • an Option<f32> field set_pos to the controls struct that is used by the sink to modify playback.
  • a member function set_pos() to the Sink that is used to set the field to some position.
  • a check if the field is not None within the periodic_access closure. It takes care of calling the member function request_pos() of the source and setting the field back to None.

However sources might not support seeking thus we can not simply add request_pos() to the Source trait. For now I only added seeking to the mp3 decoder. Sources that build upon a source, think of the Mix struct, complicate this further. What to do if only one of the two sources mixed supports seeking? For now I added the request_pos() function under a new trait SourceExt. I implemented if for the structs:

  • Decoder<I>
  • Pausable<I>
  • Stopable<I>
  • Amplify<I>

where <I> has the trait SourceExt. I do not know if something like this would work for the other sources/structs. I suspect not. Has anyone got an idea how I could improve this? Should I open a PR and work on it there?

PS: I forked the fork of minimp3-rs mentioned above, brought it up to date with master and extended it to allow decoding a whole frame. It is used as a git dependency here instead of minimp3-rs on crates.io

@dvdsk dvdsk mentioned this issue Jun 5, 2021
@probablykasper
Copy link

@est31 Any word on @dskleingeld's implementation?

@dskleingeld Do you think the seeking method could be added in a PR to the official minimp3-rs?

@dvdsk
Copy link
Collaborator

dvdsk commented Jun 6, 2021

@dskleingeld Do you think the seeking method could be added in a PR to the official minimp3-rs?

Should be doable, it adds a seperate Decoder object (SeekDecoder) that wraps around the high-level mp3dec_ex functions. However I feel some additional tests are needed. I am only using it for a side project right now so it has not been rigorously tested.

@est31
Copy link
Member

est31 commented Jun 6, 2021

@est31 Any word on @dskleingeld's implementation?

Haven't looked at it but in general seeking support would be a great addition. If it needs upstream changes, those need to be upstreamed first before it can be merged in rodio.

@aschey
Copy link
Contributor

aschey commented Jun 7, 2021

I have a branch here with a number of changes including seeking working. I was getting some strange runtime errors on Windows with the minimp3 fork with seek support (been a while and I can't remember what it was exactly. I think it was an access violation.) so I ended up using Symphonia instead. Symphonia supports MP3, FLAC, and MP4/AAC so I switched the existing MP3 and FLAC decoders over and also gained AAC support as a result.

Unfortunately Symphonia doesn't have seek implemented for AAC so I had to fork Symphonia as well and implement it there. The implementation is quick, hacky, and not particularly precise if you need sub-second precision as I don't really know much about audio codecs, but it seems to be fast enough and works okay if you can accept the limitations.

I don't think any of my changes are mergeable as is because I haven't thoroughly tested it aside from just using the basic playback capabilities so some features may not work as it is right now. I have been using it in my audio player project and the basic functionality seems to be working at least, but it should be considered very much a POC and not optimized.

The other change I added to this branch is discarding the silent samples from the very beginning of the stream to enable psuedo-gapless playback (seems to be gapless enough for the tracks I tested so far). I plan on making that an opt in feature, but haven't gotten around to it yet. I hope I can make all these changes mergeable eventually, but getting a non-hacky seek solution for all the codecs in Symphonia is an obstacle, there's some changes to Lewton that need to be merged first, and I'll need to do a lot more testing on the seek changes in Rodio to check for any regressions.

Here are all the changes I made in case anyone's curious:
Rodio Changes
Symphonia Changes

This also required this fork of Lewton with better seek support

Anyway, sorry for the essay, but I figured I'd let people know in case they really want this feature right now and are okay with a hacky solution until I can make this better or someone else creates a better solution.

@probablykasper
Copy link

@aschey This sounds awesome, excited to see progress on it!

@dvdsk
Copy link
Collaborator

dvdsk commented Jun 7, 2021

@aschey, seeking support for all formats, that is great! I see you added a seek method to the Source trait, that looks way cleaner then adding a new trait like I did. I thought this would be a problem for sources that can/do not support seeking. However right now these could fake a seek implementation (always returning Ok(current_pos)?).

How can I help get this in rodio?

@aschey
Copy link
Contributor

aschey commented Jun 7, 2021

@dskleingeld Yeah, that's basically how I handled it. I figured it's fine for a seek to be a no-op on situations where it doesn't make sense like a sine wave. If you want to run my branch and test it out a bit, that'd help me out a lot. PRs are welcome as well if you see something that can be improved.

Most of the source implementations are totally untested right now as I only used it for basic playback, none of the fancy effects or looping or anything. I just made my best guess as to how seek might work for the sources that I haven't used so there's a good chance that a lot of them still need tweaking and some definitely need to be cleaned up with proper error handling and whatnot.

In terms of getting this in for real, I think this needs to happen in a few phases. I was lazy and put all of this in one branch just to see if it would work at all, so I'll copy the different parts of my changes into new branches so they can be merged separately.

  1. Symphonia's decoder is not Send so I had to make a few modifications to get it to fit in with Rodio's requirements. Unfortunately the new Send requirement on MediaSource is technically a breaking change since that's a public trait that can be implemented by the user so I'll see if I can find a way to make this work without changing MediaSource directly. Looks like there's already a PR to add Send to MediaSource, but it hasn't gotten any response yet.
  2. Refactor Rodio to replace the existing MP3 and FLAC decoders with Symphonia.
  3. Add the seek function to the Source trait. Seek won't work for AAC and Ogg yet. I couldn't get seek to work in Lewton at all without TonalidadeHidrica's fork that I mentioned above, but I might've been using it wrong.
  4. Update Lewton and Symphonia with proper seek support for AAC and Ogg.

I've got some life stuff going on right now, so I might not make too much progress in the next few weeks, but I'll start working on part 1 and 2.

@est31
Copy link
Member

est31 commented Jun 7, 2021

I'm positive towards symphonia integration in general, but some users of rodio might have concerns about the copyleft license that symphonia uses (MPL-2.0). Many gamedevs have a disdain towards anything copyleft related, even if it's the weak copyleft of MPL-2.0. I think an optional backend based on symphonia would work (or maybe multiple, idk what's more elegant), but it shouldn't replace any existing "permissively" licensed backends.

@aschey
Copy link
Contributor

aschey commented Jun 7, 2021

Okay, fair enough. I can probably make them interchangeable and have it opt-in via a feature flag. Seek can just be a no-op for any decoders that don't support it.

@pdeljanov
Copy link

Hey all, developer of Symphonia here. I stumbled across this thanks to @aschey's recent PRs. I'm happy that Symphonia is getting some attention and thought I could address some of your concerns and add a few thoughts.

I'll preface this by saying that I am having difficultly finding time to work on Symphonia, hence the low activity on the repo, but I do hope to dedicate a few hours each week to fix bugs, merge PRs, and refactor things as necessary to improve the library and support different users. In the longer term, my goal is to complete the Vorbis decoder, and implement support for Opus.

So onto some thoughts:

  1. I believe most of the core traits: FormatReader, MetadataReader, Decoder, MediaSource, etc. can be marked Send without issue.
  2. I'm sitting on a couple commits that refactors the seek interface and improves MP3 seeking. These haven't been pushed to the public repository yet. I'll be bumping the versions of all crates to 0.3 before publishing so marking the traits in point 1 as Send should be fine.
  3. On the topic of seeking, to seek what users typically consider an AAC file (*.aac, *.m4a) would require both the ADTS and MP4 readers to implement seeking. The former would roughly be implemented the same way as MP3, while the latter is considerably more complex. This is something I am starting to look into to support the improvements in point 2.
  4. Just a heads up, the AAC decoder is not perfect. This may be something you want to consider since users will likely be filing the bugs to you and I don't think I'll be able to address them that quickly.
  5. Just a suggestion, but Symphonia currently supports all of Rodio's codecs except Vorbis. Perhaps it makes sense to just wrap lewton in a Symphonia Decoder and use Symphonia directly? This buys you a few things such as intelligent format detection (i.e., you can pipe bytes into Symphonia and it'll figure out which format and decoder to use), pretty thorough metadata support (including weird combos like ID3v2 on FLAC -- yes, I've seen this!), and support for many format-codec combinations such as FLAC-in-OGG. The downside is the risk of regressions.

Cheers!

@est31
Copy link
Member

est31 commented Jun 10, 2021

Thanks for reaching out @pdeljanov !

To 4: interesting, will keep it in mind. To 5: wrapping lewton might be a good idea anyways, but as I've said above, symphonia should rather be an option instead of a hard dependency. Having it as an option would be awesome.

@aschey
Copy link
Contributor

aschey commented Jun 11, 2021

Thanks for getting my PRs merged @pdeljanov. Wrapping lewton in a Decoder trait is an interesting idea that I'll definitely take a look at. I have a working branch with optional Symphonia integration here that I plan to make a PR for once the next Symphonia version is released.

@probablykasper
Copy link

Would want to have seeking support in redlux (which glues together rust-mp4 and fdk-aac c-bindings with rodio) for .aac and .m4a, which @pdeljanov made me realize should at least be doable for AAC by parsing the ADTS headers while keeping track of where in the file we are. For m4a I'm pretty clueless though, anybody got any ideas?

@pdeljanov
Copy link

Hey @aschey, just wanted to give you a heads up that I published 0.3.0 which includes your changes.

@probablykasper, for basic (non-fragmented) MP4 files, all you have to do is use the STTS atom to convert a sample timestamp to a sample number. Once you have the sample number, you just need to start reading from that sample number forward. I implemented seeking in MP4 (and ADTS) for v0.3 of Symphonia so you could get some inspiration from that albeit it's a bit more complicated because it also supports fragmented streams.

@aschey
Copy link
Contributor

aschey commented Jun 27, 2021

Hey @aschey, just wanted to give you a heads up that I published 0.3.0 which includes your changes.

@probablykasper, for basic (non-fragmented) MP4 files, all you have to do is use the STTS atom to convert a sample timestamp to a sample number. Once you have the sample number, you just need to start reading from that sample number forward. I implemented seeking in MP4 (and ADTS) for v0.3 of Symphonia so you could get some inspiration from that albeit it's a bit more complicated because it also supports fragmented streams.

Great, thanks for the heads up! Just created the PR to add Symphonia integration here: #376. Happy to accept feedback if you have ideas for improvements. I did briefly look into wrapping Lewton in a decoder trait, but it didn't seem like Lewton's API would be compatible with the way the decoder trait works since Lewton requires access to the underlying read/seek source.

Bringing this back to the intent of the original issue, I'll work on another PR to expose Symphonia's seek functionality to Rodio once the previously mentioned PR is merged.

@est31
Copy link
Member

est31 commented Jun 27, 2021

I did briefly look into wrapping Lewton in a decoder trait, but it didn't seem like Lewton's API would be compatible with the way the decoder trait works since Lewton requires access to the underlying read/seek source.

Lewton has two different APIs. A high level one in the inside_ogg module that indeed needs Read/Seek, as well as a lower level one which only needs access to &[u8] packets.

@Shnatsel
Copy link
Contributor

Patching Lewton might no longer be necessary, since Symphonia now has a Vorbis decoder.

It's 100% safe code and is at around 10% slower than ffmpeg. Correctness should be quite good as well - I've tested it against the reference implementation on over 200Gb of Vorbis from various sources, and all the mismatches with the reference implementation have been fixed.

@est31
Copy link
Member

est31 commented Jan 24, 2022

I've been wondering why have rodio at all, given that symphonia exists, has more codecs, etc. Why not use symphonia directly? There is one big difference: Symphonia has a copyleft license. For some people that's good. But I think many don't like copyleft.

So the target market of rodio is everything that's left. People who want to use copyleft free licensing. This is a big thing in e.g. games for example. This extends to the libraries that rodio is using. So whatever libraries rodio uses, they should be patched, if possible.

Generally it seems that symphonia has more maintainer momentum right now than rodio and related projects. The approach of having a monorepo is quite interesting, compared to rodio's approach of one-codec-per-crate. I still think that, beyond inertia, it's still valuable to maintain rodio and the libraries that rodio uses.

@Shnatsel
Copy link
Contributor

Shnatsel commented Jan 24, 2022

Naturally. I just wanted to point out that patching every single crate is not strictly necessary to provide some seeking support. Ideally all backends should support it, of course.

@rednaz1337
Copy link

Symphonia has a copyleft license. For some people that's good. But I think many don't like copyleft.

True, but Symphonia is licensed under the MPL, which is a weak copyleft license. Unlike GPL, it can be used in commercial/proprietary projects just fine. It is not infectious, even when statically linking. You only have to disclose changes you made to the library itself (if you made any).

@dvdsk
Copy link
Collaborator

dvdsk commented Jan 30, 2023

So the target market of rodio is everything that's left. People who want to use copyleft free licensing
...

I use Rodio as it is higher level then symphonia, Rodio's basic example is 11 lines of code whereas symphonia's getting started example comes in at 103 lines.

Getting back to the issue at hand, rodio should have seeking in files, this issue has been open for 5 years. We have a working fork (by aschey using symphonia as decoder). Since then symphonia has become an optional/default decoder for Rodio. To me the next step seems a PR integrating aschey changes into Rodio. Seeking methods would be kept behind the same feature flag as the symphonia backends. Possibly even behind an extra unstable flag (like done in tokio) to allow the API change later.

@est31 you seem the maintainer of rodio atm, would you be open to such a PR?
@aschey would you be interested in opening such an PR? If not would it be okay for someone else to continue on your work? I have no time right now (I do however, have a need for seeking in Rodio) but could pick it up in a few months.

@aschey
Copy link
Contributor

aschey commented Jan 30, 2023

Hey, sorry for abandoning this a while ago. I ended up using Symphonia directly rather than going through Rodio as I needed some additional functionality (unrelated to seeking) that would've been difficult to integrate with Rodio's architecture. Anyone's welcome to pick up my branch here and continue it. It's definitely a bit of a mess and has unrelated changes so you'd be better off just copying over the useful bits into a fresh branch.

As mentioned previously, I only got basic playback functionality working though and I didn't really test any of the other features that were affected so I wouldn't assume anything in there beyond basic playback is in a working state. Alternatively, it may be better to avoid adding the seek method to the Source trait and go about it in a way that would require less extensive changes.

@est31
Copy link
Member

est31 commented Jan 30, 2023

I'm fine with seeking additions. Ideally they'd be in a way that any backend can opt into it. lewton already has partial seeking as well, but it needs to be polished (to be nice). Given that there is a 0.15 cpal upgrade, we'll likely have to do a semver breaking update anyways.

@dvdsk
Copy link
Collaborator

dvdsk commented Jan 30, 2023

Hey, sorry for abandoning this a while ago. I ended up using Symphonia directly rather than going through Rodio as I needed some additional functionality (unrelated to seeking) that would've been difficult to integrate with Rodio's architecture.

No worries. Off topic, I wonder what other functionality you needed. I am building a podcast player I think I only need seeking and pause/play but I might have missed something there.

Unless any one else works on it (let us know here please!) I will probably work on this in one or two months. I will need seeking anyway and I rather add seeking to Rodio (where I will get some code review and feedback) then throw something together.

@aschey
Copy link
Contributor

aschey commented Jan 31, 2023

No worries. Off topic, I wonder what other functionality you needed. I am building a podcast player I think I only need seeking and pause/play but I might have missed something there.

I think you're good. Don't want to spam this thread so I'm happy to discuss more elsewhere, but I'm working on a music player daemon similar to MPD and I needed a way to retrieve the exact playback position of the stream in order to keep multiple clients synchronized.

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 29, 2023

I'm gonna give this one ago, Ill open a draft PR soon.

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 30, 2023

Cleaned up my implementation build upon years old hacks, got a bit unhappy about the design.

So lets discuss our options there.


Right now I see 4:

  1. Sink gets an extra method append_seekable which only accepts Sources implementing Source<IsSeekable>. This is a design that provides type safety around non-seek-able vs seek-able sources/decoders. Unfortunately to actually get there we would need to add generics to quite some of rodio's types.

I have a start on this as my current implementation, though to me this design seems to lead to a high amount of new code bloat. I am thinking of abandon it in favor of one of the others.

edit 2023-10-02: There is a way to get a clean API but it needs the now still unstable Specialization feature.

  1. The other option is to add seek to the Source trait. This gives up a typed interface. Though most not all decoders can support seek atm.

Combined with Decoder::new auto selecting the right decoder this easily leads to a situation where seek does nothing and the crate user does not understand why.

  1. Make it a runtime error, make the seek function return an error when seeking is not supported..

That is doable but will add some complexity to Sink.

  1. Put seek behind (multiple) features. The seek function will only exist when all decoders available at compile time support seek. For now that would require no-default-features = true and the features symphonia or minimp3 to get seek though more decoders could be added in the future.

The rust API guidelines specify features should be additive, adding a decoder should not remove seeking.


To me a merge between 3 and 4 seems like a good option. Switch to an API with runtime errors when you add a feature which enables a decoder that does not support seek.

If we plan to add more features only supported by some decoders a good implementation of 1 on which we can build might make more sense.
This could be something along the lines of Source<Seekable> where Seekable is sealed trait with implementations IsSeekable and NotSeekable. The same happens for Decoder, Sink and Controls then only Control<IsSeekable> gets seek as an option.


edit 2023-10-02:
Should specialize skip_duration to seek if the duration is longer then some threshold, see: #443

@dvdsk
Copy link
Collaborator

dvdsk commented Sep 30, 2023

@est31 I would love you opinion on this if you have time.

edit: I went for approach 3, see draft PR #513

@dvdsk dvdsk mentioned this issue Oct 4, 2023
@dvdsk
Copy link
Collaborator

dvdsk commented Oct 12, 2023

The PR is ready now, it got pretty big, please help the maintainers out by reviewing it.

@shimekukuri
Copy link

#513 updated to latest commit.

@dvdsk
Copy link
Collaborator

dvdsk commented Apr 6, 2024

Its merged 🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests