Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A fair transcoding jobs list #1192

Closed
Jorropo opened this issue Oct 4, 2018 · 15 comments · Fixed by #3637
Closed

A fair transcoding jobs list #1192

Jorropo opened this issue Oct 4, 2018 · 15 comments · Fixed by #3637

Comments

@Jorropo
Copy link
Contributor

Jorropo commented Oct 4, 2018

Currently transcoding jobs list is basic FIFO list, that good but it makes some problem.
On my instance a user upload for over 2 days of transcoding, this is long, and now when an other user upload a video he have to wait 2 days before transcoding.
So why I don't use the per day limit ? Cause I'm a small instances and most of the time she doesn't transcode. And a per day limit doesn't provide enough control, what if the instance is doing nothing, why blocking user ?

So somethings that could be good is :
(assume jobs arrive in this order)
job 1 for user A
job 2 for user A
job 3 for user A
job 4 for user B
job 5 for user B
job 6 for user C
job 7 for user A

First job 1 cause he is the first.
Then job 4 cause user A already have a job executed.
Then job 6 cause user A and B already have a job executed.
Then job 2 cause we reached the end of the list so we return to start.
Then job 5 cause user A already have a job executed.
Then job 3 cause we reached the end of the list so we return to start.
Then job 7 cause we reached the end of the list so we return to start.

Here is a python3 implementation, assume we have a function transcode that transcode a video.
(that just for explain)

global listOfJobs
global listOfAlreadyTreatedUser
listOfAlreadyTreatedUser = []
listOfJobs = [{"payloads":"some payload","user":"A"},{"payloads":"some payload","user":"A"},{"payloads":"some payload","user":"B"},{"payloads":"some payload","user":"B"},{"payloads":"some payload","user":"C"},]

def whatToTranscode():
    i = 0
    while i < len(listOfJobs):
        a = listOfJobs[i]
        if a["user"] not in listOfAlreadyTreatedUser:
            listOfAlreadyTreatedUser.append(a["user"])
            del listOfJobs[i]
            return a
    listOfAlreadyTreatedUser = []
    return listOfJobs.pop(0)

while len(listOfJobs) > 0:
    transcode(whatToTranscode())
@ghost
Copy link

ghost commented Oct 6, 2018

I'd love to have other options for transcode queueing too. As I understand it, part of the technical challenge in implementing this right now is that the job queueing is handled by a generic library rather than logic that's been written specifically for video transcoding.

@Chocobozzz
Copy link
Owner

Proposal:

  • Every time you create a transcoding job for a specific user, check how many videos they uploaded in the last 24 hours
  • Create the transcoding job having a priority of Math.max(100 - (10 * upoadedInTheLast24Hours), 0)

@ghost
Copy link

ghost commented Oct 8, 2018

@Chocobozzz, I'm not sure what "priority" would concretely mean in your suggestion, or what the reason would be to hardcode 10 videos per day.

As an aside, anything using "number of videos" as a metric will be quite bad. Duration would be better, or accumulated transcode CPU time would be best. Please remember that videos can be arbitrarily complex and/or long.

@rigelk
Copy link
Collaborator

rigelk commented Oct 8, 2018

@scanlime I guess the "10 videos per day" was just to illustrate. More importantly, transcode CPU time is hard to guess too (even though that's exactly what would be required to run the algorithm in a fair way), as shown with #799.

Duration as a metric would be a good middle ground.

@ghost
Copy link

ghost commented Oct 8, 2018 via email

@Chocobozzz
Copy link
Owner

I'm not sure what "priority" would concretely mean in your suggestion,

https://github.com/OptimalBits/bull/blob/master/REFERENCE.md#queueadd

what the reason would be to hardcode 10 videos per day.

It's just an example... the duration or file sizes could be interesting too 👍

@Jorropo
Copy link
Contributor Author

Jorropo commented Oct 10, 2018

I think priority could be good (and need less work), but you can't use video uploaded in the day, that not enough precise.
Maybe we can estimate transcoding time by transcoding 10 seconds of the video and multiply by total video time in second divided by 10.

@ghost
Copy link

ghost commented Oct 10, 2018 via email

@vincib
Copy link

vincib commented Oct 30, 2018

We could also have a script that could be launched to launch either a particular job or any job for a specific video, or any job, and "spread the load of transcoding" ?
(maybe from anywhere if we use NFS or a SAN to access the files ? ;) )

@rigelk
Copy link
Collaborator

rigelk commented Oct 30, 2018

@vincib the problem is that a "job" doesn't just do transcoding. It also means modifying entries in the database to change the hash, and potentially send updates or chain actions in response to the video transcoding.

@vincib
Copy link

vincib commented Oct 30, 2018

sure, the remote job execution process could access the PGSQL database and the storage filesystem to do all what it need to do. (just ensure the job is properly locked and could be relaunched in case of crash...)
for bigger PT instances, that could be very useful (transcoding is heavy cpu-wise...)

@kyrahabattoir
Copy link

Round robin transcoding queue would certainly be nice.

@emansom
Copy link
Contributor

emansom commented Aug 5, 2022

Is a manual override on this job list possible? e.g. if I want to push a video to the front of the queue?

My usecase would be a video that has a fixed release schedule on social media.

Currently my instance is backpressured by some 200+ transcode jobs, most of 3+ hours videos, that all get four resolutions (360, 480, 720 and 1080) importing a whole YouTube channel.

While that's going on, the YouTube channel in particular is facing misused DCMA claims (patent trolls) on its videos and has to rely on PeerTube for sharing its latest video to subscribers, which is now backpressured by about three weeks of transcode jobs. Not ideal.

@Chocobozzz
Copy link
Owner

@emansom you may be interested in #4771 and #4968

@vid-bin
Copy link

vid-bin commented Sep 3, 2022

Posting under my new account now. Heres hoping 4.3.0 fixes #4968 because I'm in the same boat as @emansom.

I think this could be solved by having the new-resolution-hls jobs have a higher priority or having a separate job queue for them entirely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants