Add audio/video metadata #72

rperlin-ela · 2020-09-02T15:34:18Z

Description

We'd like to import metadata automatically from Youtube and/or Archive.org and show it under the language name above the audio/video embed in the dialog. Probably the most important thing would be what both Youtube and Archive.org call the Description, but maybe a few other fields like rights, attribution. We also have this info in spreadsheet form if that’s easier.

Resolution

@abettermap thinks Youtube may have an API we can use

abettermap · 2020-10-13T02:36:38Z

@rperlin-ela

YouTube

We will definitely need an API key for this. Here are the steps, which are very similar to my instructions for setting up the Sheets API token except you can use the same project and even the API key, you just need to enable YouTube API:

https://console.developers.google.com/apis/dashboard
Select your project if needed:
If needed, click the project in the popup.
Click this:
Type youtube
Click the result with v3 in it:
Click ENABLE

Should be all set. Go back to your Dashboard and confirm it's there then let me know it's ready:

Archive.org

Question: why don't any of the records use this for audio yet? It looks like ELA has quite a few, just wondering why they're not in the dataset.

Anyhoo, no API key required. Here's an example of the meta for your "Kabardian comparative" for example:

https://archive.org/metadata/ela_kabardian_comparative/metadata

Which returns:

Or go straight for the kill with one level deeper: https://archive.org/metadata/ela_kabardian_comparative/metadata/description

That's the info we need, right? Easy on my end, BUT how will those files be embedded as audio? I see a bunch of formats listed, but also archive.org's docs suggest an iframe, which is normally used for videos.

The audio embed I have set up in the code is not going to do anything with an API URL like that, and vice versa. So, will I need to do a check to see if the Audio url includes archive.org then process it accordingly to make it work with one of these?

The webplayer/video thing to use as an iframe embed (I don't know enough about these formats)
Sift through the full API results for that file to find the WAVE?

I see how to get the full list as well, but I assume we'd only need one of them?

Here's a full list for another of your audio items as a comparison: https://archive.org/download/mid-2003-06-13c

I'm not sure we'll get away without me doing some kind of check/processing on my end, so I could see this being the flow of code:

You provide me with a consistent URL in Audio column to the wav file or whatever
I check Audio to see if it contains archive.org
If so, parse it out so I can get just the item ID (I don't like this already)
Use the ID to hit their API so that I can get the Description

OR, we treat it as a video/webplayer:

You populate the Video field with the embed URL, e.g. https://archive.org/embed/mid-2003-06-13c
I follow steps 3-4 above (3 is easier this way)

The obvious drawback of the "video" approach is that you couldn't have audio AND video for the same record.

OR:

you give me the metadata API URL and I sift through the results until I find the WAVE format (assuming it's always there).
I grab the Description while I'm there.

I'm going to be hitting the API no matter what so whatever is cleanest for that. Assuming there is enough consistency of results then I'm sure I could find whatever I need there and deliver it to the user in whatever format you want. Personally I would go with the "video" embed since it has a lot more options and could be useful for those languages that have like 100+ wave files:

If you've got instances of Video AND Audio though, then would be a harder sell for the embed approach. 🤔

For Future Jason

here is the archive's metadata API: https://blog.archive.org/2013/07/04/metadata-api/

abettermap · 2020-10-13T02:37:06Z

@rperlin-ela

Hope that made sense, kind just learning it as I go, so let me know what you think.

rperlin-ela · 2020-10-13T18:07:57Z

Great, thanks for these instructions. I’ll get on them once we’re ready to get started. Maybe this will be obvious once we’re working on it, but can we choose what metadata gets pulled by the API? Want to clarify this with Dan, whose request this was, but I think Description is the key thing, maybe one or two others. Re: Archive.org <http://archive.org/>, same deal— and yes, we have some things on there. (Aren’t we using the Neo-Mandaic sample from there?). But we're going to have a TON more— all of our audio— within the next year or so, we just wanted to wait until we have the best samples up there with all the proper metadata. Cool that a full list can be pulled, which is probably more than we need, but I’d want to ask Dan— potentially a neat capability. In principle all the code flows you describe sound sensible, but the ability to have both video and audio for any given language community would be nice.

…

On Oct 12, 2020, at 10:36 PM, Jason Lampel ***@***.***> wrote: @rperlin-ela <https://github.com/rperlin-ela> YouTube We will definitely need an API key for this. Here are the steps, which are very similar to my instructions <#18 (comment)> for setting up the Sheets API token except you can use the same project and even the API key, you just need to enable YouTube API: https://console.developers.google.com/apis/dashboard <https://console.developers.google.com/apis/dashboard> Select your project if needed: <https://user-images.githubusercontent.com/4974087/95803510-85ea2900-0cbd-11eb-8d90-599622bdb61f.png> If needed, click the project in the popup. Click this: <https://user-images.githubusercontent.com/4974087/95803555-a6b27e80-0cbd-11eb-9b4b-d3aef5b20cb1.png> Type youtube Click the result with v3 in it: <https://user-images.githubusercontent.com/4974087/95803375-2be96380-0cbd-11eb-8a6a-165009ca0d40.png> Click ENABLE Should be all set. Go back to your Dashboard and confirm it's there then let me know it's ready: <https://user-images.githubusercontent.com/4974087/95803467-6b17b480-0cbd-11eb-80ec-9f04a732a675.png> Archive.org Question: why don't any of the records use this for audio yet? It looks like ELA has quite a few, just wondering why they're not in the dataset. Anyhoo, no API key required. Here's an example of the meta for your "Kabardian comparative" for example: https://archive.org/metadata/ela_kabardian_comparative/metadata <https://archive.org/metadata/ela_kabardian_comparative/metadata> Which returns: <https://user-images.githubusercontent.com/4974087/95805410-94870f00-0cc2-11eb-8a33-1bc76b6c9440.png> Or go straight for the kill with one level deeper: https://archive.org/metadata/ela_kabardian_comparative/metadata/description <https://archive.org/metadata/ela_kabardian_comparative/metadata/description> That's the info we need, right? Easy on my end, BUT how will those files be embedded as audio? I see a bunch of formats listed, but also archive.org's docs suggest an iframe <https://archive.org/help/audio.php>, which is normally used for videos. The audio embed I have set up in the code is not going to do anything with an API URL like that, and vice versa. So, will I need to do a check to see if the Audio url includes archive.org then process it accordingly to make it work with one of these? The webplayer/video thing to use as an iframe embed (I don't know enough about these formats) Sift through the full API results <https://archive.org/metadata/ela_kabardian_comparative> for that file to find the WAVE? <https://user-images.githubusercontent.com/4974087/95806714-92727f80-0cc5-11eb-9da1-bf928f5cb8d0.png> I see how to get the full list <https://archive.org/download/ela_kabardian_comparative/clips%20Kabardian/> as well, but I assume we'd only need one of them? Here's a full list for another of your audio items as a comparison: https://archive.org/download/mid-2003-06-13c <https://archive.org/download/mid-2003-06-13c> I'm not sure we'll get away without me doing some kind of check/processing on my end, so I could see this being the flow of code: You provide me with a consistent URL in Audio column to the wav file or whatever I check Audio to see if it contains archive.org If so, parse it out so I can get just the item ID (I don't like this already) Use the ID to hit their API so that I can get the Description OR, we treat it as a video/webplayer: You populate the Video field with the embed URL, e.g. https://archive.org/embed/mid-2003-06-13c <https://archive.org/embed/mid-2003-06-13c> I follow steps 3-4 above (3 is easier this way) The obvious drawback of the "video" approach is that you couldn't have audio AND video for the same record. OR: you give me the metadata API URL and I sift through the results until I find the WAVE format (assuming it's always there). I grab the Description while I'm there. I'm going to be hitting the API no matter what so whatever is cleanest for that. Assuming there is enough consistency of results then I'm sure I could find whatever I need there and deliver it to the user in whatever format you want. Personally I would go with the "video" embed since it has a lot more options and could be useful for those languages that have like 100+ wave files: <https://user-images.githubusercontent.com/4974087/95808796-7b825c00-0cca-11eb-9088-7c44e3219755.png> If you've got instances of Video AND Audio though, then would be a harder sell for the embed approach. 🤔 For Future Jason here is the archive's metadata API: https://blog.archive.org/2013/07/04/metadata-api/ <https://blog.archive.org/2013/07/04/metadata-api/> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#72 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMNKB5F7OXGP5HIXUPBQJG3SKO4MLANCNFSM4QTE3FWQ>.

abettermap · 2020-10-14T00:07:11Z

Great, thanks for these instructions. I’ll get on them once we’re ready to get started.

Feel free to start any time, this might be one of the first things I work on since it's fresh in my brain.

Maybe this will be obvious once we’re working on it, but can we choose what metadata gets pulled by the API? Want to clarify this with Dan, whose request this was, but I think Description is the key thing, maybe one or two others.

If I'm understanding correctly, yes. Archive.org doesn't have more than what I showed in the screenshot I think, but as for YouTube there may be other metadata. You can skim through the options here: https://developers.google.com/youtube/v3/docs/videos#properties

Highlights from today's call

Embeds are fine. Aka the Archive.org web player, aka <iframe>, aka Jason can remove the audio player component.
Once the code is ready, Ross will point the Audio URL(s) to the embed URL format rather than the .wav. Example: https://archive.org/embed/ela_kabardian_comparative
Jason will assume all audio is from achive.org

Transcript files

Unless these are part of something in the videos or audio files already in existence, I don't think this is an easy option. If you had an external file somewhere, like Dropbox, we'd have to either:

Add a new column like Transcript in the spreadsheet
Throw it at the end of Description
Use a CMS like Sanity

abettermap · 2020-10-14T20:53:34Z

@rperlin-ela should I assume that all the archive.org instances will have a playlist? Or maybe a better question is, is this format ok when they don't?

I don't think it's harming anything when there's only a single item, and it keeps the UI consistent with the instances which do have playlists.

abettermap · 2020-10-14T21:12:03Z

Re: additional metadata fields, I think you're also going to want title for both YouTube and Archive instances. Up until now I think I've been using the Endo, so title would be a nice upgrade.

abettermap · 2020-10-15T00:03:39Z

...and I think that's about all that's useful in the YouTube API. I know this just looks like code but you get the idea:

rperlin-ela · 2020-10-15T00:07:22Z

Yes, I think what's in there is now just Language, but Title followed by Description would be great.

I reckon some archive.org instances will have playlists but not all (as with Youtube). Format from the screenshot seems fine to me and good to keep consistent.

I believe the Youtube API is now enabled — let me know if not or if you need anything else from my end.

In case it helps I uploaded a new tileset just now with what I think should be the correct embed URL from Archive for Neo-Mandaic. (Sorry if that was jumping the gun—I see for the time being it's giving an error.)

abettermap · 2020-10-15T01:06:15Z

Yes, I think what's in there is now just Language, but Title followed by Description would be great.

Awesome. Looks like YouTube API also has some kind of caption/transcript endpoint but I'm not going down that road because:

It doesn't look to be part of the regular endpoint that provides the Title/Descrip
...and may involve a second (or third) API call.
...which counts even more against your API account quota (I assume you didn't put any billing info in, which should be fine but I wouldn't want to thread the needle with 3-4 requests for each time a video is opened)
1 ...and based on a skim of ELA's uploads, many do not have transcripts anyway.
...and since archive.org does not have the same API functionality, we'd really be mixing up the UI consistency even more
...only to basically be recreating YouTube, which already supports transcripts within the full view:
...and my understanding for the metadata request in this SOW was just that, metadata. Not a full-on YouTube replacement, which captions/transcript seems to be leaning towards in a small way.

I believe the Youtube API is now enabled — let me know if not or if you need anything else from my end.

Thanks for doing that although I'm not sure it's set up properly as I'm getting an error:

Requests to this API youtube method youtube.api.v3.V3DataVideoService.List are blocked.

Did you create a new key or just enable the YouTube API for the existing key? If you enabled for existing key then you should see both the Sheets and YouTube APIs in the restrictions list:

I'm 99% sure I have the correct API URL because I get what I expect when I put my own key in with the YouTube API enabled.

In case it helps I uploaded a new tileset just now with what I think should be the correct embed URL from Archive for Neo-Mandaic. (Sorry if that was jumping the gun—I see for the time being it's giving an error.)

Nice, I looked at your URL in Final Output and it looks correct and is working for me locally (I made some progress switching over to the embed-only approach), including with the playlist parameter appended to it (even though there's only one file for this guy):

Tomorrow I'd like to start working on parsing out your URLs so I can get the video IDs to use in the API.

rperlin-ela · 2020-10-15T01:30:43Z

I think I'm seeing captions on all the videos that have them — check Wakhi for example. So all seems good. But from other experiences with Youtube I wonder if people who haven't enable some kind of setting will see it.

I followed what you had in the instructions, so I think all I had done was enable the key. But I see in the dashboard it's failing 100% of the 13 times it was tried. Sorry if this was off base, but something I saw made it seem like I needed to "create credentials", specifying a little further info about the use and then getting the opportunity to restrict key, but that seemed to result in a new API key, which I'm sending you by email. Kind of fumbling around in the dark, but hopefully I can unravel if necessary.

abettermap · 2020-10-15T01:57:07Z

I think I'm seeing captions on all the videos that have them — check Wakhi for example. So all seems good.

Yeah all the captions will be there, we might be talking about two different things. You mentioned something about transcripts and in YouTube the transcript is a textual list of placemarks within the video's captions/subtitles I think? Like:

or

But from other experiences with Youtube I wonder if people who haven't enable some kind of setting will see it.

Yeah if you click the gear icon it has a cc option. This seems to coincide with but remain independent from cc subtitles etc.

Anyhoo probably not relevant I guess.

As for the new API key, sorry if my instructions steered you off course, I may have missed something as it's hard to know what's on your screen (I have multiple projects in Google so mine might look different). It's probably not a bad idea to have two API keys anyway (one youtube, one sheets), and it looks like the youtube key you just made is working so we should be set but I'll let you know if not!

rperlin-ela · 2020-10-15T02:02:44Z

All good if the API is working. My bad, anyway I hear you on the challenges and it being outside of what we’ve talked about… but this Transcript thing could theoretically be pulled via API right under our metadata, highlighting the relevant sentence as the video is playing? Something like that would be much, much better than a downloadable transcript.

…

On Oct 14, 2020, at 9:57 PM, Jason Lampel ***@***.***> wrote: I think I'm seeing captions on all the videos that have them — check Wakhi for example. So all seems good. Yeah all the captions will be there, we might be talking about two different things. You mentioned something about transcripts and in YouTube the transcript is a textual list of placemarks within the video's captions/subtitles I think? Like: <https://user-images.githubusercontent.com/4974087/96067083-2ecb8c00-0e56-11eb-846d-58e38027e849.png> or <https://user-images.githubusercontent.com/4974087/96067150-51f63b80-0e56-11eb-9d9c-389da5721e7f.png> But from other experiences with Youtube I wonder if people who haven't enable some kind of setting will see it. Yeah if you click the gear icon it has a cc option. This seems to coincide with but remain independent from cc subtitles etc. <https://user-images.githubusercontent.com/4974087/96067228-7fdb8000-0e56-11eb-88fe-b8fc9a5ac858.png> Anyhoo probably not relevant I guess. As for the new API key, sorry if my instructions steered you off course, I may have missed something as it's hard to know what's on your screen (I have multiple projects in Google so mine might look different). It's probably not a bad idea to have two API keys anyway (one youtube, one sheets), and it looks like the youtube key you just made is working so we should be set but I'll let you know if not! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#72 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMNKB5FRHBAJXVRCYSHWLMTSKZJIFANCNFSM4QTE3FWQ>.

abettermap · 2020-10-15T02:38:05Z

Definitely not going to do that but yes I think with an extra week of time something like that could be achieved. It's just a bunch of data in the API, someone would have to create a very very complex component to sync with the video.

Shouldn't have brought it up, it's so far beyond everything else.

rperlin-ela · 2020-10-15T02:40:15Z

Understood, cool to know about.

…

On Oct 14, 2020, at 10:38 PM, Jason Lampel ***@***.***> wrote: Definitely not going to do that but yes I think with an extra week of time something like that could be achieved. It's just a bunch of data in the API, someone would have to create a very very complex component to sync with the video. Shouldn't have brought it up, it's so far beyond everything else. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#72 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMNKB5C7EZ7VYXNJK7CMTO3SKZOBZANCNFSM4QTE3FWQ>.

abettermap · 2020-10-15T02:40:35Z

it's just a bunch of XML, it's not like they give you a component or HTML to work with. someone would have to hand-roll all that.

abettermap · 2020-10-15T02:44:14Z

if it's of use, the cc can be forced on:

rperlin-ela · 2020-10-15T02:46:00Z

If that’s an easy switch to flip, yes by all means

…

On Oct 14, 2020, at 10:44 PM, Jason Lampel ***@***.***> wrote: if it's of use, the cc can be forced on: — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

abettermap · 2020-10-15T02:59:30Z

yeah should be, kinda falls along same lines as how i'm playlistifying the Archive videos. Just make sure to continue to leave all your video URLs as bare as possible. I think these should be the only three scenarios?

https://www.youtube.com/embed/VIDEO_ID
https://www.youtube.com/embed/videoseries?list=PLAYLIST_ID
https://archive.org/embed/ela_kabardian_comparative

I can append parameters to the URL as needed.

The approach is kind of fragile since it relies on parsing on my end and consistency on yours, not to mention Google's APIs are kind of volatile. But in lieu of a CMS or additional column in the data, it's probably the best we can do for now.

As for embed parameters, let me know if there are any others of interest:

YouTube: https://developers.google.com/youtube/player_parameters#cc_lang_pref
Archive: https://archive.org/help/audio.php

None are difficult to add, it's not part of the API, just some extra cruft to append to the embed URLs.

abettermap · 2020-10-15T03:28:58Z

Not sure what the odds of this are but the video I was testing (Wakhi) appears to be the only one with an incorrect URL:

If it's a playlist then it needs videoseries like the others:

abettermap · 2020-10-15T04:25:11Z

I got the single-video API connection wired up, wasn't terribly hard and it definitely looks better than just the language for title.

For descrip, you thinking above the video or below? Here it is on laptop:

And mobile (whatever that is!):

I don't know how long the average descrip is and that might dictate placement a bit, so let me know what you think.

rperlin-ela · 2020-10-15T14:06:23Z

Thanks, fixing Wakhi for the next upload — I assume all the others you see are ok then? I’ll look at parameters and try to let you if there’s anything ASAP.

…

On Oct 14, 2020, at 10:59 PM, Jason Lampel ***@***.***> wrote: yeah should be, kinda falls along same lines as how i'm playlistifying the Archive videos. Just make sure to continue to leave all your video URLs as bare as possible. I think these should be the only three scenarios? https://www.youtube.com/embed/VIDEO_ID https://www.youtube.com/embed/videoseries?list=PLAYLIST_ID https://archive.org/embed/ela_kabardian_comparative I can append parameters to the URL as needed. The approach is kind of fragile since it relies on parsing on my end and consistency on yours, not to mention Google's APIs are kind of volatile. But in lieu of a CMS or additional column in the data, it's probably the best we can do for now. As for embed parameters, let me know if there are any others of interest: YouTube: https://developers.google.com/youtube/player_parameters#cc_lang_pref <https://developers.google.com/youtube/player_parameters#cc_lang_pref> Archive: https://archive.org/help/audio.php <https://archive.org/help/audio.php> None are difficult to add, it's not part of the API, just some extra cruft to append to the embed URLs. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#72 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMNKB5BGR4URQ6PDHZ4VPX3SKZQR7ANCNFSM4QTE3FWQ>.

abettermap · 2020-10-15T17:28:05Z

I assume all the others you see are ok then?

Ones I've looked at, yes, but that's something you'll want to QC on your end for this iteration and ongoing. If you're taking notes for the Big Comprehensive Manual, include the formats I mentioned above:

If any other variations are used accidentally, it won't work. If I'm missing any other scenarios, let me know. If Google changes their API, let Google let me know. :)

Re: QC in general- I think that's something that could be automated on your end. If you had a separate tab/sheet with formulas pointing to the main source tab, you could use formulas that automatically check for the usual Bad Data suspects:

Unacceptable values, e.g. "South Americaa" or "SOUTH AMERICA" in World Region. Bad example but you get the idea.
Trailing whitespace
Empty cells which should not be empty
Font Image Alt that does not start with https://www.dropbox.com/s/ and end with ?raw=1
Video/audio values that do not begin with "https://archive.org/embed/" or "https://www.youtube.com/embed/"

Just brainstorming but those are several very-real things we've encountered on more than one occasion and it adds time to troubleshooting, communication, and data maintenance, so might be worth pursuing. Could just start with one column at a time to get some practice and see how it goes, and I think it would pay for itself in the long run.

abettermap · 2020-10-15T17:29:12Z

For description in the video modal, are you thinking of having it above the video or below? See my screenshots a few comments up.

rperlin-ela · 2020-10-16T16:13:21Z

The way you have it looks good to me: title on top and then description below. Length of Acehnese description is probably representative but it would be good to allow for the possibility of longer ones.

Thanks for the QC tips, you're obviously right— I just need someone a little more expert with Google Sheets, will ask.

rperlin-ela · 2020-10-16T16:15:43Z

(Sorry forgot that playlists and Archive were still to-do!)

abettermap · 2020-10-16T21:10:54Z

No worries. Have you had a chance to check on those extra parameters yet?

YouTube: https://developers.google.com/youtube/player_parameters#cc_lang_pref
Archive: https://archive.org/help/audio.php

rperlin-ela · 2020-10-16T21:14:58Z

Yeah, didn’t see anything critical — good just forcing on subs.

…

On Oct 16, 2020, at 5:11 PM, Jason Lampel ***@***.***> wrote: No worries. Have you had a chance to check on those extra parameters yet? YouTube: https://developers.google.com/youtube/player_parameters#cc_lang_pref <https://developers.google.com/youtube/player_parameters#cc_lang_pref> Archive: https://archive.org/help/audio.php <https://archive.org/help/audio.php> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#72 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMNKB5CTJIS5KVWTS4KHLNTSLCZG5ANCNFSM4QTE3FWQ>.

abettermap · 2020-10-19T20:30:28Z

Sounds good.

Questions

Youtube playlists meta title

The title differs from the video titles, so should there be some indication that it's a playlist? Aside from the playlist icon in the top-right, it's not immediately obvious that the title refers to a playlist:

Could be something subtle like this:

Maybe the (playlist) part would be less necessary when there is a description though (although I don't see many playlists with a description)? If the description mentions anything like "this collection of..." then the "(playlist)" suffix probably isn't needed.

ID/format issue?

I'm wondering if maybe this is a byproduct of standardizing the URLs, or a true typo, but this one isn't working (Neo-Aramaic). Your syntax is fine, just like I requested it:

https://www.youtube.com/embed/videoseries?list=PLcXFPx-z7B0pfN0ZhGj1bVpE9foHBR1nC

But that link doesn't work. If I found the correct playlist, then the URL would be this:

https://www.youtube.com/embed/videoseries?list=PL2BF759B18CCE5DD2

abettermap · 2020-10-19T21:01:13Z

I'd like to add some basic error catching for non-existent videos and playlists in case we come across another one, but I'm not sure which scenario to use:

If the API doesn't find the match (in other words we won't have a title or description) then should I assume the video won't play either?
...or should I trust that your URLs are 100% always perfect (except for Neo-Aramaic, ha) and try to load the embed anyway?

Option 1 could just show "video not found" or something and not even show the player, while Option 2 (which I'm currently using) shows this:

followed by this if i try to play the video:

If my URL parsing (to extract the playlist/video ID) is accurate for one video and one playlist, then it should be accurate for all of them, so I'm leaning towards assuming that if the API doesn't find anything for the given video/playlist ID, then your URL must be incorrect.

On my end (the code side, not what the user sees) I set up a Sentry message to let us know if the YouTube API returned nothing for that URL. This doesn't tell me whether it's my fault or yours, but at least it will silently tell us which URL failed without anything crashing on the user.

abettermap · 2020-10-19T21:15:49Z

If you want to see what it looks like, here is the deploy: https://deploy-preview-116--languagemapping.netlify.app/details?id=645

archive.org stuff not ready but YT vids and playlists are working.

rperlin-ela · 2020-10-20T03:31:54Z

All looking good. Error catching sounds good, go with your judgment on it.

Maybe the (playlist) part would be less necessary when there is a description though (although I don't see many playlists with a description)? If the description mentions anything like "this collection of..." then the "(playlist)" suffix probably isn't needed.

Yep, good call, this gives a whole new visibility to playlists, so I will work on filling in all the description for playlists, don't think anything additional is needed.

I'm wondering if maybe this is a byproduct of standardizing the URLs, or a true typo, but this one isn't working (Neo-Aramaic).

Fixed now — link was right, but playlist was temporarily/inadvertently private.

abettermap · 2020-10-20T06:55:31Z

Error handling

it was naive of me to ask "can we rely on 100% accurate data". the answer to
that question in the absence of validation and QC/QA, is always NO when
humans are responsible for entering the data. so, with that in mind i took
Sentry up a couple notches with the err handling now catching 3 scenarios:

Incorrect URL format, which needs to be one of the three formats we
discussed: YouTube playlist, embed, or Internet Archive embed. Could be a
typo on your end, for example.
Format was fine but no video/embed/etc. was found, e.g. your Neo-Aramaic
General fetch failure catch-all for the scary unknown scenarios.

Great?

Yes it is. Big softball thrown in your direction to deal with video errors
falling in one of those 3 scenarios. I will use the Sentry info I added today
("tags" relevant to the scenarios) to create an alert so you'll get notified if
someone opens a bogus video URL. You will definitely want to subscribe to those alerts.

Hopefully there's a way in Sentry to pick and choose which
alerts (instead of the full mess of error events) but it's worth it either way.
If you don't want them to go to your shared ELA acct then we can add another
acct as a separate user, but this will save me the trouble of notifying you any
time there is a video issue. It's already an automated process so might as well
take advantage of it.

Closed captioning

...cannot be forced on for videos or playlists which don't already have it, including those where the cc is auto-generated.

Questions

HTML description support?

i turned it on since i saw a <span> in one of your Internet Archive examples but might be a good idea if i disable it since there's no guarantee it will match our UI:

Next steps

Sentry alerts

Will get these wired up in the UI tomorrow or soon, otherwise kind of a waste of time if I have to forward the emails to you about data stuff. I also have it broken down into production and deploy environments, so you can see if they occurred during our internal testing/PR stuff or live site.

I haven't tested it with much besides local dev so far, there's a lot of moving parts to all the error stuff so fingers crossed. it's working great on my local env though so no reason shouldn't be same in The Outside World.

Also can't really test much without a known error (i was leaning on your broken one to test), meaning i can't test the Sentry stuff including alerts either. might have to do a dummy one just so the setup doesn't get left in the dust until a real error happens.

More things I'm missing when

...it's not 12:54am? Most likely.

rperlin-ela · 2020-10-20T13:41:20Z

Ok, sounds good about error handling. No problem about closed captioning. HTML from Archive not important, better to stay consistent. Sorry if I wasn’t clear — I don’t think we need the word “Playlist” for the playlists. I will make sure the descriptions start with “A collection of…” I seem to getting be an error now with the deploy when I go to play any (non-playlist) video. Playlists and Archive looking good tho.

…

On Oct 20, 2020, at 2:55 AM, Jason Lampel ***@***.***> wrote: Error handling it was naive of me to ask "can we rely on 100% accurate data". the answer to that question in the absence of validation and QC/QA, is always NO when humans are responsible for entering the data. so, with that in mind i took Sentry up a couple notches with the err handling now catching 3 scenarios: Incorrect URL format, which needs to be one of the three formats we discussed: YouTube playlist, embed, or Internet Archive embed. Could be a typo on your end, for example. Format was fine but no video/embed/etc. was found, e.g. your Neo-Aramaic General fetch failure catch-all for the scary unknown scenarios. Great? Yes it is. Big softball thrown in your direction to deal with video errors falling in one of those 3 scenarios. I will use the Sentry info I added today ("tags" relevant to the scenarios) to create an alert so you'll get notified if someone opens a bogus video URL. You will definitely want to subscribe to those alerts. Hopefully there's a way in Sentry to pick and choose which alerts (instead of the full mess of error events) but it's worth it either way. If you don't want them to go to your shared ELA acct then we can add another acct as a separate user, but this will save me the trouble of notifying you any time there is a video issue. It's already an automated process so might as well take advantage of it. Closed captioning ...cannot be forced on for videos or playlists which don't already have it, including those where the cc is auto-generated. Questions HTML description support? i turned it on since i saw a <span> in one of your Internet Archive examples but might be a good idea if i disable it since there's no guarantee it will match our UI: <https://user-images.githubusercontent.com/4974087/96547587-98d69d80-1269-11eb-8317-ca0e98312dac.png> Next steps Sentry alerts Will get these wired up in the UI tomorrow or soon, otherwise kind of a waste of time if I have to forward the emails to you about data stuff. I also have it broken down into production and deploy environments, so you can see if they occurred during our internal testing/PR stuff or live site. I haven't tested it with much besides local dev so far, there's a lot of moving parts to all the error stuff so fingers crossed. it's working great on my local env though so no reason shouldn't be same in The Outside World. Also can't really test much without a known error (i was leaning on your broken one to test), meaning i can't test the Sentry stuff including alerts either. might have to do a dummy one just so the setup doesn't get left in the dust until a real error happens. More things I'm missing when ...it's not 12:54am? Most likely. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#72 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMNKB5HTDZAMMUQMUARMRJ3SLUX7DANCNFSM4QTE3FWQ>.

abettermap · 2020-10-20T20:50:02Z

HTML from Archive not important, better to stay consistent.

Ok I will remove it, just make sure to update your Archive descrips otherwise you'll see the HTML as HTML.

Sorry if I wasn’t clear — I don’t think we need the word “Playlist” for the playlists.

No worries, I'll remove it.

I seem to getting be an error now with the deploy when I go to play any (non-playlist) video.

Is it like this?

If so, it's happening regardless of where it's played, e.g. paste the URL into browser bar vs in the project, but if I remove cc_load_policy it seems to work fine:

Non-ELA working example: https://www.youtube.com/embed/M7lc1UVf-VE?cc_load_policy=1
ELA non-working example: https://www.youtube.com/embed/DwLRM-BTDvQ&cc_load_policy=1
ELA working example: https://www.youtube.com/embed/DwLRM-BTDvQ

I'm wondering if it has something to do with not setting the language via URL? I'm assuming it defaults to en, but that video for example only has English auto-generated, with Quechua legit:

How important is it to force the cc on? Since it's super inconsistent anyway should we just leave it up to the user?

abettermap · 2020-10-20T20:52:15Z

The language parameter is cc_lang_pref but might not work anymore (see highlighted text):

abettermap · 2020-10-20T20:53:58Z

Messing with the iframe parameters has nothing to do with the YouTube API or the metadata we are adding, so in order to stay focused on the SOW I'm going to say let's drop the cc efforts for now. If it's important to you then feel free to create a wishlist issue.

rperlin-ela · 2020-10-20T20:55:02Z

Ok, all sounds good. Yes, that’s the error. Yep, if cc_load_policy is causing any trouble, let’s remove it, no problem, no questions asked. It rings a bell that this is something no longer supported— that’s why we used to tag for this in the past but stopped doing so. A miracle way to do this via API parameter seemed too good to be true.

…

On Oct 20, 2020, at 4:54 PM, Jason Lampel ***@***.***> wrote: Messing with the iframe parameters has nothing to do with the YouTube API or the metadata we are adding, so in order to stay focused on the SOW I'm going to say let's drop the cc efforts for now. If it's important to you then feel free to create a wishlist issue. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#72 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMNKB5DO4EFEFFSSK3TU4LDSLX2HLANCNFSM4QTE3FWQ>.

abettermap · 2020-10-20T21:25:29Z

Sooo guess what. The problem was actually from the playlist parameter I'm throwing onto the URL for Archive player. I was just being lazy leaving it in for YouTube because it "worked" but I must have only been testing on playlists. 🤦

cc_load_policy doesn't break anything if I remove that parameter.

abettermap added effort: 2 🥈 (med)🥈 Average amount of work priority: 0 (wishlist) Some day! type: ✨ enhancement✨ New feature, improvement, or functionality comm: ☝️ good first issue☝️ Community: good for newcomers labels Sep 17, 2020

abettermap added focus: ⚙️ functionality⚙️ Requires script changes (TypeScript or JS) priority: 3 (high) Most critical. Should be completed first. and removed priority: 0 (wishlist) Some day! labels Oct 13, 2020

abettermap mentioned this issue Oct 15, 2020

Video/media: add metadata to modals #116

Merged

abettermap pinned this issue Oct 20, 2020

abettermap mentioned this issue Oct 20, 2020

Do more with Sentry #90

Open

6 tasks

abettermap closed this as completed in #116 Oct 20, 2020

abettermap unpinned this issue Oct 22, 2020

Add audio/video metadata #72

Add audio/video metadata #72

Comments

rperlin-ela commented Sep 2, 2020

Description

Resolution

abettermap commented Oct 13, 2020

YouTube

Archive.org

For Future Jason

abettermap commented Oct 13, 2020

rperlin-ela commented Oct 13, 2020 via email

abettermap commented Oct 14, 2020

Highlights from today's call

Transcript files

abettermap commented Oct 14, 2020

abettermap commented Oct 14, 2020

abettermap commented Oct 15, 2020

rperlin-ela commented Oct 15, 2020

abettermap commented Oct 15, 2020

rperlin-ela commented Oct 15, 2020

abettermap commented Oct 15, 2020

rperlin-ela commented Oct 15, 2020 via email

abettermap commented Oct 15, 2020

rperlin-ela commented Oct 15, 2020 via email

abettermap commented Oct 15, 2020

abettermap commented Oct 15, 2020

rperlin-ela commented Oct 15, 2020 via email

abettermap commented Oct 15, 2020

abettermap commented Oct 15, 2020

abettermap commented Oct 15, 2020

rperlin-ela commented Oct 15, 2020 via email

abettermap commented Oct 15, 2020

abettermap commented Oct 15, 2020

rperlin-ela commented Oct 16, 2020 • edited Loading

rperlin-ela commented Oct 16, 2020 • edited Loading

abettermap commented Oct 16, 2020

rperlin-ela commented Oct 16, 2020 via email

abettermap commented Oct 19, 2020

Questions

Youtube playlists meta title

ID/format issue?

abettermap commented Oct 19, 2020

abettermap commented Oct 19, 2020

rperlin-ela commented Oct 20, 2020

abettermap commented Oct 20, 2020

Error handling

Great?

Closed captioning

Questions

HTML description support?

Next steps

Sentry alerts

More things I'm missing when

rperlin-ela commented Oct 20, 2020 via email

abettermap commented Oct 20, 2020

abettermap commented Oct 20, 2020

abettermap commented Oct 20, 2020

rperlin-ela commented Oct 20, 2020 via email

abettermap commented Oct 20, 2020

rperlin-ela commented Oct 16, 2020 •

edited

Loading

rperlin-ela commented Oct 16, 2020 •

edited

Loading