Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Musical creation and transcription assistance via generative AI #3854

Open
9 tasks
walterbender opened this issue Apr 8, 2024 · 8 comments
Open
9 tasks

Comments

@walterbender
Copy link
Member

walterbender commented Apr 8, 2024

Ticket Contents

Description

Many people have musical ideas, but struggle to articulate them. Generative AI has promise to help people find a way to transcribe their musical ideas. The goal would be to create a generative AI tool that assists users such that they could sing (or perform on an instrument) their idea -- as well as speak or type instructions without music -- and be presented with some possible transcriptions of their idea, output as Music Blocks code so that it may be further refined and manipulated.

Specifically, we would be working toward accomplishing the following:

  • Create an in-app interface for a user to record or upload an audio file of their musical sample
  • Create a server-side LLM service that analyzes the sound file and outputs into Music Blocks data
  • Create an API to communicate data between the server and the client

Goals & Mid-Point Milestone

Goals

  • [In-app recorded]
  • [Research applicable models for music transcription]
  • [Research applicable models for conversion of transcript to Music Blocks code]
  • [Backend services]
  • [Frontend UX]
  • [Goals Achieved By Mid-point Milestone]
  • [In-app recorded]
  • [Research applicable models for music transcription]
  • [Research applicable models for conversion of transcript to Music Blocks code]

Setup/Installation

No response

Expected Outcome

It is expected that a user could sing or perform some melody and have that transcribed into a Music Blocks project.

Acceptance Criteria

  • input from the user (and accompanying UX)
  • transcription
  • import of transcription into a Music Blocks project

Implementation Details

The sourcing of an audio file need not be integrated into Music Blocks itself, nor the interface between the audio file and the transcript. But the output either needs to be a Music Block project that can be imported to the application or stored in the Planet.

Mockups/Wireframes

No response

Product Name

Music Blocks

Organisation Name

Sugar Labs

Domain

⁠Education

Tech Skills Needed

Artificial Intelligence, Docker, JavaScript, Python

Mentor(s)

@pikurasa @walterbender

Category

Documentation, Machine Learning, Research, AI

@walterbender walterbender changed the title [DMP 2024]: Musical creation and transcription assistance via generative AI Musical creation and transcription assistance via generative AI Apr 12, 2024
@falgun143
Copy link
Contributor

falgun143 commented Apr 18, 2024

Hello @walterbender Sir ,Will this project be heavily based on ML algorithms ,Or is it based on training of the base model(Like llama2 ,gemma etc...) using the custom data. And will this project be under DMP?

@falgun143
Copy link
Contributor

@walterbender Sir any thoughts?

@walterbender
Copy link
Member Author

@falgun143 the project won't be offered under DMP this year.
While I am not wedded to any particular implementation details, I suspect it will leverage LLMs with some measure of training data.

@gitjeet
Copy link

gitjeet commented Apr 24, 2024

@walterbender For the conversion of recorded music to MB, instead of relying on server-side processing LLM, I believe we could integrate this functionality directly into MB itself. This would involve converting the recorded sound to MIDI format and then to MB.

Once done i think we could then send any format abc or midi to LLM server to improvise

@falgun143
Copy link
Contributor

@walterbender For the conversion of recorded music to MB, instead of relying on server-side processing LLM, I believe we could integrate this functionality directly into MB itself. This would involve converting the recorded sound to MIDI format and then to MB.

Once done i think we could then send any format abc or midi to LLM server to improvise

@gitjeet ,I think as MB is already a huge and old project ,Integrating this llm project directly to the MB codebase may further make MB work slower I feel.

@falgun143
Copy link
Contributor

@walterbender For the conversion of recorded music to MB, instead of relying on server-side processing LLM, I believe we could integrate this functionality directly into MB itself. This would involve converting the recorded sound to MIDI format and then to MB.
Once done i think we could then send any format abc or midi to LLM server to improvise

@gitjeet ,I think as MB is already a huge and old project ,Integrating this llm project directly to the MB codebase may further make MB work slower I feel.

Nope, I think it wouldn't have any issue with the time complexity, and it would be easy and cost-effective. To be clear, what I meant is not integrating the LLM to MB (client-side). What I meant is only the conversion of mp3 -> midi -> MB, which does not require LLM. It is a pure graph algorithm. MB already has conversion for MB -> Midi or MB -> ABC. Send this midi or ABC to LLM (server-side, not client-side). PS: i don't think its old project

Good insight,Then there will not be any frontend right?.But tell me how can we upload an external audio file and extract musical features and then send it to llm? . And also if you see the description of this project it has been clearly written that we have to create an in app interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@walterbender @gitjeet @falgun143 and others