Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weโ€™ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

๐ŸŽ›๏ธ fix: Improve Frontend Practices for Audio Settings #3624

Merged
merged 4 commits into from
Aug 13, 2024

Conversation

danny-avila
Copy link
Owner

Summary

Closes #3622

  • Refactored the Dropdown component to use TypeScript in useCallback for better type safety.
  • Updated the Speech component to remember the last selected voice, improving user experience.
  • Maintained speechTab selections
  • Removed await calls inside useCallbacks and relied on updates for dropdowns to enhance performance.
  • Updated Dropdown component styles to match the header theme for visual consistency.
  • Refactored useTextToSpeech hook to improve voice selection and management.
  • Updated useTextToSpeechBrowser, useTextToSpeechEdge, and useTextToSpeechExternal hooks for better state management and error handling.
  • Modified the store to use undefined as the default value for the voice atom.
  • Updated various components to use common Option type for dropdown options.
  • Improved error handling and logging throughout the audio-related components.

Testing:
I thoroughly tested the changes by manually interacting with the audio settings in the application. I verified that voice selection, speech generation, and playback work as expected across different browsers and scenarios. Additionally, I ensured that the UI updates correctly when changing audio settings.

To test these changes:

  1. Navigate to the audio settings in the application.
  2. Try selecting different voices and verify that the last selected voice is remembered.
  3. Generate speech using various engines (browser, edge, external) and ensure playback works correctly.
  4. Test error scenarios by intentionally causing failures (e.g., disconnecting internet during speech generation) and verify that error messages are displayed correctly.

@danny-avila danny-avila merged commit 0569623 into main Aug 13, 2024
3 checks passed
@danny-avila danny-avila deleted the fix/voice-settings branch August 13, 2024 06:42
@danny-avila
Copy link
Owner Author

danny-avila commented Aug 13, 2024

@berry-13 it's a mistake how we are currently managing speech on the frontend.

The current design is trying to account for all providers along with speech-to-text AND text-to-speech into singular logic, and through many abstracted values/hooks. It's very unclear what's what and leads to many unexpected behaviors.

Ideally, each setting, STT and TTS, has their own components, and each provider (external, edge, browser) also has their own components for audio, and from there they use their own respective hooks.

@danny-avila
Copy link
Owner Author

danny-avila commented Aug 13, 2024

I've patched up some of the bugs between this PR and #3627 created from managing tts/tts/edge/browser/external this way, but then it should not create this many warnings (firefox only)

image

This is indicative of bad design, and I'd rather remove msedge completely to avoid these issues, not to mention the npm vulnerabilities it creates

npm audit
# npm audit report

axios  >=1.3.2
Severity: high
Server-Side Request Forgery in axios - https://github.com/advisories/GHSA-8hc4-vh64-cxmj
fix available via `npm audit fix --force`
Will install axios@1.3.1, which is a breaking change
node_modules/axios
  msedge-tts  >=1.2.0
  Depends on vulnerable versions of axios
  Depends on vulnerable versions of crypto-browserify
  node_modules/msedge-tts

@danny-avila
Copy link
Owner Author

I'm also starting to see these errors on chrome:

WebSocket connection to 'wss://speech.platform.bing.com/consumer/speech/synthesize/readaloud/edge/v1?TrustedClientToken=6A5AA1D4EAFF4E9FB37E23D68491D6F4' failed: Insufficient resources
_initClient @ vendor-C9dhCTCX.js:170
(anonymous) @ vendor-C9dhCTCX.js:170
(anonymous) @ vendor-C9dhCTCX.js:170
__awaiter @ vendor-C9dhCTCX.js:170
_send @ vendor-C9dhCTCX.js:170
_ws.onopen @ vendor-C9dhCTCX.js:170

danny-avila added a commit that referenced this pull request Aug 17, 2024
* refactor: do not call await inside useCallbacks, rely on updates for dropdown

* fix: remember last selected voice

* refactor: Update Speech component to use TypeScript in useCallback

* refactor: Update Dropdown component styles to match header theme
kenshinsamue pushed a commit to intelequia/LibreChat that referenced this pull request Sep 17, 2024
* refactor: do not call await inside useCallbacks, rely on updates for dropdown

* fix: remember last selected voice

* refactor: Update Speech component to use TypeScript in useCallback

* refactor: Update Dropdown component styles to match header theme
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Selecting the voice type does not work.
1 participant