Voice search
The widget offers voice search as an alternative to typing. Transcription happens in the user's browser via the Web Speech API (webkitSpeechRecognition / SpeechRecognition); the widget then sends the resulting text to /api/widget/search. The component is VoiceSearchModal.
How it works
- The user clicks the "voice" button next to the input. If
features.voiceSearch = falseor the browser does not support the Web Speech API, the button is not rendered (widget/src/core/voiceSearch.ts:96-99checks'SpeechRecognition' in window). - The modal opens and the browser requests microphone permission (first time).
- The browser transcribes in real time. The modal applies:
MAX_RECORDING_SECONDS = 60— cuts off after one minute.- Silence detection: 1.5 s of silence after at least 2 s recorded → cut and process (
VoiceSearchModal.tsx:32-35).
- When transcription ends, the widget calls
POST /api/widget/searchwithquery = transcript(useModalSearchHandlers.handleVoiceSearch:120-156). No audio is uploaded: STT lives entirely on the client.
Design decision: the Web Speech API gives us zero network latency for STT and zero server-side cost, at the price of limited support (Chromium-based + Safari). There is no server-side fallback — if the browser does not support Web Speech API, the feature is hidden.
Browser support
| Browser | Support |
|---|---|
| Chrome / Edge (≥ 90) | yes |
| Safari (≥ 14.1) | yes |
| iOS Safari (≥ 14.5) | yes — explicit permission required |
| Android Chrome | yes |
| Firefox | no — Firefox does not implement webkitSpeechRecognition or SpeechRecognition. The button is hidden. |
| Others | graceful fallback: the button is hidden if the API is missing. |
Modal states
The modal cycles through these states (i18n keys under voice.*):
tapToStart— initial prompt.recording— recording and streaming transcription.processing— confirming transcript and dispatching the search.success— transcript ready.- Errors:
microphonePermissionDenied,microphoneAccessFailed,microphoneBlockedByPolicy,noAudioDetected,noVoiceDetected,audioProcessError,recordingError,browserNotSupported.
Browser permissions
getUserMedia()requires a secure context (HTTPS). It works on localhost without HTTPS by exception.- If your host applies
Permissions-Policy, do not blockmicrophone=*. Recommended:Permissions-Policy: microphone=(self "https://cdn.neuroon.ai").
A server-side endpoint exists (but the widget does not use it)
There is a POST /api/widget/search/audio endpoint (WidgetSearchController.java:185) that accepts multipart/form-data with an audio field. It is available if you want to build a custom client with server-side STT, but the official widget does not use it today.
curl -X POST https://api.neuroon.ai/api/widget/search/audio \
-H "X-Widget-Token: $WIDGET_TOKEN" \
-F "audio=@recording.webm" \
-F "locale=en"
Returns AudioSearchResponseDTO with both the transcript and the results. Useful for server-side integrations or browsers without Web Speech API.
Further reading
- Text search — the flow the transcript flows into.
- Accessibility — the modal has focus trap and closes on
Esc. - Reference → Errors.