Voice search

The widget offers voice search as an alternative to typing. Transcription happens in the user's browser via the Web Speech API (webkitSpeechRecognition / SpeechRecognition); the widget then sends the resulting text to /api/widget/search. The component is VoiceSearchModal.

How it works

The user clicks the "voice" button next to the input. If features.voiceSearch = false or the browser does not support the Web Speech API, the button is not rendered (widget/src/core/voiceSearch.ts:96-99 checks 'SpeechRecognition' in window).
The modal opens and the browser requests microphone permission (first time).
The browser transcribes in real time. The modal applies:
- MAX_RECORDING_SECONDS = 60 — cuts off after one minute.
- Silence detection: 1.5 s of silence after at least 2 s recorded → cut and process (VoiceSearchModal.tsx:32-35).
When transcription ends, the widget calls POST /api/widget/search with query = transcript (useModalSearchHandlers.handleVoiceSearch:120-156). No audio is uploaded: STT lives entirely on the client.

Design decision: the Web Speech API gives us zero network latency for STT and zero server-side cost, at the price of limited support (Chromium-based + Safari). There is no server-side fallback — if the browser does not support Web Speech API, the feature is hidden.

Browser support

Browser	Support
Chrome / Edge (≥ 90)	yes
Safari (≥ 14.1)	yes
iOS Safari (≥ 14.5)	yes — explicit permission required
Android Chrome	yes
Firefox	no — Firefox does not implement `webkitSpeechRecognition` or `SpeechRecognition`. The button is hidden.
Others	graceful fallback: the button is hidden if the API is missing.

The modal cycles through these states (i18n keys under voice.*):

tapToStart — initial prompt.
recording — recording and streaming transcription.
processing — confirming transcript and dispatching the search.
success — transcript ready.
Errors: microphonePermissionDenied, microphoneAccessFailed, microphoneBlockedByPolicy, noAudioDetected, noVoiceDetected, audioProcessError, recordingError, browserNotSupported.

Browser permissions

getUserMedia() requires a secure context (HTTPS). It works on localhost without HTTPS by exception.
If your host applies Permissions-Policy, do not block microphone=*. Recommended: Permissions-Policy: microphone=(self "https://cdn.neuroon.ai").

There is a POST /api/widget/search/audio endpoint (WidgetSearchController.java:185) that accepts multipart/form-data with an audio field. It is available if you want to build a custom client with server-side STT, but the official widget does not use it today.

curl -X POST https://api.neuroon.ai/api/widget/search/audio \
  -H "X-Widget-Token: $WIDGET_TOKEN" \
  -F "audio=@recording.webm" \
  -F "locale=en"

Returns AudioSearchResponseDTO with both the transcript and the results. Useful for server-side integrations or browsers without Web Speech API.

Voice search

How it works

Browser support

Browser permissions

A server-side endpoint exists (but the widget does not use it)

Further reading

How it works​

Browser support​

Modal states​

Browser permissions​

A server-side endpoint exists (but the widget does not use it)​

Further reading​

How it works

Browser support

Modal states

Browser permissions

A server-side endpoint exists (but the widget does not use it)

Further reading