Suggestions

The widget consumes two distinct endpoints for suggestions, with different purposes. It is important not to confuse them.

Endpoint	Latency	Source	Purpose
`GET /api/widget/suggestions`	< 50 ms (p95)	precomputed index	traditional autocomplete: past queries, categories, brands, top searched products
`GET /api/widget/suggest`	200-600 ms	LLM	conversational completion / inline ghost text

`/api/widget/suggestions` (instant)

GET/api/widget/suggestions

curl "https://api.neuroon.ai/api/widget/suggestions?q=zap&limit=8" \
  -H "X-Widget-Token: $WIDGET_TOKEN"

Controller at WidgetSearchController.java.

Returns an array of Suggestion (

GET/api/widget/suggest

curl "https://api.neuroon.ai/api/widget/suggest?q=zapatos%20mujer%20talla" \
  -H "X-Widget-Token: $WIDGET_TOKEN"

Controller at WidgetSearchController.java.

Returns a single completion that the widget shows as ghost text inside the input (IDE-style): the user presses Tab to accept.

Autocomplete while typing → /suggestions. Cheap, p95 < 50 ms, no LLM cost.
Suggest the next token of the query → /suggest. More expensive but contextual.

The widget combines both: the dropdown at the bottom comes from /suggestions; the inline ghost text comes from /suggest.

features.suggestions: false disables both completers and removes the dropdown.
Currently you cannot disable only one; if you need that fine-grained control, open an issue with support.

Endpoint	Per-minute limit
`GET /api/widget/suggestions`	300
`GET /api/widget/suggest`	100