Skip to main content

Suggestions

The widget consumes two distinct endpoints for suggestions, with different purposes. It is important not to confuse them.

EndpointLatencySourcePurpose
GET /api/widget/suggestions< 50 ms (p95)precomputed indextraditional autocomplete: past queries, categories, brands, top searched products
GET /api/widget/suggest200-600 msLLMconversational completion / inline ghost text

/api/widget/suggestions (instant)

GET/api/widget/suggestions
curl "https://api.neuroon.ai/api/widget/suggestions?q=zap&limit=8" \
-H "X-Widget-Token: $WIDGET_TOKEN"

Controller at WidgetSearchController.java.

Returns an array of Suggestion (

GET/api/widget/suggest
curl "https://api.neuroon.ai/api/widget/suggest?q=zapatos%20mujer%20talla" \
-H "X-Widget-Token: $WIDGET_TOKEN"

Controller at WidgetSearchController.java.

Returns a single completion that the widget shows as ghost text inside the input (IDE-style): the user presses Tab to accept.

When to use each

  • Autocomplete while typing/suggestions. Cheap, p95 < 50 ms, no LLM cost.
  • Suggest the next token of the query/suggest. More expensive but contextual.

The widget combines both: the dropdown at the bottom comes from /suggestions; the inline ghost text comes from /suggest.

Configuration

  • features.suggestions: false disables both completers and removes the dropdown.
  • Currently you cannot disable only one; if you need that fine-grained control, open an issue with support.

Rate limits

EndpointPer-minute limit
GET /api/widget/suggestions300
GET /api/widget/suggest100

Further reading