Implementation:Langgenius Dify HitTesting

Knowledge Sources	Domains	Last Updated
Dify	RAG, Knowledge_Management, Frontend	2026-02-12 00:00 GMT

Overview

Description

hitTesting is a frontend service function that executes a retrieval test against a Dify knowledge base (dataset). It submits a query string along with a full retrieval configuration and returns the matching segments ranked by relevance score. The response includes segment content, relevance scores, t-SNE coordinates for visualization, and child chunk details for hierarchical datasets.

This function is the primary interface for the "Hit Testing" feature in the Dify knowledge base UI, enabling users to evaluate and tune retrieval quality before deploying a dataset in production applications.

Usage

Call hitTesting from the knowledge base hit testing panel to execute a test query against the dataset.
Pass different retrieval_model configurations to compare search methods and tuning parameters without altering the dataset's default settings.
Use the returned records array to render a ranked list of matching segments with their scores.
Use the tsne_position coordinates to render an embedding visualization scatter plot.

Code Reference

Source Location

web/service/datasets.ts, lines 191--193.

Signature

export const hitTesting = (
  { datasetId, queryText, retrieval_model }: {
    datasetId: string
    queryText: string
    retrieval_model: RetrievalConfig
  }
): Promise<HitTestingResponse> => {
  return post<HitTestingResponse>(
    `/datasets/${datasetId}/hit-testing`,
    { body: { query: queryText, retrieval_model } }
  )
}

Import

import { hitTesting } from '@/service/datasets'
import type { RetrievalConfig } from '@/types/app'

I/O Contract

Inputs

Parameter	Type	Required	Description
`datasetId`	`string`	Yes	The ID of the dataset to query.
`queryText`	`string`	Yes	The natural language query string.
`retrieval_model`	`RetrievalConfig`	Yes	Full retrieval configuration object.

RetrievalConfig structure:

Field	Type	Description
`search_method`	`RETRIEVE_METHOD`	Search strategy: `semantic_search`, `full_text_search`, `hybrid_search`, or `keyword_search`.
`top_k`	`number`	Maximum number of segments to return.
`score_threshold_enabled`	`boolean`	Whether to filter results below the score threshold.
`score_threshold`	`number`	Minimum relevance score for returned segments (0.0 to 1.0).
`reranking_enable`	`boolean`	Whether to apply reranking to initial results.
`reranking_model`	`object`	Reranking model configuration with `reranking_provider_name` and `reranking_model_name`.
`reranking_mode`	`RerankingModeEnum`	Optional. Reranking approach: `reranking_model` or `weighted_score`.
`weights`	`object`	Optional. Weighted score configuration with `weight_type`, `vector_setting`, and `keyword_setting`.

Outputs

Returns Promise<HitTestingResponse>:

Field	Type	Description
`query`	`object`	The submitted query with `content` (string) and `tsne_position` (`{ x: number, y: number }`).
`records`	`HitTesting[]`	Array of matching segments ranked by relevance.

HitTesting record structure:

Field	Type	Description
`segment`	`Segment`	The matched segment, including `id`, `content`, `position`, `word_count`, `tokens`, `keywords`, `hit_count`, and nested `document` reference.
`score`	`number`	Relevance score (higher is more relevant).
`tsne_position`	`TsnePosition`	2D t-SNE coordinates (`{ x: number, y: number }`) for embedding visualization.
`child_chunks`	`HitTestingChildChunk[] ¦ null`	Child chunk matches for hierarchical datasets, each with `id`, `content`, `position`, and `score`.
`files`	`Attachment[]`	Attached files associated with the segment.
`summary`	`string`	Optional. Segment summary (hierarchical mode).

Usage Examples

Running a semantic search test

import { hitTesting } from '@/service/datasets'

const result = await hitTesting({
  datasetId: 'ds-abc123',
  queryText: 'How do I configure single sign-on?',
  retrieval_model: {
    search_method: 'semantic_search',
    top_k: 5,
    score_threshold_enabled: true,
    score_threshold: 0.6,
    reranking_enable: false,
    reranking_model: {
      reranking_provider_name: '',
      reranking_model_name: '',
    },
  },
})

console.log(`Query position: (${result.query.tsne_position.x}, ${result.query.tsne_position.y})`)
for (const record of result.records) {
  console.log(`Score: ${record.score}, Content: ${record.segment.content.substring(0, 100)}...`)
}

Comparing hybrid search with reranking

import { hitTesting } from '@/service/datasets'

const hybridResult = await hitTesting({
  datasetId: 'ds-abc123',
  queryText: 'What are the API rate limits?',
  retrieval_model: {
    search_method: 'hybrid_search',
    top_k: 10,
    score_threshold_enabled: false,
    score_threshold: 0,
    reranking_enable: true,
    reranking_model: {
      reranking_provider_name: 'cohere',
      reranking_model_name: 'rerank-english-v2.0',
    },
    reranking_mode: 'reranking_model',
  },
})

// Display results with child chunks for hierarchical datasets
for (const record of hybridResult.records) {
  console.log(`[${record.score.toFixed(3)}] ${record.segment.content.substring(0, 80)}`)
  if (record.child_chunks) {
    for (const child of record.child_chunks) {
      console.log(`  -> Child [${child.score.toFixed(3)}]: ${child.content.substring(0, 60)}`)
    }
  }
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment