Implementation:Langgenius Dify HitTesting
| Knowledge Sources | Domains | Last Updated |
|---|---|---|
| Dify | RAG, Knowledge_Management, Frontend | 2026-02-12 00:00 GMT |
Overview
Description
hitTesting is a frontend service function that executes a retrieval test against a Dify knowledge base (dataset). It submits a query string along with a full retrieval configuration and returns the matching segments ranked by relevance score. The response includes segment content, relevance scores, t-SNE coordinates for visualization, and child chunk details for hierarchical datasets.
This function is the primary interface for the "Hit Testing" feature in the Dify knowledge base UI, enabling users to evaluate and tune retrieval quality before deploying a dataset in production applications.
Usage
- Call
hitTestingfrom the knowledge base hit testing panel to execute a test query against the dataset. - Pass different
retrieval_modelconfigurations to compare search methods and tuning parameters without altering the dataset's default settings. - Use the returned
recordsarray to render a ranked list of matching segments with their scores. - Use the
tsne_positioncoordinates to render an embedding visualization scatter plot.
Code Reference
Source Location
web/service/datasets.ts, lines 191--193.
Signature
export const hitTesting = (
{ datasetId, queryText, retrieval_model }: {
datasetId: string
queryText: string
retrieval_model: RetrievalConfig
}
): Promise<HitTestingResponse> => {
return post<HitTestingResponse>(
`/datasets/${datasetId}/hit-testing`,
{ body: { query: queryText, retrieval_model } }
)
}
Import
import { hitTesting } from '@/service/datasets'
import type { RetrievalConfig } from '@/types/app'
I/O Contract
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
datasetId |
string |
Yes | The ID of the dataset to query. |
queryText |
string |
Yes | The natural language query string. |
retrieval_model |
RetrievalConfig |
Yes | Full retrieval configuration object. |
RetrievalConfig structure:
| Field | Type | Description |
|---|---|---|
search_method |
RETRIEVE_METHOD |
Search strategy: semantic_search, full_text_search, hybrid_search, or keyword_search.
|
top_k |
number |
Maximum number of segments to return. |
score_threshold_enabled |
boolean |
Whether to filter results below the score threshold. |
score_threshold |
number |
Minimum relevance score for returned segments (0.0 to 1.0). |
reranking_enable |
boolean |
Whether to apply reranking to initial results. |
reranking_model |
object |
Reranking model configuration with reranking_provider_name and reranking_model_name.
|
reranking_mode |
RerankingModeEnum |
Optional. Reranking approach: reranking_model or weighted_score.
|
weights |
object |
Optional. Weighted score configuration with weight_type, vector_setting, and keyword_setting.
|
Outputs
Returns Promise<HitTestingResponse>:
| Field | Type | Description |
|---|---|---|
query |
object |
The submitted query with content (string) and tsne_position ({ x: number, y: number }).
|
records |
HitTesting[] |
Array of matching segments ranked by relevance. |
HitTesting record structure:
| Field | Type | Description |
|---|---|---|
segment |
Segment |
The matched segment, including id, content, position, word_count, tokens, keywords, hit_count, and nested document reference.
|
score |
number |
Relevance score (higher is more relevant). |
tsne_position |
TsnePosition |
2D t-SNE coordinates ({ x: number, y: number }) for embedding visualization.
|
child_chunks |
HitTestingChildChunk[] ¦ null |
Child chunk matches for hierarchical datasets, each with id, content, position, and score.
|
files |
Attachment[] |
Attached files associated with the segment. |
summary |
string |
Optional. Segment summary (hierarchical mode). |
Usage Examples
Running a semantic search test
import { hitTesting } from '@/service/datasets'
const result = await hitTesting({
datasetId: 'ds-abc123',
queryText: 'How do I configure single sign-on?',
retrieval_model: {
search_method: 'semantic_search',
top_k: 5,
score_threshold_enabled: true,
score_threshold: 0.6,
reranking_enable: false,
reranking_model: {
reranking_provider_name: '',
reranking_model_name: '',
},
},
})
console.log(`Query position: (${result.query.tsne_position.x}, ${result.query.tsne_position.y})`)
for (const record of result.records) {
console.log(`Score: ${record.score}, Content: ${record.segment.content.substring(0, 100)}...`)
}
Comparing hybrid search with reranking
import { hitTesting } from '@/service/datasets'
const hybridResult = await hitTesting({
datasetId: 'ds-abc123',
queryText: 'What are the API rate limits?',
retrieval_model: {
search_method: 'hybrid_search',
top_k: 10,
score_threshold_enabled: false,
score_threshold: 0,
reranking_enable: true,
reranking_model: {
reranking_provider_name: 'cohere',
reranking_model_name: 'rerank-english-v2.0',
},
reranking_mode: 'reranking_model',
},
})
// Display results with child chunks for hierarchical datasets
for (const record of hybridResult.records) {
console.log(`[${record.score.toFixed(3)}] ${record.segment.content.substring(0, 80)}`)
if (record.child_chunks) {
for (const child of record.child_chunks) {
console.log(` -> Child [${child.score.toFixed(3)}]: ${child.content.substring(0, 60)}`)
}
}
}