Implementation:HKUDS AI Trader Jina Search Tool
| Knowledge Sources | |
|---|---|
| Domains | MCP_Tools, Web_Search, Information_Retrieval |
| Last Updated | 2026-02-09 14:00 GMT |
Overview
MCP tool service that provides web search and scraping capabilities to trading agents via the Jina AI Reader and Search APIs.
Description
The tool_jina_search.py module implements a FastMCP tool server exposing a get_information function that trading agents can call to search the web for financial news and information. It uses the Jina AI search API (s.jina.ai) to find URLs matching a query, then scrapes content via the Jina Reader API (r.jina.ai). Results are filtered by publication date relative to the simulation date (TODAY_DATE from config) to prevent future data leakage during backtesting. The module includes a WebScrapingJinaTool class that handles the search-then-scrape pipeline and a parse_date_to_standard utility for normalizing various date formats.
Usage
This tool runs as a standalone MCP HTTP service (default port 8001) started by the MCPServiceManager. Trading agents invoke it via the MCP protocol to gather market news and analysis during their decision-making process.
Code Reference
Source Location
- Repository: HKUDS_AI_Trader
- File: agent_tools/tool_jina_search.py
- Lines: 1-280
Signature
def parse_date_to_standard(date_str: str) -> str:
"""Convert various date formats to standard YYYY-MM-DD HH:MM:SS format."""
class WebScrapingJinaTool:
def __init__(self):
"""Initialize with JINA_API_KEY from environment."""
def __call__(self, query: str) -> List[Dict[str, Any]]:
"""Search and scrape web content for query."""
def _jina_scrape(self, url: str) -> Dict[str, Any]:
"""Scrape a single URL via Jina Reader API."""
def _jina_search(self, query: str) -> List[str]:
"""Search for URLs matching query via Jina Search API."""
@mcp.tool()
def get_information(query: str) -> str:
"""MCP-exposed tool: search web and return structured content."""
Import
from agent_tools.tool_jina_search import WebScrapingJinaTool, parse_date_to_standard
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| query | str | Yes | Search terms to find matching web content |
| JINA_API_KEY | env var | Yes | Jina AI API key for authentication |
| TODAY_DATE | config | No | Simulation date for filtering future content |
Outputs
| Name | Type | Description |
|---|---|---|
| result | str | Formatted string with URL, title, description, publish time, and content (first 1000 chars) for each result |
Usage Examples
# Direct Python usage
from agent_tools.tool_jina_search import WebScrapingJinaTool
tool = WebScrapingJinaTool()
results = tool("AAPL earnings report Q3 2025")
for result in results:
print(f"Title: {result['title']}")
print(f"URL: {result['url']}")
print(f"Content: {result['content'][:200]}...")
# Running as MCP service
# python agent_tools/tool_jina_search.py
# Starts HTTP server on port 8001 (configurable via SEARCH_HTTP_PORT env var)