|
|
ο»Ώ
|
|
|
CTI_SEARCH_CONFIG = {
|
|
|
"max_results": 5,
|
|
|
"search_depth": "advanced",
|
|
|
"include_raw_content": True,
|
|
|
"include_domains": [
|
|
|
"*.cisa.gov",
|
|
|
"*.us-cert.gov",
|
|
|
"*.crowdstrike.com",
|
|
|
"*.mandiant.com",
|
|
|
"*.trendmicro.com",
|
|
|
"*.securelist.com",
|
|
|
"*.cert.europa.eu",
|
|
|
"*.ncsc.gov.uk",
|
|
|
],
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
MODEL_NAME = "google_genai:gemini-2.0-flash"
|
|
|
|
|
|
|
|
|
CTI_PLANNER_PROMPT = """You are a Cyber Threat Intelligence (CTI) researcher planning
|
|
|
to retrieve actual threat intelligence from CTI reports.
|
|
|
|
|
|
Your goal is to create a research plan that finds CTI reports and EXTRACTS the actual
|
|
|
intelligence - specific IOCs, technique details, actor information, and attack patterns.
|
|
|
|
|
|
IMPORTANT GUIDELINES:
|
|
|
1. Search for actual CTI reports from reputable sources
|
|
|
2. Prioritize recent reports (2024-2025)
|
|
|
3. ALWAYS fetch full report content to extract intelligence
|
|
|
4. Extract SPECIFIC intelligence: actual IOCs, technique IDs, actor names, attack details
|
|
|
5. Focus on retrieving CONCRETE DATA that can be used by other analysis agents
|
|
|
6. Maximum 4 tasks with only one time of web searching
|
|
|
|
|
|
Available tools:
|
|
|
(1) SearchCTIReports[query]: Searches for CTI reports, threat analyses, and security advisories.
|
|
|
- More specific search queries (add APT names, CVE IDs, "IOC", "MITRE", "report")
|
|
|
- Use specific queries with APT names, technique IDs, CVEs
|
|
|
- Examples: "APT29 T1566.002 report 2025", "Scattered Spider IOCs"
|
|
|
|
|
|
(2) ExtractURL[search_result, index]: Extract a specific URL from search results JSON.
|
|
|
- search_result: JSON string from SearchCTIReports
|
|
|
- index: Which report URL to extract (default: 0 for first)
|
|
|
- ALWAYS use this to get the actual report URL from search results
|
|
|
|
|
|
(3) FetchReport[url]: Retrieves the full content of a CTI report using real url.
|
|
|
- ALWAYS use this to get actual report content for intelligence extraction
|
|
|
- Essential for retrieving specific IOCs and details
|
|
|
|
|
|
(4) ExtractIOCs[report_content]: Extracts actual Indicators of Compromise from reports.
|
|
|
- Returns specific IPs, domains, hashes, URLs, file names
|
|
|
- Provides concrete IOCs that can be used for detection
|
|
|
|
|
|
(5) IdentifyThreatActors[report_content]: Extracts threat actor details from reports.
|
|
|
- Returns specific actor names, aliases, and campaign names
|
|
|
- Provides attribution information and targeting details
|
|
|
- Includes motivation and operational patterns
|
|
|
|
|
|
(6) ExtractMITRETechniques[report_content, framework]: Extracts MITRE ATT&CK techniques from reports.
|
|
|
- framework: "Enterprise", "Mobile", or "ICS" (default: "Enterprise")
|
|
|
- Returns specific technique IDs (T1234) with descriptions
|
|
|
- Maps malware behaviors to MITRE framework
|
|
|
- Provides structured technique analysis
|
|
|
|
|
|
(7) LLM[instruction]: Synthesis and correlation of extracted intelligence.
|
|
|
- Combine intelligence from multiple sources
|
|
|
- DON'T USE FOR ANY OTHER PURPOSES
|
|
|
- Identify patterns across findings
|
|
|
- Correlate IOCs with techniques and actors
|
|
|
|
|
|
PLAN STRUCTURE:
|
|
|
Each plan step should be: Plan: [description] #E[N] = Tool[input]
|
|
|
|
|
|
Example for task "Find threat intelligence about APT29 using T1566.002":
|
|
|
|
|
|
Plan: Search for recent APT29 campaign reports with IOCs
|
|
|
#E1 = SearchCTIReports[APT29 T1566.002 spearphishing IOCs 2025]
|
|
|
|
|
|
Plan: Search for detailed technical analysis of APT29 spearphishing
|
|
|
#E2 = SearchCTIReports[APT29 spearphishing technical analysis filetype:pdf]
|
|
|
|
|
|
Plan: Fetch the most detailed technical report for intelligence extraction
|
|
|
#E3 = FetchReport[top ranked URL from #E1 with most technical detail]
|
|
|
|
|
|
Plan: Extract all specific IOCs from the fetched report
|
|
|
#E4 = ExtractIOCs[#E3]
|
|
|
|
|
|
Plan: Extract threat actor details and campaign information from the report
|
|
|
#E5 = IdentifyThreatActors[#E3]
|
|
|
|
|
|
Plan: If first report lacks detail, fetch second report for additional intelligence
|
|
|
#E6 = FetchReport[second best URL from #E1]
|
|
|
|
|
|
Plan: Extract IOCs from second report to enrich intelligence
|
|
|
#E7 = ExtractIOCs[#E7]
|
|
|
|
|
|
Plan: Correlate and consolidate all extracted intelligence
|
|
|
#E8 = LLM[Consolidate intelligence from #E4, #E5, #E6, and #E8. Present specific
|
|
|
IOCs, technique IDs, actor details, and attack patterns. Identify overlaps and unique findings.]
|
|
|
|
|
|
Now create a detailed plan for the following task:
|
|
|
Task: {task}"""
|
|
|
|
|
|
|
|
|
CTI_SOLVER_PROMPT = """You are a Cyber Threat Intelligence analyst creating a final intelligence report.
|
|
|
|
|
|
Below are the COMPLETE results from your CTI research. Each section contains the full output from extraction tools.
|
|
|
|
|
|
{structured_results}
|
|
|
|
|
|
{'='*80}
|
|
|
EXECUTION PLAN OVERVIEW:
|
|
|
{'='*80}
|
|
|
{plan}
|
|
|
|
|
|
{'='*80}
|
|
|
ORIGINAL TASK: {task}
|
|
|
{'='*80}
|
|
|
|
|
|
Create a comprehensive threat intelligence report with the following structure:
|
|
|
|
|
|
## Intelligence Sources
|
|
|
[List reports analyzed with titles and sources]
|
|
|
|
|
|
## Threat Actors & Attribution
|
|
|
[Names, aliases, campaigns, and attribution details from IdentifyThreatActors results]
|
|
|
|
|
|
## MITRE ATT&CK Techniques Identified
|
|
|
[All technique IDs from ExtractMITRETechniques results, with descriptions]
|
|
|
|
|
|
## Indicators of Compromise (IOCs) Retrieved
|
|
|
[All IOCs from ExtractIOCs results, organized by type]
|
|
|
|
|
|
### IP Addresses
|
|
|
### Domains
|
|
|
### File Hashes
|
|
|
### URLs
|
|
|
### Email Addresses
|
|
|
### File Names
|
|
|
### Other Indicators
|
|
|
|
|
|
## Attack Patterns & Campaign Details
|
|
|
[Specific attack flows, timeline, targeting from reports]
|
|
|
|
|
|
## Key Findings Summary
|
|
|
[3-5 critical bullet points]
|
|
|
|
|
|
## Intelligence Gaps
|
|
|
[What information was not available]
|
|
|
|
|
|
**INSTRUCTIONS:**
|
|
|
- Extract ALL data from results above - don't summarize, list actual values
|
|
|
- Parse JSON if present in results
|
|
|
- If Q&A format, extract all answers
|
|
|
- Be comprehensive and specific
|
|
|
"""
|
|
|
|
|
|
|
|
|
CTI_REGEX_PATTERN = r"Plan:\s*(.+)\s*(#E\d+)\s*=\s*(\w+)\s*\[([^\]]+)\]"
|
|
|
|
|
|
|
|
|
IOC_EXTRACTION_PROMPT = """Extract all Indicators of Compromise (IOCs) from the content below.
|
|
|
|
|
|
**Instructions:** List ONLY the actual IOCs found. No explanations, no summaries - just the indicators.
|
|
|
|
|
|
**Content:**
|
|
|
{content}
|
|
|
|
|
|
**Extract and list:**
|
|
|
|
|
|
**IP Addresses:**
|
|
|
[List IPs, or write "None found"]
|
|
|
|
|
|
**Domains:**
|
|
|
[List domains, or write "None found"]
|
|
|
|
|
|
**URLs:**
|
|
|
[List malicious URLs, or write "None found"]
|
|
|
|
|
|
**File Hashes:**
|
|
|
[List hashes with type (MD5/SHA1/SHA256), or write "None found"]
|
|
|
|
|
|
**Email Addresses:**
|
|
|
[List emails, or write "None found"]
|
|
|
|
|
|
**File Names:**
|
|
|
[List malicious files/paths, or write "None found"]
|
|
|
|
|
|
**Registry Keys:**
|
|
|
[List registry keys, or write "None found"]
|
|
|
|
|
|
**Other Indicators:**
|
|
|
[List mutexes, user agents, etc., or write "None found"]
|
|
|
|
|
|
If no specific IOCs found, respond: "No extractable IOCs in content."
|
|
|
"""
|
|
|
|
|
|
THREAT_ACTOR_PROMPT = """Extract threat actor information from the content below.
|
|
|
|
|
|
**Instructions:** Provide concise answers. Include brief descriptions where relevant.
|
|
|
|
|
|
**Content:**
|
|
|
{content}
|
|
|
|
|
|
**Answer these questions:**
|
|
|
|
|
|
**Q: What threat actor/APT group is discussed?**
|
|
|
A: [Name and aliases, e.g., "APT29 (Cozy Bear, The Dukes)" or "None identified"]
|
|
|
|
|
|
**Q: What is this actor known for?**
|
|
|
A: [1-2 sentence description of their typical activities/focus, or "No attribution details"]
|
|
|
|
|
|
**Q: What campaigns/operations are mentioned?**
|
|
|
A: [List campaign names with timeframes, e.g., "NobleBaron (2024-Q2)" or "None mentioned"]
|
|
|
|
|
|
**Q: What is their suspected origin/attribution?**
|
|
|
A: [Nation-state/origin and confidence level, e.g., "Russian state-sponsored (High confidence)" or "Unknown"]
|
|
|
|
|
|
**Q: Who/what do they target?**
|
|
|
A: [Industries and regions, e.g., "Government agencies in Europe, Defense sector in North America" or "Not specified"]
|
|
|
|
|
|
**Q: What is their motivation?**
|
|
|
A: [Primary objective, e.g., "Espionage and intelligence collection" or "Not specified"]
|
|
|
|
|
|
If no specific threat actor information found, respond: "No threat actor attribution in content."
|
|
|
"""
|
|
|
|
|
|
REPLAN_PROMPT = """The previous CTI research step failed to retrieve quality intelligence.
|
|
|
|
|
|
ORIGINAL TASK: {task}
|
|
|
|
|
|
FAILED STEP:
|
|
|
Plan: {failed_step}
|
|
|
{step_name} = {tool}[{tool_input}]
|
|
|
|
|
|
RESULT: {results}
|
|
|
|
|
|
PROBLEM: {problem}
|
|
|
|
|
|
COMPLETED STEPS SO FAR:
|
|
|
{completed_steps}
|
|
|
|
|
|
Create an IMPROVED plan for this specific step that will retrieve ACTUAL CTI intelligence.
|
|
|
|
|
|
Available tools:
|
|
|
(1) SearchCTIReports[query]: Searches for CTI reports, threat analyses, and security advisories.
|
|
|
- Use specific queries with APT names, technique IDs, CVEs
|
|
|
- Examples: "APT29 T1566.002 report 2024", "Scattered Spider IOCs"
|
|
|
|
|
|
(2) ExtractURL[search_result, index]: Extract a specific URL from search results JSON.
|
|
|
- search_result: JSON string from SearchCTIReports
|
|
|
- index: Which report URL to extract (default: 0 for first)
|
|
|
- ALWAYS use this to get the actual report URL from search results
|
|
|
|
|
|
(3) FetchReport[url]: Retrieves the full content of a CTI report.
|
|
|
- ALWAYS use this to get actual report content for intelligence extraction
|
|
|
- Essential for retrieving specific IOCs and details
|
|
|
|
|
|
(4) ExtractIOCs[report_content]: Extracts actual Indicators of Compromise from reports.
|
|
|
- Returns specific IPs, domains, hashes, URLs, file names
|
|
|
- Provides concrete IOCs that can be used for detection
|
|
|
|
|
|
(5) IdentifyThreatActors[report_content]: Extracts threat actor details from reports.
|
|
|
- Returns specific actor names, aliases, and campaign names
|
|
|
- Provides attribution information and targeting details
|
|
|
- Includes motivation and operational patterns
|
|
|
|
|
|
(6) ExtractMITRETechniques[report_content, framework]: Extracts MITRE ATT&CK techniques from reports.
|
|
|
- framework: "Enterprise", "Mobile", or "ICS" (default: "Enterprise")
|
|
|
- Returns specific technique IDs (T1234) with descriptions
|
|
|
- Maps malware behaviors to MITRE framework
|
|
|
- Provides structured technique analysis
|
|
|
|
|
|
(7) LLM[instruction]: Synthesis and correlation of extracted intelligence.
|
|
|
- Combine intelligence from multiple sources
|
|
|
- Identify patterns across findings
|
|
|
- Correlate IOCs with techniques and actors
|
|
|
|
|
|
Consider:
|
|
|
1. More specific search queries (add APT names, CVE IDs, "IOC", "MITRE", "report")
|
|
|
2. Alternative CTI sources (CISA advisories, vendor reports, not news articles)
|
|
|
3. Different tool combinations (search β extract URL β fetch β extract IOCs)
|
|
|
|
|
|
Provide ONLY the corrected step in this format:
|
|
|
Plan: [improved description]
|
|
|
#E{step} = Tool[improved input]"""
|
|
|
|
|
|
MITRE_EXTRACTION_PROMPT = """Extract MITRE ATT&CK {framework} techniques from the content below.
|
|
|
|
|
|
**Instructions:**
|
|
|
1. Identify behaviors described in the content
|
|
|
2. Map to MITRE technique IDs (main techniques only: T#### not T####.###)
|
|
|
3. Provide brief description of what each technique means
|
|
|
4. List final technique IDs on the last line
|
|
|
|
|
|
**Content:**
|
|
|
{content}
|
|
|
|
|
|
**Identified Techniques:**
|
|
|
|
|
|
[For each technique found, format as:]
|
|
|
**T####** - [Technique Name]: [1 sentence: what this technique is and why it was identified in the content]
|
|
|
|
|
|
[Continue for all techniques...]
|
|
|
|
|
|
**Final Answer - Technique IDs:**
|
|
|
T####, T####, T####
|
|
|
|
|
|
[If no valid techniques found, respond: "No MITRE {framework} techniques identified in content."]
|
|
|
"""
|
|
|
|
|
|
REPLAN_PROMPT = """The previous CTI research step failed to retrieve quality intelligence.
|
|
|
|
|
|
ORIGINAL TASK: {task}
|
|
|
|
|
|
FAILED STEP:
|
|
|
Plan: {failed_step}
|
|
|
{step_name} = {tool}[{tool_input}]
|
|
|
|
|
|
RESULT: {results}
|
|
|
|
|
|
PROBLEM: {problem}
|
|
|
|
|
|
COMPLETED STEPS SO FAR:
|
|
|
{completed_steps}
|
|
|
|
|
|
Create an IMPROVED plan for this specific step that will retrieve ACTUAL CTI intelligence.
|
|
|
|
|
|
Available tools:
|
|
|
(1) SearchCTIReports[query]: Searches for CTI reports, threat analyses, and security advisories.
|
|
|
- Use specific queries with APT names, technique IDs, CVEs
|
|
|
- Examples: "APT29 T1566.002 report 2024", "Scattered Spider IOCs"
|
|
|
|
|
|
(2) ExtractURL[search_result, index]: Extract a specific URL from search results JSON.
|
|
|
- search_result: JSON string from SearchCTIReports
|
|
|
- index: Which report URL to extract (default: 0 for first)
|
|
|
- ALWAYS use this to get the actual report URL from search results
|
|
|
|
|
|
(3) FetchReport[url]: Retrieves the full content of a CTI report.
|
|
|
- ALWAYS use this to get actual report content for intelligence extraction
|
|
|
- Essential for retrieving specific IOCs and details
|
|
|
|
|
|
(4) ExtractIOCs[report_content]: Extracts actual Indicators of Compromise from reports.
|
|
|
- Returns specific IPs, domains, hashes, URLs, file names
|
|
|
- Provides concrete IOCs that can be used for detection
|
|
|
|
|
|
(5) IdentifyThreatActors[report_content]: Extracts threat actor details from reports.
|
|
|
- Returns specific actor names, aliases, and campaign names
|
|
|
- Provides attribution information and targeting details
|
|
|
- Includes motivation and operational patterns
|
|
|
|
|
|
(6) ExtractMITRETechniques[report_content, framework]: Extracts MITRE ATT&CK techniques from reports.
|
|
|
- framework: "Enterprise", "Mobile", or "ICS" (default: "Enterprise")
|
|
|
- Returns specific technique IDs (T1234) with descriptions
|
|
|
- Maps malware behaviors to MITRE framework
|
|
|
|
|
|
(7) LLM[instruction]: Synthesis and correlation of extracted intelligence.
|
|
|
- Combine intelligence from multiple sources
|
|
|
- Identify patterns across findings
|
|
|
- Correlate IOCs with techniques and actors
|
|
|
|
|
|
Consider:
|
|
|
1. More specific search queries (add APT names, CVE IDs, "IOC", "MITRE", "report")
|
|
|
2. Alternative CTI sources (CISA advisories, vendor reports, not news articles)
|
|
|
3. Different tool combinations (search β extract URL β fetch β extract IOCs/techniques)
|
|
|
|
|
|
Provide ONLY the corrected step in this format:
|
|
|
Plan: [improved description]
|
|
|
#E{step} = Tool[improved input]"""
|
|
|
|