from typing import Any, Dict, List, Optional

from wandb_mcp_server.utils import get_rich_logger
from wandb_mcp_server.weave_api.service import TraceService
from wandb_mcp_server.weave_api.models import QueryResult
from wandb_mcp_server.api_client import WandBApiManager

logger = get_rich_logger(__name__)

def get_trace_service():
    """
    Get a TraceService instance with the current request's API key.
    
    This creates a new TraceService for each request to ensure
    the correct API key is used from the context.
    """
    # Get the API key from context (set by auth middleware) or environment
    api_key = WandBApiManager.get_api_key()
    return TraceService(api_key=api_key)

QUERY_WEAVE_TRACES_TOOL_DESCRIPTION = """
Query Weave traces, trace metadata, and trace costs with filtering and sorting options.

---
**Cost Calculation and Sorting Enhancements:**
- For each model in the `costs` dictionary, a new field `total_cost` is computed as the sum of `completion_tokens_total_cost` and `prompt_tokens_total_cost`.
- You can post-hoc sort traces by any of: `total_cost`, `completion_cost`, or `prompt_cost` (across all models, summed if multiple).
---

<wandb_vs_weave_product_distinction>
**IMPORTANT PRODUCT DISTINCTION:**
W&B offers two distinct products with different purposes:

1. W&B Models: A system for ML experiment tracking, hyperparameter optimization, and model 
    lifecycle management. Use `query_wandb_tool` for questions about:
    - Experiment runs, metrics, and performance comparisons
    - Artifact management and model registry
    - Hyperparameter optimization and sweeps
    - Project dashboards and reports

2. W&B Weave: A toolkit for LLM and GenAI application observability and evaluation. Use
    `query_weave_traces_tool` (this tool) for questions about:
    - Execution traces and paths of LLM operations
    - LLM inputs, outputs, and intermediate results
    - Chain of thought visualization and debugging
    - LLM evaluation results and feedback

FYI: The Weigths & Biases platform is owned by Coreweave. If there are queries related to W&B, wandb \
or weave and Coreweave, they might be related to W&B products or features that leverage Coreweave's \
GPU or compute infrastructure.
</wandb_vs_weave_product_distinction>

<use_case_selector>
**USE CASE SELECTOR - READ FIRST:**
- For runs, metrics, experiments, artifacts, sweeps etc → use query_wandb_tool
- For traces, LLM calls, chain-of-thought, LLM evaluations, AI agent traces, AI apps etc → use query_weave_traces_tool

=====================================================================
⚠️ TOOL SELECTION WARNING ⚠️
This tool is ONLY for WEAVE TRACES (LLM operations), NOT for run metrics or experiments!
=====================================================================

**KEYWORD GUIDE:**
If user question contains:
- "runs", "experiments", "metrics" → Use query_wandb_tool
- "traces", "LLM calls" etc → Use this tool

**COMMON MISUSE CASES:**
❌ "Looking at metrics of my latest runs" - Do NOT use this tool, use query_wandb_tool instead
❌ "Compare performance across experiments" - Do NOT use this tool, use query_wandb_tool instead
</use_case_selector>

If the users asks for data about "runs" or "experiments" or anything about "experiment tracking"
then use the `query_wandb_tool` instead.
</use_case_selector>

<usage_tips>
query_traces_tool can return a lot of data, below are some usage tips for this function
in order to avoid overwhelming a LLM's context window with too much data.

<managing_llm_context_window>

Returning all weave trace data can possibly result in overwhelming the LLM context window
if there are 100s or 1000s of logged weave traces (depending on how many child traces each has) as
well as resulting in a lot of data from or calls to the weave API.

So, depending on the user query, consider doing the following to return enough data to answer the user query
but not too much data that it overwhelms the LLM context window:

- return only the root traces using the `trace_roots_only` boolean filter if you only need the top-level/parent
traces and don't need the data from all child traces. For example, if a user wants to know the number of
successful traces in a project but doesn't need the data from all child traces. Or if a user
wants to visualise the number of parent traces over time.

- return only the truncated values of the trace data keys in order to first give a preview of the data that can then
inform more targeted weave trace queries from the user. in the extreme you can set `truncate_length` to 0 in order to
only return keys but not the values of the trace data.

- return only the metadata for all the traces (set `metadata_only = True`) if the query doesn't need to know anything
about the structure or content of the individual weave traces. Note that this still requires
requesting all the raw traces data from the weave API so can still result in a lot of data and/or a
lot of calls being made to the weave API.

- return only the columns needed using the `columns` parameter. In weave, the `inputs` and `output` columns of a
trace can contain a lot of data, so avoiding returning these columns can help. Note you have to explicitly specify
the columns you want to return if there are certain columns you don't want to return. Its almost always a good idea to
specficy the columns needed.

<returning_metadata_only>

If `metadata_only = True` this returns only metadata of the traces such as trace counts, token counts,
trace types, time range, status counts and distribution of op names. if `metadata_only = False` the
trace data is returned either in full or truncated to `truncate_length` characters depending if
`return_full_data = True` or `False` respectively.
</returning_metadata_only>

<truncating_trace_data_values>

If `return_full_data = False` the trace data is truncated to `truncate_length` characters,
default 200 characters. Otherwise the trace data is returned in full.
</truncating_trace_data_values>

Remember, LLM context window is precious, only return the minimum amount of data needed to complete an analysis.
</managing_llm_context_window>

<usage_guidance>

- Exploratory queries: For generic exploratory or initial queries about a set of weave traces in a project it can
be a good idea to start with just returning metadata or truncated data. Consider asking the
user for clarification and warn them that returning a lot of weave traces data might
overwhelm the LLM context window. No need to warn them multiple times, just once is enough.

- Project size: Consider using the count_traces_tool to get an estimate of the number of traces in a project
before querying for them as query_trace_tool can return a lot of data.

- Partial op name matching: Use the `op_name_contains` filter if a users has only given a partial op name or if they
are unsure of the exact op name.

- Weave Evaluations: If asked about weave evaluations or evals traces:
    - Evals are complicated to query, prompt the user with follow up questions if needed. 
    - First, always try and oritent yourself - pull a summary of the evaluation, get all of the top level column names in the eval \
and always get a count of the total number of child traces in this eval by filtering by parent_id and using the count_traces tool.
    - As part of orienting yourself, just pull a subset of child traces from the eval, maybe 3 to 5, to understand the column structure \
and values.
    - Always be explicit about the amount of data returned and limits used in your query - return to the user the count of traces \
analysed. 
    - Always stay filterd on the evaluation id (filter by `parent_id`) unless specifically asked questions across different evaulations, e.g. \
if a parent id (or parentId) is provided then ensure to use that filter in the query.
    - filter for traces with `op_name_contains = "Evaluation.evaluate"` as a first step. These ops are parent traces that contain
    aggregated stats and scores about the evaluation. The child traces of these ops are the actual evaluation results
    for each sample in an evaluation dataset. If asked about individual rows in an evaluation then use the parent_ids
    filter to return the child traces.
    - for questions where both a child call name of an evaluation and an evaluation id or name are provided, always ensure \
that you first correctly get the evaluation id, and then use it as the parent_id in the query for the child traces. Otherwise \
there is a risk of returning traces that do not belong to the evaluation that was given.

- Weave nomenclature: Note that users might refer to weave ops as "traces" or "calls" or "traces" as "ops".

</usage_guidance>

Parameters
----------
entity_name : str
    The Weights & Biases entity name (team or username)
project_name : str
    The Weights & Biases project name
filters : dict
    Dict of filter conditions, supporting:
    
    - display_name : str or regex pattern
        Filter by display name seen in the Weave UI
    - op_name : str or regex pattern
        Filter by weave op name, a long URI starting with 'weave:///'
    - op_name_contains : str
        Filter for op_name containing this substring (easier than regex)
    - trace_roots_only : bool
        Boolean to filter for only top-level/parent traces. Useful when you don't need
        to return the data from all child traces.
    - trace_id : str
        Filter by a specific `trace_id` (e.g., "01958ab9-3c67-7c72-92bf-d023fa5a0d4d").
        A `trace_id` groups multiple calls/spans. Use if the user explicitly say they provided a "trace_id" for a group of operations.
        Always first try to filter by `call_ids` if a user provides an ID, before trying to filter by `trace_id`.
    - call_ids : str or list of str
        Filter by specific `call_id`s (also known as Span IDs) (string or list of strings, e.g., ["01958ab9-3c68-7c23-8ccd-c135c7037769"]).
        **GUIDANCE**: `call_id` (Span ID) identifies a *single* operation/span and is typically found in Weave UI URLs.
        If a user provides an ID for a specific item they're viewing, **prefer `call_ids`**.
        Format as a list: `{"call_ids": ["user_provided_id"]}`.
    - parent_ids : str or list of str
        Return traces that are children of the given parent trace ids (string or list of strings). Ensure you use this \
if given an evaluation trace id or name.
    - status : str
        Filter by trace status, defined as whether or not the trace had an exception or not. Can be
        `success` or `error`.
        NOTE: When users ask for "failed", "wrong", or "incorrect" traces, use `status:'error'` or 
        `has_exception:True` as the filter.
    - time_range : dict
        Dict with "start" and "end" datetime strings. Datetime strings should be in ISO format
        (e.g. `2024-01-01T00:00:00Z`)
    - attributes : dict
        Dict of the weave attributes of the trace.
        Supports nested paths (e.g., "metadata.model_name") via dot notation.
        Value can be:
        *   A literal for exact equality (e.g., `"status": "success"`)
        *   A dictionary with a comparison operator: `$gt`, `$lt`, `$eq`, `$gte`, `$lte` (e.g., `{"token_count": {"$gt": 100}}`)
        *   A dictionary with the `$contains` operator for substring matching on string attributes (e.g., `{"model_name": {"$contains": "gpt-3"}}`)
        **Warning:** The `$contains` operator performs simple substring matching only, full regular expression matching (e.g., via `$regex`) is **not supported** for attributes. Do not attempt to use `$regex`.
    - has_exception : bool, optional
        Optional[bool] to filter traces by exception status:
        - None (or key not present): Show all traces regardless of exception status
        - True: Show only traces that have exceptions (exception field is not null)
        - False: Show only traces without exceptions (exception field is null)
sort_by : str, optional
    Field to sort by (started_at, ended_at, op_name, etc.). Defaults to 'started_at'
sort_direction : str, optional
    Sort direction ('asc' or 'desc'). Defaults to 'desc'
limit : int, optional
    Maximum number of results to return. Defaults to None
include_costs : bool, optional
    Include tracked api cost information in the results. Defaults to True
include_feedback : bool, optional
    Include weave annotations (human labels/feedback). Defaults to True
columns : list of str, optional
    List of specific columns to include in the results. Its almost always a good idea to specficy the
    columns needed. Defaults to None (all columns).
    Available columns are:
        id: <class 'str'>
        project_id: <class 'str'>
        op_name: <class 'str'>
        display_name: typing.Optional[str]
        trace_id: <class 'str'>
        parent_id: typing.Optional[str]
        started_at: <class 'datetime.datetime'>
        attributes: dict[str, typing.Any]
        inputs: dict[str, typing.Any]
        ended_at: typing.Optional[datetime.datetime]
        exception: typing.Optional[str]
        output: typing.Optional[typing.Any]
        summary: typing.Optional[SummaryMap] # Contains nested data like 'summary.weave.status' and 'summary.weave.latency_ms'
        status: typing.Optional[str] # Synthesized from summary.weave.status if requested
        latency_ms: typing.Optional[int] # Synthesized from summary.weave.latency_ms if requested
        wb_user_id: typing.Optional[str]
        wb_run_id: typing.Optional[str]
        deleted_at: typing.Optional[datetime.datetime]
expand_columns : list of str, optional
    List of columns to expand in the results. Defaults to None
truncate_length : int, optional
    Maximum length for string values in weave traces. Defaults to 200
return_full_data : bool, optional
    Whether to include full untruncated trace data. If True, the `truncate_length` parameter is ignored. If  \
`False` returns truncation_length = 0, no values for the column keys are returned. Defaults to True.
metadata_only : bool, optional
    Return only metadata without traces. Defaults to False

Returns
-------
str
    JSON string containing either full trace data or metadata only, depending on parameters

<examples>
    ```python
    # Get an overview of the traces in a project
    query_traces_tool(
        entity_name="my-team",
        project_name="my-project",
        filters={"root_traces_only": True},
        metadata_only=True,
        return_full_data=False
    )

    # Get failed traces with costs and feedback
    query_traces_tool(
        entity_name="my-team",
        project_name="my-project",
        filters={"status": "error"},
        include_costs=True,
        include_feedback=True
    )

    # Get specific columns for traces who's op name (i.e. trace name) contains a specific substring
    query_traces_tool(
        entity_name="my-team",
        project_name="my-project",
        filters={"op_name_contains": "Evaluation.summarize"},
        columns=["id", "op_name", "started_at", "costs"]
    )
    ```
</examples>
"""


def query_traces(
    entity_name: str,
    project_name: str,
    filters: Dict[str, Any] = {},
    sort_by: str = "started_at",
    sort_direction: str = "desc",
    limit: int = 100,
    offset: int = 0,
    include_costs: bool = True,
    include_feedback: bool = True,
    columns: List[str] = [],
    expand_columns: List[str] = [],
    return_full_data: bool = True,
    api_key: str = "",
    query_expr: Any = None,  # We ignore this in the new implementation
    request_timeout: int = 10,
    retries: int = 3,
) -> List[Dict[str, Any]]:
    """
    This maintains the original signature of query_traces from query_weave.py,
    but delegates to our new implementation.
    """
    # If api_key was provided, create a new service with that key
    service = get_trace_service()
    if api_key:
        service = TraceService(
            api_key=api_key,
            retries=retries,
            timeout=request_timeout,
        )

    # Query traces
    result = service.query_traces(
        entity_name=entity_name,
        project_name=project_name,
        filters=filters,
        sort_by=sort_by,
        sort_direction=sort_direction,
        limit=limit,
        offset=offset,
        include_costs=include_costs,
        include_feedback=include_feedback,
        columns=columns,
        expand_columns=expand_columns,
        return_full_data=return_full_data,  # Match original behavior
        metadata_only=False,
    )

    # Match the return type of the original function (List[Dict])
    if result.traces:
        # Convert WeaveTrace objects to dictionaries if needed
        traces_as_dicts = []
        for trace in result.traces:
            if hasattr(trace, "model_dump"):
                # Pydantic model - convert to dict
                traces_as_dicts.append(trace.model_dump())
            elif isinstance(trace, dict):
                # Already a dict
                traces_as_dicts.append(trace)
            else:
                # Unknown type, try to convert to dict
                try:
                    traces_as_dicts.append(dict(trace))
                except Exception:
                    # If all else fails, convert to string
                    traces_as_dicts.append(
                        {"error": f"Could not convert {type(trace)} to dict"}
                    )
        return traces_as_dicts
    else:
        return []


async def query_paginated_weave_traces(
    entity_name: str,
    project_name: str,
    chunk_size: int = 20,
    filters: Dict[str, Any] = {},
    sort_by: str = "started_at",
    sort_direction: str = "desc",
    target_limit: Optional[int] = None,
    include_costs: bool = True,
    include_feedback: bool = True,
    columns: List[str] = [],
    expand_columns: List[str] = [],
    truncate_length: Optional[int] = 200,
    return_full_data: bool = True,
    metadata_only: bool = False,
    api_key: Optional[str] = None,
    retries: int = 3,
    debug_raw_traces: bool = False,
) -> QueryResult:
    """
    Query Weave traces with pagination and return results as a Pydantic model.

    This maintains the original signature of query_paginated_weave_traces from query_weave.py,
    but delegates to our new implementation and returns a Pydantic QueryResult model directly.

    Example:
        ```python
        result = await query_paginated_weave_traces(
            entity_name="my-entity",
            project_name="my-project"
        )

        # Access Pydantic model properties directly
        print(f"Total traces: {result.metadata.total_traces}")
        ```

    Args:
        entity_name: Weights & Biases entity name.
        project_name: Weights & Biands project name.
        chunk_size: Number of traces to retrieve in each chunk.
        filters: Dictionary of filter conditions.
        sort_by: Field to sort by.
        sort_direction: Sort direction ('asc' or 'desc').
        target_limit: Maximum total number of results to return.
        include_costs: Include tracked API cost information in the results.
        include_feedback: Include Weave annotations in the results.
        columns: List of specific columns to include in the results.
        expand_columns: List of columns to expand in the results.
        truncate_length: Maximum length for string values.
        return_full_data: Whether to include full untruncated trace data.
        metadata_only: Whether to only include metadata without traces.
        api_key: Optional API key to use for authentication.
        retries: Number of retry attempts for API calls.
        debug_raw_traces: Include raw traces in the response for debugging.

    Returns:
        QueryResult: A Pydantic model containing the query results
    """
    # If api_key was provided, create a new service with that key
    service = get_trace_service()
    if api_key:
        service = TraceService(
            api_key=api_key,
            retries=retries,
        )

    # Query traces with pagination
    result = service.query_paginated_traces(
        entity_name=entity_name,
        project_name=project_name,
        chunk_size=chunk_size,
        filters=filters,
        sort_by=sort_by,
        sort_direction=sort_direction,
        target_limit=target_limit,
        include_costs=include_costs,
        include_feedback=include_feedback,
        columns=columns,
        expand_columns=expand_columns,
        truncate_length=truncate_length,
        return_full_data=return_full_data,
        metadata_only=metadata_only,
    )

    # Add raw traces for debugging if requested
    if debug_raw_traces and result.traces:
        # Create a copy to avoid modifying the original result
        result_dict = result.model_dump()
        result_dict["raw_traces"] = result.traces
        # Convert back to QueryResult
        result = QueryResult.model_validate(result_dict)

    assert isinstance(result, QueryResult), (
        f"Result type must be a QueryResult, found: {type(result)}"
    )
    return result