Spaces:
Running
Implement Run Detail and Trace Detail screens with full navigation
Browse filesDay 3 (Guide): Run Detail + Trace Detail screens completed
Run Detail Screen (Screen 3):
- Enhanced with 3 tabs: Overview, Test Cases, Performance
- Overview tab: Run metadata with gradient card styling
- Test Cases tab: Interactive dataframe with click-to-trace navigation
- Performance tab: 4-chart dashboard (response time histogram, token usage, cost, success/failure pie)
- Added create_performance_charts() function for performance visualizations
Trace Detail Screen (Screen 4):
- Created complete screen with 5 tabs:
* Thought Graph: Network visualization of agent reasoning flow
* Waterfall: Interactive timeline diagram of span execution
* GPU Metrics: Time series dashboard + raw metrics data (2 sub-tabs)
* Span Details: Detailed table with tokens, cost, duration per span
* Raw Data: JSON view of OpenTelemetry trace data
* Ask About This Trace: Accordion with Q&A placeholder (for MCP integration)
Components Added:
- components/thought_graph.py: Network graph visualization of agent reasoning
- screens/trace_detail.py: All trace visualization functions
* create_span_visualization(): Waterfall chart with color-coded spans
* create_gpu_metrics_dashboard(): Multi-panel GPU metrics time series
* create_gpu_summary_cards(): HTML summary cards for GPU metrics
* process_trace_data(): Trace data processor with timestamp handling
* create_span_table(): JSON view of span details
Navigation Handlers:
- on_test_case_select(): Navigate from Run Detail to Trace Detail
- go_back_to_run_detail(): Back button from Trace Detail to Run Detail
- create_trace_metadata_html(): Trace metadata HTML generator
- create_span_details_table(): Span details dataframe generator
Event Wiring:
- test_cases_table.select → on_test_case_select (loads trace, switches screens)
- back_to_run_detail_btn.click → go_back_to_run_detail (returns to run detail)
- Integrated all 11 trace detail outputs (graphs, tables, JSON)
Navigation Flow:
Leaderboard (Screen 1) → Run Detail (Screen 3) → Trace Detail (Screen 4)
- Click DrillDown row → navigate to Run Detail with 3 tabs
- Click Test Case row → navigate to Trace Detail with 5 tabs
- Back buttons work correctly between all screens
File Stats:
- app.py: 832 → 1193 lines (+361 lines)
- New files: components/thought_graph.py, screens/trace_detail.py
- All functions compile and type-check successfully
- app.py +447 -85
- components/thought_graph.py +398 -0
- screens/trace_detail.py +721 -0
|
@@ -21,8 +21,339 @@ from components.analytics_charts import (
|
|
| 21 |
create_cost_efficiency_scatter
|
| 22 |
)
|
| 23 |
from components.report_cards import generate_leaderboard_summary_card
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
from utils.navigation import Navigator, Screen
|
| 25 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
# Initialize data loader
|
| 27 |
data_loader = create_data_loader_from_env()
|
| 28 |
navigator = Navigator()
|
|
@@ -265,30 +596,8 @@ def on_html_table_row_click(row_index_str):
|
|
| 265 |
|
| 266 |
results_df = data_loader.load_results(results_dataset)
|
| 267 |
|
| 268 |
-
#
|
| 269 |
-
|
| 270 |
-
<div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
|
| 271 |
-
padding: 20px; border-radius: 10px; color: white; margin-bottom: 20px;">
|
| 272 |
-
<h2 style="margin: 0 0 10px 0;">📊 Run Detail: {run_data.get('model', 'Unknown')}</h2>
|
| 273 |
-
<div style="display: grid; grid-template-columns: 1fr 1fr 1fr; gap: 20px; margin-top: 15px;">
|
| 274 |
-
<div>
|
| 275 |
-
<strong>Agent Type:</strong> {run_data.get('agent_type', 'N/A')}<br>
|
| 276 |
-
<strong>Provider:</strong> {run_data.get('provider', 'N/A')}<br>
|
| 277 |
-
<strong>Success Rate:</strong> {run_data.get('success_rate', 0):.1f}%
|
| 278 |
-
</div>
|
| 279 |
-
<div>
|
| 280 |
-
<strong>Total Tests:</strong> {run_data.get('total_tests', 0)}<br>
|
| 281 |
-
<strong>Successful:</strong> {run_data.get('successful_tests', 0)}<br>
|
| 282 |
-
<strong>Failed:</strong> {run_data.get('failed_tests', 0)}
|
| 283 |
-
</div>
|
| 284 |
-
<div>
|
| 285 |
-
<strong>Total Cost:</strong> ${run_data.get('total_cost_usd', 0):.4f}<br>
|
| 286 |
-
<strong>Avg Duration:</strong> {run_data.get('avg_duration_ms', 0):.0f}ms<br>
|
| 287 |
-
<strong>Submitted By:</strong> {run_data.get('submitted_by', 'Unknown')}
|
| 288 |
-
</div>
|
| 289 |
-
</div>
|
| 290 |
-
</div>
|
| 291 |
-
"""
|
| 292 |
|
| 293 |
# Format results for display
|
| 294 |
display_df = results_df.copy()
|
|
@@ -358,30 +667,8 @@ def load_run_detail(run_id):
|
|
| 358 |
|
| 359 |
results_df = data_loader.load_results(results_dataset)
|
| 360 |
|
| 361 |
-
#
|
| 362 |
-
|
| 363 |
-
<div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
|
| 364 |
-
padding: 20px; border-radius: 10px; color: white; margin-bottom: 20px;">
|
| 365 |
-
<h2 style="margin: 0 0 10px 0;">📊 Run Detail: {run_data.get('model', 'Unknown')}</h2>
|
| 366 |
-
<div style="display: grid; grid-template-columns: 1fr 1fr 1fr; gap: 20px; margin-top: 15px;">
|
| 367 |
-
<div>
|
| 368 |
-
<strong>Agent Type:</strong> {run_data.get('agent_type', 'N/A')}<br>
|
| 369 |
-
<strong>Provider:</strong> {run_data.get('provider', 'N/A')}<br>
|
| 370 |
-
<strong>Success Rate:</strong> {run_data.get('success_rate', 0):.1f}%
|
| 371 |
-
</div>
|
| 372 |
-
<div>
|
| 373 |
-
<strong>Total Tests:</strong> {run_data.get('total_tests', 0)}<br>
|
| 374 |
-
<strong>Successful:</strong> {run_data.get('successful_tests', 0)}<br>
|
| 375 |
-
<strong>Failed:</strong> {run_data.get('failed_tests', 0)}
|
| 376 |
-
</div>
|
| 377 |
-
<div>
|
| 378 |
-
<strong>Total Cost:</strong> ${run_data.get('total_cost_usd', 0):.4f}<br>
|
| 379 |
-
<strong>Avg Duration:</strong> {run_data.get('avg_duration_ms', 0):.0f}ms<br>
|
| 380 |
-
<strong>Submitted By:</strong> {run_data.get('submitted_by', 'Unknown')}
|
| 381 |
-
</div>
|
| 382 |
-
</div>
|
| 383 |
-
</div>
|
| 384 |
-
"""
|
| 385 |
|
| 386 |
# Format results for display
|
| 387 |
display_df = results_df.copy()
|
|
@@ -458,30 +745,8 @@ def on_drilldown_select(evt: gr.SelectData, df):
|
|
| 458 |
|
| 459 |
results_df = data_loader.load_results(results_dataset)
|
| 460 |
|
| 461 |
-
#
|
| 462 |
-
|
| 463 |
-
<div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
|
| 464 |
-
padding: 20px; border-radius: 10px; color: white; margin-bottom: 20px;">
|
| 465 |
-
<h2 style="margin: 0 0 10px 0;">📊 Run Detail: {run_data.get('model', 'Unknown')}</h2>
|
| 466 |
-
<div style="display: grid; grid-template-columns: 1fr 1fr 1fr; gap: 20px; margin-top: 15px;">
|
| 467 |
-
<div>
|
| 468 |
-
<strong>Agent Type:</strong> {run_data.get('agent_type', 'N/A')}<br>
|
| 469 |
-
<strong>Provider:</strong> {run_data.get('provider', 'N/A')}<br>
|
| 470 |
-
<strong>Success Rate:</strong> {run_data.get('success_rate', 0):.1f}%
|
| 471 |
-
</div>
|
| 472 |
-
<div>
|
| 473 |
-
<strong>Total Tests:</strong> {run_data.get('total_tests', 0)}<br>
|
| 474 |
-
<strong>Successful:</strong> {run_data.get('successful_tests', 0)}<br>
|
| 475 |
-
<strong>Failed:</strong> {run_data.get('failed_tests', 0)}
|
| 476 |
-
</div>
|
| 477 |
-
<div>
|
| 478 |
-
<strong>Total Cost:</strong> ${run_data.get('total_cost_usd', 0):.4f}<br>
|
| 479 |
-
<strong>Avg Duration:</strong> {run_data.get('avg_duration_ms', 0):.0f}ms<br>
|
| 480 |
-
<strong>Submitted By:</strong> {run_data.get('submitted_by', 'Unknown')}
|
| 481 |
-
</div>
|
| 482 |
-
</div>
|
| 483 |
-
</div>
|
| 484 |
-
"""
|
| 485 |
|
| 486 |
# Format results for display
|
| 487 |
display_df = results_df.copy()
|
|
@@ -697,23 +962,95 @@ with gr.Blocks(title="TraceMind-AI", theme=theme) as app:
|
|
| 697 |
# Hidden textbox for row selection (JavaScript bridge)
|
| 698 |
selected_row_index = gr.Textbox(visible=False, elem_id="selected_row_index")
|
| 699 |
|
| 700 |
-
# Screen 3: Run Detail
|
| 701 |
with gr.Column(visible=False) as run_detail_screen:
|
| 702 |
# Navigation
|
| 703 |
with gr.Row():
|
| 704 |
back_to_leaderboard_btn = gr.Button("⬅️ Back to Leaderboard", variant="secondary", size="sm")
|
| 705 |
-
|
| 706 |
-
# Run
|
| 707 |
-
|
| 708 |
-
|
| 709 |
-
|
| 710 |
-
|
| 711 |
-
|
| 712 |
-
|
| 713 |
-
|
| 714 |
-
|
| 715 |
-
|
| 716 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 717 |
# Event handlers
|
| 718 |
app.load(
|
| 719 |
fn=load_leaderboard,
|
|
@@ -812,6 +1149,31 @@ with gr.Blocks(title="TraceMind-AI", theme=theme) as app:
|
|
| 812 |
outputs=[leaderboard_screen, run_detail_screen]
|
| 813 |
)
|
| 814 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 815 |
# HTML table row click handler (JavaScript bridge via hidden textbox)
|
| 816 |
selected_row_index.change(
|
| 817 |
fn=on_html_table_row_click,
|
|
|
|
| 21 |
create_cost_efficiency_scatter
|
| 22 |
)
|
| 23 |
from components.report_cards import generate_leaderboard_summary_card
|
| 24 |
+
from screens.trace_detail import (
|
| 25 |
+
create_span_visualization,
|
| 26 |
+
create_span_table,
|
| 27 |
+
create_gpu_metrics_dashboard,
|
| 28 |
+
create_gpu_summary_cards
|
| 29 |
+
)
|
| 30 |
from utils.navigation import Navigator, Screen
|
| 31 |
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
# Trace Detail handlers and helpers
|
| 35 |
+
|
| 36 |
+
def create_span_details_table(spans):
|
| 37 |
+
"""
|
| 38 |
+
Create table view of span details
|
| 39 |
+
|
| 40 |
+
Args:
|
| 41 |
+
spans: List of span dictionaries
|
| 42 |
+
|
| 43 |
+
Returns:
|
| 44 |
+
DataFrame with span details
|
| 45 |
+
"""
|
| 46 |
+
try:
|
| 47 |
+
if not spans:
|
| 48 |
+
return pd.DataFrame(columns=["Span Name", "Kind", "Duration (ms)", "Tokens", "Cost (USD)", "Status"])
|
| 49 |
+
|
| 50 |
+
rows = []
|
| 51 |
+
for span in spans:
|
| 52 |
+
name = span.get('name', 'Unknown')
|
| 53 |
+
kind = span.get('kind', 'INTERNAL')
|
| 54 |
+
|
| 55 |
+
# Get attributes
|
| 56 |
+
attributes = span.get('attributes', {})
|
| 57 |
+
if isinstance(attributes, dict) and 'openinference.span.kind' in attributes:
|
| 58 |
+
kind = attributes.get('openinference.span.kind', kind)
|
| 59 |
+
|
| 60 |
+
# Calculate duration
|
| 61 |
+
start = span.get('startTime') or span.get('startTimeUnixNano', 0)
|
| 62 |
+
end = span.get('endTime') or span.get('endTimeUnixNano', 0)
|
| 63 |
+
duration = (end - start) / 1000000 if start and end else 0 # Convert to ms
|
| 64 |
+
|
| 65 |
+
status = span.get('status', {}).get('code', 'OK') if isinstance(span.get('status'), dict) else 'OK'
|
| 66 |
+
|
| 67 |
+
# Extract tokens and cost information
|
| 68 |
+
tokens_str = "-"
|
| 69 |
+
cost_str = "-"
|
| 70 |
+
|
| 71 |
+
if isinstance(attributes, dict):
|
| 72 |
+
# Check for token usage
|
| 73 |
+
prompt_tokens = attributes.get('gen_ai.usage.prompt_tokens') or attributes.get('llm.token_count.prompt')
|
| 74 |
+
completion_tokens = attributes.get('gen_ai.usage.completion_tokens') or attributes.get('llm.token_count.completion')
|
| 75 |
+
total_tokens = attributes.get('llm.usage.total_tokens')
|
| 76 |
+
|
| 77 |
+
# Build tokens string
|
| 78 |
+
if prompt_tokens is not None and completion_tokens is not None:
|
| 79 |
+
total = int(prompt_tokens) + int(completion_tokens)
|
| 80 |
+
tokens_str = f"{total} ({int(prompt_tokens)}+{int(completion_tokens)})"
|
| 81 |
+
elif total_tokens is not None:
|
| 82 |
+
tokens_str = str(int(total_tokens))
|
| 83 |
+
|
| 84 |
+
# Check for cost
|
| 85 |
+
cost = attributes.get('gen_ai.usage.cost.total') or attributes.get('llm.usage.cost')
|
| 86 |
+
if cost is not None:
|
| 87 |
+
cost_str = f"${float(cost):.6f}"
|
| 88 |
+
|
| 89 |
+
rows.append({
|
| 90 |
+
"Span Name": name,
|
| 91 |
+
"Kind": kind,
|
| 92 |
+
"Duration (ms)": round(duration, 2),
|
| 93 |
+
"Tokens": tokens_str,
|
| 94 |
+
"Cost (USD)": cost_str,
|
| 95 |
+
"Status": status
|
| 96 |
+
})
|
| 97 |
+
|
| 98 |
+
return pd.DataFrame(rows)
|
| 99 |
+
|
| 100 |
+
except Exception as e:
|
| 101 |
+
print(f"[ERROR] create_span_details_table: {e}")
|
| 102 |
+
import traceback
|
| 103 |
+
traceback.print_exc()
|
| 104 |
+
return pd.DataFrame(columns=["Span Name", "Kind", "Duration (ms)", "Tokens", "Cost (USD)", "Status"])
|
| 105 |
+
|
| 106 |
+
|
| 107 |
+
def create_trace_metadata_html(trace_data: dict) -> str:
|
| 108 |
+
"""Create HTML for trace metadata display"""
|
| 109 |
+
trace_id = trace_data.get('trace_id', 'Unknown')
|
| 110 |
+
spans = trace_data.get('spans', [])
|
| 111 |
+
if hasattr(spans, 'tolist'):
|
| 112 |
+
spans = spans.tolist()
|
| 113 |
+
elif not isinstance(spans, list):
|
| 114 |
+
spans = list(spans) if spans is not None else []
|
| 115 |
+
|
| 116 |
+
metadata_html = f"""
|
| 117 |
+
<div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
|
| 118 |
+
padding: 20px; border-radius: 10px; color: white; margin-bottom: 20px;">
|
| 119 |
+
<h3 style="margin: 0 0 10px 0;">Trace Information</h3>
|
| 120 |
+
<div style="display: grid; grid-template-columns: 1fr 1fr; gap: 15px;">
|
| 121 |
+
<div>
|
| 122 |
+
<strong>Trace ID:</strong> {trace_id}<br>
|
| 123 |
+
<strong>Total Spans:</strong> {len(spans)}
|
| 124 |
+
</div>
|
| 125 |
+
</div>
|
| 126 |
+
</div>
|
| 127 |
+
"""
|
| 128 |
+
return metadata_html
|
| 129 |
+
|
| 130 |
+
|
| 131 |
+
def on_test_case_select(evt: gr.SelectData, df):
|
| 132 |
+
"""Handle test case selection in run detail - navigate to trace detail"""
|
| 133 |
+
global current_selected_run, current_selected_trace
|
| 134 |
+
|
| 135 |
+
print(f"[DEBUG] on_test_case_select called with index: {evt.index}")
|
| 136 |
+
|
| 137 |
+
# Check if we have a selected run
|
| 138 |
+
if current_selected_run is None:
|
| 139 |
+
print("[ERROR] No run selected - current_selected_run is None")
|
| 140 |
+
gr.Warning("Please select a run from the leaderboard first")
|
| 141 |
+
return {}
|
| 142 |
+
|
| 143 |
+
try:
|
| 144 |
+
# Get selected test case
|
| 145 |
+
selected_idx = evt.index[0]
|
| 146 |
+
if df is None or df.empty or selected_idx >= len(df):
|
| 147 |
+
gr.Warning("Invalid test case selection")
|
| 148 |
+
return {}
|
| 149 |
+
|
| 150 |
+
test_case = df.iloc[selected_idx].to_dict()
|
| 151 |
+
trace_id = test_case.get('trace_id')
|
| 152 |
+
|
| 153 |
+
print(f"[DEBUG] Selected test case: {test_case.get('task_id', 'Unknown')} (trace_id: {trace_id})")
|
| 154 |
+
|
| 155 |
+
# Load trace data
|
| 156 |
+
traces_dataset = current_selected_run.get('traces_dataset')
|
| 157 |
+
if not traces_dataset:
|
| 158 |
+
gr.Warning("No traces dataset found in current run")
|
| 159 |
+
return {}
|
| 160 |
+
|
| 161 |
+
trace_data = data_loader.get_trace_by_id(traces_dataset, trace_id)
|
| 162 |
+
|
| 163 |
+
if not trace_data:
|
| 164 |
+
gr.Warning(f"Trace not found: {trace_id}")
|
| 165 |
+
return {}
|
| 166 |
+
|
| 167 |
+
current_selected_trace = trace_data
|
| 168 |
+
|
| 169 |
+
# Get spans and ensure it's a list
|
| 170 |
+
spans = trace_data.get('spans', [])
|
| 171 |
+
if hasattr(spans, 'tolist'):
|
| 172 |
+
spans = spans.tolist()
|
| 173 |
+
elif not isinstance(spans, list):
|
| 174 |
+
spans = list(spans) if spans is not None else []
|
| 175 |
+
|
| 176 |
+
print(f"[DEBUG] Loaded trace with {len(spans)} spans")
|
| 177 |
+
|
| 178 |
+
# Create visualizations
|
| 179 |
+
span_viz_plot = create_span_visualization(spans, trace_id)
|
| 180 |
+
span_details_json = create_span_table(spans).value
|
| 181 |
+
|
| 182 |
+
# Create thought graph
|
| 183 |
+
from components.thought_graph import create_thought_graph as create_network_graph
|
| 184 |
+
thought_graph_plot = create_network_graph(spans, trace_id)
|
| 185 |
+
|
| 186 |
+
# Create span details table
|
| 187 |
+
span_table_df = create_span_details_table(spans)
|
| 188 |
+
|
| 189 |
+
# Load GPU metrics (if available)
|
| 190 |
+
gpu_summary_html = "<div style='padding: 20px; text-align: center;'>⚠️ No GPU metrics available (expected for API models)</div>"
|
| 191 |
+
gpu_plot = None
|
| 192 |
+
gpu_json_data = {}
|
| 193 |
+
|
| 194 |
+
try:
|
| 195 |
+
if 'metrics_dataset' in current_selected_run and current_selected_run['metrics_dataset']:
|
| 196 |
+
metrics_dataset = current_selected_run['metrics_dataset']
|
| 197 |
+
gpu_metrics_data = data_loader.load_metrics(metrics_dataset)
|
| 198 |
+
|
| 199 |
+
if gpu_metrics_data is not None and not gpu_metrics_data.empty:
|
| 200 |
+
gpu_plot = create_gpu_metrics_dashboard(gpu_metrics_data)
|
| 201 |
+
gpu_summary_html = create_gpu_summary_cards(gpu_metrics_data)
|
| 202 |
+
gpu_json_data = gpu_metrics_data.to_dict('records')
|
| 203 |
+
except Exception as e:
|
| 204 |
+
print(f"[WARNING] Could not load GPU metrics: {e}")
|
| 205 |
+
|
| 206 |
+
# Return dictionary with visibility updates and data
|
| 207 |
+
return {
|
| 208 |
+
run_detail_screen: gr.update(visible=False),
|
| 209 |
+
trace_detail_screen: gr.update(visible=True),
|
| 210 |
+
trace_title: gr.update(value=f"# 🔍 Trace Detail: {trace_id}"),
|
| 211 |
+
trace_metadata_html: gr.update(value=create_trace_metadata_html(trace_data)),
|
| 212 |
+
trace_thought_graph: gr.update(value=thought_graph_plot),
|
| 213 |
+
span_visualization: gr.update(value=span_viz_plot),
|
| 214 |
+
span_details_table: gr.update(value=span_table_df),
|
| 215 |
+
span_details_json: gr.update(value=span_details_json),
|
| 216 |
+
gpu_summary_cards_html: gr.update(value=gpu_summary_html),
|
| 217 |
+
gpu_metrics_plot: gr.update(value=gpu_plot),
|
| 218 |
+
gpu_metrics_json: gr.update(value=gpu_json_data)
|
| 219 |
+
}
|
| 220 |
+
|
| 221 |
+
except Exception as e:
|
| 222 |
+
print(f"[ERROR] on_test_case_select failed: {e}")
|
| 223 |
+
import traceback
|
| 224 |
+
traceback.print_exc()
|
| 225 |
+
gr.Warning(f"Error loading trace: {e}")
|
| 226 |
+
return {}
|
| 227 |
+
|
| 228 |
+
|
| 229 |
+
|
| 230 |
+
def create_performance_charts(results_df):
|
| 231 |
+
"""
|
| 232 |
+
Create performance analysis charts for the Performance tab
|
| 233 |
+
|
| 234 |
+
Args:
|
| 235 |
+
results_df: DataFrame with test results
|
| 236 |
+
|
| 237 |
+
Returns:
|
| 238 |
+
Plotly figure with performance metrics
|
| 239 |
+
"""
|
| 240 |
+
import plotly.graph_objects as go
|
| 241 |
+
from plotly.subplots import make_subplots
|
| 242 |
+
|
| 243 |
+
try:
|
| 244 |
+
if results_df.empty:
|
| 245 |
+
fig = go.Figure()
|
| 246 |
+
fig.add_annotation(text="No performance data available", showarrow=False)
|
| 247 |
+
return fig
|
| 248 |
+
|
| 249 |
+
# Create 2x2 subplots
|
| 250 |
+
fig = make_subplots(
|
| 251 |
+
rows=2, cols=2,
|
| 252 |
+
subplot_titles=(
|
| 253 |
+
"Response Time Distribution",
|
| 254 |
+
"Token Usage per Test",
|
| 255 |
+
"Cost per Test",
|
| 256 |
+
"Success vs Failure"
|
| 257 |
+
),
|
| 258 |
+
specs=[[{"type": "histogram"}, {"type": "bar"}],
|
| 259 |
+
[{"type": "bar"}, {"type": "pie"}]]
|
| 260 |
+
)
|
| 261 |
+
|
| 262 |
+
# 1. Response Time Distribution (Histogram)
|
| 263 |
+
if 'execution_time_ms' in results_df.columns:
|
| 264 |
+
fig.add_trace(
|
| 265 |
+
go.Histogram(
|
| 266 |
+
x=results_df['execution_time_ms'],
|
| 267 |
+
nbinsx=20,
|
| 268 |
+
marker_color='#3498DB',
|
| 269 |
+
name='Response Time',
|
| 270 |
+
showlegend=False
|
| 271 |
+
),
|
| 272 |
+
row=1, col=1
|
| 273 |
+
)
|
| 274 |
+
fig.update_xaxes(title_text="Time (ms)", row=1, col=1)
|
| 275 |
+
fig.update_yaxes(title_text="Count", row=1, col=1)
|
| 276 |
+
|
| 277 |
+
# 2. Token Usage per Test (Bar)
|
| 278 |
+
if 'total_tokens' in results_df.columns:
|
| 279 |
+
test_indices = list(range(len(results_df)))
|
| 280 |
+
fig.add_trace(
|
| 281 |
+
go.Bar(
|
| 282 |
+
x=test_indices,
|
| 283 |
+
y=results_df['total_tokens'],
|
| 284 |
+
marker_color='#9B59B6',
|
| 285 |
+
name='Tokens',
|
| 286 |
+
showlegend=False
|
| 287 |
+
),
|
| 288 |
+
row=1, col=2
|
| 289 |
+
)
|
| 290 |
+
fig.update_xaxes(title_text="Test Index", row=1, col=2)
|
| 291 |
+
fig.update_yaxes(title_text="Tokens", row=1, col=2)
|
| 292 |
+
|
| 293 |
+
# 3. Cost per Test (Bar)
|
| 294 |
+
if 'cost_usd' in results_df.columns:
|
| 295 |
+
test_indices = list(range(len(results_df)))
|
| 296 |
+
fig.add_trace(
|
| 297 |
+
go.Bar(
|
| 298 |
+
x=test_indices,
|
| 299 |
+
y=results_df['cost_usd'],
|
| 300 |
+
marker_color='#E67E22',
|
| 301 |
+
name='Cost',
|
| 302 |
+
showlegend=False
|
| 303 |
+
),
|
| 304 |
+
row=2, col=1
|
| 305 |
+
)
|
| 306 |
+
fig.update_xaxes(title_text="Test Index", row=2, col=1)
|
| 307 |
+
fig.update_yaxes(title_text="Cost (USD)", row=2, col=1)
|
| 308 |
+
|
| 309 |
+
# 4. Success vs Failure (Pie)
|
| 310 |
+
if 'success' in results_df.columns:
|
| 311 |
+
# Convert to boolean if needed
|
| 312 |
+
success_series = results_df['success']
|
| 313 |
+
if success_series.dtype == object:
|
| 314 |
+
success_series = success_series == "✅"
|
| 315 |
+
|
| 316 |
+
success_count = int(success_series.sum())
|
| 317 |
+
failure_count = len(results_df) - success_count
|
| 318 |
+
|
| 319 |
+
fig.add_trace(
|
| 320 |
+
go.Pie(
|
| 321 |
+
labels=['Success', 'Failure'],
|
| 322 |
+
values=[success_count, failure_count],
|
| 323 |
+
marker_colors=['#2ECC71', '#E74C3C'],
|
| 324 |
+
showlegend=True
|
| 325 |
+
),
|
| 326 |
+
row=2, col=2
|
| 327 |
+
)
|
| 328 |
+
|
| 329 |
+
# Update layout
|
| 330 |
+
fig.update_layout(
|
| 331 |
+
height=700,
|
| 332 |
+
showlegend=False,
|
| 333 |
+
title_text="Performance Analysis Dashboard",
|
| 334 |
+
title_x=0.5
|
| 335 |
+
)
|
| 336 |
+
|
| 337 |
+
return fig
|
| 338 |
+
|
| 339 |
+
except Exception as e:
|
| 340 |
+
print(f"[ERROR] create_performance_charts: {e}")
|
| 341 |
+
import traceback
|
| 342 |
+
traceback.print_exc()
|
| 343 |
+
fig = go.Figure()
|
| 344 |
+
fig.add_annotation(text=f"Error creating charts: {str(e)}", showarrow=False)
|
| 345 |
+
return fig
|
| 346 |
+
|
| 347 |
+
|
| 348 |
+
|
| 349 |
+
def go_back_to_run_detail():
|
| 350 |
+
"""Navigate from trace detail back to run detail"""
|
| 351 |
+
return {
|
| 352 |
+
run_detail_screen: gr.update(visible=True),
|
| 353 |
+
trace_detail_screen: gr.update(visible=False)
|
| 354 |
+
}
|
| 355 |
+
|
| 356 |
+
|
| 357 |
# Initialize data loader
|
| 358 |
data_loader = create_data_loader_from_env()
|
| 359 |
navigator = Navigator()
|
|
|
|
| 596 |
|
| 597 |
results_df = data_loader.load_results(results_dataset)
|
| 598 |
|
| 599 |
+
# Generate performance chart
|
| 600 |
+
perf_chart = create_performance_charts(results_df)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 601 |
|
| 602 |
# Format results for display
|
| 603 |
display_df = results_df.copy()
|
|
|
|
| 667 |
|
| 668 |
results_df = data_loader.load_results(results_dataset)
|
| 669 |
|
| 670 |
+
# Generate performance chart
|
| 671 |
+
perf_chart = create_performance_charts(results_df)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 672 |
|
| 673 |
# Format results for display
|
| 674 |
display_df = results_df.copy()
|
|
|
|
| 745 |
|
| 746 |
results_df = data_loader.load_results(results_dataset)
|
| 747 |
|
| 748 |
+
# Generate performance chart
|
| 749 |
+
perf_chart = create_performance_charts(results_df)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 750 |
|
| 751 |
# Format results for display
|
| 752 |
display_df = results_df.copy()
|
|
|
|
| 962 |
# Hidden textbox for row selection (JavaScript bridge)
|
| 963 |
selected_row_index = gr.Textbox(visible=False, elem_id="selected_row_index")
|
| 964 |
|
| 965 |
+
# Screen 3: Run Detail (Enhanced with Tabs)
|
| 966 |
with gr.Column(visible=False) as run_detail_screen:
|
| 967 |
# Navigation
|
| 968 |
with gr.Row():
|
| 969 |
back_to_leaderboard_btn = gr.Button("⬅️ Back to Leaderboard", variant="secondary", size="sm")
|
| 970 |
+
|
| 971 |
+
run_detail_title = gr.Markdown("# 📊 Run Detail")
|
| 972 |
+
|
| 973 |
+
with gr.Tabs():
|
| 974 |
+
with gr.TabItem("📋 Overview"):
|
| 975 |
+
gr.Markdown("*Run metadata and summary*")
|
| 976 |
+
run_metadata_html = gr.HTML("")
|
| 977 |
+
|
| 978 |
+
with gr.TabItem("✅ Test Cases"):
|
| 979 |
+
gr.Markdown("*Individual test case results*")
|
| 980 |
+
test_cases_table = gr.Dataframe(
|
| 981 |
+
headers=["Task ID", "Status", "Tool", "Duration", "Tokens", "Cost", "Trace ID"],
|
| 982 |
+
interactive=False,
|
| 983 |
+
wrap=True
|
| 984 |
+
)
|
| 985 |
+
gr.Markdown("*Click a test case to view detailed trace (including Thought Graph)*")
|
| 986 |
+
|
| 987 |
+
with gr.TabItem("⚡ Performance"):
|
| 988 |
+
gr.Markdown("*Performance metrics and charts*")
|
| 989 |
+
performance_charts = gr.Plot(label="Performance Analysis", show_label=False)
|
| 990 |
+
|
| 991 |
+
# Screen 4: Trace Detail with Sub-tabs
|
| 992 |
+
with gr.Column(visible=False) as trace_detail_screen:
|
| 993 |
+
with gr.Row():
|
| 994 |
+
back_to_run_detail_btn = gr.Button("⬅️ Back to Run Detail", variant="secondary", size="sm")
|
| 995 |
+
|
| 996 |
+
trace_title = gr.Markdown("# 🔍 Trace Detail")
|
| 997 |
+
trace_metadata_html = gr.HTML("")
|
| 998 |
+
|
| 999 |
+
with gr.Tabs():
|
| 1000 |
+
with gr.TabItem("🧠 Thought Graph"):
|
| 1001 |
+
gr.Markdown("""
|
| 1002 |
+
### Agent Reasoning Flow
|
| 1003 |
+
|
| 1004 |
+
This interactive network graph shows **how your agent thinks** - the logical flow of reasoning steps,
|
| 1005 |
+
tool calls, and LLM interactions.
|
| 1006 |
+
|
| 1007 |
+
**How to read it:**
|
| 1008 |
+
- 🟣 **Purple nodes** = LLM reasoning steps
|
| 1009 |
+
- 🟠 **Orange nodes** = Tool calls
|
| 1010 |
+
- 🔵 **Blue nodes** = Chains/Agents
|
| 1011 |
+
- **Arrows** = Flow from one step to the next
|
| 1012 |
+
- **Hover** = See tokens, costs, and timing details
|
| 1013 |
+
""")
|
| 1014 |
+
trace_thought_graph = gr.Plot(label="Thought Graph", show_label=False)
|
| 1015 |
+
|
| 1016 |
+
with gr.TabItem("📊 Waterfall"):
|
| 1017 |
+
gr.Markdown("*Interactive waterfall diagram showing span execution timeline*")
|
| 1018 |
+
gr.Markdown("*Hover over spans for details. Drag to zoom, double-click to reset.*")
|
| 1019 |
+
span_visualization = gr.Plot(label="Trace Waterfall", show_label=False)
|
| 1020 |
+
|
| 1021 |
+
with gr.TabItem("🖥️ GPU Metrics"):
|
| 1022 |
+
gr.Markdown("*Performance metrics for GPU-based models (not available for API models)*")
|
| 1023 |
+
gpu_summary_cards_html = gr.HTML(label="GPU Summary", show_label=False)
|
| 1024 |
+
|
| 1025 |
+
with gr.Tabs():
|
| 1026 |
+
with gr.TabItem("📈 Time Series Dashboard"):
|
| 1027 |
+
gpu_metrics_plot = gr.Plot(label="GPU Metrics Over Time", show_label=False)
|
| 1028 |
+
|
| 1029 |
+
with gr.TabItem("📋 Raw Metrics Data"):
|
| 1030 |
+
gpu_metrics_json = gr.JSON(label="GPU Metrics Data")
|
| 1031 |
+
|
| 1032 |
+
with gr.TabItem("📝 Span Details"):
|
| 1033 |
+
gr.Markdown("*Detailed span information with token and cost data*")
|
| 1034 |
+
span_details_table = gr.Dataframe(
|
| 1035 |
+
headers=["Span Name", "Kind", "Duration (ms)", "Tokens", "Cost (USD)", "Status"],
|
| 1036 |
+
interactive=False,
|
| 1037 |
+
wrap=True,
|
| 1038 |
+
label="Span Breakdown"
|
| 1039 |
+
)
|
| 1040 |
+
|
| 1041 |
+
with gr.TabItem("🔍 Raw Data"):
|
| 1042 |
+
gr.Markdown("*Raw OpenTelemetry trace data (JSON)*")
|
| 1043 |
+
span_details_json = gr.JSON()
|
| 1044 |
+
|
| 1045 |
+
with gr.Accordion("🤖 Ask About This Trace", open=False):
|
| 1046 |
+
trace_question = gr.Textbox(
|
| 1047 |
+
label="Question",
|
| 1048 |
+
placeholder="e.g., Why was the tool called twice?",
|
| 1049 |
+
lines=2
|
| 1050 |
+
)
|
| 1051 |
+
trace_ask_btn = gr.Button("Ask", variant="primary")
|
| 1052 |
+
trace_answer = gr.Markdown("*Ask a question to get AI-powered insights*")
|
| 1053 |
+
|
| 1054 |
# Event handlers
|
| 1055 |
app.load(
|
| 1056 |
fn=load_leaderboard,
|
|
|
|
| 1149 |
outputs=[leaderboard_screen, run_detail_screen]
|
| 1150 |
)
|
| 1151 |
|
| 1152 |
+
# Trace detail navigation
|
| 1153 |
+
test_cases_table.select(
|
| 1154 |
+
fn=on_test_case_select,
|
| 1155 |
+
inputs=[test_cases_table],
|
| 1156 |
+
outputs=[
|
| 1157 |
+
run_detail_screen,
|
| 1158 |
+
trace_detail_screen,
|
| 1159 |
+
trace_title,
|
| 1160 |
+
trace_metadata_html,
|
| 1161 |
+
trace_thought_graph,
|
| 1162 |
+
span_visualization,
|
| 1163 |
+
span_details_table,
|
| 1164 |
+
span_details_json,
|
| 1165 |
+
gpu_summary_cards_html,
|
| 1166 |
+
gpu_metrics_plot,
|
| 1167 |
+
gpu_metrics_json
|
| 1168 |
+
]
|
| 1169 |
+
)
|
| 1170 |
+
|
| 1171 |
+
back_to_run_detail_btn.click(
|
| 1172 |
+
fn=go_back_to_run_detail,
|
| 1173 |
+
outputs=[run_detail_screen, trace_detail_screen]
|
| 1174 |
+
)
|
| 1175 |
+
|
| 1176 |
+
|
| 1177 |
# HTML table row click handler (JavaScript bridge via hidden textbox)
|
| 1178 |
selected_row_index.change(
|
| 1179 |
fn=on_html_table_row_click,
|
|
@@ -0,0 +1,398 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Thought Graph Visualization Component
|
| 3 |
+
Visualizes agent reasoning flow as an interactive network graph
|
| 4 |
+
"""
|
| 5 |
+
|
| 6 |
+
import plotly.graph_objects as go
|
| 7 |
+
import networkx as nx
|
| 8 |
+
from typing import List, Dict, Any, Tuple
|
| 9 |
+
import colorsys
|
| 10 |
+
|
| 11 |
+
|
| 12 |
+
def create_thought_graph(spans: List[Dict[str, Any]], trace_id: str = "Unknown") -> go.Figure:
|
| 13 |
+
"""
|
| 14 |
+
Create an interactive thought graph showing agent reasoning flow
|
| 15 |
+
|
| 16 |
+
This is different from the waterfall chart - it shows the logical flow
|
| 17 |
+
of the agent's thinking process (LLM calls, Tool calls, etc.) as a
|
| 18 |
+
directed graph rather than a timeline.
|
| 19 |
+
|
| 20 |
+
Args:
|
| 21 |
+
spans: List of OpenTelemetry span dictionaries
|
| 22 |
+
trace_id: Trace identifier
|
| 23 |
+
|
| 24 |
+
Returns:
|
| 25 |
+
Plotly figure with interactive network graph
|
| 26 |
+
"""
|
| 27 |
+
|
| 28 |
+
# Ensure spans is a list
|
| 29 |
+
if hasattr(spans, 'tolist'):
|
| 30 |
+
spans = spans.tolist()
|
| 31 |
+
elif not isinstance(spans, list):
|
| 32 |
+
spans = list(spans) if spans is not None else []
|
| 33 |
+
|
| 34 |
+
if not spans:
|
| 35 |
+
# Return empty figure with message
|
| 36 |
+
fig = go.Figure()
|
| 37 |
+
fig.add_annotation(
|
| 38 |
+
text="No reasoning steps to display",
|
| 39 |
+
xref="paper", yref="paper",
|
| 40 |
+
x=0.5, y=0.5, xanchor='center', yanchor='middle',
|
| 41 |
+
showarrow=False,
|
| 42 |
+
font=dict(size=20)
|
| 43 |
+
)
|
| 44 |
+
return fig
|
| 45 |
+
|
| 46 |
+
# Build graph from spans
|
| 47 |
+
G = nx.DiGraph()
|
| 48 |
+
|
| 49 |
+
# First pass: Add all nodes and build span_map
|
| 50 |
+
span_map = {}
|
| 51 |
+
for span in spans:
|
| 52 |
+
span_id = span.get('spanId') or span.get('span_id') or span.get('spanID')
|
| 53 |
+
if not span_id:
|
| 54 |
+
continue
|
| 55 |
+
|
| 56 |
+
# Get span details
|
| 57 |
+
name = span.get('name', 'Unknown')
|
| 58 |
+
kind = span.get('kind', 'INTERNAL')
|
| 59 |
+
attributes = span.get('attributes', {})
|
| 60 |
+
|
| 61 |
+
# Check for OpenInference span kind
|
| 62 |
+
if isinstance(attributes, dict) and 'openinference.span.kind' in attributes:
|
| 63 |
+
openinference_kind = attributes.get('openinference.span.kind', kind)
|
| 64 |
+
if openinference_kind: # Only call .upper() if not None
|
| 65 |
+
kind = openinference_kind.upper()
|
| 66 |
+
|
| 67 |
+
# Extract metadata for node
|
| 68 |
+
node_data = {
|
| 69 |
+
'span_id': span_id,
|
| 70 |
+
'name': name,
|
| 71 |
+
'kind': kind,
|
| 72 |
+
'attributes': attributes,
|
| 73 |
+
'status': span.get('status', {}).get('code', 'OK')
|
| 74 |
+
}
|
| 75 |
+
|
| 76 |
+
# Add token and cost info if available
|
| 77 |
+
if isinstance(attributes, dict):
|
| 78 |
+
# Token info
|
| 79 |
+
if 'gen_ai.usage.prompt_tokens' in attributes:
|
| 80 |
+
node_data['prompt_tokens'] = attributes['gen_ai.usage.prompt_tokens']
|
| 81 |
+
if 'gen_ai.usage.completion_tokens' in attributes:
|
| 82 |
+
node_data['completion_tokens'] = attributes['gen_ai.usage.completion_tokens']
|
| 83 |
+
|
| 84 |
+
# Cost info
|
| 85 |
+
if 'gen_ai.usage.cost.total' in attributes:
|
| 86 |
+
node_data['cost'] = attributes['gen_ai.usage.cost.total']
|
| 87 |
+
elif 'llm.usage.cost' in attributes:
|
| 88 |
+
node_data['cost'] = attributes['llm.usage.cost']
|
| 89 |
+
|
| 90 |
+
# Model info
|
| 91 |
+
if 'gen_ai.request.model' in attributes:
|
| 92 |
+
node_data['model'] = attributes['gen_ai.request.model']
|
| 93 |
+
elif 'llm.model' in attributes:
|
| 94 |
+
node_data['model'] = attributes['llm.model']
|
| 95 |
+
|
| 96 |
+
# Tool info
|
| 97 |
+
if 'tool.name' in attributes:
|
| 98 |
+
node_data['tool_name'] = attributes['tool.name']
|
| 99 |
+
|
| 100 |
+
# Add node to graph
|
| 101 |
+
G.add_node(span_id, **node_data)
|
| 102 |
+
span_map[span_id] = span
|
| 103 |
+
|
| 104 |
+
# Second pass: Add all edges (now all nodes exist in span_map)
|
| 105 |
+
for span in spans:
|
| 106 |
+
span_id = span.get('spanId') or span.get('span_id') or span.get('spanID')
|
| 107 |
+
if not span_id:
|
| 108 |
+
continue
|
| 109 |
+
|
| 110 |
+
parent_id = span.get('parentSpanId') or span.get('parent_span_id') or span.get('parentSpanID')
|
| 111 |
+
if parent_id and parent_id in span_map:
|
| 112 |
+
G.add_edge(parent_id, span_id)
|
| 113 |
+
print(f"[DEBUG] Added edge: {parent_id} → {span_id}")
|
| 114 |
+
|
| 115 |
+
print(f"[DEBUG] Graph created: {G.number_of_nodes()} nodes, {G.number_of_edges()} edges")
|
| 116 |
+
|
| 117 |
+
if G.number_of_nodes() == 0:
|
| 118 |
+
# Return empty figure with message
|
| 119 |
+
fig = go.Figure()
|
| 120 |
+
fig.add_annotation(
|
| 121 |
+
text="No valid spans to display",
|
| 122 |
+
xref="paper", yref="paper",
|
| 123 |
+
x=0.5, y=0.5, xanchor='center', yanchor='middle',
|
| 124 |
+
showarrow=False,
|
| 125 |
+
font=dict(size=20)
|
| 126 |
+
)
|
| 127 |
+
return fig
|
| 128 |
+
|
| 129 |
+
# Calculate layout using hierarchical layout
|
| 130 |
+
try:
|
| 131 |
+
# Try to use hierarchical layout (for DAGs)
|
| 132 |
+
pos = nx.spring_layout(G, k=2, iterations=50, seed=42)
|
| 133 |
+
|
| 134 |
+
# If graph is a DAG, use hierarchical layout
|
| 135 |
+
if nx.is_directed_acyclic_graph(G):
|
| 136 |
+
# Get levels using longest_path_length
|
| 137 |
+
levels = {}
|
| 138 |
+
for node in G.nodes():
|
| 139 |
+
# Find longest path from any root to this node
|
| 140 |
+
try:
|
| 141 |
+
# Get all paths from roots to this node
|
| 142 |
+
roots = [n for n in G.nodes() if G.in_degree(n) == 0]
|
| 143 |
+
max_depth = 0
|
| 144 |
+
for root in roots:
|
| 145 |
+
if nx.has_path(G, root, node):
|
| 146 |
+
paths = list(nx.all_simple_paths(G, root, node))
|
| 147 |
+
max_depth = max(max_depth, max(len(p) for p in paths) if paths else 0)
|
| 148 |
+
levels[node] = max_depth
|
| 149 |
+
except:
|
| 150 |
+
levels[node] = 0
|
| 151 |
+
|
| 152 |
+
# Create hierarchical layout
|
| 153 |
+
pos = create_hierarchical_layout(G, levels)
|
| 154 |
+
except Exception as e:
|
| 155 |
+
print(f"[DEBUG] Layout calculation error: {e}")
|
| 156 |
+
# Fallback to circular layout
|
| 157 |
+
pos = nx.circular_layout(G)
|
| 158 |
+
|
| 159 |
+
# Extract node positions
|
| 160 |
+
node_x = []
|
| 161 |
+
node_y = []
|
| 162 |
+
node_text = []
|
| 163 |
+
node_colors = []
|
| 164 |
+
node_sizes = []
|
| 165 |
+
hover_text = []
|
| 166 |
+
|
| 167 |
+
for node in G.nodes():
|
| 168 |
+
x, y = pos[node]
|
| 169 |
+
node_x.append(x)
|
| 170 |
+
node_y.append(y)
|
| 171 |
+
|
| 172 |
+
# Get node data
|
| 173 |
+
node_data = G.nodes[node]
|
| 174 |
+
name = node_data.get('name', 'Unknown')
|
| 175 |
+
kind = node_data.get('kind', 'INTERNAL')
|
| 176 |
+
|
| 177 |
+
# Create label (shortened)
|
| 178 |
+
label = shorten_label(name, max_length=20)
|
| 179 |
+
node_text.append(label)
|
| 180 |
+
|
| 181 |
+
# Assign color based on kind
|
| 182 |
+
color = get_node_color(kind, node_data.get('status', 'OK'))
|
| 183 |
+
node_colors.append(color)
|
| 184 |
+
|
| 185 |
+
# Size based on importance (LLM and AGENT nodes are larger)
|
| 186 |
+
size = 40 if kind in ['LLM', 'AGENT', 'CHAIN'] else 30
|
| 187 |
+
node_sizes.append(size)
|
| 188 |
+
|
| 189 |
+
# Create detailed hover text
|
| 190 |
+
hover = f"<b>{name}</b><br>"
|
| 191 |
+
hover += f"Type: {kind}<br>"
|
| 192 |
+
hover += f"Status: {node_data.get('status', 'OK')}<br>"
|
| 193 |
+
|
| 194 |
+
if 'model' in node_data:
|
| 195 |
+
hover += f"Model: {node_data['model']}<br>"
|
| 196 |
+
if 'tool_name' in node_data:
|
| 197 |
+
hover += f"Tool: {node_data['tool_name']}<br>"
|
| 198 |
+
if 'prompt_tokens' in node_data or 'completion_tokens' in node_data:
|
| 199 |
+
prompt = node_data.get('prompt_tokens', 0)
|
| 200 |
+
completion = node_data.get('completion_tokens', 0)
|
| 201 |
+
hover += f"Tokens: {prompt + completion} (p:{prompt}, c:{completion})<br>"
|
| 202 |
+
if 'cost' in node_data and node_data['cost'] is not None:
|
| 203 |
+
hover += f"Cost: ${node_data['cost']:.6f}<br>"
|
| 204 |
+
|
| 205 |
+
hover_text.append(hover)
|
| 206 |
+
|
| 207 |
+
# Extract edges
|
| 208 |
+
edge_x = []
|
| 209 |
+
edge_y = []
|
| 210 |
+
edge_traces = []
|
| 211 |
+
|
| 212 |
+
print(f"[DEBUG] Drawing {G.number_of_edges()} edges")
|
| 213 |
+
for edge in G.edges():
|
| 214 |
+
x0, y0 = pos[edge[0]]
|
| 215 |
+
x1, y1 = pos[edge[1]]
|
| 216 |
+
print(f"[DEBUG] Edge from ({x0:.2f}, {y0:.2f}) to ({x1:.2f}, {y1:.2f})")
|
| 217 |
+
|
| 218 |
+
# Create edge line (make it thicker and darker for visibility)
|
| 219 |
+
edge_trace = go.Scatter(
|
| 220 |
+
x=[x0, x1, None],
|
| 221 |
+
y=[y0, y1, None],
|
| 222 |
+
mode='lines',
|
| 223 |
+
line=dict(width=3, color='#555'), # Increased width from 2 to 3, darker color
|
| 224 |
+
hoverinfo='none',
|
| 225 |
+
showlegend=False
|
| 226 |
+
)
|
| 227 |
+
edge_traces.append(edge_trace)
|
| 228 |
+
|
| 229 |
+
# Add arrow annotation
|
| 230 |
+
edge_traces.append(create_arrow_annotation(x0, y0, x1, y1))
|
| 231 |
+
|
| 232 |
+
# Create node trace
|
| 233 |
+
node_trace = go.Scatter(
|
| 234 |
+
x=node_x,
|
| 235 |
+
y=node_y,
|
| 236 |
+
mode='markers+text',
|
| 237 |
+
marker=dict(
|
| 238 |
+
size=node_sizes,
|
| 239 |
+
color=node_colors,
|
| 240 |
+
line=dict(width=2, color='white')
|
| 241 |
+
),
|
| 242 |
+
text=node_text,
|
| 243 |
+
textposition='bottom center',
|
| 244 |
+
textfont=dict(size=10, color='#333'),
|
| 245 |
+
hovertext=hover_text,
|
| 246 |
+
hoverinfo='text',
|
| 247 |
+
showlegend=False
|
| 248 |
+
)
|
| 249 |
+
|
| 250 |
+
# Create figure
|
| 251 |
+
fig = go.Figure(data=edge_traces + [node_trace])
|
| 252 |
+
|
| 253 |
+
# Update layout with better visibility settings
|
| 254 |
+
fig.update_layout(
|
| 255 |
+
title={
|
| 256 |
+
'text': f"🧠 Agent Thought Graph: {trace_id}",
|
| 257 |
+
'x': 0.5,
|
| 258 |
+
'xanchor': 'center',
|
| 259 |
+
'font': {'size': 20}
|
| 260 |
+
},
|
| 261 |
+
showlegend=False,
|
| 262 |
+
hovermode='closest',
|
| 263 |
+
margin=dict(t=100, b=40, l=40, r=40),
|
| 264 |
+
height=600,
|
| 265 |
+
xaxis=dict(
|
| 266 |
+
showgrid=False,
|
| 267 |
+
zeroline=False,
|
| 268 |
+
showticklabels=False,
|
| 269 |
+
range=[-0.1, 1.1] # Add padding to see edges at boundaries
|
| 270 |
+
),
|
| 271 |
+
yaxis=dict(
|
| 272 |
+
showgrid=False,
|
| 273 |
+
zeroline=False,
|
| 274 |
+
showticklabels=False,
|
| 275 |
+
range=[-0.1, 1.1] # Add padding to see edges at boundaries
|
| 276 |
+
),
|
| 277 |
+
plot_bgcolor='white', # Pure white background for maximum contrast
|
| 278 |
+
paper_bgcolor='#f8f9fa', # Light gray paper
|
| 279 |
+
annotations=[
|
| 280 |
+
dict(
|
| 281 |
+
text="💡 Hover over nodes to see details | Arrows show execution flow",
|
| 282 |
+
xref="paper", yref="paper",
|
| 283 |
+
x=0.5, y=-0.05, xanchor='center', yanchor='top',
|
| 284 |
+
showarrow=False,
|
| 285 |
+
font=dict(size=11, color='#666')
|
| 286 |
+
)
|
| 287 |
+
]
|
| 288 |
+
)
|
| 289 |
+
|
| 290 |
+
# Add legend for node types
|
| 291 |
+
legend_items = create_legend_items()
|
| 292 |
+
fig.add_annotation(
|
| 293 |
+
text=legend_items,
|
| 294 |
+
xref="paper", yref="paper",
|
| 295 |
+
x=1.0, y=1.0, xanchor='right', yanchor='top',
|
| 296 |
+
showarrow=False,
|
| 297 |
+
font=dict(size=10),
|
| 298 |
+
align='left',
|
| 299 |
+
bgcolor='white',
|
| 300 |
+
bordercolor='#ccc',
|
| 301 |
+
borderwidth=1,
|
| 302 |
+
borderpad=8
|
| 303 |
+
)
|
| 304 |
+
|
| 305 |
+
return fig
|
| 306 |
+
|
| 307 |
+
|
| 308 |
+
def create_hierarchical_layout(G: nx.DiGraph, levels: Dict[str, int]) -> Dict[str, Tuple[float, float]]:
|
| 309 |
+
"""Create a hierarchical layout for the graph"""
|
| 310 |
+
pos = {}
|
| 311 |
+
|
| 312 |
+
# Group nodes by level
|
| 313 |
+
level_nodes = {}
|
| 314 |
+
for node, level in levels.items():
|
| 315 |
+
if level not in level_nodes:
|
| 316 |
+
level_nodes[level] = []
|
| 317 |
+
level_nodes[level].append(node)
|
| 318 |
+
|
| 319 |
+
# Assign positions
|
| 320 |
+
max_level = max(levels.values()) if levels else 0
|
| 321 |
+
for level, nodes in level_nodes.items():
|
| 322 |
+
y = 1.0 - (level / max(max_level, 1)) # Top to bottom
|
| 323 |
+
num_nodes = len(nodes)
|
| 324 |
+
for i, node in enumerate(nodes):
|
| 325 |
+
x = (i + 1) / (num_nodes + 1) # Spread evenly
|
| 326 |
+
pos[node] = (x, y)
|
| 327 |
+
|
| 328 |
+
return pos
|
| 329 |
+
|
| 330 |
+
|
| 331 |
+
def get_node_color(kind: str, status: str) -> str:
|
| 332 |
+
"""Get color for node based on kind and status"""
|
| 333 |
+
|
| 334 |
+
# Error status overrides kind color
|
| 335 |
+
if status == 'ERROR':
|
| 336 |
+
return '#DC143C' # Crimson
|
| 337 |
+
|
| 338 |
+
# Color by kind
|
| 339 |
+
color_map = {
|
| 340 |
+
'LLM': '#9B59B6', # Purple
|
| 341 |
+
'AGENT': '#1ABC9C', # Turquoise
|
| 342 |
+
'CHAIN': '#3498DB', # Light Blue
|
| 343 |
+
'TOOL': '#E67E22', # Orange
|
| 344 |
+
'RETRIEVER': '#F39C12', # Yellow-Orange
|
| 345 |
+
'EMBEDDING': '#8E44AD', # Dark Purple
|
| 346 |
+
'CLIENT': '#4169E1', # Royal Blue
|
| 347 |
+
'SERVER': '#2E8B57', # Sea Green
|
| 348 |
+
'INTERNAL': '#95A5A6', # Gray
|
| 349 |
+
}
|
| 350 |
+
|
| 351 |
+
return color_map.get(kind, '#4682B4') # Steel Blue default
|
| 352 |
+
|
| 353 |
+
|
| 354 |
+
def shorten_label(text: str, max_length: int = 20) -> str:
|
| 355 |
+
"""Shorten label for display"""
|
| 356 |
+
if len(text) <= max_length:
|
| 357 |
+
return text
|
| 358 |
+
return text[:max_length-3] + '...'
|
| 359 |
+
|
| 360 |
+
|
| 361 |
+
def create_arrow_annotation(x0: float, y0: float, x1: float, y1: float) -> go.Scatter:
|
| 362 |
+
"""Create an arrow annotation between two points"""
|
| 363 |
+
# Calculate arrow position (70% along the line, closer to end)
|
| 364 |
+
arrow_x = x0 + 0.7 * (x1 - x0)
|
| 365 |
+
arrow_y = y0 + 0.7 * (y1 - y0)
|
| 366 |
+
|
| 367 |
+
# Calculate angle for arrow direction
|
| 368 |
+
import math
|
| 369 |
+
angle = math.atan2(y1 - y0, x1 - x0)
|
| 370 |
+
|
| 371 |
+
# Create arrow head (larger and more visible)
|
| 372 |
+
arrow_size = 0.03 # Increased from 0.02
|
| 373 |
+
arrow_dx = arrow_size * math.cos(angle + 2.8)
|
| 374 |
+
arrow_dy = arrow_size * math.sin(angle + 2.8)
|
| 375 |
+
|
| 376 |
+
arrow_trace = go.Scatter(
|
| 377 |
+
x=[arrow_x - arrow_dx, arrow_x, arrow_x + arrow_size * math.cos(angle - 2.8)],
|
| 378 |
+
y=[arrow_y - arrow_dy, arrow_y, arrow_y + arrow_size * math.sin(angle - 2.8)],
|
| 379 |
+
mode='lines',
|
| 380 |
+
line=dict(width=2, color='#555'), # Match edge color
|
| 381 |
+
fill='toself',
|
| 382 |
+
fillcolor='#555', # Darker fill color
|
| 383 |
+
hoverinfo='none',
|
| 384 |
+
showlegend=False
|
| 385 |
+
)
|
| 386 |
+
|
| 387 |
+
return arrow_trace
|
| 388 |
+
|
| 389 |
+
|
| 390 |
+
def create_legend_items() -> str:
|
| 391 |
+
"""Create HTML legend for node types"""
|
| 392 |
+
legend = "<b>Node Types:</b><br>"
|
| 393 |
+
legend += "🟣 LLM Call<br>"
|
| 394 |
+
legend += "🟠 Tool Call<br>"
|
| 395 |
+
legend += "🔵 Chain/Agent<br>"
|
| 396 |
+
legend += "⚪ Other<br>"
|
| 397 |
+
legend += "🔴 Error"
|
| 398 |
+
return legend
|
|
@@ -0,0 +1,721 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Screen 4: Trace Detail View
|
| 3 |
+
Shows detailed OpenTelemetry trace visualization
|
| 4 |
+
"""
|
| 5 |
+
|
| 6 |
+
import gradio as gr
|
| 7 |
+
import plotly.graph_objects as go
|
| 8 |
+
from plotly.subplots import make_subplots
|
| 9 |
+
from datetime import datetime
|
| 10 |
+
import pandas as pd
|
| 11 |
+
from typing import Optional, Callable, Dict, Any, List
|
| 12 |
+
from components.thought_graph import create_thought_graph
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
def create_trace_detail_screen(
|
| 16 |
+
trace_data: dict,
|
| 17 |
+
on_back: Optional[Callable] = None,
|
| 18 |
+
mcp_qa_enabled: bool = True
|
| 19 |
+
) -> gr.Blocks:
|
| 20 |
+
"""
|
| 21 |
+
Create the trace detail screen UI
|
| 22 |
+
|
| 23 |
+
Args:
|
| 24 |
+
trace_data: OpenTelemetry trace data
|
| 25 |
+
on_back: Callback for back button
|
| 26 |
+
mcp_qa_enabled: Enable MCP Q&A tool
|
| 27 |
+
|
| 28 |
+
Returns:
|
| 29 |
+
Gradio Blocks for trace detail screen
|
| 30 |
+
"""
|
| 31 |
+
|
| 32 |
+
with gr.Blocks() as trace_detail:
|
| 33 |
+
with gr.Row():
|
| 34 |
+
if on_back:
|
| 35 |
+
back_btn = gr.Button("⬅️ Back to Run Detail", variant="secondary", size="sm")
|
| 36 |
+
|
| 37 |
+
gr.Markdown(f"# 🔍 Trace Detail: {trace_data.get('trace_id', 'Unknown')}")
|
| 38 |
+
|
| 39 |
+
# Safely extract spans
|
| 40 |
+
spans = trace_data.get('spans', [])
|
| 41 |
+
if hasattr(spans, 'tolist'):
|
| 42 |
+
spans = spans.tolist()
|
| 43 |
+
elif not isinstance(spans, list):
|
| 44 |
+
spans = list(spans) if spans is not None else []
|
| 45 |
+
|
| 46 |
+
# Trace metadata
|
| 47 |
+
with gr.Row():
|
| 48 |
+
gr.Markdown(f"""
|
| 49 |
+
**Trace ID:** `{trace_data.get('trace_id', 'N/A')}`
|
| 50 |
+
**Total Spans:** {len(spans)}
|
| 51 |
+
""")
|
| 52 |
+
|
| 53 |
+
# Tabs for different visualizations
|
| 54 |
+
with gr.Tabs() as tabs:
|
| 55 |
+
# Tab 1: Thought Graph (STAR FEATURE!)
|
| 56 |
+
with gr.Tab("🧠 Thought Graph"):
|
| 57 |
+
gr.Markdown("""
|
| 58 |
+
### Agent Reasoning Flow
|
| 59 |
+
This graph visualizes how your agent thinks - showing the flow of reasoning steps,
|
| 60 |
+
tool calls, and LLM interactions as a network.
|
| 61 |
+
|
| 62 |
+
**Node Colors:**
|
| 63 |
+
- 🟣 Purple: LLM reasoning steps
|
| 64 |
+
- 🟠 Orange: Tool calls
|
| 65 |
+
- 🔵 Blue: Chains/Agents
|
| 66 |
+
- 🔴 Red: Errors
|
| 67 |
+
""")
|
| 68 |
+
|
| 69 |
+
# Create and display thought graph
|
| 70 |
+
thought_graph_plot = gr.Plot(
|
| 71 |
+
value=create_thought_graph(spans, trace_data.get('trace_id', 'Unknown')),
|
| 72 |
+
label=""
|
| 73 |
+
)
|
| 74 |
+
|
| 75 |
+
# Tab 2: Execution Timeline (Waterfall)
|
| 76 |
+
with gr.Tab("⏱️ Execution Timeline"):
|
| 77 |
+
gr.Markdown("""
|
| 78 |
+
### Waterfall Chart
|
| 79 |
+
Timeline view showing when each span executed and for how long.
|
| 80 |
+
""")
|
| 81 |
+
|
| 82 |
+
# Span visualization
|
| 83 |
+
span_viz = gr.Plot(
|
| 84 |
+
value=create_span_visualization(spans, trace_data.get('trace_id', 'Unknown')),
|
| 85 |
+
label=""
|
| 86 |
+
)
|
| 87 |
+
|
| 88 |
+
# Tab 3: Span Details
|
| 89 |
+
with gr.Tab("📋 Span Details"):
|
| 90 |
+
gr.Markdown("""
|
| 91 |
+
### Detailed Span Information
|
| 92 |
+
Raw span data with attributes, status, and metadata.
|
| 93 |
+
""")
|
| 94 |
+
|
| 95 |
+
# Span details table
|
| 96 |
+
span_table = create_span_table(spans)
|
| 97 |
+
|
| 98 |
+
# MCP Q&A Tool (below tabs)
|
| 99 |
+
gr.Markdown("---")
|
| 100 |
+
if mcp_qa_enabled:
|
| 101 |
+
with gr.Accordion("🤖 Ask About This Trace", open=False):
|
| 102 |
+
question_input = gr.Textbox(
|
| 103 |
+
label="Question",
|
| 104 |
+
placeholder="e.g., Why was the tool called twice? What tool did the agent use first?",
|
| 105 |
+
lines=2
|
| 106 |
+
)
|
| 107 |
+
ask_btn = gr.Button("Ask", variant="primary")
|
| 108 |
+
answer_output = gr.Markdown("*Ask a question to get AI-powered insights*")
|
| 109 |
+
|
| 110 |
+
# Wire up MCP Q&A (placeholder for now)
|
| 111 |
+
ask_btn.click(
|
| 112 |
+
fn=lambda q: f"**Answer:** This is a placeholder. MCP integration coming soon.\n\n**Your question:** {q}",
|
| 113 |
+
inputs=[question_input],
|
| 114 |
+
outputs=[answer_output]
|
| 115 |
+
)
|
| 116 |
+
|
| 117 |
+
# Wire up events
|
| 118 |
+
if on_back:
|
| 119 |
+
back_btn.click(fn=on_back, inputs=[], outputs=[])
|
| 120 |
+
|
| 121 |
+
return trace_detail
|
| 122 |
+
|
| 123 |
+
|
| 124 |
+
def process_trace_data(spans: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
| 125 |
+
"""Process trace spans for waterfall visualization"""
|
| 126 |
+
# Ensure spans is a list
|
| 127 |
+
if hasattr(spans, 'tolist'):
|
| 128 |
+
spans = spans.tolist()
|
| 129 |
+
elif not isinstance(spans, list):
|
| 130 |
+
spans = list(spans) if spans is not None else []
|
| 131 |
+
|
| 132 |
+
if not spans:
|
| 133 |
+
return []
|
| 134 |
+
|
| 135 |
+
# Helper function to get timestamp from span (handles different field names)
|
| 136 |
+
def get_timestamp(span, field_name):
|
| 137 |
+
"""Get timestamp handling different OpenTelemetry field name variations"""
|
| 138 |
+
# Try different variations of field names
|
| 139 |
+
variations = [
|
| 140 |
+
field_name, # e.g., 'startTime'
|
| 141 |
+
field_name.lower(), # e.g., 'starttime'
|
| 142 |
+
field_name.replace('Time', 'TimeUnixNano'), # e.g., 'startTimeUnixNano'
|
| 143 |
+
field_name[0].lower() + field_name[1:], # e.g., 'startTime'
|
| 144 |
+
# Add snake_case variations (start_time, end_time)
|
| 145 |
+
field_name.replace('Time', '_time').lower(), # e.g., 'start_time'
|
| 146 |
+
field_name.replace('Time', '_time_unix_nano').lower(), # e.g., 'start_time_unix_nano'
|
| 147 |
+
]
|
| 148 |
+
|
| 149 |
+
for var in variations:
|
| 150 |
+
if var in span:
|
| 151 |
+
value = span[var]
|
| 152 |
+
# Handle both string and numeric timestamps
|
| 153 |
+
if isinstance(value, str):
|
| 154 |
+
return int(value)
|
| 155 |
+
return value
|
| 156 |
+
|
| 157 |
+
# If not found, return 0
|
| 158 |
+
return 0
|
| 159 |
+
|
| 160 |
+
# Calculate relative times
|
| 161 |
+
start_times = [get_timestamp(span, 'startTime') for span in spans]
|
| 162 |
+
min_start = min(start_times) if start_times else 0
|
| 163 |
+
max_start = max(start_times) if start_times else 0
|
| 164 |
+
|
| 165 |
+
# Check if we have any actual timing data
|
| 166 |
+
has_timing_data = min_start > 0 or max_start > 0
|
| 167 |
+
|
| 168 |
+
# Debug: Print first span's raw timestamps
|
| 169 |
+
if spans:
|
| 170 |
+
first_span = spans[0]
|
| 171 |
+
print(f"[DEBUG] First span raw data sample:")
|
| 172 |
+
print(f" startTime field: {first_span.get('startTime', 'NOT FOUND')}")
|
| 173 |
+
print(f" endTime field: {first_span.get('endTime', 'NOT FOUND')}")
|
| 174 |
+
print(f" startTimeUnixNano field: {first_span.get('startTimeUnixNano', 'NOT FOUND')}")
|
| 175 |
+
print(f" endTimeUnixNano field: {first_span.get('endTimeUnixNano', 'NOT FOUND')}")
|
| 176 |
+
print(f" HAS_TIMING_DATA: {has_timing_data}")
|
| 177 |
+
if 'attributes' in first_span:
|
| 178 |
+
attrs = first_span['attributes']
|
| 179 |
+
print(f" Sample attributes: {list(attrs.keys())[:5] if isinstance(attrs, dict) else 'N/A'}")
|
| 180 |
+
if isinstance(attrs, dict):
|
| 181 |
+
# Check for cost fields
|
| 182 |
+
cost_fields = [k for k in attrs.keys() if 'cost' in k.lower() or 'price' in k.lower()]
|
| 183 |
+
if cost_fields:
|
| 184 |
+
print(f" Cost-related fields found: {cost_fields}")
|
| 185 |
+
|
| 186 |
+
# Auto-detect timestamp unit based on magnitude
|
| 187 |
+
time_divisor = 1000000 # Default: assume nanoseconds, convert to milliseconds
|
| 188 |
+
if start_times and min_start > 0:
|
| 189 |
+
# If timestamp is > 1e15, it's likely nanoseconds
|
| 190 |
+
# If timestamp is > 1e12, it's likely microseconds
|
| 191 |
+
# If timestamp is > 1e9, it's likely milliseconds
|
| 192 |
+
# If timestamp is < 1e9, it's likely seconds
|
| 193 |
+
if min_start > 1e15:
|
| 194 |
+
time_divisor = 1000000 # nanoseconds to milliseconds
|
| 195 |
+
time_unit = "nanoseconds"
|
| 196 |
+
elif min_start > 1e12:
|
| 197 |
+
time_divisor = 1000 # microseconds to milliseconds
|
| 198 |
+
time_unit = "microseconds"
|
| 199 |
+
elif min_start > 1e9:
|
| 200 |
+
time_divisor = 1 # already in milliseconds
|
| 201 |
+
time_unit = "milliseconds"
|
| 202 |
+
else:
|
| 203 |
+
time_divisor = 0.001 # seconds to milliseconds
|
| 204 |
+
time_unit = "seconds"
|
| 205 |
+
print(f"[DEBUG] Auto-detected timestamp unit: {time_unit} (min_start={min_start}, divisor={time_divisor})")
|
| 206 |
+
|
| 207 |
+
processed_spans = []
|
| 208 |
+
for idx, span in enumerate(spans):
|
| 209 |
+
start_time = get_timestamp(span, 'startTime')
|
| 210 |
+
end_time = get_timestamp(span, 'endTime')
|
| 211 |
+
|
| 212 |
+
# Calculate relative start
|
| 213 |
+
relative_start = (start_time - min_start) / time_divisor if has_timing_data else 0
|
| 214 |
+
|
| 215 |
+
# Calculate duration - prefer duration_ms if available
|
| 216 |
+
if 'duration_ms' in span and span['duration_ms'] is not None:
|
| 217 |
+
actual_duration = float(span['duration_ms'])
|
| 218 |
+
else:
|
| 219 |
+
actual_duration = (end_time - start_time) / time_divisor
|
| 220 |
+
|
| 221 |
+
# Debug: Print first few durations
|
| 222 |
+
if idx < 3:
|
| 223 |
+
duration_source = 'duration_ms' if 'duration_ms' in span else 'calculated'
|
| 224 |
+
print(f"[DEBUG] Span {idx}: start={start_time}, end={end_time}, duration={actual_duration:.3f}ms ({duration_source})")
|
| 225 |
+
|
| 226 |
+
# Handle span ID variations
|
| 227 |
+
span_id = span.get('spanId') or span.get('span_id') or span.get('spanID') or f'span_{idx}'
|
| 228 |
+
parent_id = span.get('parentSpanId') or span.get('parent_span_id') or span.get('parentSpanID')
|
| 229 |
+
|
| 230 |
+
# Get span kind - check both top-level and OpenInference attributes
|
| 231 |
+
span_kind = span.get('kind', 'INTERNAL')
|
| 232 |
+
attributes = span.get('attributes', {})
|
| 233 |
+
|
| 234 |
+
# Check for OpenInference span kind in attributes
|
| 235 |
+
if isinstance(attributes, dict) and 'openinference.span.kind' in attributes:
|
| 236 |
+
openinference_kind = attributes.get('openinference.span.kind')
|
| 237 |
+
# Map OpenInference kinds to OpenTelemetry kinds for consistency
|
| 238 |
+
# OpenInference kinds: CHAIN, TOOL, LLM, RETRIEVER, EMBEDDING, AGENT, etc.
|
| 239 |
+
if openinference_kind:
|
| 240 |
+
span_kind = openinference_kind.upper()
|
| 241 |
+
|
| 242 |
+
# Extract token and cost information from attributes
|
| 243 |
+
token_info = {}
|
| 244 |
+
cost_info = {}
|
| 245 |
+
if isinstance(attributes, dict):
|
| 246 |
+
# Helper to safely extract numeric values
|
| 247 |
+
def safe_numeric(value):
|
| 248 |
+
"""Safely convert to numeric, return None if invalid"""
|
| 249 |
+
if value is None:
|
| 250 |
+
return None
|
| 251 |
+
try:
|
| 252 |
+
if isinstance(value, (int, float)):
|
| 253 |
+
return value
|
| 254 |
+
return float(value)
|
| 255 |
+
except (ValueError, TypeError):
|
| 256 |
+
return None
|
| 257 |
+
|
| 258 |
+
# Check for token usage (various formats)
|
| 259 |
+
prompt_tokens = None
|
| 260 |
+
completion_tokens = None
|
| 261 |
+
|
| 262 |
+
if 'gen_ai.usage.prompt_tokens' in attributes:
|
| 263 |
+
prompt_tokens = safe_numeric(attributes['gen_ai.usage.prompt_tokens'])
|
| 264 |
+
if 'gen_ai.usage.completion_tokens' in attributes:
|
| 265 |
+
completion_tokens = safe_numeric(attributes['gen_ai.usage.completion_tokens'])
|
| 266 |
+
if 'llm.token_count.prompt' in attributes and prompt_tokens is None:
|
| 267 |
+
prompt_tokens = safe_numeric(attributes['llm.token_count.prompt'])
|
| 268 |
+
if 'llm.token_count.completion' in attributes and completion_tokens is None:
|
| 269 |
+
completion_tokens = safe_numeric(attributes['llm.token_count.completion'])
|
| 270 |
+
|
| 271 |
+
# Store valid token counts
|
| 272 |
+
if prompt_tokens is not None:
|
| 273 |
+
token_info['prompt_tokens'] = int(prompt_tokens)
|
| 274 |
+
if completion_tokens is not None:
|
| 275 |
+
token_info['completion_tokens'] = int(completion_tokens)
|
| 276 |
+
|
| 277 |
+
# Calculate total tokens
|
| 278 |
+
if 'prompt_tokens' in token_info and 'completion_tokens' in token_info:
|
| 279 |
+
token_info['total_tokens'] = token_info['prompt_tokens'] + token_info['completion_tokens']
|
| 280 |
+
elif 'llm.usage.total_tokens' in attributes:
|
| 281 |
+
total = safe_numeric(attributes['llm.usage.total_tokens'])
|
| 282 |
+
if total is not None:
|
| 283 |
+
token_info['total_tokens'] = int(total)
|
| 284 |
+
|
| 285 |
+
# Check for cost information (various formats)
|
| 286 |
+
if 'gen_ai.usage.cost.total' in attributes:
|
| 287 |
+
cost = safe_numeric(attributes['gen_ai.usage.cost.total'])
|
| 288 |
+
if cost is not None:
|
| 289 |
+
cost_info['total_cost'] = cost
|
| 290 |
+
elif 'llm.usage.cost' in attributes:
|
| 291 |
+
cost = safe_numeric(attributes['llm.usage.cost'])
|
| 292 |
+
if cost is not None:
|
| 293 |
+
cost_info['total_cost'] = cost
|
| 294 |
+
|
| 295 |
+
# Debug: Print cost info for LLM spans
|
| 296 |
+
if idx < 2 and span_kind == 'LLM':
|
| 297 |
+
print(f"[DEBUG] LLM Span {idx} cost extraction:")
|
| 298 |
+
print(f" gen_ai.usage.cost.total: {attributes.get('gen_ai.usage.cost.total', 'NOT FOUND')}")
|
| 299 |
+
print(f" llm.usage.cost: {attributes.get('llm.usage.cost', 'NOT FOUND')}")
|
| 300 |
+
print(f" cost_info: {cost_info}")
|
| 301 |
+
|
| 302 |
+
# Store actual duration for tooltip, use minimum for visualization
|
| 303 |
+
display_duration = max(actual_duration, 0.1) # Minimum width for visibility
|
| 304 |
+
|
| 305 |
+
processed_spans.append({
|
| 306 |
+
'span_id': span_id,
|
| 307 |
+
'parent_id': parent_id,
|
| 308 |
+
'name': span.get('name', 'Unknown'),
|
| 309 |
+
'kind': span_kind,
|
| 310 |
+
'start_time': relative_start,
|
| 311 |
+
'duration': display_duration, # For bar width
|
| 312 |
+
'actual_duration': actual_duration, # For tooltip
|
| 313 |
+
'end_time': relative_start + actual_duration, # Use actual for end time
|
| 314 |
+
'attributes': attributes,
|
| 315 |
+
'status': span.get('status', {}).get('code', 'UNKNOWN'),
|
| 316 |
+
'tokens': token_info,
|
| 317 |
+
'cost': cost_info
|
| 318 |
+
})
|
| 319 |
+
|
| 320 |
+
print(f"[DEBUG] Total spans in input: {len(spans)}")
|
| 321 |
+
print(f"[DEBUG] Processed spans: {len(processed_spans)}")
|
| 322 |
+
|
| 323 |
+
# Debug: Show span kinds and statuses detected
|
| 324 |
+
span_kinds = {}
|
| 325 |
+
span_statuses = {}
|
| 326 |
+
durations = []
|
| 327 |
+
spans_with_tokens = 0
|
| 328 |
+
spans_with_cost = 0
|
| 329 |
+
for span in processed_spans:
|
| 330 |
+
kind = span['kind']
|
| 331 |
+
status = span['status']
|
| 332 |
+
span_kinds[kind] = span_kinds.get(kind, 0) + 1
|
| 333 |
+
span_statuses[status] = span_statuses.get(status, 0) + 1
|
| 334 |
+
durations.append(span['actual_duration'])
|
| 335 |
+
if span['tokens']:
|
| 336 |
+
spans_with_tokens += 1
|
| 337 |
+
if span['cost']:
|
| 338 |
+
spans_with_cost += 1
|
| 339 |
+
|
| 340 |
+
print(f"[DEBUG] Span kinds detected: {span_kinds}")
|
| 341 |
+
print(f"[DEBUG] Span statuses detected: {span_statuses}")
|
| 342 |
+
if durations:
|
| 343 |
+
print(f"[DEBUG] Duration range: {min(durations):.3f}ms - {max(durations):.3f}ms")
|
| 344 |
+
print(f"[DEBUG] Spans with token info: {spans_with_tokens}/{len(processed_spans)}")
|
| 345 |
+
print(f"[DEBUG] Spans with cost info: {spans_with_cost}/{len(processed_spans)}")
|
| 346 |
+
|
| 347 |
+
return processed_spans
|
| 348 |
+
|
| 349 |
+
|
| 350 |
+
def create_span_visualization(spans: List[Dict[str, Any]], trace_id: str = "Unknown") -> go.Figure:
|
| 351 |
+
"""Create an interactive Plotly waterfall visualization of spans"""
|
| 352 |
+
processed_spans = process_trace_data(spans)
|
| 353 |
+
|
| 354 |
+
print(f"[DEBUG] create_span_visualization - Received {len(spans)} spans")
|
| 355 |
+
print(f"[DEBUG] create_span_visualization - Processed {len(processed_spans)} spans")
|
| 356 |
+
|
| 357 |
+
if not processed_spans:
|
| 358 |
+
# Return empty figure with message
|
| 359 |
+
fig = go.Figure()
|
| 360 |
+
fig.add_annotation(
|
| 361 |
+
text="No spans to display",
|
| 362 |
+
xref="paper", yref="paper",
|
| 363 |
+
x=0.5, y=0.5, xanchor='center', yanchor='middle',
|
| 364 |
+
showarrow=False,
|
| 365 |
+
font=dict(size=20)
|
| 366 |
+
)
|
| 367 |
+
return fig
|
| 368 |
+
|
| 369 |
+
# Sort spans by start time for better visualization
|
| 370 |
+
processed_spans.sort(key=lambda x: x['start_time'])
|
| 371 |
+
|
| 372 |
+
# Create unique labels for each span (include index to ensure uniqueness)
|
| 373 |
+
for idx, span in enumerate(processed_spans):
|
| 374 |
+
# Add span index to make labels unique
|
| 375 |
+
span['display_name'] = f"{span['name']} [{idx}]"
|
| 376 |
+
|
| 377 |
+
# Create colors based on span status and kind
|
| 378 |
+
colors = []
|
| 379 |
+
color_map = {} # Track which colors are assigned to which kinds
|
| 380 |
+
for span in processed_spans:
|
| 381 |
+
status = span['status']
|
| 382 |
+
kind = span['kind']
|
| 383 |
+
|
| 384 |
+
# Only show red for actual errors (ERROR status)
|
| 385 |
+
if status == 'ERROR':
|
| 386 |
+
color = '#DC143C' # Crimson for errors
|
| 387 |
+
else:
|
| 388 |
+
# Color by span kind (supports both OpenTelemetry and OpenInference)
|
| 389 |
+
if kind == 'SERVER':
|
| 390 |
+
color = '#2E8B57' # Sea Green
|
| 391 |
+
elif kind == 'CLIENT':
|
| 392 |
+
color = '#4169E1' # Royal Blue
|
| 393 |
+
elif kind == 'LLM':
|
| 394 |
+
color = '#9B59B6' # Purple for LLM calls
|
| 395 |
+
elif kind == 'TOOL':
|
| 396 |
+
color = '#E67E22' # Orange for Tool calls
|
| 397 |
+
elif kind == 'CHAIN':
|
| 398 |
+
color = '#3498DB' # Light Blue for Chains
|
| 399 |
+
elif kind == 'AGENT':
|
| 400 |
+
color = '#1ABC9C' # Turquoise for Agents
|
| 401 |
+
elif kind == 'RETRIEVER':
|
| 402 |
+
color = '#F39C12' # Yellow-Orange for Retrievers
|
| 403 |
+
elif kind == 'EMBEDDING':
|
| 404 |
+
color = '#8E44AD' # Dark Purple for Embeddings
|
| 405 |
+
else:
|
| 406 |
+
color = '#4682B4' # Steel Blue for INTERNAL/unknown
|
| 407 |
+
|
| 408 |
+
colors.append(color)
|
| 409 |
+
if kind not in color_map:
|
| 410 |
+
color_map[kind] = color
|
| 411 |
+
|
| 412 |
+
print(f"[DEBUG] Color assignments: {color_map}")
|
| 413 |
+
|
| 414 |
+
# Create the waterfall chart
|
| 415 |
+
fig = go.Figure()
|
| 416 |
+
|
| 417 |
+
# Prepare custom data for hover tooltips
|
| 418 |
+
customdata = []
|
| 419 |
+
for span in processed_spans:
|
| 420 |
+
# Build token info string
|
| 421 |
+
token_str = ""
|
| 422 |
+
if span['tokens']:
|
| 423 |
+
tokens = span['tokens']
|
| 424 |
+
if 'total_tokens' in tokens:
|
| 425 |
+
token_str = f"<br>Tokens: {tokens['total_tokens']}"
|
| 426 |
+
if 'prompt_tokens' in tokens and 'completion_tokens' in tokens:
|
| 427 |
+
token_str += f" (prompt: {tokens['prompt_tokens']}, completion: {tokens['completion_tokens']})"
|
| 428 |
+
elif 'prompt_tokens' in tokens or 'completion_tokens' in tokens:
|
| 429 |
+
parts = []
|
| 430 |
+
if 'prompt_tokens' in tokens:
|
| 431 |
+
parts.append(f"prompt: {tokens['prompt_tokens']}")
|
| 432 |
+
if 'completion_tokens' in tokens:
|
| 433 |
+
parts.append(f"completion: {tokens['completion_tokens']}")
|
| 434 |
+
token_str = f"<br>Tokens: {', '.join(parts)}"
|
| 435 |
+
|
| 436 |
+
# Build cost info string
|
| 437 |
+
cost_str = ""
|
| 438 |
+
if span['cost'] and 'total_cost' in span['cost']:
|
| 439 |
+
cost_str = f"<br>Cost: ${span['cost']['total_cost']:.6f}"
|
| 440 |
+
|
| 441 |
+
customdata.append([
|
| 442 |
+
span['name'],
|
| 443 |
+
span['kind'],
|
| 444 |
+
span['span_id'],
|
| 445 |
+
span['end_time'],
|
| 446 |
+
span['actual_duration'], # Show actual duration, not display duration
|
| 447 |
+
token_str,
|
| 448 |
+
cost_str
|
| 449 |
+
])
|
| 450 |
+
|
| 451 |
+
# Add bars for each span (use display_name for unique y-axis labels)
|
| 452 |
+
fig.add_trace(go.Bar(
|
| 453 |
+
y=[span['display_name'] for span in processed_spans],
|
| 454 |
+
x=[span['duration'] for span in processed_spans], # Display duration (min 0.1ms)
|
| 455 |
+
base=[span['start_time'] for span in processed_spans],
|
| 456 |
+
orientation='h',
|
| 457 |
+
marker_color=colors,
|
| 458 |
+
hovertemplate=(
|
| 459 |
+
"<b>%{customdata[0]}</b><br>" +
|
| 460 |
+
"Type: %{customdata[1]}<br>" +
|
| 461 |
+
"Span ID: %{customdata[2]}<br>" +
|
| 462 |
+
"Duration: %{customdata[4]:.3f} ms<br>" + # Actual duration with 3 decimal places
|
| 463 |
+
"Start: %{base:.2f} ms<br>" +
|
| 464 |
+
"End: %{customdata[3]:.2f} ms" +
|
| 465 |
+
"%{customdata[5]}" + # Token info (already formatted)
|
| 466 |
+
"%{customdata[6]}" + # Cost info (already formatted)
|
| 467 |
+
"<extra></extra>"
|
| 468 |
+
),
|
| 469 |
+
customdata=customdata,
|
| 470 |
+
name="Spans"
|
| 471 |
+
))
|
| 472 |
+
|
| 473 |
+
# Update layout for better visualization
|
| 474 |
+
fig.update_layout(
|
| 475 |
+
title={
|
| 476 |
+
'text': f"OpenTelemetry Trace: {trace_id}",
|
| 477 |
+
'x': 0.5,
|
| 478 |
+
'xanchor': 'center'
|
| 479 |
+
},
|
| 480 |
+
xaxis_title="Time (milliseconds)",
|
| 481 |
+
yaxis_title="Spans",
|
| 482 |
+
showlegend=False,
|
| 483 |
+
height=400 + len(processed_spans) * 30, # Dynamic height based on span count
|
| 484 |
+
bargap=0.2,
|
| 485 |
+
hovermode='closest'
|
| 486 |
+
)
|
| 487 |
+
|
| 488 |
+
return fig
|
| 489 |
+
|
| 490 |
+
|
| 491 |
+
def create_span_table(spans: List[Dict[str, Any]]) -> gr.JSON:
|
| 492 |
+
"""Create detailed span information display"""
|
| 493 |
+
|
| 494 |
+
# Ensure spans is a list
|
| 495 |
+
if hasattr(spans, 'tolist'):
|
| 496 |
+
spans = spans.tolist()
|
| 497 |
+
elif not isinstance(spans, list):
|
| 498 |
+
spans = list(spans) if spans is not None else []
|
| 499 |
+
|
| 500 |
+
# Helper function to get timestamp (same as in process_trace_data)
|
| 501 |
+
def get_timestamp(span, field_name):
|
| 502 |
+
variations = [
|
| 503 |
+
field_name,
|
| 504 |
+
field_name.lower(),
|
| 505 |
+
field_name.replace('Time', 'TimeUnixNano'),
|
| 506 |
+
field_name[0].lower() + field_name[1:],
|
| 507 |
+
]
|
| 508 |
+
for var in variations:
|
| 509 |
+
if var in span:
|
| 510 |
+
value = span[var]
|
| 511 |
+
if isinstance(value, str):
|
| 512 |
+
return int(value)
|
| 513 |
+
return value
|
| 514 |
+
return 0
|
| 515 |
+
|
| 516 |
+
# Simplify span data for display
|
| 517 |
+
simplified_spans = []
|
| 518 |
+
for span in spans:
|
| 519 |
+
start_time = get_timestamp(span, 'startTime')
|
| 520 |
+
end_time = get_timestamp(span, 'endTime')
|
| 521 |
+
duration_ms = (end_time - start_time) / 1000000 if (end_time and start_time) else 0
|
| 522 |
+
|
| 523 |
+
# Handle span ID variations
|
| 524 |
+
span_id = span.get('spanId') or span.get('span_id') or span.get('spanID') or 'N/A'
|
| 525 |
+
parent_id = span.get('parentSpanId') or span.get('parent_span_id') or span.get('parentSpanID') or 'root'
|
| 526 |
+
|
| 527 |
+
simplified_spans.append({
|
| 528 |
+
"Span ID": span_id,
|
| 529 |
+
"Parent": parent_id,
|
| 530 |
+
"Name": span.get('name', 'N/A'),
|
| 531 |
+
"Kind": span.get('kind', 'N/A'),
|
| 532 |
+
"Duration (ms)": round(duration_ms, 2),
|
| 533 |
+
"Attributes": span.get('attributes', {}),
|
| 534 |
+
"Status": span.get('status', {}).get('code', 'UNKNOWN')
|
| 535 |
+
})
|
| 536 |
+
|
| 537 |
+
return gr.JSON(value=simplified_spans, label="Span Details")
|
| 538 |
+
|
| 539 |
+
|
| 540 |
+
# GPU Metrics Visualization Functions
|
| 541 |
+
|
| 542 |
+
def extract_metrics_data(metrics_df):
|
| 543 |
+
"""
|
| 544 |
+
Extract and prepare GPU metrics data for visualization
|
| 545 |
+
|
| 546 |
+
Args:
|
| 547 |
+
metrics_df: DataFrame with flat metrics structure (from HuggingFace dataset)
|
| 548 |
+
Expected columns: timestamp, gpu_utilization_percent, gpu_memory_used_mib,
|
| 549 |
+
gpu_temperature_celsius, gpu_power_watts, co2_emissions_gco2e
|
| 550 |
+
|
| 551 |
+
Returns:
|
| 552 |
+
DataFrame ready for visualization
|
| 553 |
+
"""
|
| 554 |
+
if metrics_df is None or metrics_df.empty:
|
| 555 |
+
return pd.DataFrame()
|
| 556 |
+
|
| 557 |
+
# Ensure timestamp is datetime
|
| 558 |
+
if 'timestamp' in metrics_df.columns:
|
| 559 |
+
if not pd.api.types.is_datetime64_any_dtype(metrics_df['timestamp']):
|
| 560 |
+
metrics_df['timestamp'] = pd.to_datetime(metrics_df['timestamp'])
|
| 561 |
+
|
| 562 |
+
# Sort by timestamp
|
| 563 |
+
metrics_df = metrics_df.sort_values('timestamp')
|
| 564 |
+
|
| 565 |
+
return metrics_df
|
| 566 |
+
|
| 567 |
+
|
| 568 |
+
def create_gpu_summary_cards(df):
|
| 569 |
+
"""
|
| 570 |
+
Create summary cards for GPU metrics
|
| 571 |
+
|
| 572 |
+
Args:
|
| 573 |
+
df: DataFrame with flat metrics structure (columns: gpu_utilization_percent, etc.)
|
| 574 |
+
|
| 575 |
+
Returns:
|
| 576 |
+
HTML string with summary cards
|
| 577 |
+
"""
|
| 578 |
+
if df is None or df.empty:
|
| 579 |
+
return "<div style='padding: 20px; text-align: center;'>⚠️ No GPU metrics available (expected for API models)</div>"
|
| 580 |
+
|
| 581 |
+
# Get the latest row (assumes df is sorted by timestamp)
|
| 582 |
+
latest = df.iloc[-1]
|
| 583 |
+
|
| 584 |
+
# Extract values (with safe fallback)
|
| 585 |
+
utilization = latest.get('gpu_utilization_percent', 0)
|
| 586 |
+
memory_used = latest.get('gpu_memory_used_mib', 0)
|
| 587 |
+
temperature = latest.get('gpu_temperature_celsius', 0)
|
| 588 |
+
co2_emissions = latest.get('co2_emissions_gco2e', 0)
|
| 589 |
+
power = latest.get('gpu_power_watts', 0)
|
| 590 |
+
|
| 591 |
+
# Also get memory total if available for percentage
|
| 592 |
+
memory_total = latest.get('gpu_memory_total_mib', 0)
|
| 593 |
+
memory_percent = (memory_used / memory_total * 100) if memory_total > 0 else 0
|
| 594 |
+
|
| 595 |
+
cards_html = f"""
|
| 596 |
+
<div style="display: grid; grid-template-columns: repeat(4, 1fr); gap: 15px; margin: 20px 0;">
|
| 597 |
+
<div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); padding: 20px; border-radius: 10px; color: white; text-align: center;">
|
| 598 |
+
<h3 style="margin: 0 0 10px 0; font-size: 1em;">GPU Utilization</h3>
|
| 599 |
+
<h2 style="margin: 0; font-size: 2em;">{utilization:.1f}%</h2>
|
| 600 |
+
</div>
|
| 601 |
+
<div style="background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); padding: 20px; border-radius: 10px; color: white; text-align: center;">
|
| 602 |
+
<h3 style="margin: 0 0 10px 0; font-size: 1em;">GPU Memory</h3>
|
| 603 |
+
<h2 style="margin: 0; font-size: 2em;">{memory_used:.0f} MiB</h2>
|
| 604 |
+
<p style="margin: 5px 0 0 0; font-size: 0.8em; opacity: 0.9;">{memory_percent:.1f}% of {memory_total:.0f} MiB</p>
|
| 605 |
+
</div>
|
| 606 |
+
<div style="background: linear-gradient(135deg, #4facfe 0%, #00f2fe 100%); padding: 20px; border-radius: 10px; color: white; text-align: center;">
|
| 607 |
+
<h3 style="margin: 0 0 10px 0; font-size: 1em;">GPU Temperature</h3>
|
| 608 |
+
<h2 style="margin: 0; font-size: 2em;">{temperature:.0f}°C</h2>
|
| 609 |
+
</div>
|
| 610 |
+
<div style="background: linear-gradient(135deg, #43e97b 0%, #38f9d7 100%); padding: 20px; border-radius: 10px; color: white; text-align: center;">
|
| 611 |
+
<h3 style="margin: 0 0 10px 0; font-size: 1em;">CO2 Emissions</h3>
|
| 612 |
+
<h2 style="margin: 0; font-size: 2em;">{co2_emissions:.4f} g</h2>
|
| 613 |
+
<p style="margin: 5px 0 0 0; font-size: 0.8em; opacity: 0.9;">Power: {power:.1f} W</p>
|
| 614 |
+
</div>
|
| 615 |
+
</div>
|
| 616 |
+
"""
|
| 617 |
+
|
| 618 |
+
return cards_html
|
| 619 |
+
|
| 620 |
+
|
| 621 |
+
def create_gpu_metrics_dashboard(metrics_df):
|
| 622 |
+
"""
|
| 623 |
+
Create a combined dashboard with GPU metric charts
|
| 624 |
+
|
| 625 |
+
Args:
|
| 626 |
+
metrics_df: DataFrame with flat metrics structure (from HuggingFace dataset)
|
| 627 |
+
|
| 628 |
+
Returns:
|
| 629 |
+
Plotly figure with GPU metrics time series
|
| 630 |
+
"""
|
| 631 |
+
if metrics_df is None or metrics_df.empty:
|
| 632 |
+
# Return empty figure with message
|
| 633 |
+
fig = go.Figure()
|
| 634 |
+
fig.add_annotation(
|
| 635 |
+
text="No GPU metrics available (expected for API models)",
|
| 636 |
+
xref="paper", yref="paper",
|
| 637 |
+
x=0.5, y=0.5, xanchor='center', yanchor='middle',
|
| 638 |
+
showarrow=False,
|
| 639 |
+
font=dict(size=16)
|
| 640 |
+
)
|
| 641 |
+
return fig
|
| 642 |
+
|
| 643 |
+
# Prepare data
|
| 644 |
+
df = extract_metrics_data(metrics_df)
|
| 645 |
+
|
| 646 |
+
if df.empty:
|
| 647 |
+
return None
|
| 648 |
+
|
| 649 |
+
# Create subplots for GPU metrics
|
| 650 |
+
# We'll show: Utilization, Memory, Temperature, Power, CO2
|
| 651 |
+
fig = make_subplots(
|
| 652 |
+
rows=3, cols=2,
|
| 653 |
+
subplot_titles=[
|
| 654 |
+
'GPU Utilization (%)',
|
| 655 |
+
'GPU Memory (MiB)',
|
| 656 |
+
'GPU Temperature (°C)',
|
| 657 |
+
'GPU Power (W)',
|
| 658 |
+
'CO2 Emissions (g)',
|
| 659 |
+
'Power Cost (USD)'
|
| 660 |
+
],
|
| 661 |
+
vertical_spacing=0.10,
|
| 662 |
+
horizontal_spacing=0.12,
|
| 663 |
+
specs=[[{}, {}], [{}, {}], [{}, {}]]
|
| 664 |
+
)
|
| 665 |
+
|
| 666 |
+
colors = ['#667eea', '#f093fb', '#4facfe', '#FFE66D', '#43e97b', '#FF6B6B']
|
| 667 |
+
|
| 668 |
+
# Define metrics to plot
|
| 669 |
+
metrics_config = [
|
| 670 |
+
('gpu_utilization_percent', 'GPU Utilization (%)', 1, 1, colors[0]),
|
| 671 |
+
('gpu_memory_used_mib', 'GPU Memory (MiB)', 1, 2, colors[1]),
|
| 672 |
+
('gpu_temperature_celsius', 'GPU Temperature (°C)', 2, 1, colors[2]),
|
| 673 |
+
('gpu_power_watts', 'GPU Power (W)', 2, 2, colors[3]),
|
| 674 |
+
('co2_emissions_gco2e', 'CO2 Emissions (g)', 3, 1, colors[4]),
|
| 675 |
+
('power_cost_usd', 'Power Cost (USD)', 3, 2, colors[5]),
|
| 676 |
+
]
|
| 677 |
+
|
| 678 |
+
for col_name, title, row, col, color in metrics_config:
|
| 679 |
+
if col_name in df.columns:
|
| 680 |
+
fig.add_trace(
|
| 681 |
+
go.Scatter(
|
| 682 |
+
x=df['timestamp'],
|
| 683 |
+
y=df[col_name],
|
| 684 |
+
mode='lines+markers',
|
| 685 |
+
name=title,
|
| 686 |
+
line=dict(color=color, width=3),
|
| 687 |
+
marker=dict(size=6, color=color),
|
| 688 |
+
hovertemplate=(
|
| 689 |
+
f"<b>{title}</b><br>" +
|
| 690 |
+
"Time: %{x}<br>" +
|
| 691 |
+
"Value: %{y:.2f}<br>" +
|
| 692 |
+
"<extra></extra>"
|
| 693 |
+
)
|
| 694 |
+
),
|
| 695 |
+
row=row, col=col
|
| 696 |
+
)
|
| 697 |
+
|
| 698 |
+
# Add memory total as a dashed line if available
|
| 699 |
+
if 'gpu_memory_total_mib' in df.columns:
|
| 700 |
+
total_memory = df['gpu_memory_total_mib'].iloc[0]
|
| 701 |
+
fig.add_hline(
|
| 702 |
+
y=total_memory,
|
| 703 |
+
line_dash="dash",
|
| 704 |
+
line_color="gray",
|
| 705 |
+
annotation_text=f"Total: {total_memory:.0f} MiB",
|
| 706 |
+
annotation_position="right",
|
| 707 |
+
row=1, col=2
|
| 708 |
+
)
|
| 709 |
+
|
| 710 |
+
fig.update_layout(
|
| 711 |
+
title_text="GPU Metrics Over Time",
|
| 712 |
+
height=900,
|
| 713 |
+
template="plotly_white",
|
| 714 |
+
showlegend=False,
|
| 715 |
+
hovermode='x unified'
|
| 716 |
+
)
|
| 717 |
+
|
| 718 |
+
# Update x-axes to show time format
|
| 719 |
+
fig.update_xaxes(tickformat='%H:%M:%S')
|
| 720 |
+
|
| 721 |
+
return fig
|