How To Retrieve Visualize And Analyze Monitoring Data With The Assistant

Mahadeva
11 Nov 2025

In this walkthrough, we’ll play out a realistic conversation between a user and the Quantum Network Monitor Assistant to retrieve live and historical data, visualize short-term trends, and troubleshoot an incident. Along the way, you’ll see the exact function calls the assistant makes and how their results drive the next step. You can try these flows yourself using the Quantum Network Monitor Assistant at https://readyforquantum.com/?assistant=open.

Step 1: Get the lay of the land (which hosts are being monitored?) User: I need a quick inventory of my monitored endpoints. What do I have under watch?

Assistant: I’ll list your monitored hosts so we know which assets to query. I’ll include configuration detail so we can see endpoint types and locations.

We use get_host_list to retrieve monitored hosts, optionally filtering or paginating if needed.

Function call

{
  "name": "functions.get_host_list",
  "arguments": {
    "detail_response": true,
    "page_number": 1,
    "page_size": 10
  }
}

How we interpret the response

The function returns a list of hosts with fields like id, address, endpoint, enabled, port, and agent_location.
Example (summarized):
- id: 42, address: api.acme.internal, endpoint: https, port: 443, agent_location: London - UK
- id: 57, address: db.acme.internal, endpoint: icmp, agent_location: London - UK
We’ll use these IDs and addresses in subsequent data queries.

Step 2: Fetch live status (latest data snapshot) User: Show me the current status for api.acme.internal.

Assistant: I’ll pull the latest dataset (dataset_id = 0). I’ll request detail so we get response time stats, status, and any alert flags.

We use get_host_data with dataset_id set to 0 to fetch the latest reading for this host.

Function call

{
  "name": "functions.get_host_data",
  "arguments": {
    "detail_response": true,
    "dataset_id": 0,
    "address": "api.acme.internal"
  }
}

How we interpret the response

Typical fields include status (up/down), response_time_ms, uptime, last_checked, alert_flag, and for HTTPS endpoints possibly ssl_days_remaining.
Example (summarized):
- status: up
- response_time_ms: 230
- p50_ms: 210, p95_ms: 410, p99_ms: 520 (from recent rolling window)
- alert_flag: false
- last_checked: 2026-03-29T02:59:50Z
Conclusion: The API is up, with moderate latency and no active alert.

Step 3: Pull historical data for trend analysis User: Great. Can you show me the last 24 hours so I can spot trends?

Assistant: I’ll request a 24-hour range ending now. With detailed data, I can compute rolling stats (min/avg/p95) and highlight any spikes.

We use get_host_data with a date range (omit dataset_id) to fetch historic measurements.

Function call

{
  "name": "functions.get_host_data",
  "arguments": {
    "detail_response": true,
    "address": "api.acme.internal",
    "date_start": "2026-03-28T03:00:00",
    "date_end": "2026-03-29T03:00:00"
  }
}

How we interpret the response

The function returns time-series points with timestamps and latency values, plus summarized stats if available.
We calculate:
- Overall min/avg/max latency for the period.
- p95 latency for peak characterization.
- Hourly breakdowns to see time-of-day patterns.
Example insight:
- 24h average: 195 ms; p95: 420 ms.
- Notable spike 09:05–09:20 UTC with several samples >600 ms.
Assistant visualization: I can render a quick textual trend summary and optionally output CSV. For a fast glance, I’ll provide hourly aggregates and mark spikes. If you paste this into a dashboard or spreadsheet, you’ll get the full chart.

Step 4: Correlate two services side-by-side (parallel fetch) User: That spike might be tied to the database. Compare the API and db.acme.internal over the same 24 hours.

Assistant: I’ll fetch both datasets in parallel for a clean comparison.

We use multi_tool_use.parallel to run two get_host_data calls at once.

Function call

{
  "name": "multi_tool_use.parallel",
  "arguments": {
    "tool_uses": [
      {
        "recipient_name": "functions.get_host_data",
        "parameters": {
          "detail_response": true,
          "address": "api.acme.internal",
          "date_start": "2026-03-28T03:00:00",
          "date_end": "2026-03-29T03:00:00"
        }
      },
      {
        "recipient_name": "functions.get_host_data",
        "parameters": {
          "detail_response": true,
          "address": "db.acme.internal",
          "date_start": "2026-03-28T03:00:00",
          "date_end": "2026-03-29T03:00:00"
        }
      }
    ]
  }
}

How we interpret the response

We align the two time series by timestamp and compare:
- Did DB ICMP latency spike at the same time as API HTTPS latency?
- Is there a lead/lag (e.g., DB spike precedes API spike by a few minutes)?
Example insight:
- DB ICMP RTT median: 4 ms; p95: 12 ms across most hours.
- 09:06–09:15 UTC: DB p95 jumps to 120 ms; API p95 jumps to 650 ms with a 2–3 minute lag.
Conclusion: The API latency spike correlates strongly with DB network degradation, suggesting a shared network issue or DB host resource contention.

Step 5: Zero in on active incidents (alerts) User: Are there any active alerts right now that I should triage?

Assistant: I’ll query for hosts with alert_flag set. This highlights assets currently in an alert state.

We use get_host_data filtering by alert_flag to find active issues.

Function call

{
  "name": "functions.get_host_data",
  "arguments": {
    "detail_response": true,
    "alert_flag": true,
    "dataset_id": 0
  }
}

How we interpret the response

The function returns the latest data points for hosts that are currently alerting.
Example (summarized):
- host: cache.acme.internal (icmp), alert_flag: true, status: down, last_checked: 2026-03-29T02:58:30Z
- host: api.acme.internal (https), alert_flag: true, status: up, reason: latency_threshold_exceeded
Next step: drill into one host’s recent history (past 2 hours) for root-cause clues.

Deep-dive a single alert for the past 2 hours User: Pull the last two hours for cache.acme.internal.

Assistant: I’ll request a narrower range to spot when it went down and if it flapped.

Function call

{
  "name": "functions.get_host_data",
  "arguments": {
    "detail_response": true,
    "address": "cache.acme.internal",
    "date_start": "2026-03-29T01:00:00",
    "date_end": "2026-03-29T03:00:00"
  }
}

How we interpret the response

We look for status transitions (up→down), packet loss trends, and last success timestamp.
Example insight:
- Host transitioned down at 02:41:13 UTC with 100% packet loss since then.
- Prior 30 minutes showed rising packet loss (5%→18%→42%), indicating degrading link quality before outage.
Action: forward evidence to network ops, or check the hosting node if you manage it.

Step 6: Acknowledge and clear alerts after remediation User: We’ve fixed the cache server. Please clear the alert so I can track fresh incidents.

Assistant: I’ll reset the monitor alerts for the cache host so new events are obvious.

We use reset_alerts to clear the alert flag for the specific host. If you prefer, you can reset all alerts for your account at once.

Function call

{
  "name": "functions.reset_alerts",
  "arguments": {
    "address": "cache.acme.internal"
  }
}

How we interpret the response

The function confirms the alert reset succeeded.
After a couple of fresh checks, the host should show alert_flag: false if the issue is resolved.
If predictive alerts are also used and you want to clear them, set alert_type to "predict".

Step 7: Visualizing trends quickly (from retrieved data) User: Can you give me a quick visualization for the API’s last 24 hours and mark the spike?

Assistant: From the historical dataset we retrieved earlier, I’ll compute hourly aggregates and highlight outliers. I can:

Provide a compact textual trend summary with hourly p95 values and call out hours exceeding thresholds.
Supply CSV-style rows (timestamp, latency_ms) so you can drop it into your BI tool.
If you want a rapid inline sketch, I can render a simple ASCII sparkline of p95 latency across hours and annotate peak intervals.

Example trend summary (derived from the 24h data we fetched)

Hours with elevated p95 (>400 ms): 09:00–10:00 UTC, 14:00–14:15 UTC.
Biggest spike window: 09:05–09:20 UTC (p95 ~650 ms).
Baseline outside spikes: p95 ~280–320 ms.

Putting it all together: A simple operational loop you can reuse

Inventory monitored assets:
- Call get_host_list with detail_response = true.
Live snapshots for quick health checks:
- Call get_host_data with dataset_id = 0 and address for the target host.
Historical trend analysis:
- Call get_host_data with date_start/date_end for 2h, 24h, 7d windows as needed.
Correlation across services:
- Use multi_tool_use.parallel to fetch multiple hosts over the same window; compare aligned time series.
Incident triage:
- Call get_host_data with alert_flag = true (and dataset_id = 0) to surface active issues.
- Drill into a host’s recent window for root cause analysis.
Post-remediation hygiene:
- Call reset_alerts for the specific host (or all) to clear flags and observe clean state.

Function cheat sheet we used

get_host_list: Lists your monitored hosts and configurations.
get_host_data: Retrieves live (dataset_id = 0) or historic (date range) monitoring data, optionally with detailed stats.
multi_tool_use.parallel: Executes multiple tool calls at once (great for side-by-side comparisons).
reset_alerts: Clears monitor or predictive alert flags after you’ve acknowledged or fixed an issue.

You can run these exact flows in your own environment with the Quantum Network Monitor Assistant at https://readyforquantum.com/?assistant=open. It’s the fastest way to fetch live/historic data, visualize performance trends, and troubleshoot incidents—right from a chat.

How To Retrieve Visualize And Analyze Monitoring Data With The Assistant

Frequently Asked Questions

How do I get the latest status for a specific monitored host?

What’s the difference between using dataset_id and a date range in get_host_data?

How can I compare two services over the same time period?

How do I find which hosts currently have active alerts?

What should I do after fixing an incident so future alerts are easier to track?

Related Posts

Continuous Monitoring With Custom Agents

From Data To Action How AI Interprets Network Logs

Using AI To Generate Insights From Network Monitoring Logs