Getting Started with Metrics API

Home
Essential Knowledge
By Type
Building Blocks
APIs
Releases

Essential knowledge

Authors:

Kirill Gaiduk, Girish Padmanabha

Changed on:

19 June 2024

Overview

This guide is intended to provide implementers with information about the Metrics API within the Fluent Big Inventory product, describing its functionalities and utilization principles within the Fluent Platform.

Pre-requisites:

You should have knowledge of How Metrics works.
You should have knowledge of Metrics usage for Platform Observability.
You should have knowledge of GraphQL.
You should configure your User role with `METRICS_VIEW` permission for Metrics data access.

Key points

Metrics data is fetched and visualized for the Fluent Platform Observability enablement.
Use `metricInstant` and `metricRange` GraphQL queries to retrieve the Metrics data.
Prometheus Query Language is utilized for the Metrics API queries construction.

What is Metrics API for?

The Metrics API is a part of API acting as a proxy between the Fluent Platform and the Metrics workspace in to provide the necessary Metrics data for visualization with Fluent OMX UX framework, including:

Charts
Diagrams
Tables
Dashboards

Queries

The following Metrics API query operations are currently available to retrieve customer Metrics data:

`metricInstant`

Designed to execute an instant query at a specific time, returning a single data point for the specified Metric at that exact time.

`metricRange`

Executes a range query over a predefined duration, outputting a series of data points for the Metric from start to end at intervals determined by the step value.

Query Selection Principals

Use `metricInstant` when:
- You want the value of a Metric at a precise historical moment.
- You aim to retrieve exact historical values.
- You are comparing values between two distinct points in time.
Use `metricRange` when:
- You want to visualize a Metric trend over a span.
- Conducting a Metric evolution analysis over a duration.
- Generating graphs or charts that need data at regular intervals.

Query Parameters

Available Metrics API queries are configurable with the following parameters:

Query Name	Parameter	Mandatory/Optional	Type	Format	Default value (when not specified)	Description
`metricInstant`	query	Mandatory	`String`	Prometheus Query Language (PromQL)		The query string for a Metric. See the details in the Query Syntax section below.
	time	Optional	`DateTime`	rfc3339: "2023-07-06T19:20:30Z" unix_timestamp: 1696473068	Current server time in UTC	The timestamp for a Metric query evaluation.
`metricRange`	query	Mandatory	`String`	PromQL		The query string for a Metric. See the details in the Query Syntax section below.
	start	Optional	`DateTime`		30 minutes before the current server time in UTC	Start of fetch range.
	end	Optional	`DateTime`		Current server time in UTC	End of fetch range.
	step	Optional	`String`		1 minute	Retrieval interval within the range.

Info

The maximum query time range is 32 days. The 32 days are between the `start` time and the `end` time of a PromQL query.
The `step` parameter would define the intervals at which the PromQL operators or functions are computed.
For example, the function with a `step` of 10 minutes would be computed over every 10-minute interval within the start-end range.

In essence, the combination of `start`, `end`, and `step` helps you to shape the granularity and duration of your data retrieval. Depending on your specific monitoring requirements, you can use them to get a broad overview or a detailed insight into the Metric behavior over time.

Query Syntax

The Metrics API uses the Prometheus Query Language (PromQL), a functional query language designed for real-time selection and aggregation of time series Metrics data.

Valid PromQL Syntax should be inserted into the `metricInstant` and `metricRange` Metrics API queries as the `query` parameter value.

PromQL Querying Basic Guidelines

Basic querying guidelines can be found at Querying basics.
A detailed list of operators is available at Operators.
Comprehensive functions used in queries are explained at Query functions.
For practical query examples, refer to Querying examples.

Find a sample Metrics API query explanation through an example in the Reference Query section below.

Query Response

The Metrics API query Response can be displayed in the UI with OMX UX framework components, e.g.:

Note

For more details, check the UX Configure Overview.

Metrics API Schema

Metrics workspace server sends the Metrics API query Response that is mapped back following the Metrics API schema.

1scalar Json
2
3type Query {
4    # An instant query at a single point in time
5    metricInstant(query: String!, time: DateTime): MetricInstant
6    # An expression query over a range of time
7    metricRange(query: String!, start: DateTime, end: DateTime, step: String): MetricRange
8}
9
10type MetricInstant {
11    status: String!
12    data: MetricInstantData
13    errorType: String
14    error: String
15    warnings: [String]
16}
17
18type MetricInstantData {
19    resultType: String
20    result: [VectorInstant]
21}
22
23type VectorInstant {
24    metric: Json
25    value: [Json]
26}
27
28type MetricRange {
29    status: String!
30    data: MetricRangeData
31    errorType: String
32    error: String
33    warnings: [String]
34}
35
36type MetricRangeData {
37    resultType: String
38    result: [VectorRange]
39}
40
41type VectorRange {
42    metric: Json
43    values: [[Json]]
44}

Response Data Structure

Metrics API query Response is provided to the user with the following Data Structure (based on the Metrics API schema):

Structure Name	Field	Type	Mandatory/Optional	Description
`MetricInstant`	status	String	Mandatory	Query status (e.g., "success,” "error").
	data	MetricInstantData	Optional	Query result.
	errorType	String	Optional	Error type.
	error	String	Optional	Detailed error.
	warnings	Array of String	Optional	Warning messages.
`MetricInstantData`	resultType	String	Optional	Usually "vector".
	result	Array of VectorInstant	Optional	Metric data.
`VectorInstant`	metric	Json	Optional	Metric labels/values.
	value	Array of Json	Optional	Timestamp and value.
`MetricRange`	status	String	Mandatory	Range query status.
	data	MetricRangeData	Optional	Query result.
	errorType	String	Optional	Error type.
	error	String	Optional	Detailed error.
	warnings	Array of String	Optional	Warning messages.
`MetricRangeData`	resultType	String	Optional	Result type.
	result	Array of VectorRange	Optional	Data points within range.
`VectorRange`	metric	Json	Optional	Metric labels/values.
	values	Array of Array of Json	Optional	Timestamps and values.

Response Value Field Array

The initial field symbolizes the UNIX timestamp for the Metric.
The subsequent field denotes the Metric value corresponding to the above-mentioned timestamp.

Info

For Instant Metrics, the value result will display a solitary data point for the designated Metric at the stipulated time.

No Precise Data Match Handling

For a `metricInstant` query targeting a specific time without an exact data match:
- Stale Data Management: The platform identifies data as "stale" if no new data point emerges within the default span (5 minutes). Stale data is excluded from outputs.
- Data Interpolation: The platform will automatically fetch data from the latest noted point preceding the queried timestamp.

Explanation through an Example

The Metrics data points are documented at 10:01 and 10:03;
The Metrics data is queried for 10:02;
10:01 value will be received.
However, querying 10:04 without data within the staleness window returns no value.

For a `metricRange` query from start to finish with a defined step:
- Continuous Data Interpolation: When a distinct timestamp within the range lacks data, the platform defaults to the closest prior data point for that timestamp. This guarantees continuous data points in the range query, even if some values are interpolated.
- Range Data Structure: The values field in VectorRange houses data point arrays, reflecting the sequence of values throughout the designated time range.

Inaccurate Metric Querying Result

An error won’t be presented when an erroneous Metric (like core_event_received_incorrect) is queried.
A success acknowledgment accompanied by empty result values will be received instead.

Reference Query

The Sources Dashboard visualizes Metrics and displays user-friendly information by querying inventory-related Metrics data. The following example (powering the dashboard gauge chart) is intended to serve as a reference for Metrics API queries building / configuring.

Explanation through an Example

1. `gaugeChart` represents Instant Metric data for a gauge chart.

The given example is one of the use cases.

A wide range of visualization options is available.
For Example:

Numeric format,
Bar charts and other graphical representations,
Tabular format.

2. `metricInstant` is an Instant Query at a single point in time.

Alternative

Consider `metricRange` (an expression query over a range of time) as an alternative.

3. `query` is a mandatory Metrics API query parameter containing a query string for a Metric.

Note

See the available parameters and their configuration options in the Query Parameters section above.

4. `$gaugeChartQuery` is a variable for Prometheus-style Metric query tailored for capturing the difference in counts over a specified time range or the total count for specific sources and types.

Info

Variables serve as input parameters for the query. They make the query dynamic and can be adjusted based on user needs or UI interactions.

5. `sum()` aggregates data by summing up all matching time series.

Note

For more details and configuration options, check the Aggregation operators.

6. `rubix_event_runtime_seconds_count` is a counter Metric (from `rubix_event_runtime_seconds` histogram Metric) that shows the number of observed events executed by the engine ().

Alternative

Three series of data are available as “child“ Metrics for the `rubix_event_runtime_seconds` and `rubix_event_inflight_latency_seconds`:

`_bucket`: This series provides a cumulative count of observed values for each bucket.
So, for the `rubix_event_runtime_seconds`, you might have buckets like `<0.005`, `<0.01`, `<0.1`, etc., indicating how many events had run times less than those thresholds.
`_sum`: This represents the total cumulative sum of all observed values.
For `rubix_event_inflight_latency_seconds`, this would be the sum of all inflight latencies observed.
`_count`: This provides the total count of events observed.

For more details, check the Metrics Types section here.

7. Source Filter value `!=\"internal\"` ensures that internal events are excluded from the count.

Alternative

Filtering options availability depends on the labels saved as a part of the specific Metric.
For Example:

Source Filter specifies whether events from certain sources are included or excluded.
Status Filter determines which statuses of events are considered.
Retailer Filter allows you to fetch a retailer-specific Metrics data based on the `retailer_id`.

Filters are available to be applied individually or in combination.

For more details, check the Metrics Labels section here.

8. Inventory Entities to be considered in the query are specified with the `entity_type`.

The given Gauge Chart query for the Sources Dashboard is one of the use cases.

Similar dashboards could be constructed using the Metrics API for other domains, like:

Products,
Orders.

Taking Metrics as a foundational example and understanding their structure allows you to adapt and develop dashboards for different domains.

For instance, the entities to consider when creating a Product Dashboard include:

`PRODUCT_CATALOGUE`,
`PRODUCT`,
and `CATEGORY`.

Ensure you adjust and replace the relevant Entity Types to tailor the dashboard to the specific domain you're focusing on.

9. Subtraction helps to calculate the difference in counts over a specified time range.

Note

For more details and options available, check the Arithmetic binary operators.

10. `offset` set up the difference in Metrics counts for now vs 480 minutes ago.

Info

The Offset modifier lets you shift the start time of a query to a specific time in the past.

This is valuable when assessing the state of Metrics at a previous moment or contrasting present Metric values against historical data.

For more details and examples, check the Offset modifier.

11. `or` clause is utilized to return the current count of `rubix_event_runtime_seconds_count` when the difference in counts (now vs 480 minutes ago) has no data.

Note

For more details and options available, check the Logical/set binary operators.

The given example is one of the use cases.

The Metrics API is flexible and can be used in a wide range of the Fluent Platform Observability scenarios. For Example:

Representing a range of Metrics data for a bar chart showing complete, failed or no match events.
Showing a range of Metrics data for a bar chart categorized by source.
Representing Instant Metric data on how long it takes to complete a queue.
Displaying the latest time a certain event was received.
Showing an Instant Metric data on the total number of failures.
Counting data fitting tabular representation format including totals, failures, completions, update timestamps and any kind of analytical calculations based on them.
And many more.

Getting Started with Metrics API

Overview

Key points

What is Metrics API for?

Queries

`metricInstant`

`metricRange`

Query Parameters

Query Syntax

Query Response

Metrics API Schema

Response Data Structure

Response Value Field Array

No Precise Data Match Handling

Inaccurate Metric Querying Result

Reference Query

Explanation through an Example

Related content

Metrics Overview

How Metrics works

Metrics usage for Platform Observability

Kirill Gaiduk

Contributors:

Building Blocks

By Type

Helpful Resources

Quick Links