Writing an Istio WASM Plugin in Go for migrating 100s of services to new auth strategy (Part 4)

Shane Hender
Zendesk Engineering
4 min readJul 13, 2023

--

Part 4: Adding Metrics

This article will mostly be about the building of the WASM plugin itself and less on the reasons why we needed it, but a 1-liner explanation is “We had to route our service-to-service traffic through an Nginx proxy to acquire an auth JWT from an Auth Service, and wanted to remove that extra network hop by having the WASM plugin call the Auth Service directly instead of Nginx”.

This section is all about adding telemetry to the plugin. To refer back to earlier parts use the links below:

Adding metrics to our plugin is easy to get started, though adding tags to those metrics is a bit harder.

Generally we need to define a metric the first time we use it with the Envoy Host, then memoize that metric ID to re-use it.

I’m just going to paste the code to our Metrics class which just simply keeps an in-memory map of metric names (including tags) to their corresponding ID. MetricCounter is just an uint32 type. proxywasm supports histograms and gauges too.

package internal

import (
"fmt"

"github.com/tetratelabs/proxy-wasm-go-sdk/proxywasm"
)

type Metrics struct {
// Our memoized map of metric names to ID
counters map[string]proxywasm.MetricCounter
}

// We want to group all our metrics under a common prefix
const MetricPrefix = "envoy_wasm_auth_plugin"

func NewMetrics() *Metrics {
return &Metrics{
counters: make(map[string]proxywasm.MetricCounter),
}
}

// Increment function increments the specific metric name by 1
func (m *Metrics) Increment(name string) {
fullName := metricName(name, tags)
if _, exists := m.counters[fullName]; !exists {
// We haven't seen this metric name before, so define it in the Envoy Host
m.counters[fullName] = proxywasm.DefineCounterMetric(fullName)
}
m.counters[fullName].Increment(1)
}

func metricName(name string) string {
fullName := fmt.Sprintf("%s.%s", MetricPrefix, name)

return fullName
}

Each time a metric is used, we first check if we’ve registered this before. If we haven’t then we need to call DefineCounterMetric to register it to get the ID to use with the Increment function. This is all we need if we only want simple metric names without tags.

Adding in a metric to our RequestHandler:

func (r *RequestHandler) doSomethingWithRequest(reqHeaderMap map[string]string, xRequestID string) types.Action {
r.Metrics.Increment("requests_intercepted")
...
...
}

Deploying this, we can verify that it is creating the metric correctly by querying the Envoy prometheus endpoint:

$ kubectl exec "$(kubectl get pod -l app=helloworld -o jsonpath='{.items[0].metadata.name}')" -c istio-proxy -- curl -sS 0:15020/stats/prometheus | grep envoy_wasm_auth_plugin
# TYPE envoy_wasm_auth_plugin_requests_intercepted counter
envoy_wasm_auth_plugin_requests_intercepted{} 2

Unfortunately since the metric functionality exposed by Envoy is based on very basic StatsD format, tags are a bit of an afterthought. So we need regex-based tag extractors defined with Envoy itself to separate the tags from the metric name itself.

For me, it was easier just to see what pre-defined tag extractors were defined in the Istio configuration rather than going through the process of adding a new tag-regex any time I added another tag to a metric.

We can query the Envoy Sidecar to see what tags are already available. Omit the jq filter at the end if you don’t have it, it just made it easier to read the extremely long output from config_dump.

$ kubectl exec "$(kubectl get pod -l app=helloworld -o jsonpath='{.items[0].metadata.name}')" -c istio-proxy -- curl -sS 0:15000/config_dump | jq ".configs[0].bootstrap.stats_config.stats_tags"
[
{
"tag_name": "cluster_name",
"regex": "^cluster\\.((.+?(\\..+?\\.svc\\.cluster\\.local)?)\\.)"
},
{
"tag_name": "tcp_prefix",
"regex": "^tcp\\.((.*?)\\.)\\w+?$"
},
{
"tag_name": "response_code",
"regex": "(response_code=\\.=(.+?);\\.;)"
},
{
"tag_name": "response_code",
"regex": "_rq(_(\\d{3}))$"
},
{
"tag_name": "response_code_class",
"regex": "_rq(_(\\dxx))$"
},
...
...
...
]

The tag_name shows the format of the metric format we need to match. For example if we wanted to track the response status from our Auth-Service call, we’ll format a metric like:

envoy_wasm_auth_plugin.auth_requests_response_code=.=200;.;

We first have to add tag support into our main Metrics struct:

package internal

...
...
...

func (m *Metrics) Increment(name string, tags [][2]string) {
fullName := metricName(name, tags)
...
...
}

func metricName(name string, tags [][2]string) string {
fullName := fmt.Sprintf("%s_%s", MetricPrefix, name)

for _, t := range tags {
fullName += fmt.Sprintf("_%s=.=%s;.;", t[0], t[1])
}
return fullName
}

Then add a counter into our auth_client.go file to increment the metric with the response status tag:

d.Metrics.Increment("auth_requests", [][2]string{{"response_code", authResponseHeaders[":status"]}})

This will then allow a metric name envoy_wasm_auth_plugin.auth_requests with the tag response_code:200 to be pushed as a prometheus metic. Let’s verify that with Envoy as well:

$ kubectl exec "$(kubectl get pod -l app=helloworld -o jsonpath='{.items[0].metadata.name}')" -c istio-proxy -- curl -sS 0:15020/stats/prometheus | grep envoy_wasm_auth_plugin
# TYPE envoy_wasm_auth_plugin_auth_requests_ counter
envoy_wasm_auth_plugin_auth_requests_{response_code="200"} 1
# TYPE envoy_wasm_auth_plugin_requests_intercepted counter
envoy_wasm_auth_plugin_requests_intercepted{} 1

If you want to add more tags that aren’t pre-defined, then you’ll have to add an Envoy Stats definition which I thankfully didn’t have to figure out by using Istio’s default Envoy stats_tag configuration.

Success kid for tags I needed correlated to Istio predefined list
https://imgflip.com/i/7o2j2k

In the next post we’ll show how to test this plugin.

--

--