It's not a secret that we use Zabbix to monitor the infra. That's even a reason why we (re)build it for some other architectures, including aarch64,ppc64,ppc64le on CBS and also armhfp

There are really cool things in Zabbix, including Low-Level Discovery. With such discovery, you can create items/prototypes/triggers that will be applied "automagically" for each discovered network interface, or mounted filesystem. For example, the default template (if you still use it) has such item prototypes and also graph for each discovered network interface and show you the bandwidth usage on those network interfaces.

But what happens if you suddenly want to for example to create some calculated item on top of those ? Well, the issue is that from one node to the other, interface name can be eth0, or sometimes eth1, and with CentOS 7 things started to also move to the new naming scheme, so you can have something like enp4s0f0. I wanted to create a template that would fit-them-all, so I had a look at calculated item and thought "well, easy : let's have that calculated item use a user macro that would define the name of the interface we really want to gather stats from ..." .. but it seems I was wrong. Zabbix user macros can be used in multiple places, but not everywhere. (It seems that I wasn't the only one not understanding the doc coverage for this, but at least that bug report will have an effect on the doc to clarify this)

That's when I discussed this in #zabbix (on that RichLV pointed me to something that could be interesting for my case : Alias. I must admit that it's the first time I was hearing about it, and I don't even know when it landed in Zabbix (or if I just overlooked it at first sight).

So cool, now I can just have our config mgmt pushing for example a /etc/zabbix/zabbix_agentd.d/interface-alias.conf file that looks like this and reload zabbix-agent :


That means that now, whatever the interface name will be (as puppet in our case will create that file for us) , we'll be able to get values from net.if.default.out and keys, automatically. Cool

That also means that if you want to aggregate all this into a single key for a group of nodes (and so graph that too), you can do something always referencing those new keys (example for the total outgoing bandwidth for a group of hosts) :

grpsum["Your group name","net.if.default.out",last,0]

And from that point, you can easily also configure triggers, and graphs too. Now going back to work on some other calculated items for total bandwith usage for a period of time and triggers based on some max_bw_usage user macro.