About MCaaS
Monitors
Overview
A metric monitor provides alerts and notifications if a specific metric is above or below a certain threshold. This page provides instructions for setting up a metric monitor to alert on low disk space.
Prerequisites
Before getting started, you need a Datadog account linked to a host with the Datadog Agent installed. To verify, check your Infrastructure List in Datadog.
Setup
To create a metric monitor in Datadog, use the main navigation: Monitors -→ New Monitor -→ Metric.
Choose the detection method
When you create a metric monitor, Threshold Alert is automatically selected as the detection method. A threshold alert compares metric values against user-defined thresholds. The goal for this monitor is to alert on a static threshold, so no change is necessary.
Define the metric
To get an alert on low disk space, use the system.disk.in_use
metric
from the Disk integration and average the metric over host
and
device
:
After this is set, the monitor automatically updates to a Multi Alert
that triggers a separate alert for each host
, device
reporting your
metric.
Set alert conditions
According to the Disk integration documentation, system.disk.in_use
is
the amount of disk space in use as a fraction of the total. So, when
this metric is reporting a value of 0.7
, the device is 70% full.
To alert on low disk space, the monitor should trigger when the metric
is above
the threshold. The threshold values are based on your
preference. For this metric, values between 0
and 1
are appropriate:
For this example, the other settings in this section are left on the defaults. For more details, see the Metric Monitors documentation.
Say what’s happening
Before a monitor can be saved, it must have a title and message.
Title
The title must be unique for each monitor. Since this is a multi alert
monitor, names are available for each group element (host
and
device
) with message template variables:
Disk space is low on {{device.name}} / {{host.name}}
Message
Use the message to tell your team how to resolve the issue, for example:
Steps to free up disk space:
1. Remove unused packages
2. Clear APT cache
3. Uninstall unnecessary applications
4. Remove duplicate files
For different messages based on alert vs. warning thresholds, see the Notification documentation.
Notify your team
Use this section to send notifications to your team through Email,
Slack, PagerDuty, etc. You can search for team members and connected
accounts with the drop-down box. When an @notification
is added to
this box, the notification is automatically added to the message box:
Removing the @notification
from either section removes it from both
sections.
Tag your Monitors
Add tags to your monitors to help group and search for monitors in the UI. This is helpful with creating dashboard widgets for your monitors. The teams asks that you refrain from starting your tags with MCaaS. We encourage utilizing the short codes for tenants, modules, applications, and services as tags.