About MCaaS

Setup

Initial Setup

Once you’re logged in to Datadog, by default it should take you to the quick-start guide document, in case it doesn’t show up, feel free to copy the url from below and start/review the initial integrations.

`https://fcs-mcaas-<tenant-name>.ddog-gov.com/help/quick_start`

Instrumentation

Set up your application to send traces using one of the following official Datadog tracing libraries located here:

Java

Python

Ruby

Golang

Nodejs

php

Nginx

CPP

DotNet

Note

More info can be found here:https://docs.datadoghq.com/tracing/setup/

Pod Tags

MCaaS added the following annotations to each pod that will show up as tags for logs, metrics, and traces. The values are gathered from the application’s Flux config.

<table> <caption>Available Pod Tags</caption> <col width="50%" /> <col width="50%" /> <tbody> <tr class="odd"> <td align="left">Tag Name</td> <td align="left">Tag Value from Flux Config</td> </tr> <tr class="even"> <td align="left">tenant-short-code</td> <td align="left">mcaasLabels tenantShortCode</td> </tr> <tr class="odd"> <td align="left">module-short-code</td> <td align="left">mcaasLabels moduleShortCode</td> </tr> <tr class="even"> <td align="left">application-short-code</td> <td align="left">mcaasLabels applicationShortCode</td> </tr> <tr class="odd"> <td align="left">service</td> <td align="left">mcaasLabels serviceShortCode</td> </tr> <tr class="even"> <td align="left">env</td> <td align="left">mcaasLabels environment</td> </tr> <tr class="odd"> <td align="left">version</td> <td align="left">generated image tag</td> </tr> </tbody> </table>

In order to use these tags and any other tags to search logs, metrics, and traces, input <Tag Name>:<Tag Value> into the search bar.

Integrations

Datadog has over 400+ integrations officially listed.
Custom integrations are available via the Datadog API.
The Agent is open source.
Once integrations have been configured, all data is treated the same throughout Datadog, whether it is living in a datacenter or in an online service.

Log Management

Datadog Log Management lets you send and process every log produced by your applications and infrastructure. You can observe your logs in real-time using the Live Tail, without indexing them. You can ingest all of the logs from your applications and infrastructure, decide what to index dynamically with filters, and then store them in an archive.

APM & Distributed Tracing

Datadog Application Performance Monitoring (APM or tracing) provides you with deep insight into your application’s performance—from automatically generated dashboards for monitoring key metrics, like request volume and latency, to detailed traces of individual requests—side by side with your logs and infrastructure monitoring. When a request is made to an application, Datadog can see the traces across a distributed system, and show you systematic data about precisely what is happening to this request.

Spans

A span represents a logical unit of work in a distributed system for a given time period. Multiple spans construct a trace. Datadog APM allows you to customize your traces with span tags to include any additional information you might need to maintain observability into your application. Instructions on how to set span tags based on programming language: https://docs.datadoghq.com/tracing/guide/add_span_md_and_graph_it/?tab=java#instrument-your-code-with-custom-span-tags

After adding span tags to the application code, they can be analyzed in Datadog’s APM Traces page: https://docs.datadoghq.com/tracing/guide/add_span_md_and_graph_it/?tab=java#leverage-your-custom-span-tags-with-app-analytics

Infrastructure

All machines show up in the infrastructure list.
You can see the tags applied to each machine. Tagging allows you to indicate which machines have a particular purpose.
Datadog attempts to automatically categorize your servers. If a new machine is tagged, you can immediately see the stats for that machine based on what was previously set up for that tag.

Host Map

The Host Map can be found under the Infrastructure menu. It offers the ability to:

Quickly visualize your environment
Identify outliers
Detect usage patterns
Optimize resources

Events

The Event Stream is based on the same conventions as a blog:

Any event in the stream can be commented on.
Can be used for distributed teams and maintaining the focus of an investigation.
You can filter by user, source, tag, host, status, priority, and incident.

For each incident, users can:

Increase/decrease priority
Comment
See similar incidents
@ notify team members, who receive an email
@support-datadog to ask for assistance

Dashboards

Dashboards contain graphs with real-time performance metrics.

Synchronous mousing across all graphs in a screenboard.
Vertical bars are events. They put a metric into context.
Click and drag on a graph to zoom in on a particular timeframe.
As you hover over the graph, the event stream moves with you.
Display by zone, host, or total usage.
Datadog exposes a JSON editor for the graph, allowing for arithmetic and functions to be applied to metrics.
Share a graph snapshot that appears in the stream.
Graphs can be embedded in an iframe. This enables you to give a 3rd party access to a live graph without also giving access to your data or any other information.

Monitors

Monitors provide alerts and notifications based on metric thresholds, integration availability, network endpoints, and more.

Use any metric reporting to Datadog
Set up multi-alerts (by device, host, etc.)
Use @ in alert messages to direct notifications to the right people
Schedule downtimes to suppress notifications for system shutdowns, off-line maintenance, etc.

Network Performance Monitoring

Datadog Network Performance Monitoring (NPM) gives you visibility into your network traffic across any tagged object in Datadog: from containers to hosts, services, and availability zones. Group by anything—from datacenters to teams to individual containers. Use tags to filter traffic by source and destination. The filters then aggregate into flows, each showing traffic between one source and one destination, through a customizable network page and network map. Each flow contains network metrics such as throughput, bandwidth, retransmit count, and source/destination information down to the IP, port, and PID levels. It then reports key metrics such as traffic volume and TCP retransmits.

Real User Monitoring

Datadog Real User Monitoring (RUM) enables you to visualize and analyze the real-time activities and experiences of individual users to prioritize engineering work on the features with the highest business impact. You can visualize load times, frontend errors, and page dependencies, and then correlate business and application metrics so that you can troubleshoot quickly with application, infrastructure, and business metrics in a single dashboard.

On this page:

Initial Setup
Instrumentation
Pod Tags
Integrations
Log Management
APM & Distributed Tracing
- Spans
Infrastructure
Host Map
Events
Dashboards
Monitors
Network Performance Monitoring
Real User Monitoring

test