Q: What is the metric data resolution and delay?
A: Data is collected from Microsoft portal every 5 minutes. Typical delays for data available through Microsoft APIs are in 15-30 minutes range. However, some metrics are available in 1-day resolution only and always reflect yesterday's status:
- license consumption and activations
- mailbox counts, including which mailboxes are active and inactive, and which are in specific quota categories
- storage, including storage for mail, OneDrive, SharePoint
Q: Why does the extension need 'x.read.all' permissions on audit logs, or is it possible to specify more limited permissions?
A: Extension uses audit logs to count users of the O365 services. We didn’t find a more granular way to specify required permissions. Parsing the audit logs in memory gives the desired answer.
Q: What guarantees can you provide that the extension will only access the minimum required data, and that these permissions are truly the only option available?
A: Extension gets a list of active users from audit logs, summarizes it, and then drops it. All in memory, nothing is exported as logs. Also, no other information is read from audit logs. This can be examined in the extension code, if you are interested. Download extension package from the public hub and look inside the zip to analyze the code.
Q: Since the extension has access to the audit logs, can I use it to monitor change events in my O365 environment?
A: No. Extension uses audit logs only to count service users. No log is recorded or exported by the extension. You need to set up audit log forwarding to you Dynatrace tenant separately. Conscious and controlled use of the log monitoring - as opposed to by-the-way of using this extension - assures all security and privacy requirements that it brings will be obeyed.
Q: Why are some metrics timestamps skewed from when the event actually occurred?
A: Metrics may not be accurate to the minute they are reported at in Dynatrace. The Office 365 API has to process the data before it can be retrieved by the extension. This means the metric timestamps for graph API could vary 5-15 minutes from the time of the actual event as shown on Office 365. Metrics reported from the Management API may be 2-4 hours behind the actual time shown in Office 365 but can take as long as 6hrs.
Q: Why email activity count is flat over the day and delayed for a day?
A: The email activity metric is a tricky one; we are looking for ways to provide near-real-time email activity summaries. Any advice in this regard is welcomed. Please use the Community thread to provide feedback.
Q: What does the metric "office365.tenant.service.health" represent?
A: The metric "office365.tenant.service.health" represents the current health of the different M365 services. The specific service recorded by a metric line can be determined by looking at the "service" dimension. The value of the "Service Health Status" metric is calculated using a mapping between the possible status of each service to a numerical value:
| Number | Status | Description |
| ------ | --------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| 0 | serviceOperational | The service is healthy and no issues have been identified.
| 1 | falsePositive | After a detailed investigation, the service is confirmed to be healthy and operating as designed. No impact to the service was observed or the cause of the incident originated outside of the service. Incidents and advisories with this status appear in the history view until they expire (after the period of time stated in the final post for that event).
| 2 | serviceRestored | The corrective action has resolved the underlying problem and the service has been restored to a healthy state. To find out what went wrong, view the issue details.
| 3 | postIncidentReviewPublished | A post-incident report for a specific issue that includes root cause information has been published, with next steps to ensure a similar issue doesn't reoccur.
| 4 | verifyingService | The action has been taken to mitigate the issue and we have verified that the service is healthy.
| 5 | restoringService | The cause of the issue has been identified, and action is being taken to bring the service back to a healthy state.
| 6 | extendedRecovery | This status indicates that corrective action is in progress to restore the service to most users but will take some time to reach all the affected systems. You might also see this status if a temporary fix is made to reduce impact while a permanent fix is waiting to be applied.
| 7 | investigating | A potential issue was identified and more information is being gathered about what's going on and the scope of impact.
| 8 | investigationSuspended | If our detailed investigation of a potential issue results in a request for additional information from customers to allow the service team to investigate further, you'll see this status. If service team needs you to act, they'll let you know what data or logs they need.
| 9 | serviceDegradation | An issue is confirmed that may affect use of a service or feature. You might see this status if a service is performing more slowly than usual, there are intermittent interruptions, or if a feature isn't working, for example.
| 10 | serviceInterruption | You'll see this status if an issue is determined to affect the ability for users to access the service. In this case, the issue is significant and can be reproduced consistently.
These values increase as the corresponding service status becomes worse. Three tiers of service status can be used, values 0-4 are a healthy status, values 5-7 are in a warning state, and values 8 and above are considered an error status. These tiers are used in the default dashboard.
Q: What are the API's used by the extension?
A: The extension collects data from the Microsoft Graph API and the Office 365 Management API. Microsoft's authentication endpoint is also used to retrieve authorization tokens. The specific API endpoint that gets used depends on the type of Office 365 tenant that is being monitored. The extension currently supports the Enterprise tenant.
Enterprise Tenant:
Q: Does this extension monitor Azure AD or Active Directory on-prem?
A: This extension does not monitor Azure AD or Active Directory on-prem. This extension provides only two metrics related to O365 user interactions with AD: number of logons and number of failed logons.
Use Active Directory monitoring extensions to monitor Active Directory on-prem. Note that these extensions don't support Azure AD.
Q: What is the DDU Consumption of this extension?
A: The formula for DDU consumption of the extension is:
40 * 525.6 DDUs/year per monitored M365 tenant
Typically, there's one tenant monitored - your enterprise M365 tenant. However, the extension allows for monitoring of multiple tenants.
DDU cost above does not include any possible Log events or Custom events triggered by the extension. For more information on this, please visit the DDU log event cost and DDU custom event cost pages.