Site24x7 monitors your critical resources round-the-clock and presents those stats and trends holistically to you via comprehensive reports. This article intends to throw more insight into the various availability and performance parameters captured by Site24x7 during monitoring. Also, highlight the different calculations used by our monitoring engine to derive at the various end values that matter the most for your business.
The below table defines all the variables used in calculating different performance metrics.
Variables used in calculations
|
Definition
|
Monitoring Period
|
The total time period for which monitoring is enabled
|
Maintenance Period
|
The total time period within the Monitoring Period for which the monitor is marked as under MAINTENANCE
|
UpTime
|
The total amount of time during which the monitor is in UP status
|
DownTime
|
The total amount of time during which the monitor is in DOWN status
|
Response Time
|
The time taken to complete a single poll
|
Number of Outages
|
The number of polls that have failed
|
Down Percentage
|
The percentage of time that the monitor is down outside of the Maintenance Period
|
Maintenance Percentage
|
The percentage of time the monitor is under maintenance
|
Availability
|
The percentage of time that the monitor is UP outside of the Maintenance Period
|
API Time
|
The point of time at which the API call is made by the monitor
|
DNS Time
|
The point of time at which the DNS request is resolved completely
|
ConnStartTime
|
The point of time at which the API establishes connection with the website
|
ConnEndTime
|
The point of time at which the connection to the website socket is successfully established
|
Response Start time
|
The point of time at which the first response starts coming in for the base page
|
Response End
|
The point of time at which the response has been completely read
|
Maintenance Period
Whenever a Monitor requires to be updated or fixed, they can be defined as being under maintenance. Marking a monitoring period as maintenance ensures that the monitors are not shown as DOWN in the final reports, allowing an accurate view of the actual downtime. However, you can always include the maintenance period as UPTIME in your uptime calculation using the "MAINTENANCE AS UPTIME" rocker button in your Availability Summary Report. To calculate UPTIME, Site24x7 uses all the outages logged in our monitoring engine and derives at the actual DOWN percentage. The UPTIME can be further derived by using this outage value.
Uptime and Downtime
Uptime/downtime of a monitor provides with an approximation of the total time their website has been available for customers' use. Uptime/downtime is the amount of time (in days, hours and minutes) the server, network, or website has been running (UP) or has been unavailable. Uptime is usually listed as a percentage, like 99.9% uptime for a given period of time. The uptime for a website can be viewed under Availability, above the Events Timeline in the web client.
See the example below to understand how the availability percentage values are determined.
In this example the time period chosen is Last one month. Hence, when converted into seconds:
MonitoringPeriod = 30*24*60*60 seconds = 2592000 seconds
DownTime = (43*60) + 48 seconds = 2628 seconds
Therefore,
DownPercentage = (2628/2592000)*100 = 0.1%
In case of a monitor group, the total uptime period will be the sum of individual monitor's uptime. So let's say 10 monitors in a group, then 10 monitors, 30days report will say 300 days uptime. Total uptime percentage is average of individual monitors uptime percentage. Two monitors with one down all the time and another one is up all the time will say 50% uptime.
Calculating Availability
The Availability of a website indicates whether the website is currently available for the customer to use or not. It's represented as either UP or DOWN for the current instance and in percentage for a selected time period. For calculating uptime, Site24x7's monitoring engine has to detect the actual Downtime. Downtime may or may not include the maintenance period.
In our above example, maintenance is treated as UP. Therefore, the formula to calculate Availability will be:
AvailabilityPercentage = 100 - DownPercentage
AvailabilityPercentage = 100 - 0.1 = 99.9%
Only a round-off value (rounded off two decimal values) will be shown. For monitor groups, the group availability will be based on individual monitor's availability/monitor count in the group.
For instance, the time period chosen for the availability report on July 1st is Last 30 days in milliseconds, which is 2592000000 ms, or 720 hrs. The time period June 1st to June 30th will be considered for the calculation. The number of monitors selected from the monitor group is 10.
Consider that one monitor had a downtime of one day, and the Total Downtime is the sum of all downtimes. Hence, the Total Downtime is one day, which is 86400000 ms.
Additionally, let's consider that the monitor has a suspension period of 2 days, which is which is 172800000 ms.
Now, the total suspended time will have to be deducted from the total monitoring period.
Total monitoring period = Monitoring period*Number of monitors selected from the monitor group - Total Suspended Time
Therefore, total monitoring period = 10 * 2592000000 = 25920000000 ms (300 days) - 172800000 ms (2 days) = 25747200000 ms (298 days)
Total Uptime = Total monitoring period - Total Downtime
Total Uptime = 25747200000 - 86400000 = 25660800000 (297 days)
Availability percentage = (Uptime/total monitoring period)*100
Availability percentage = (25660800000/25747200000)*100 = 99.66%
Also, based on the total Downtime/Uptime of the monitor MTTR and MTBF can be calculated.
- Mean Time To Repair (MTTR): The time taken to get the server UP, once it is down. This must be as low as possible. MTTR will be equal to ZERO in case there are no outages.
MTTR = Actual DownTime / Number of Outages
- Mean Time Between Failures (MTBF): The average time that a device or a system worked without failure or the average time taken for a failure to happen. The term can also mean the length of time a user may reasonably expect a device or system to work before an incapacitating fault occurs. This must be as high as possible. MTBF will be equal to the Total Uptime in case there are no outages.
MTBF = Actual UpTime / Number of Outages
In our example above, the time period selected is one month, and the number of outages is one. Hence,
MTTR = (43 min 48 sec / 1) = 43 mins 48 seconds
MTBF = (29 days 23 hours 16 min/ 1) = 29 days 23 hours 16 min
Response Time
Response time is comprised of four major components, viz., DNS time, connection time, first byte and last byte time.
How is it calculated?
DNSResolveTime = APITime - DNSTime
ConnTime = ConnEndTime - ConnStartTime
FirstByteTime = ConnEndTime - ResponseStart
Download Time = ConnEndTime - ResponseEnd
ResponseTime = DNSResolveTime + ConnTime + FirstByteTime + Download Time
The response time of website, which is monitored across all the monitoring locations for a chosen time period is calculated and shown using a normal line graph. Maximum, minimum and average response time can be gauged from this graph. Average values depend on the time period chosen for the monitoring.
In the above example, for the point of time selected the values for the different components of response time are:
DNSResolveTime = 64 ms
ConnTime = 222 ms
FirstByteTime = 129 ms
Download Time = 11 ms
Therefore for the point in time selected;
ResponseTime = 64 + 222 + 129 + 11 = 426 ms
Min: Minimum value of all the entries during the selected period
Max: Maximum value of all the entries during the selected period
Average: Sum of Response time of all entries / Total number of entries