Key Performance Testing Metrics

These are the most commonly used and informative performance testing metrics:

Response Time

Measures the total time it takes a system to respond to a user request. It is one of the most critical metrics as it ensures systems are responsive and meeting user expectations. There are four subcategories of response time:

Minimum Response Time - Measures the shortest amount of time the system takes to respond to a user request. It represents the best-case scenario.

When to Use: Use when assessing the baseline performance of the system under optimal conditions.
Maximum Response Time - Measures the longest amount of time the system takes to respond to a user request. It represents the worst-case scenario.

When to Use: Use when identifying potential performance bottlenecks or issues under stress or peak load conditions.
Average Response Time - Measures the sum of all the response times divided by the total number of requests. It represents the typical response time the user will experience.

When to Use: Use when evaluating overall user experience and performance efficiency under normal operating conditions.
Percentile Response Time = Average response time corresponding to the fastest X% of requests. It represents the time required for X% of requests to be completed successfully.

When to Use: Use when focusing on responsiveness for the majority of users, identifying outliers, and ensuring consistent performance.

For example, let’s say we are evaluating the performance of a video streaming platform's 'play video' functionality. Following the test, we review the outcomes and discover that the 90th percentile response time for playing a video is 5 seconds. This implies that 90% of the video playback requests were processed within an average time of 5 seconds. However, the remaining 10% of the video playback requests exceeded this duration, indicating that some users experienced longer wait times before the video started playing.

Throughput

Throughput = Total no. of requests / Total time taken. Measures the number of requests that can be processed by a system in a given time. It is generally measured in units of bytes per second or transactions per second.

When to Use: Use when assessing the system's capacity to handle a specific workload and the overall efficiency of request processing.

Error Rate

Error Rate = (Number of failed requests / Total number of requests) x 100. Also known as an error percentage. It measures the percentage of requests that failed or didn’t receive a response. It is an important metric because it identifies the issues and bottlenecks that affect the performance of the system.

When to Use: Use when evaluating the stability and reliability of the system, identifying potential issues, and assessing the impact of errors on user experience.

CPU Utilization

CPU Utilization (%) = (1 - (Idle time / Total time)) * 100. Measures the percentage of CPU capacity utilized while processing the requests.

When to Use: Use when assessing the system's CPU load, identifying potential performance bottlenecks related to CPU usage, and optimizing resource utilization.

Memory Utilization

Memory Utilization (%) = (Used memory / Total memory) * 100. Measures the amount of memory that is being used by a system or application, compared to the total amount of memory available.

When to Use: Use when evaluating memory usage patterns, identifying memory-related issues, and optimizing memory allocation for improved performance.

Average Latency Time

Latency = Processing time + Network transit time. Also known as plain “latency”. It measures the amount of time it takes for a system or application to respond to a user’s request. It is generally measured in milliseconds.

When to Use: Use when assessing the time it takes for a user's request to be processed and optimizing response times for enhanced user experience.

Network Latency

Network Latency = Time taken for response - Time spent. Also known as ‘network delay’ or ‘lag’. It refers to the delay that occurs during data transmission over a network. It can be caused by various factors such as distance between the sender and receiver, limited bandwidth, and the type of network technology used.

When to Use: Use when evaluating the impact of network delays on overall system performance and identifying potential optimizations to reduce latency.

Wait Time

Wait Time indicates how much time elapses from the moment a request is sent to the server until the first byte is received.

Wait time can be viewed from both perspectives i.e., from users and applications.

When to Use: Use when assessing the time users spend waiting for responses and identifying delays in processing times from both user and application perspectives.

Wait time = Response time - Processing time (user’s perspective). From the user's perspective, the wait time is the time spent waiting for the system to respond to their request. For e.g., time taken to load a page, perform a search, or complete a transaction.

Wait time = Processing time - Queue time (application’s perspective). From the application's perspective, the wait time is the time taken by the system to process a user request after it has been received. For e.g. network latency, resource contention, or database performance issues.

Concurrent User Capacity

Refers to the maximum number of users that can use a system or application simultaneously without degrading performance or causing errors.

When to Use: Use when determining the system's capacity to handle concurrent users, identifying potential performance limitations, and optimizing for scalability.

Transaction Pass/Fail

Transaction pass = (No. of successful transactions / Total Transactions) x 100%. Transaction pass occurs when a transaction has been completed as expected without any error or delay.

Transaction fail = (No. of failed transactions / Total Transactions) x 100%. Transaction failure occurs when the transaction is initiated and attempted to complete but fails due to some error. For example, a user enters incorrect payment details, which causes the payment to fail.

When to Use: Use when evaluating the success rate of transactions, identifying potential issues, and ensuring transactional reliability.

Standard Deviations

Refers to a statistical measure of the dispersion or variability of a dataset relative to its mean. It provides insights into the spread of data points around the mean value, indicating how much individual data points deviate from the average.

When to Use: Use to assess the consistency and stability of performance metrics over multiple test iterations. Standard deviations help in identifying outliers or anomalies in the data that may indicate potential performance issues or fluctuations in system behavior. Additionally, standard deviation can be useful for comparing the variability of performance metrics across different test scenarios or configurations, enabling testers to optimize system performance and ensure reliability under varying conditions.