what is this 90th percentile exactly?
Let’s try to understand with an example. If you had 10 sheep and each sheep eat some KGs of grass on a daily basis. One day you weighted the grass and noted the figures of each sheep’s intake. Refer to the below table:
Sheep# S1 S2 S3 S4 S5 S6 S7 S8 S9 S10
Grass(kg) 3 3.2 4 4.8 3.6 2.9 3.4 3 3.8 3.9
Now, you need to find out what amount of grass has been consumed by 90% of sheep? So simply you need to sort the number with respect to consumed grass and ignore the last value.
1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th
Sheep# S6 S1 S8 S2 S7 S5 S9 S10 S3 S4
Grass(kg) 2.9 3 3 3.2 3.4 3.6 3.8 3.9 4 4.8
90th percentile value in 10 entries is a 9th value which is 4, so just ignore S4 with 4.8 (keep it hungry for some days, it eats so much).
The conclusion is 90% of total sheep either eat 4 KGs grass or below, so you got an upper limit of grass consumption. In terms of performance testing, you need to sort response time of a particular transaction or request in increasing order and then ignore 10% of the total count having high values. The last highest number in the remaining values will be 90th percentile.
Example:
A performance test script is executed for 25 iterations. The response time of the login transaction of each iteration is:
S. No. Iteration No. Login (Response Time (in sec))
1 1 1.5
2 2 1.6
3 3 1.1
4 4 0.9
5 5 2.1
6 6 1.9
7 7 1.4
8 8 1
9 9 0.8
10 10 1.5
11 11 1.8
12 12 1.1
13 13 1.6
14 14 1.7
15 15 1.3
16 16 0.9
17 17 1
18 18 1.5
19 19 2.3
20 20 1.9
21 21 1.8
22 22 1.2
23 23 1.4
24 24 0.9
25 25 1.5
Now, sort the list in increasing order with respect to response time.
S. No. Iteration No. Login (Response Time)
1 9 0.8
2 4 0.9
3 16 0.9
4 24 0.9
5 8 1
6 17 1
7 3 1.1
8 12 1.1
9 22 1.2
10 15 1.3
11 7 1.4
12 23 1.4
13 1 1.5
14 10 1.5
15 18 1.5
16 25 1.5
17 2 1.6
18 13 1.6
19 14 1.7
20 11 1.8
21 21 1.8
22 6 1.9
23 20 1.9
24 5 2.1
25 19 2.3
Now, 22.5 is the 90% of the number of transactions i.e. 25.
=> 25 x (90/100) = 22.5
Round-off to 23. So the 23rd value will be 90th percentile which is 1.9 seconds. It means 90% of total iterations having response time 1.9 seconds or less than it. Similarly, you can calculate other percentile values like 70th, 80th or 95th percentile.
How 90th percentile calculated in MS Excel?
MS Excel uses below formula to calculate 90th percentile:
90th Percentile = 0.9 * (Number of Values – 1) + 1
Why we need 90th percentile in Performance Testing?
Percentile is often considered as a performance goal. If the given SLA has 90th percentile NFR and it meets during the test then it shows that 90% of the users have an experience that matches your performance goals. It gives additional confidence to the client over his application.
Sometimes average response time appears extremely high and individual datasets seem normal. Even a couple of peaks in response times, skew the average response time numbers and impact the test. In such scenarios, 90th percentile (or other percentile values) eliminate the unusual spike data from the result.
In reality, most of the applications have very few high spikes in the graph; a statistician would say that the curve has a long tail. A long-tail does not imply many slow transactions, but few that are magnitudes slower than the norm. In that case, 90th Percentile helpful because it ignores 10% of the request having the spike (this can be ignored).
If the 50th percentile (median) of response time is 5 seconds that means that 50% of the transactions are either as fast or faster than 5 seconds. If the 90th percentile of the same transaction is at 8 seconds it means that 90% are as fast or faster and only 10% are slower. The average, in this case, could either be lower than 5 seconds or somewhere in between. A percentile gives a much better sense of real-world performance because it shows a slice of response time curve.
If we calculate the difference of the 90th percentile value and the average response time value and divide this difference with the average response time value then it gives an idea of the spread of different data points. If the ratio is extremely small, it means that average and 90th percentile values are very close to each other and will indicate good and constant performance of the application. However, if the ratio is large, it shows high deviation in response time and non-uniform performance of the application. This is one of the methods where 90th percentile is useful, although I would recommend to draw your conclusion using standard deviation only.
Percentiles are a really great and easy way of understanding the real performance characteristics of your application. They also provide a great basis for automatic base-lining, application behavioural learning and optimizing your application with a proper focus. However, averages are ineffective because they are too simplistic and one-dimensional. In short, percentile (90th, 95th, 99th) is great in performance testing world!
No comments:
Post a Comment