BKP_sreenivas: August 2021

Monday, August 30, 2021

Effort Estimation

Effort Estimation :

Project Analysis & Understanding 5 Days

Requirement Gathering 3 Days

Planning 2 Days

Scripting

Simple 2 Days

Normal 2 Days

Complex Based on Application Behaviour

Design Scenario 2 Days

Execution & Analysis 3 Days

Test Planning :-

Document Purpose

Overview

References

Architecture

Scope of Testing

In Scope

Out Scope

Test Approach

Test Scenario

Test Entry & Exit Criteria

Suspension & Resumption Criteria

Environment Details

Test Data

Tools

Test Schedules

Stub Approach

Metrics

Work load analysis

Risk, assumptions, Dependencies Constraints.

Success Criteria

Open Items, Acronyms.

web_get_int_property(HTTP_INFO_RETURN_CODE);

This function to check the script is successfully accessed the requested page.

SSL Certificate :-

web_set_socket_option("SSL_Version", "TLS 1.2");

web_set_socket_option("SSL_CIPHER_LIST", "ECDHE-RSA-AES128-GEM-SHA256");

Error :- web_set_socket_option to supply suitable ciphers (eg :- SHA256).

Protocol : TLS 1.3

Key Exchange Group : X25519

Signature Algorithm : ECDSA

CIPHER : AES-128-GCM

Hash Algorithm : SHA-256

Note :-

If the Protocol is HTTP :- we use web_set_proxy("proxyName:<port>");

If the Protocol is HTTPS :- we use web_set_secure_proxy("proxyName:<port>");

Thursday, August 26, 2021

HEAP_ Memory Leak

Garbage Collection in Java – What is GC and How it Works in the JVM (freecodecamp.org)

JVM is a Dynamic Memory it is stored in RAM.

Defi :- After using the Object memory, it should get free. If it is not getting free then that is called MemoryLeak.

-> To Avoid OutOfMemory Exception which causing Memory Leak to do Increase the HEAP size from

512 MB to 1024 MB or else to increase from ...... MB to ......MB.

-> If there is no Memory space present for creating new objects in HEAP, JAVA throws OutOfMemory Exception or Java.Lang.OutOfMemory Error.

Q : What will happen if HEAP size is low ?

Ans : The Number of GC Cycles are increase and the GC frequency will come very often.

every time it will create the "STOP the Events"

It will literally halt the applications.

It will suspend and terminates all your threads.

Heap is the space where the Objects & References are stored.

All the Object related data is stored in HEAP Area.

(Object related Data means Object Information, Object RunTime Data and all the Instance variables Information. This can be accessed by Multiple Memory).

Each JVM has its own memory segment, that is called HEAP Memory, which is storage of Java objects.

These Java objects can be grouped based on their age, like

Younger Generation ,
Older Generation and
Permanent Generation.

Note :- Every JVM having only one Heap Area

Heap and GC are decide the performance of JVM.

GC is a Mechanism which will have some Automatic Memory Management.

It is having own unique GC Memory Algorithm which are used remove unreachable objects or Unreferenced Objects from Memory.

Live Objects are Reachable

Dead Objects are Not Reachable

Some Important Terms :

Live Objects
Dead Objects
Demon Thread
System.GC( );

Thread :-Daemon Threads and Non Daemon Thread

Daemon Thread : Created by JVM

Non-Daemon Thread : Created by Application

Note : -The Main Thread is always Non-Demon Thread.

Native Thread : 1. Threads in this model are managed by the JVM with the help of underlying OS support.

2. These threads are implemented at the OS level and managed in kernel space.

GC is process of automatically freezing objects that are no longer referenced by the program.

Each Generation has its own memory segment with in the heap.

When this segment is full, GC deletes all the unreferenced objects that are moved as Garbage to create space.

If the HEAP size is decreased gradually overtime then its clear that there is a some problem.

If you are running load test for a long time and you observe that the heap size is getting decreased, then one of the problem could be Memory Leak.

Generally an object becomes eligible for garbage collection in Java on following cases:

1) All references of that object explicitly set to null e.g. object = null

2) Object is created inside a block and reference goes out scope once control exit that block.

3) Parent object set to null, if an object holds reference of another object and when you set container object’s reference null, child or contained object automatically becomes eligible for garbage collection.

4) If an object has only live references via WeakHashMap it will be eligible for garbage collection.

Note :- The HEAP Memory is different from Actual Memory.

We can identify the Memory leaks by running a load test an application for a prolonged duration of 10 to 12 hours and monitoring the memory utilization.

Note : -

If the Application happens to have Memory Leak then FullGC will start to run repeatedly without reclaiming the Memory.

When full GC runs repeatedly CPU will start to spike up and never comes down.

Sol :- To resolve the problem, Memory leak has to be fixed using Thread Dump/ Heap Dump root cause of the problem should be identified & Fixed.

Reasons for CPU Spike up : -

1. Repeated full GC

2. Non Terminated Loops

3. Non Synchronized to Java.Util.HashMap.

GC : Collecting unused objects

If Younger generation is filled and not able to free the memory to create New Objects, then it is completely Developer Issue.

High GC can Lead High CPU Utilization and slow Response Time. This is due to suspensions.

GC Mechanism :

Half GC :- Deallocate the Memory from Younger Generation

Full GC :- Deallocate the Memory from Both Generations

Heap and GC are decide the Performance of JVM.

JMap : to take HEAP Dump.

JHat : To analyze HEAP Dump

Syntax : jmap -dump:[live],format=b,file=<file-path> <pid>

Eg : jmap -dump:format=b,file=/opt/tmp/heapdump.bin 37320

How do you find JVM free Memory ?

Ans : XMX, XMS

Note : to Capture troubled HEAP dump .

java -XX:+HeadDumpOnOutOfMemoryError -XX:HeapDumpPath=<file-or-dir-path>

When memory usage exceeds beyond XMX we will get JVM out of memory Exception.

To check Memory Leaks : System Resource Graph -> Windows Resource Graph.

Thread do not Occupy Memory space with in Java Heap

They Occupy Memory space outside the Java Heap

This is one of the reason why typically java process taking more than XMX value

Memory Leak Symptoms :

Performance symptoms during memory Leakage in our Application :

-> The Application uses Heap memory increased by time

-> The Response slowdown gradually due to memory cognition

-> Aggressive execution of GC activity.

-> HEAP Dump shows a lot of objects retained(from the leakage types)

-> A Sudden increase of memory PAGING as reported by the OS monitoring tools.

-> OUT OF MEMORY occurs frequently in the logs and some times an application server restart its required.

Due to Memory Leak affects:

-> Slow system performance.

-> Crashes in running programs.

-> The System may hang

-> The System reboot all the resources including the Leaked Memory.

Main Causes of OUTOFMEMORY Exception:

-> It is always Memory Leak.

-> GC is not freezing up the memory because there can be a dead locks.

-> Memory Fragmentations.

-> Excess GC Overhead

-> Allocating oversized Temp Objects.

-> Non Terminated Loops

-> Repeated full GC

Note : If Application creates too many Threads then it can be result in "Java.Lang.OutOfMemory" Error: Unable to create New Native Thread".

NOTE : when you identify High Response time along with MEMORY Leak, so first carryout HEAP Dump analysis and check the frequency of GC and carryout the GC analysis to identify the root cause analysis.

NOTE : If the HEAP size is decreased gradually overtime, then it is clear that is a server problem.

if you are running a Load Test for a long time and you deserve that the HEAP size is getting decreased, then there is one of the problem could be a MEMORY Leak.

NOTE : If Total Memory usage is increasing but Logical Thread and Private Byte counters are not increasing, Then there is a Leak in MEMORY.

HEAP :

JVM created HEAP memory for allocated objects.

It is a snapshot of a memory which contains Java Objects and class information.

Q : can we change the HEAP settings during load testing ?

Ans : Yes, We can, but it will effect the Performance issue.

If you change the HEAP setting during Load testing, you need to restart the JVM.

Q : Where we get OutOfMemory Exception in GC ?

Ans : When Memory usage exceeds beyond XMX we get JVM OutOfMemory Exception.

The Min HEAP size is XMS

The Max HEAP size is XMX

XMN :- set the initial and MAX size of the Heap for the Young Generation.

Q : Where we need to increase the Heap size ?

Ans : In Tomcat Catelina file

In WebLogic set DomainEnv.cmd file

set WLS_MEM_ARGS_64bit = -XMS 512M -XMX 1024 M

set WLS_MEM_ARGS_32bit = -XMS 512 M -XMX 512M

Note : Serial GC will work only Standalone applications and Desktop Applications.

If we use distributed application we will get High Performance issues.

CPU :-

IF CPU is a Bottleneck :

Add a Longer Pacing Time between Iterations

Add Wait steps of a second or two

IF Memory is a Bottleneck run fewer VUsers

High CPU Utilization occur whenever excessive GC cycles Running

High GC can lead to High CPU Utilization and slow Response time this is due to Suspensions.

Note :- To check Memory Leaks = System Resource Graph -> Windows Resource Graph.

Flags :-

To change Type of GC's

Java -XX:+PrintCommandLineFlags -version

Java -XX:+UseG1GC -XX:+PrintCommandLineFlags -Version

Java -XX:+UseSerialGC -XX:+PrintCommandLineFlags -Version

Memory Increases below two Reasons :

A Bad GC Memory settings or an Application Problem

The Application Problems is either a MemoryLeak or one of the several other memory Problems.

When do you take HEAP Dump ?

whenever available Memory goes down
Some sudden failures of Application

eg : sudden Response Time Increase

High CPU Utilization (it is occur whenever excessive GC cycles Running)

In 100% Memory 40% Filled

When do you take Thread Dump ?

Extremely poor Response Time.
Hung Application
Very Slow startup
Unusual amount of Resource Utilization.

Note : Before generating Thread Dump we must know Java Process ID,

We can get that ID by using JPS Command in UNIX Server.

eg : $JPS

Note - Below Performance symptoms that usually appear incase of Thread Blocking.

1. Slow Application Response.

2. App Server logs might be show some stuck Threads

3. The Servers healthy status becomes critical on Monitoring Tools.

4. Frequent App Server restarts manually or Automatically.

5. Thread Dump shows a lot of Threads in the Blocked status waiting for different resources.

6. App Profiling shows a lot of Thread blocking.

What is Memory leak:

when ever a computer program consumes memory but unable to release it back to the operating system. A memory leak has symptoms similar to a number of other problems and generally can only be diagnosed by a programmer with access to the program source code; however, many people refer to any unwanted increase in memory usage as a memory leak, though this is not strictly accurate.

The memory for any consistent increase and also any degradation in CPU performance. Is it a memory leak?

Note that constantly increasing memory usage is not necessarily evidence of a memory leak.

Some applications will store ever increasing amounts of information in memory (e.g. as a cache). If the cache can grow so large as to cause problems, this may be a programming or design error, but is not a memory leak as the information remains nominally in use.

In other cases, programs may require an unreasonably large amount of memory because the programmer has assumed memory is always sufficient for a particular task; for example, a graphics file processor might start by reading the entire contents of an image file and storing it all into memory, something that is not viable where a very large image exceeds available memory.

To put it another way, a memory leak arises from a particular kind of programming error, and without access to the program code, someone seeing symptoms can only guess that there might be a memory leak. It would be better to use terms such as "constantly increasing memory use" where no such inside knowledge exists.

The term "memory leak" is evocative and non-programmers especially can become so attached to the term as to use it for completely unrelated memory issues such as buffer overrun.

Checking for Leaks: There are a number of telltale signs that an application is leaking memory.

Maybe it's throwing an OutOfMemoryException.
Maybe its responsiveness is growing very sluggish because it started swapping virtual memory to disk.
Maybe memory use is gradually (or not so gradually) increasing in Task Manager.

When a memory leak is suspected, you must first determine what kind of memory is leaking, as that will allow you to focus your debugging efforts in the correct area.

Use PerfMon to examine the following performance counters for the application:

Process/Private Bytes:

The Process/Private Bytes counter reports all memory that is exclusively allocated for a process and can't be shared with other processes on the system.

Test: If Process/Private Bytes is increasing, but # Bytes in All Heaps remains stable, un managed memory is leaking.

.NET CLR LocksAndThreads/# of current logical Threads:

The .NET CLR LocksAndThreads/# of current logical Threads counter reports the number of logical threads in an AppDomain.

Test: If an application's logical thread count is increasing unexpectedly, thread stacks are leaking.

Test: If both counters for 'logical thread count' and 'Private Bytes' are increasing, memory in the managed heaps is building up.

.NET CLR Memory/# Bytes in All Heaps:

The .NET CLR Memory/# Bytes in All Heaps counter reports the combined total size of the Gen0, Gen1, Gen2, and large object heaps.

Test:

By default, the stack size on modern desktop and server versions of Windows? is 1MB. So if an application's Process/Private Bytes is periodically jumping in 1MB increments with a corresponding increase in .NET CLR LocksAndThreads/# of current logical Threads, a thread stack leak is very likely the culprit.

Test:

If total memory use is increasing, but counters for 'logical thread count' and 'Private Bytes' (measuring managed heap memory) are not increasing, there is a leak in the unmanage

Alternative method:

Start with monitoring the response times, throughput, total tps etc.. You should see the impact here if not monitoring the run time environment or system resources in first instance. Now it could or could not be memory leak.

Look at memory profile of the server hosting the run time environment and application server logs. Check the logs, if out of memory errors are recorded in the logs it could or could not be a memory leak. Check heap usage and gc logs. It could be a memory leak if the heap is full and no memory is being released after gc(s). If there is enough heap but jvm is still kicking off gc's to free the memory, the perm gen space might be full or could be some other reason.

If its a memor leak then jvm would be thrashing and hogging up all the cpu. You won't see any load on the down stream systems. Plotting graph from gc logs would show an increase in the heap troughs.

Above is just one example and there could be many many variations to this. You can simulate a memory leak yourself, just google it and you will find code to both induce and fix it.

As you might guess, memory leak, if left unattended and not corrected, could prove to be fatal. Memory leaks can be found out by running tests for long duration (say about an hour) and continuously checking memory usage.

Issues caused by memory leaks are essentially based on two variables for a standalone windows application

1) Frequency of usage

2) Size of memory leak.

If either one or both are very high, the computer might come to a point when no memory is available for other applications. This could lead to a computer crash. If it is a network based application then you will also have to consider network traffic. If each network transaction causes a memory leak, then a high volume of network transactions could also prove dangerous.

Endurance Testing

Soak testing, endurance testing or stability testing; the terms are largely interchangeable but refer to a test that is designed largely around running a controlled load, usually a little lower than business-as-usual, over a prolonged period. This tests for resource consumption and non-release that can only be identified over time. Gradual memory leaks are the classic example,

where available memory might reduce little by little. In the short term this will have no effect on the system, but over time could be a major resourcing issue.

Log files tend to grow over time, and past a certain point may cause system issues if there is insufficient disk space, garbage clean up processing, etc. This small cumulative growth and non-release can all be issues that are only apparent over time.

So how long should a soak test run? As long as possible. In reality you should aim for anything over 12 hours. Ideally a soak test would run for 72 hours, so the system can demonstrate its ability to hold things together for a period longer than a weekend's-worth of non-stop operation. If the system continues to operate normally for 72 hours, it gets over the period it generally will need to be supported by out-of-hours on-call staff. This is generally the critical length of time a system needs to be able run "unattended," as weekend support is far more costly to an organization than business-hour support.

Note :

It is used for "To find out Memory leaks",

It is used for "To find out Connection failures between different Layers",

It is used for "To find out Connection Failure in DB connection"

============

if your LG exceeds the 80% CPU and you get that warning, i'd treat the results with suspicion.

the next questions are points you need to consider:

How many LGs did you use for this test

Did you get the error on all your LGs

How many Vusers did you run in this test?

At what point in your ramp up did you start getting the CPU error messages?

Once you answer all these, try the following:

Run with a lower amount of Vusers on your LG and see if the error messages still pop up, or

Run your test with more than 1 LG and distribute your Vusers accordingly between the LGs

Improve the HW on your LG

====================

As we know that memory is a holding place for instructions and data that the microprocessor can reach quickly. So while processing, the memory contains the main parts of the operating system and some or all of the application programs and related data that are currently being used.

Below diagram can give you some inference on where the data can be stored and which storage areas can render faster response.

Threads & Process

So while seeing about memory, there are two elements that comes into picture.

Process
Threads

Process is one instance of a running application and all the memory and other resources associated with it. Threads on the other side, is one path of execution through application’s code. A process can consist of one or many threads of execution.

So what is a bottleneck?

A component is said to be bottleneck whose performance limits the optimum performance of other components. It can because of the any of the following reasons:

Insufficient resources
Resource malfunctioning
Uneven workload sharing
Incorrect configuration of resources
Monopolization of a particular resource

Paging/Swapping

Paging is a movement of pages of data between disk and main merry. when there are too many pages in memory, the RAM shortens in capacity to have all the pages in it, so it uses 'virtual memory'. When each page in execution demands that page that is not currently in real memory (RAM), it places some pages on virtual memory and adjusts the required page on RAM. The state of excessive paging is called Thrashing. It is a state in which the CPU performs 'productive' work less and 'swapping' more. This could lead to a major memory bottleneck.

Performance Counters

Let’s see some of the memory counters now and how these counters can be added in a Windows OS.

Memory Object

The memory object describes the behaviors of physical and virtual memory on the computer

Physical memory is the amount of RAM on the computer. Virtual memory consists of space in physical memory and on disk.

How to add memory counters:

Go to perfmon utility.
click on Add counters
Choose Memory as performance object
Select the counters that you want to monitor.

Memory Cache

This is a place where the processor stores the data or instructions that is currently working at the time or is predicted to work shortly. It allows the processor to get the information quickly from the faster cache memory. The net result is a more efficient and faster running system.

Memory Leak

It occurs when applications allocate memory for use but don not free allocated memory when finished.. It causes temporary memory shortages in application programs that run for a short time. It causes system to allocate all available memory to one process. Eventually, the system hangs till the memory is released.

Tuning Tips for memory resources:

Increase physical memory about above the minimum required.
Create multiple paging files while using multiple disks.
Determine the correct size for the paging files
The initial size fi the paging file is between 1 and 1.5 times the amount of RAM available
Check the available space on your disks. Do not use large paging files.
Ensure effective usage of Cache memory.
Monitor the applications and replace those that leak memory or use it inefficiently
Shut down the services that are not required by the application
Replace 16-bit system with 32-bit or 64-bit systems
Run memory-intensive programs on the high end computers or when the system workload is light.

Reasons for memory bottlenecks

Too many page faults - Having too many page faults leads to excessive program execution delays. Ensure that your application doesn’t experience too many hard faults

Disk contention

Competition for memory - When memory is scarce, the memory access pattern of one program can unduly influence other running programs.

Ques : - How to graph the Partial GC and Full GC of an application in Dynatrace ?

Ans : In Dynatrace go to Agent based Measures -> JVM ->(here having)

-> Committed Memory, MAX Memory, Memory pool, Thread Count, Memory Utilization

-> Free Memory, Specific GC Activation, Total GC Activation.

-> GC Collection Old Memory & Young Generation, Un Loaded Classes, Used Memory, etc...

Decreasing Full GC Time

The execution time of Full GC is relatively longer than that of Minor GC. Therefore, if it takes too much time to execute Full GC (1 second or more), timeout may occur in several connected parts.

If you try to decrease the Old area size to decrease Full GC execution time, OutOfMemoryError may occur or the number of Full GCs may increase.

Alternatively, if you try to decrease the number of Full GC by increasing the Old area size, the execution time will be increased.

Wednesday, August 25, 2021

Bottlenecks

Tuning : it is an area to find out the bottlenecks in different levels

Bottleneck : it is a breaking point where the server will upgrade and degrade

Memory leaks
Array bound errors
Inefficient buffering
Too many processing cycles
A larger number of HTTP transactions
Too many file transfers between memory and disk
Inefficient session state management
Thread contention due to maximum concurrent users
Poor architecture sizing for peak load
Inefficient SQL statements
Lack of proper indexing on the database tables
An inappropriate configuration of the servers

CPU utilization bottleneck.

There are two possible causes of CPU utilization bottleneck. The first one may be caused due to the CPU preprocessor running at over 80% capacity. The second possible cause is insufficient system memory.

Memory utilization bottleneck.

The most common reason that causes Memory utilization bottleneck is insufficient or fast enough RAM. Lack of enough memory causes the computer to start offloading storage via a very slow HDD or SSD to keep it running.

Software limitation bottleneck.

Software limitation bottleneck may be caused by programs that are only built to work with a single CPU. When they are linked to a stream of CPUs, they find it hard to handle a number of tasks at once.

Networking utilization bottleneck.

A networking utilization bottleneck occurs when two devices have a problem communicating with each other due to a lack of bandwidth. The lack of processing power between the two devices to enhance communication is due to the overloaded server and network between the two devices.

Disk usage bottleneck.

This type of bottleneck causes the reduction in the disk usage speed due to fragmentation issues

Bottlenecks which are in DB :-

High SQL Parsing(Hard Parse and Soft Parse);
Establishing New Connections repeatedly
High levels of Contentions for small amount of Data(Application level block Contention)
Different configuration issues

Incorrect sizing of Log Files
Too many Check Points
Sub-Optimal Parameter settings.

Bottlenecks which are caused by SQL statements :-

Insufficient high load statements
Object Contention - DB Objects are the main source of Contention(Bottleneck).
Tuning SQL Statement may change execution plans
Analyse the impact of SQL Tuning in test DB using SQL Performance Analyser.
DB Time (Wait Time + CPU Time of all the sessions).
Perform AWR Analysis for every 1 Hour.

What are the performance bottlenecks that you found in the projects you were working? What are the recommendations made to overcome those issues?

Ans: Following are the bottlenecks found in the application during performance testing:

The bottleneck in the JavaScript and CSS request which increases the response time of the application webpage.

While login to the application which includes database related transactions; seems very slow, then I started my investigation by looking into the tables in the database. I found that indexing was not present in the tables. Once the indexing is attached for the tables then readability became very fast and response time improved.

How to identify performance bottlenecks?

Ans: Firstly analyse the performance testing graph and filter out the time window when the issue occurred. Secondly, correlate the client-side metric graph with server-side metric graph and analyse the server behaviour at that particular time. After that investigate the server logs and DB logs and find out the root cause. The application performance monitoring (management) tools make the root cause identification process easier and quicker.

Requirement Gathering

Performance Test Requirements

According to Tom Gilb’s principle, “Projects without clear goals will not achieve their goals clearly.”

Test requirements define your task set and provide you the goals to measure the end result of your activity. In performance testing world, we often face situations where an application is given to test without test requirements. In such situation, Performance Engineer has to get the required information before starting the testing. Performance Engineer has to divide the Performance Testing Requirements Gathering activity in two phases, first what elementary information he/she needs to successfully complete the task and secondly, how to collect the desired information.

Per Req

Following are some of the question comes to a Performance Engineer’s mind before starting a Performance Testing project,

What is the type of application and its complete architecture description?

What are the known current as well as previous performance bottlenecks?

Which application scenarios need to be tested?

What will be the workload model?

What are the performance goals?

Answer of each of the above questions has significant impact on the results of a performance test. Performance test results are always depend on the test environment and replicating exact production environment is extremely important for successful performance testing. Moreover, you can never be fully confident about your testing strategy and performance optimization recommendations without prior complete knowledge of application architecture. Information of known performance bottlenecks will assists you to setup the right type of test (Load Test or Stress Test etc.) to reproduce the production bugs (you need to reproduce the production bugs to find out their root cause). You can never do 100% coverage of the application in performance testing and identification of the critical business scenarios will help you to find out the real issues in optimal time. Performance testing workload should be as much identical to production environment as possible. Assigning the right user mix to selected performance scenarios is absolutely important for accurate results. Last but not the least, Performance goals will define the success criteria of your activity and you can never produce quantitative results without clearly defined performance goals.

Once all the required performance test requirements are listed, next step is how to gather this information. You can contact to different stakeholders like Business Analysts, Marketing Team, Network Team, Development Team and Functional Testing Team to collect all the required information.

Best you can do by preparing a questionnaire and forward it to all stakeholders to get the maximum information from them. But what will you do if you don’t receive any useful information from all the contacted sources? Then you have no choice but to use your experience and expertise to extract maximum requirements. Of course one may not be able to list down all the required information by own but at least can get enough to successfully complete the activity. For Eg, if you are going to start testing an E-Commerce web application, an experienced performance engineer can guess what will be its potential scenarios (E.g. Browsing Catalogs, Searching for items and Adding Items to Cart etc.). One can also figure out Browsing catalog will be the mostly accessed and searching will be most performance intrusive scenario. A Performance Engineer can easily figure out an E-Commerce web application Response time should not be more than 3 seconds. So it’s not the end of world if you don’t get the required performance test requirements and you can still successfully complete the activity by using your experience.

What are the Non-Functional Requirements?

Non-functional requirements are the testing goals which are created especially for performance testing, security testing, usability testing, etc. It is a combined requirement for all types of non-functional test. Since PerfMatrix is a core performance testing site, so non-functional word will be specific to Performance Testing only. Some simple examples of Non-Functional Requirements are:

The number of users handled by the application
Page Response Time
The number of requests processed by the application per unit time
CPU Utilization
Memory Utilization
Error rate etc.

In non-functional requirement, some of the goals are time-bound like the response time of a page, request per second, resource utilization etc. whereas some of the goals are load bound like real-world user load, throughput etc.

Difference between functional and non-functional requirement:

The main difference in functional and non-functional requirement is that functional requirements are end-user result oriented and stressed over the correct output. For example: If a user uses a banking application and clicks ‘Account Balance’ link then the application must display the correct balance available in his account. On the other hand, the non-functional requirements are oriented to the performance of the system in terms of responsiveness and load-bearing capacity. For example: If 1000 users hit ‘Account Balance’ link at the same time then the application must respond to all the users within a defined time (say 3 seconds).

Purpose of NFR Gathering:

The project team or client sets the expectation for performance testing in the form of non-functional requirements. To collect the requirement, analyse them from performance testing perspective and finalise the quantitative NFRs; all these steps fall under the NFR gathering phase of PTLC (Performance Test Life Cycle). All the requirements are documented, categorized and concluded in the Non-Functional Requirement Document. The end result of this phase provides quantitative NFRs which helps to prepare a correct workload model during performance testing.

Accountability:

Performance Test Manager or Performance Test Lead has a responsibility to collect, discuss and finalise the Non-functional requirement.

Approach:

Once the performance testing scope is finalised in the risk assessment phase, then either of them (Performance Test Manager or Lead) has to schedule a meeting with the project team to understand the client’s expectation. He may get the requirement in layman term which may require thorough study and deep analyse to extract the testable NFRs. There is also a high probability not to get clear and reasonable requirements in one meeting so he needs to set-up multiple meetings with the project stakeholders. Once all the non-functional requirements are finalised and he gets the testable NFRs then these NFRs should be properly documented in the Non-Functional Requirement Document and get the approval from the project stakeholders.

How to proceed practically?

“How to get the performance requirements from a non-technical customer?” This is a valuable question that everyone has. In most of the cases, the non-functional requirements are either defined incompletely or they are more conceptually rather than quantitative. A Performance Test Manager/Lead needs to spend his time to understand the client’s expectation by asking the right set of questions and transforming the conceptual requirements into quantitative goals. He needs to play the role of a mediator who bridges the gap between the novice user language and performance testing terminologies.

It is the responsibility of Performance Test Manager/Lead to get the essence from the customer. One needs to quickly understand the customer and should start talking in their language. Don’t talk using performance testing jargons and make the customer feel that they are ignorant, which is not the case. Don’t expect that the customer will give the performance goals in a single shot. Rather, start the discussion with an overview and purpose of performance testing. The discussion can be started with questions like

What is the expected user load they are looking for?
How much the expected response time for a page?
What are the types of performance test can be included? Explain each type of performance tests and their purpose.

These simple questions make the client comfortable to explain what he wants.

Some additional Tips:

Also, try to build a good rapport with the customer. During the meeting/call, educate the customer by explaining some core performance testing terms with some good examples, so that he can explain exactly what his expectations are. The most important thing which a performance manager/lead should always avoid, not to scare the client by asking lots of question in one shot or talking too much technical. The way of asking the requirement should be very polite to impresses the customer and get his confidence.

Once the NFRs are collected, draft them in Non-Functional Requirement Document along with the date and points discussed in the meetings. The final Non-Functional Requirement Document must be signed-off from all the project stakeholders.

Challenges:

In most of the cases, applications are new and hence a Performance Test Manager/Lead does not have any clue on the NFRs. Many times, the client pick some random numbers over the call and ask to perform NFT (non-functional testing) against those numbers and at the end of the test when such NFRs do not meet, client forced to modify the NFRs and pass the test at any cost.

Such performance tests are executed to showcase that performance testing had been done and the application performed very well in the test environment. But in reality, 82% of such applications are failed due to performance issue within 6 months or even in the less time period when user load increases in production. Hence key points are to analyse the NFR thoroughly, do not consider (agree) any random number as NFR and finalise the quantitative NFRs without any caveat.

Deliverable:

Non-functional requirement document has to draft once the scope is finalized. The outcome of each meeting may lead to multiple changes in the document. Non-functional requirement document should cover all the aspects along with major and minor changes in the requirement. Do not forget to mention the acceptable error percentage in the document because this is one of the criteria which will help to decide RAG status at the end of the testing.

Non-Functional Requirement Document Template:

Download the template of Non-Functional Requirement Document.

Example:

Let’s call PerfMate. In the previous phase (Risk Assessment) of the project, PerfMate has conducted the risk assessment and finalise the scope of performance testing. He got the approval from all the stakeholders on Risk Assessment document. Moving to the next phase, he starts to collect the non-functional requirement (specific to performance testing) and receives the following requirements:

The application should be very fast.
The response time of the application should be quick.
The web server performance should be as high as possible.
The application should support many users.
The servers should not fail when a sudden load comes during sale and offer periods.
The application should run without any failures for a long duration.

From a client point of view, he has given all the requirements and set his expectation, but from PerfMate perspective, he just got the conceptual requirements. Now, PerfMate explained to the client that the provided information is partial and cannot help him to define the performance testing goal. With a polite manner, he gained the customer’s confidence and convinced him to provide the quantitative requirements. He asked project stakeholders to answer some of his basic questions. At last, he managed to gather the following requirements in the next couple of days:

Question: How many types of users are using this application (GUI)?

Answer. 4 types of users

Admin
Seller
End-user
Call center employee

Question: What are the business scenarios of every user?

Answer: Business scenarios for each type of users are like:

Admin:

To approve/reject a new seller
To verify a newly added product and approve/reject it

Seller:

To add a new product
To delete the existing product

Buyer:

To buy a product
To cancel the order

Call Center Employee:

To register the complain

Question: What is the AUT current and predicted peak user load for all its users’ actions over time?

Answer: Admin

Current: 4
Predicted: 10

Seller

Current: 25
Predicted: 100

Buyer

Current: 438
Predicted: 2500

Call Center

Current: 8
Predicted: 24

Question: What is the average of active user count (including all types of users) at a time?

Answer:

Total: 304
Admin: 3
Seller: 15
Buyer: 278
Call Center: 8

Question: What could be the active user count during peak hour?

Answer:

Total: 500
Admin: 4
Seller: 50
Buyer: 438
Call Center: 8

Question: Anytime during a day or month when average user count is suddenly increased?

Answer: 31st of every month there is a 1-minute sale from 09:00 to 09:01 and 10:00 to 10:01. During this period the active buyer count becomes three times i.e. 834 and active seller count becomes 23. None of the other (Admin/Customer Care) users count change.

Total: 868
Admin: 3
Seller: 23
Buyer: 834
Call Center: 8

Question: What is the request count receive at Admin end?

Answer: 200 requests per hour

Question: What would be the response time NFRs for all the scenarios?

Answer: None of the pages should breach 3 seconds average response time NFR. This NFR is applicable for all type of users, scenarios, and pages.

<Rest of the figures are displayed in the below NFR table>

Question: What would be the server-side NFRs?

Answer:

CPU utilization must not be more than 60%; except stress test.
Pre, post and steady-state memory utilization difference must not be more than 15%; except stress test.
Pre, post and steady-state disk utilization must not be more than 25%; except stress test.

Question: What is the size of the performance testing environment with respect to production?

Answer: 100% (Live like environment)

More or less PerfMate got the answers to all of his queries and noted down in the NFR document and share with the client for the approval. After getting approval on the NFR Document, he will start the preparation of the performance test strategy.

A typical NFR format prepared by PerfMate is:

Note: Do not consider the below table as a standard NFR format. It was made simple for understanding purpose; especially for beginners.

NFR ID Category Description Impact to

NFR01 Application The solution must be able to support 500 active users.

1. Admin: 4 1. Admin

(2 for seller and 2 for product approval)

2. Seller: 50 2. Seller

3. Buyer: 438 3. Buyer

4. Call Center: 8 4. Call Center

NFR02 Application The solution must be able to support the future volume of active users i.e. 2634

1. Admin: 10 1. Admin

2. Seller: 100 2. Seller

3. Buyer: 2500 3. Buyer

4. Call Center: 24 4. Call Center

NFR03 Application The solution must be performed well during the longer period of time with average volume. i.e. 304

1. Admin: 3 1. Admin

2. Seller: 15 2. Seller

3. Buyer: 278 3. Buyer

4. Call Center: 8 4. Call Center

NFR04 Application The solution must be able to support the spike load of the buyer and seller during the sale period.

1. Admin: 3 1. Admin

2. Seller: 23 2. Seller

3. Buyer: 834 3. Buyer

4. Call Center: 8 4. Call Center

NFR05 Application Admin gets an average of 200 requests per hour every time. 1. Admin

NFR06 Application The number of orders:

1. Peak Hour volume: 1340

2. Sale Hour Volume: 2830

3. Future Volume: 7500

4. Average Volume: 600

Note: 4% of the users cancel the order in every scenario. 1. Buyer

NFR07 Application Sellers add an average of 180 products per hour and delete 18 existing products every hour 1. Seller

NFR08 Application The call center employees get 40 complains per hour 1. Call Center

NFR09 Application The response time of any page must not exceed 3 seconds except stress test 1. Admin

2. Seller

3. Buyer

4. Call Center

NFR10 Application The error rate of transactions must not exceed 1% 1. Admin

2. Seller

3. Buyer

4. Call Center

NFR11 Server The CPU Utilization must not exceed 60% 1. Web

2. App

3. DB

NFR12 Server The Memory Utilization must not exceed 15% (Compare pre, post and steady state memory status) 1. Web

2. App

3. DB

NFR13 Server The disk Utilization must not exceed 15% (Compare pre, post and steady state memory status) 1. Web

2. App

3. DB

NFR14 Server There must not any memory leakage 1. Web

2. App

3. DB

NFR15 Application Buyers order at the average rate of 1. Buyer

1. Peak Hour Rate: 3.06 products per hour

2. Sale Hour Rate: 3.39 products per hour

3. Future Volume: 3 products per hour

4. Average Volume Rate: 2.15 products per hour

=======

General Information

1. What is the current project timeline to begin and close testing activities? ie. starting and completion date

2. Is the application functionality stable and its functional testing is completed?

3. We have observed that in some cases firewall affects the results in load testing, do we need to get firewall clearance before starting load testing?

4. Please provide access credentials/URL of the application?

5. Any preference on Performance Tools? E.g. LoadRunner, JMeter

6. What type of performance testing should be performed? E.g. Load Test, Stress Test, Soak Test, Spike Test

7. What are the goals of the performance testing activity?

E.g.

Evaluate System against performance criteria

Discover what parts of the system perform poorly and under what conditions

Compare two platforms with the same software to see which performs better

8. What will the acceptance criteria for each performance test?

E.g.

All user transaction should pass with response time not more than 5 seconds

At least 95% of the user transaction should be successfully completed

Architectural Information

9. What is the type of application? E.g. Client Server, Web Based, Mobile App

10. In which technology/Platform the system is developed?

E.g. J2EE, .Net, PHP, Silverlight, Ruby, SAP, Any other

11. Which data base is used for this system? E.g. Oracle, MySQL, SQL Server

12. Which Application server is running with the system? E.g. Tomcat, IIS, WebSphere

13. What does the target system (hardware) look like (Please specify all servers and network appliances configurations and their interaction mechanism)?

LAN/WAN details

Terminal servers

Bandwidth link

Load Balancing techniques

Batch Transactions

Disaster recovery

14. Do you have traffic monitoring tool deployed on web server? E.g. Google Analytics, New Relic

15. Is there any known issue(s) in this application?

E.g.

Memory lock

Higher CPU and Memory utilization

Unexpected growth in daily visitors

More response time which leads to time out error

16. What is the protocol between client and server? E.g. http, https, TCP, FTP

17. For a web application, is the client browser version dependent? E.g. Application runs only on IE-8

18. Will you provide us separate test environment to do a performance test run?

Note: Its strongly recommended test environment should be separate and identical to production environment.

19. Are there specific requirements for the input data?

Data validation (credit cards, zip codes, etc.)

Data uniqueness (you cannot enter the same data more than once).

Is it time sensitive?

Business Information

20. Have you identified the performance scenarios of your system?

E.g. for Facebook application performance scenarios can be Login, viewing posts, adding posts, commenting, image uploading, sending invitations, chatting, logout etc.

21. Do you have statistics, how many users visit your site in 24 hours?

E.g. Facebook is access by more than 175 million users daily

22. How do you see your users accessing the application?

Sporadically throughout the day?

Do they all log in at simultaneously?

Certain number at specified intervals?

23. What is the average user session time on your system? E.g. Facebook user average session time is 23 minutes

24. How many transactions does the end user do per day in the application?

25. How much important is the sitting location of your end users? Do they sit in the same location or they distributed across the globe?

26. How do they access the application? E.g. RDP or Web or mobile?

27. Do you have set acceptable maximum transaction completion time?

E.g. System response time should not exceed 5 seconds while retrieving user’s order history

28. What is the peak load time on production server?

E.g. Maximum number of US based users log on to facebook.com at 8pm EST

29. On peak load time, roughly how many users are accessing the system

E.g. Facebook is accessed by up to 10 million users during peak hours

30. What will be workload model?

E.g. for Facebook users 1 million users can concurrently login, 4 million can view posts, 1 million can add posts etc.

31. How many simultaneous users, the system intends to support (please specify the number)?

E.g. Currently Facebook is supporting 10 million simultaneous users but in future it should support 20 million simultaneous users

32. Are there any time constraints for running the test?

Eg. the server can only be accessed outside business hours; server can only be accesses from 7 pm – 8 am

33. Would you like to share any additional information?

If the client didn't know anything about application. We've three ways to conduct performance testing.

1. Application in production ===> Get the details using some analytical tools.

2. Having competitors ===> Get the details of competitor's details using alexa.com etc.

3. No competitors and Application not in Production ===> Must Understand the CBT of application.

=================================================

Performance Test Scenarios Selection

Different activities (requirement gathering, scenario selection, scripting, workload model, test execution, analysis and reporting) are involved in performance testing and perfect execution of all of these is mandatory to achieve desired test results. As performance testing is a complex and time-taking activity, you can’t test each and every application scenario (unlike functional testing) in it. You select most important scenarios only, which can highlight more performance bottlenecks in AUT. It’s basically works similar to Pareto’s principle which states 80% of the issues are due to 20% of causes.

Scenario New

Selecting any additional scenario or missing out the one can greatly affect your test effort and results. Proper planning and consensus from all stakeholders is required before formally start working on AUT scenarios scripting.

Following is the list of scenarios which you should include in your performance test,

Most Frequently Accessed Scenarios: Application scenarios which are mostly accessed by end users. As such scenarios will affect maximum users they must be included in load test. For Eg, browsing product catalog in an E-commerce web application.

Business Critical Scenarios: Application scenarios where application core features exists. For Eg, purchasing a product is a business critical scenario in an E-commerce application.

Resource Intensive Scenarios: Such scenarios which are expected to consume more system resources as compared to others. For Eg, order placement will be most resource intensive scenario in an E-commerce web application.

Contractually Obligated Scenarios: Application scenarios for which company has contracted to provide hassle free services. These scenarios might not be used very frequently but they can create huge business loss in case of failure. For Eg, company claims its home page loads within 3 seconds.

Stakeholders Concerning Scenarios: Stakeholders could be more concerned on AUT new features impact on its overall performance.

Time Dependent Frequently Accessed Scenarios: Time dependent scenarios which are executed very frequently but on certain occasions only. For Eg, viewing monthly pay roll slip on an online payroll application.

Technology Specific Scenarios: These are the scenarios which are specific to AUT selected technology. For Eg, uploading a file through FTP server could be an example of technology specific scenario.