We had an interesting challenge posed to us by a large UK based government health organization. It was to assess if their large health related eLearning portal would indeed support 20000 concurrent users (They have 800K registered users) and deliver good performance. There was indeed a cost constraint, and hence we decided to use the open source tool JMeter.
The open source toolset has its own idiosyncrasies – max 1GB heap size support , supports only a few thousand users per machine and and has a nasty habit of generating a large log! To simulate the load of initially of 20000 and later 37000 concurrent users, we had to use close to 40 load generators and synchronize them.
We identified usage patterns and then created the load profile scientifically using the STEM Core Concept “Operational Profiling”. We generated the scripts, identified the data requirements, populated the data and setup synchronized load generators. During this process, we also discovered interesting client side scripting , we flattened them into our scripts. Now we were ready to rock and roll.
When we turned on the load generators, sparks flew and the system spewed out enormous logs – 3-6 million lines, approximately 400-600 MB! We wrote a special utility to rapidly search for the needle in the haystack! We found database deadlocks, fat content and heavy client side logic. Also the system monitors were off the chart and the bandwidth choked!
Working closely with the development team, we helped them identify bottlenecks, This resulted in query, content and client side logic optimization.Now the system monitors were under control and the deployed bandwidth was good enough to support the 20000 concurrent user load with good performance. To support higher loads in the future, system was checked with nearly twice this load and additional resources to support identified.
The FIVE weeks that we spent on this was great! (Hmmm- tough times over at last!)