Friday 26 November 2010

Cross functional teams - Future shape of a team structure ?

A typical traditional team structure consisted of a Business Analyst, Quality Analyst, Developer, Database specialist, Operations / IT support. These project roles were very well defined and the people associated used to strictly stick to their tasks. This leads to situations where a developer very often worries less about the quality of the system as he/she thinks that it was the task of the tester. There are also tensions between developers and operations around deployments to various environments. Database specialists take pride in owning their scripts and stored procedures. So without them around, developers feel a bit crippled. Similarly Quality Analysts are brought in at a later stage, thus, them missing out on all the crucial initial discussions and business values.

The future team structures look more promising where these vertical barriers of roles are being broken down and people are willing to step on each others toes a bit.
The Quality analysts are being involved in Business analysis along with the developers. You will notice different sets of questions coming up from different roles helps in understanding the system and closing any gaps before hand.
Quality analysts are pairing up with business analysts, also for writing the acceptance criteria for stories and then involving the business stakeholders to review them. This way the team is ensured that they are on the right track for that particular story.

More often from my observation, the developers implement the acceptance criteria but the tests do not perform what they intend to be doing or sometimes there are not enough checks written. To be mutually beneficial, quality analysts can pair up with the developers in writing the implementations for the acceptance criteria. Thus we can ensure that the tests are doing what they are supposed to do and also have a clean re-factored code.
By pairing the developers with the database experts, it helps the developers to understand the database structures and schema. Thus by sharing the knowledge we are avoiding the risk of depending on a single individual or team. I have noticed quite a few times that the IT operations team likes to take "ownership" of the environments and it becomes a difficult task to approach them for every single deployment to every single environment. By maintaining a good relationship with the IT operations team, the developers can thus gain mutual trust, so that the team as a whole can deploy more often and therefore gain a faster feedback cycle.

Thus I feel this approach that I have been following since the last few projects has not only helped the team deliver better quality software but also at a quicker rate.

Monday 20 September 2010

The wonderful world of Deutschland

My client was situated in a beautiful town called Heidelberg in Germany. Once again I was lucky to have met such wonderful talented people and made some great friends.

The client was basically a consumer price comparison website for house hold gas, electricity and telecommunication. It took off as a startup 10 years ago and has grown to a medium sized family based business to become one of Germany's most popular price comparison website.

As part of improving their software delivery, they hired us to help them implement a pilot agile process. We started off with two simultaneous mini projects which would help them increase the number of people signing up for products in their system and also enhance user experience in general.

Apart from coaching them on new technology and better ways to write code, we also had to impart our knowledge on process improvement.
There were mixed emotions from the team in the beginning. Some of the members were very eager to learn and pick up on new things, some people were approaching it cautiously and some people were reluctant. This is very common at most of the clients and a very natural human behaviour to environmental changes.
Over time, coming across different hurdles and lots of lessons learnt, they eventually started to appreciate the value of our good practices. We also conducted a session called "Why we do What we do" to help them introspect all the practices we threw at them.

One thing I learnt was that if different consultants have different opinions on a certain situation, then instead of confusing and over whelming the client with all the ideas, the consultants should get to a common understanding first and place that idea before the client whilst presenting other ideas just as suggestions.
The last thing you would ever want is internal conflicts within the consultants.

This was a client who had no QAs in their software development teams and not surprisingly had little knowledge about the QA process itself. I introduced the concept of automation testing and the developers initially suggested to use ruby with cucumber to write the tests in the BDD style. But after a while they started facing difficulties learning a new language. It was also not a good practice to have the code base in .NET and tests written in ruby. So I suggested them to use WatiN along with YatFram. It was good fun pairing with the developers in helping them write the tests.

One of the challenges for writing automation tests was that, because it was a legacy code base, it was not in a very testable state. So to write unit tests the developers had to re-factor the code first. But at the same time they had to draw a line as to how much to re-factor. Thus we had to rely on high level automated browser based user journey tests. But the down side is that these take longer time to run on the build. To reduce the build time I recommended to split the tests into two builds; one running the quick journeys and the a second one running the detailed tests.  Thus it would give the developers a better opportunity to check in frequently while compromising a bit on the constant feedback.

As time progressed, when the tests started doing their job of finding defects when some one changed some code, then the developers started appreciating the value of the tests and became more proactive in writing them.

I had initially suggested that a round of performance testing had to be done before the first deployment to production based on the changes in code base that we had done. But it was de-prioritised until they realised after go live that their servers were running out of memory. So we had to quickly diagnose and fix the problem after which we ran a series of performance tests on the staging environment. I introduced JMeter for this purpose which worked out well. ANTS was used to monitor the server performance.

As a non-german speaker, I did find it a bit difficult to actually test the website even though I managed it eventually.

The website was very heavily dependant on the database. Interestingly they had 28 web servers and just one database server. Moreover all the validations, static information were stored in the database in the form of stored procedures.

We tried setting up different environments as part of the build process. Though we did not achieve a great amount of success automating the deployment process until the end of the project, we managed to improve it over a period of time. The developers fixed the gaps and defects in the process, I was coming across, as it was growing. It was a bit complex as it involved deploying the code base, database, CSS, third party tools and a CMS backend all separately.

During my time in the project, I got an opportunity to hire a person internally for a QA role and mentored him over weeks. The team was very happy that he picked up his role well enough to carry on independently once we leave.

We successfully deployed thrice over the three month period and delivered as per expectation. We also gave the client a feedback and what they could improve for the future. Overall it was a good project, a great client, an amazing country and people. 

Saturday 29 May 2010

One of my smallest yet a challenging project using Citrix machines

Recently I worked on a project for a car insurance company. It was very exciting in terms of the domain as I have not worked in the Insurance domain before. So there was some learning in that end. It was one of the best clients I have worked with; right from the management to down below, everybody was very co-operative in helping us out with all the infrastructure and also about the domain / requirements itself along with a good sense of humor.

The core of the application consisted of calculating the insurances which the advisors in the company's branches would use to communicate with their customers.
The business drive came along when the management found out that the advisors in their branches were giving the most expensive deal (high cost to company) to the customers, thus greatly reducing the revenues of the company. They decided to bring in ThoughtWorks to come up with a pilot application with a set of business defined formulae to bring the revenues back on track. One of the challenges that we had was to deliver the application within a months time. From the initial estimates we figured out that it would take nearly twice of that for us to build a quality application with a clean code base. Further discussions revealed that the business was in a hurry to showcase the application to their teams in a national conference during that month. We came up with a strategy of coming up with a mocked up application with stubbed data and all the front end designs ready for the conference, and then continue with developing the application further in due course of time. This idea was bought well by the business from where our journey began.

The team consisted of just seven of us, the smallest team I have ever been on. Interestingly it was an exceptionally diverse team with each one of us from a different country. Being the lone QA I thought I could influence my ideas to the team, but the developers had stronger opinions of their own. So I had to set up a quality expectation meeting along with the business so that we were all on the same page and not stepping on each others toes.

I joined the project at the initiation phase which was right after the inception. I collaborated closely with the BA to analyze requirements and start writing the acceptance criteria for the stories so that the developers could start off once they were done with setting up the infrastructure.

Liaising with the product owner from the finance department was very helpful in terms of defining the formulae. I then went off and started creating the test data for the complex array of calculations. This greatly assisted me in my manual testing as well as for the automation tests. (Data driven testing.) It was all a cake walk until the product owner came back two weeks before the go live date asking us to change some calculations that were implemented right at the beginning. We had to go ahead implementing this change knowing that it is going to be a huge risk and might have a ripple down effect to the other calculations. But our automation tests around calculations were robust enough to catch any regression defects.

As part of the automation testing, I also started writing BDD tests in WatiN using C#. The developers later implemented them as part of their story development. This technique proves to be mutually beneficial, because as a QA I would get to define the tests with the test data. It would also give the developers a head start in implementing the tests. For this project keeping in mind that the business was not too specific with the quality requirements and the size of the application itself being small, we mostly stuck to the happy paths while defining the automation tests. There is always a compromise between the developer build times and the quality in terms of automation tests. So there should be a mutual understanding as to how much to automate and what to automate. I decided to run rest of the tests manually. I also find it a good practice to keep updating a manual regression test suite as the project evolves, which would act as a reference point at the time of deployments.

Now comes the most interestingly challenging part of the whole project. The application was going to be used by the branch officials through Citrix machines (dumb terminal / thin client). These terminals were connected to the central server on a 512 Kbps connection. The icing on the cake was that the phone lines which the branch advisors use, also share the same bandwidth. Thus it was very critical for us to keep performance, load and stress testing in mind since the beginning of the project and develop the application in such a way that it uses minimum bandwidth at all times.
The client had a "Model office" which they were using to simulate a production environment. But the drawback was that they had just four user terminals, so I wanted to use a testing tool for gaining full confidence in terms of load. The client was interested in an economical tool for this pilot project to test the performance of the application. So Jmeter was recommended as part of the testing approach.  I simulated thousands of users hitting the application server several times and doing a typical user journey thus creating enough initial load on the test server. At the same time I manually started using the terminals in the "Model office" to test the performance of the application, observing the refresh rate and response time. I was also observing statistics on the server in terms of memory leaks, %CPU utilization and Throughput. This gave us a fair idea and also a quick heads up of whether we were taking the right path. But we also needed to consider the difference in the configuration settings between the test server and the production server. So the % difference in the load was to be looked out, for getting a realistic view. We started deploying the application on to production, to monitor how it behaves. We gently started hitting it with Jmeter and saw that we were maxing out the server CPUs. The developers started optimizing the code and Javascript tweaks for IE6 improved the performance of the application.
Understanding how critical it is to go live, the client offered us dedicated servers. The client had an infrastructure in place with huge amounts of servers, CPUs and memory. And all they had to do is virtualize what ever they required for our application without altering their hardware. which I found was a very cost effective way.
We took advantage and beefed up to 4 servers with 4 CPUs on each, which gave us tremendous performance results but found that the CPUs were getting under utilized. So we ramped down to 4 servers and 2 CPUs on each server, which still gave us the required performance and CPU utilization was optimum. We got the statistics from the client and found out that the max Throughput that they would hit is about 12 requests/second. And till about 70 requests/second per server our application was giving an observed response time of about 2 seconds which was in agreeable terms with the customer.
The load balancers were sending the requests to a single server based on the IP address from which the request was coming from. So as Jmeter was running from a single machine, all the requests were being sent to a single server instead of getting load balanced. Also Jmeter itself was not able to handle huge amounts of load and was breaking off at 200 threads. The solution we came up with was  running Jmeter over several computers to achieve the required load on the servers. Later found out that Jmeter needs a "super computer" to run hundreds of threads.
Not having too many Listeners writing test results helped it a bit. But even better is asking Jmeter to write the files directly to a file. Even though I had the root URL in the HTTP Request header, after recording a script, I had to go back and change all the URLs to accommodate the dynamic multi user record generation. We had the user ids in the URL, so random user ids which were being generated by the Jmeter Generator, had to be replaced with the parameter in the URL. Used the gaussian random timer and ramp up periods to simulate closer to realistic conditions.
I also realized the importance of having a dedicated performance environment, because sharing environments caused our daily jobs to slow down by not being able to deploy builds on time, the application itself getting badly hit under heavy load etc.

The servers which were rendering the application on to the terminals were hosting IE6 browser. So we also had to overcome some design challenges. Our UX expert also suggested and came up with loads of design changes which were initially developed by a third party.

The final build was deployed on time to production and the client was very pleased by the application.

Thursday 1 April 2010

My experiences on an Open Source Distributed Agile project

This project called RapidFTR was interesting because we all worked towards a good cause of supporting children who get separated from their families during disasters. So people at ThoughtWorks who were in between projects initially volunteered to help out building the applicationas part of the company's Corporate Social Responsibility.

It was also exciting in terms of me being the lone QA for the team. This was a truly open source distributed agile project and we had to adapt to a lot of process changes. Our aim was to get the source code into the cloud so that anybody and everybody could contribute to it.
This was a Ruby based project. We used all open source tools. GitHub for the source code repository / defect tracking, TeamCity for CI, CouchDB as the database.

There was a huge process tailoring required for the project. This is because, though initially the development was happening only in London, it gradually started getting distributed over to the East and West coast of the US. Thus we could not do pair programming anymore.

Stand ups instead of everyday used to happen biweekly late in the evening GMT, keeping the time zones in mind.

We did not get chances of doing story huddles either, again due to the differences in time and space.
The team came up with the idea that the code be peer reviewed once it is written to over come the lack of pair programming.

Similarly I suggested that we could also do some peer reviewing for the acceptance tests that are written in Mingle before the developers start playing a story. This would enable a knowledge sharing session amongst the team and also a way to have a second eye on the tests.

As a QA I used Cucumber and Webrat. It is a great tool to use. It works similar to Fitness and TWIST if people have prior experiences on these tools.  And one of the advantages is that it has some pre-implemented steps which can be used while writing the BDD tests. The tests run quiet fast too.

The stakeholder who came from a developer background, started getting too much involved in the development / implementation process of the project. This led to a great amount of discussions. I stepped in and suggested if he could just explain what he wants in plain simple english and le tthe developers worry about how it needs to be done, life would be easy for everybody. The stakeholder took this in a very positive way and the situation indeed change in a positive way.

Even after two months into the project I was testing the application on my local host which was quiet a pain. The developers kept forking out of the trunk to work on their feature stories and not merge often. As a QA, I found it very difficult to keep track of the current status of the story that the developers were working on as it was a pain to keep shifting between forks. I suggested them to merge as often as possible so that I can test the application on trunk which also might help me catch integration issues.

In a open source distributed agile project, one of the challenges of the QA is to review the code level tests (acceptance tests, rspec tests etc) along with the automation tests (scenario level) to make sure that the tests are actually testing what they are supposed to.

There were several people involved working solo in SFO, NY, London.

We kept communication to a maximum by opening up a google group and responding to emails through that medium. All most all of us were on skype and gtalk to catch up for a quick chat. We also used Google Wave for discussions. As explained earlier the core team used to meet up biweekly on a skype conference call to catch up with the updates.

We did a mistake of not including the stakeholder since the begining of the project on site. The initial 2-3 weeks of the project went in to skype calls with the stakeholder. But only when he came down to London and we all started talking over the table, did we realise that we were miles apart from understanding each other till then. After some heavy discussions and gaining a better idea, we had to go back and change some basic logic and architecture of the code. This costed us a bit of a time.

I was trying to improve and give emphasis on the performance of the application since the beginning, as that was one of the important lesson learnt from previous projects. But the developers were least concerned about it and like ever before wanted to keep the performance aspect for laters.

My first code jam

I attended my first code jam as an Agile Quality Analyst last week in our ThoughtWorks London office. It was a great experience and learnt quiet a bit to take forward to future code jams.

The code jam was conducted for a project called RapidFTR which is a child finder application to reunite children who get separated from their families during natural disasters.

The developers spent a lot of time setting up their machines on the day of the code jam thus losing precious few initial hours. Ensuring that people have set up their machines before hand to avoid loss of time on the day , or giving yourself extra time before the code jam for the setups and people to settle in, is a good idea. One of the suggestions would be coming up with a Virtual Machine with the required setup, so that people can conveniently download it and start playing.

Preparing ourselves for people with different backgrounds (BAs, QAs, devs etc) and skill sets  by collecting the information before hand, will ensure that everybody is engaged well.

Having a separate code jam environment / cloning the repository, would be nice thus protecting and isolating your master build. It would also help the QAs to rapidly do regression & smoke testing if the developers could keep checking in their code onto this environment. Needless to say having a Continuos Integration running on the code jam environment would help identify build integrity at an earlier stage.

Prioritizing enough stories so that there are no dependencies at any point of time is an useful exercise, so that there are no surprises during the code jam.

If the QAs could come up with the acceptance criteria and also try writing automation tests (I mean the BDD way) before hand for all the stories that are intended to be played in the code jam, then it would be very helpful for the developers to start off on stories and know what exactly needs to be done. Developers could finish implementing the automation tests as part of story completion.

Huddles before the developers picked up stories usually did not go very well as I was the sole QA always running around. Maybe having greater number of QAs having an understanding of the system.

We had a pomodoro up and running to rotate people and also to giving them an occasional break. A good practice is to keep a story owner (one who knows the technology etc) on the story and rotate others.

As a QA, I personally felt that, like usual, I could not go and poke the developers for doing analysis while the story was being played. I found this very difficult as the developers were in a constant time constraint for delivering some sensible functionality by the end of the day.
But I ensured that all the automation tests that were previously written have been properly implemented as part of the development sign off.

I observed that people do get disappointed for not finishing off stories at the end of the day. So we should make sure the stories are short and achievable but also challenging enough for a day.

Having loads of food, beer and freebies cheered up people !