Monday, October 29, 2007

LoadRunner Scripting Challenge - LiveJournal

My last few posts have unabashedly been about LoadRunner even though I vowed to diversify my writings into more distinct areas. But this one again is about LoadRunner and is too interesting to pass...

Background:

I had an old journal in LiveJournal back from the times when 'blogging' wasn't a very familiar word or activity and most of the popular blogging sites like Blogger and WordPress (these are the only free ones that I've liked so far) didn't exist. I came across LiveJournal and was immediately hooked. I wrote prolifically at that time and I should also mention that my creative writing abilities then were much superior to my current showcase. But over time, I've outgrown LiveJournal because their free version doesn't provide the most needed features and the one that does is ad-supported. To get rid of the ads, you have to pay a nominal amount. I think that wasn't the only reason and I was a paid member for sometime supporting their efforts. But I thought I needed a more traditional 'blog' rather than a journal. I also haven't been writing much lately and have no communities that I would like to keep track of in LiveJournal.

So the result was some of my most cherished writings were stored in LiveJournal's history of posts and I wanted to import all of them to a WordPress blog. WordPress allows importing an XML file of LiveJournal formatted entries. The problem however is that LiveJournal allows exporting the posts only by month. So if I wanted to export my posts from back in 2002, I would have to export it month by month for over 6 years worth of posts. That was certainly not very exciting. But soon it became pretty exciting when I decided to use LoadRunner to export all my posts month by month in XML format and I learned something very interesting in the process - which is of course the whole idea of this post.

Challenge:

When I recorded the script by going to the export page and logging in, I noticed in the script that my password wasn't hard coded anywhere. I expected to find it with the request when I submitted the form containing my username and password. Instead, there were the "chal" field and the "response" field and the password field was blank.

web_submit_data("login.bml",
"Action=http://www.livejournal.com/login.bml?ret=1",
"Method=POST",
"RecContentType=text/html",
"Referer=http://www.livejournal.com/?returnto=/export.bml",
"Snapshot=t5.inf",
"Mode=HTML",
ITEMDATA,
"Name=returnto", "Value=/export.bml", ENDITEM,
"Name=mode", "Value=login", ENDITEM,
"Name=chal", "Value=c0:1193421600:2223:300:DsqGiIDCk0A4hs9sNsmH:c7f09b46d0ba2d82bb68945ea84532fc", ENDITEM,
"Name=response", "Value=b84434e3c2d193c4a1c345d7874a89a3", ENDITEM,
"Name=user", "Value=myusername", ENDITEM,
"Name=password", "Value=", ENDITEM,

Honestly, that was surprising to me. At that time, I couldn't come up with any explanation of how my user ID can be authenticated and successfully logged in without sending the password in form submission. I could smell the hint of another challenge. If you want to try it before reading on to get a better handle, try recording a script on LiveJournal (http://www.livejournal.com) by logging in and making it work.

Solution:

On thinking a little more and looking at the script again, I realized that they could be using JavaScript to generate the response which was some kind of hashed form of the password. On viewing the source of the initial page, I found a js file with this code:

var pass = pass_field.value;
var chal = chal_field.value;
var res = MD5(chal + MD5(pass));
resp_field.value = res;
pass_field.value = ""; // dont send clear-text password!

So on further investigation, I figured what was happening:

1. On initial navigation to any page with the login form, a challenge string is sent to the client. I guess it has to have some kind of expiration time because if you use it sometime later with the correct response, it'll return that the challenge has expired.

2. An event handler is registered with the submit button of the login form.

3. When the submit button is pressed, the event handler takes the field values including the hidden challenge string. It takes care of creating an MD5 digest of the challenge string concatenated with the MD5 digest of the password and clearing the actual password field value. This is the value that you see in the "response" form parameter in the code above. The values are then submitted and the server takes care of validating the response value.

This page explains this in high-level: http://blog.paranoidferret.com/index.php/2007/07/22/secure-authentication-without-ssl-using-javascript/

So to make the LR script work, I had to do 2 things:

1. Get the value of the challenge by correlating appropriately

2. Once I have the challenge string, create the 'response' value by getting the MD5 digest of the challenge string concatenated with digest of the password string.

The MD5 digest calculation is based on the MD5 Message Digest Algorithm by Ron Rivest of MIT. There was no way I was going to write an MD5 function myself. A quick search revealed some implementations of this algorithm. The one I used was from http://www.fourmilab.ch/md5/ because this was in C and saved me from including a lot of legalese. All I had to do was include the header and C file and use the MD5 functions. So this is the outline of final script. I have left out the details which should serve as an interesting exercise.

1. Firstly, get the MD5 digest of the password. This will later be used to concatenate to the challenge string.

MD5Init(&md5c);
MD5Update(&md5c,(unsigned char *)mesg,strlen(mesg));
MD5Final(signature,&md5c);

where mesg is the password string.

2. Correlate the script to save the challenge string at the appropriate place. This is the string that is sent by the server as a hidden form parameter of the login form.

3. Once you have the challenge string, concatenate the MD5 digest of the password (calculated in Step 1) to it. Remember that this digest has to be concatenated as a lower-case hexadecimal number. Get the MD5 digest of this concatenated string.

4. Pass this MD5 digest calculated in Step 3 as the value to 'response' parameter in the login step:

"Name=chal", "Value={chal}", ENDITEM,
"Name=response", "Value={response}", ENDITEM,

That should do it. Now I can parameterize the year and month values in export step and export all my journal entries by month in XML files. It was a simple task to concatenate all the files into 1 big XML file and importing it through WordPress blogging features. I'm a happy WordPress blog user now...we'll just have to see how long that lasts before I start looking for other options.

I guess I should also add that the exercise gave me a great opportunity to learn how web site authentication can be handled without using SSL and how I can script that in LoadRunner.

Note 1: Since creating this script, I explored LiveJournal's server protocol documentation and found that they have documented the authentication mechanism pretty well. Great work in encouraging other users to build custom clients.

Note 2: I felt I should mention that LoadRunner licensing agreement specifically prohibits using the software to run tests on public domain sites. It basically means that you cannot load up your LR Controller with the script you created above and run a load test with any number of virtual users.

3 comments:

  1. hi Gaurav,

    I have a situation where , i had sent an xml request payload using the web_custom_request from load runner.

    Since i get the response as a string , how do i check for a particular text in the response?

    Please let me know your thoughts.

    Thanks,
    aswak

    ReplyDelete
  2. I have a question on how you implemented this. Do you have any contact info?

    ReplyDelete
  3. Nice thoughts, but can you post a sample LR script with directions on where you placed the *.h files?

    ReplyDelete