Wednesday, October 8, 2008

LoadRunner: Creating manual web_custom_request for HTTP file uploads

There was a recent forum post in Advanced LoadRunner yahoo group asking how LoadRunner captures the file upload requests and if there was a way to customize that request. At that time, I replied that it may be possible to generate a manual web_custom_request by using the underlying raw HTTP request. I hadn't tried it then but it sounded interesting enough to be marked as to-do. I recently got some time to try it and it was fun.

Problem:

As mentioned in the forum post, LoadRunner captures any form based file uploads as web_submit_data with the file name (and location) within the list of data within the request. An example request:

web_submit_data("FileUpload",
        "Action={URL}",
        "Method=POST",
        "EncType=multipart/form-data",
        "TargetFrame=",
        "RecContentType=text/html",
        "Mode=HTML",
        ITEMDATA,
        "Name=File", "Value=C:\\testdata\\readme1.txt", "File=yes", ENDITEM,
        LAST);

There are 2 obvious drawbacks to this:

1. For this kind of script, the files to be uploaded have to be transferred to each of the Load Generator machines being used in the scenario. Or they have to be transferred to a shared network location so that both the machine used for creating the script and the load generator machines can reference it.

2. As mentioned in the forum post, any parameterization in the file contents is not possible. If the scenario requires files to be uploaded iteratively with unique names, that many files have to be created manually.

Solution:

There are 2 solutions to this and one is more universally acceptable (by web servers) than the other. I'll describe both and let you decide which one you want to use. I also highly recommend reading RFC1867 which provides excellent background on how multipart file uploads are implemented in HTML form based submissions.

1. web_custom_request POST: To arrive at the solution, I did exactly what I usually do when I'm stuck with some HTML script or want to see the raw HTTP request being sent to any web server. I opened Webscarab to capture the HTTP request that was being sent for file uploads. (Webscarab related post here).

Here's a screenshot of the HTTP post request as captured by Webscarab:

WebscarabFileUploadCapture  

And here's the relevant portions of the raw request:

Content-Type: multipart/form-data; boundary=---------------------------327572003712859
Content-length: 228

-----------------------------327572003712859
Content-Disposition: form-data; name="File"; filename="readme1.txt"
Content-Type: text/plain

testdata
readme
readme
123456789
-----------------------------327572003712859--

The Content-Type HTTP header is what specifies the submission to be a multipart data submission. The "boundary" is a string that doesn't occur in the file data and is used to mark the boundaries of the data being sent. Content-length, as the name implies is the length of the data that is being sent.

The body of the request starts with the boundary string and is followed by the 2 headers to specify the content-disposition which includes the name of the file to be uploaded. Content-type specifies the encoding of the data being submitted. Following that within the body is a blank line and the actual contents of the file that will be uploaded. After the contents is the boundary once again to signify end of the submission data.

So this kinda makes it easier to create a web_custom_request. The Content-Type HTTP header is specified by "EncType" attribute of the function. Content-length header can be ignored since web_custom_request itself generates it by calculating the length of the data at runtime. The rest goes in the "Body" attribute starting with the boundary. Content-Disposition specifies among other things the name of the file to be uploaded and this can be parameterized. Content-Type is the type of file being uploaded and can be text/plain, image/gif etc. And in the end is the boundary once again. The final request looks like this:

web_custom_request("{rndString}.txt",
        "URL={testURL}",
        "Method=POST",
        "EncType=multipart/form-data; boundary=---------------------------17773322701763",
        "TargetFrame=",
        "Resource=1",
        "RecContentType=application/octet-stream",
        "Body=-----------------------------17773322701763\r\n"
        "Content-Disposition: form-data; name=\"File\"; filename=\"{rndString}.txt\"\r\n"
        "Content-Type: text/plain\r\n"
        "\r\n"
        "testdata\r\n"
        "readme\r\n"
        "readme\r\n"
        "{rndString}\r\n"
        "-----------------------------17773322701763--\r\n",
        LAST);       

I have parameterized the file name to be a random string so I can run this script iteratively without having to worry about server rejecting duplicate files. I can also parameterize the contents of the file within the body to whatever I like so that each file is unique or based on some other logic.

2. web_custom_request PUT: The second option is more straightforward but less universally accepted. I've read that not all web servers implement this. It involves using the HTTP PUT to create a web_custom_request. For more on different HTTP methods, see http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html. It mentions that:

The fundamental difference between the POST and PUT requests is reflected in the different meaning of the Request-URI. The URI in a POST request identifies the resource that will handle the enclosed entity. That resource might be a data-accepting process, a gateway to some other protocol, or a separate entity that accepts annotations. In contrast, the URI in a PUT request identifies the entity enclosed with the request -- the user agent knows what URI is intended and the server MUST NOT attempt to apply the request to some other resource.

I've also read that PUT is more efficient so if your test site implements the uploads using POST, better do it that way instead. Nevertheless, here is the web_custom_request with PUT:

web_custom_request("{rndString}.txt",
        "URL={URL}/{rndString}.txt",
        "Method=PUT",
        "Resource=1",
        "RecContentType=application/octet-stream",
        "Body=testdata\r\n"
        "readme\r\n"
        "readme\r\n"
        "{rndString}\r\n",
        LAST);

The resulting function is straightforward, URL attribute is the URI identifying the file to be uploaded and Body is the exact contents of the file.