21 April 2010

How to send Query String throw the form GET method using specific encoding

In this post, I'll not talk much about Java, but about URL Encoding in General, although, I'll use java code to Illustrate that.

As you might know, the Get method of the HTTP protocol puts the query string in the request header.

example:

<html>
<head>
</head>
<body>
<form method="GET" action="/some/server">
<input type="text" name="name1" value="val1" />
<input type="text" name="name2" value="val2" />
<input type="text" name="name3" value="val3" />
<input type="submit" />
</form>
</body>
</html>


When you submit this form, the query string will looks like:
name1=val1&name2=val2&name3=val3


this appears for you in the browser navigation box.

for more info see:
http://en.wikipedia.org/wiki/Query_string and http://en.wikipedia.org/wiki/Percent-encoding

As it is clear from wiki links, not all characters can be included as is in the Query String, that characters get "URL Encoded" , and this URL encoding converts these characters in the form %HH, and this conversion (encoding) done based on the encoding schema, as the UTF-8 encoding schema encodes 'أ' to '%D8%A3' whereas other encoding such as ISO-8859-6 encodes it to '%C3'.

BTW, character 'أ', is the first character in the Arabic Language (proudly my native language).


The browser do URL Encoding to the Query String and sends this request to the server, So the server has to "Decode" this Query String back to the original characters to manipulate it as it's needs (inserts in db, etc ..)

Example:

index.html:

<html>
<head>
<meta http-equiv='Content-Type' content='text/html; charset=UTF-8'>
</head>
<body>
<form method="GET" action="TestServlet" accept-charset="UTF-8" >
<input type="text" name="name1" /> <!-- put here any non-ascii char, ex أ >
<input type="submit" />
</form>
</body>
</html>


TestServlet.java

package com.forat;

import java.io.IOException;
import java.net.URLDecoder;
import java.util.Enumeration;

import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

public class TestServlet extends HttpServlet {
private static final long serialVersionUID = 1L;

/**
* @see HttpServlet#doGet(HttpServletRequest request, HttpServletResponse response)
*/
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
String qString = request.getQueryString();

System.out.println("Raw Query String: " + qString);
System.out.println("Decoded Query String: " + URLDecoder.decode(qString, "UTF-8"));
}
}


Try to change the encoding of index.html and TestServlet.jsp and notice the different encoding representations.

1 comment:

Arafat Hossain Piyada said...

I'm a Java Language learner, however don't have time much this day but your article just teach me something interesting. Thanks.