09 January 2009

Enabling Arabic in Java web applications (JSP apps and Struts apps )

Update here:
http://m-hewedy.blogspot.com/2010/05/enable-arabic-non-ascii-characters-for.html


Enabling Arabic (Unicode) in Java web applications (JSP apps and Struts apps )

One of the most big problems you can face as a java web developer is how
to make you web application accepts Arabic (non-ISO-8859-1 characters)
characters from the users in the input text boxes.

In JSP/Servlet applications, a small googling can solve the problem, but
with a framework such as Struts, the matter differ somewhat.

If your web application uses the Database ( the most if not all does )
you should insure that the problem is not in the Database itself that cannot
save Arabic characters, you should try hand-entering Arabic words in varchar
and nvarchar fields, if it accepts Arabic characters, then the problem is in
the JSP/Servlet or Struts and you will find the solution here.

1- To Enable Arabic for request parameteres of "POST" method:

For pure JSP/Servlet Applications :

Two steps and every thing well be done.

First: in each JSP page write this tag at the top of the page :

<%@page language="java" contentType="text/html; charset=UTF-8"%>

Second : in each Servlet that works as the controller for you JSPs, write these two statements at the top of your doPost() :

request.setCharacterEncoding("UTF-8");
response.setCharacterEncoding("UTF-8");

That’s all about Enabling Arabic character acceptation in your JSP forms

For Struts Applications:

Also two steps :
First: in each JSP page write this tag at the top of the page :

<%@page language="java" contentType="text/html; charset=UTF-8"%>

Second: write a Filter class the wraps your org.apache.struts.action.ActionServlet class and put these two statements in its doFilter() method

request.setCharacterEncoding("UTF-8");

response.setCharacterEncoding("UTF-8");

example filter :

public class ArabicEncodingFilter implements Filter {

private void doBeforeProcessing(ServletRequest request, ServletResponse response)

throws IOException, ServletException {

request.setCharacterEncoding("UTF-8");

response.setCharacterEncoding("UTF-8");

}

private void doAfterProcessing(ServletRequest
request, ServletResponse response)
throws IOException, ServletException {

}

public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)throws IOException, ServletException {

doBeforeProcessing(request, response);

Throwable problem = null;

try {

chain.doFilter(request, response);

} catch(Throwable t) {

problem = t;

t.printStackTrace();

}

doAfterProcessing(request, response);

}}}

And then wrap all your Servlets with you Filter by putting this in web.xml :

<filter>

<filter-name>EncodingFilter</filter-name>

<filter-class>ArabicEncodingFilter</filter-class>

</filter>

<filter-mapping>

<filter-name>EncodingFilter</filter-name>

<url-pattern>/*</url-pattern>

</filter-mapping>

2- Enable Arabic for request parameteres of "GET" method:

It depends on your server (tomcat/JBoss, etc), In tomcat 6 do the following:

set the URIEncoding attribute of the element in /conf/server.xml to UTF-8.

Or you can retrieve the request QueryString as is and URLDecode it using the desired encoding (UTF-8).

You can Wrtie a HttpServletRequestWrapper that wraps your HttpServletRequest's getParameter methods .

see :

http://balusc.blogspot.com/2009/05/unicode-how-to-get-characters-right.html
http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/
http://www.joelonsoftware.com/articles/Unicode.html
http://java.sun.com/javaee/5/docs/tutorial/doc/bnayb.html



for more, complete solustion, see : http://m-hewedy.blogspot.com/2010/05/enable-arabic-non-ascii-characters-for.html

4 comments:

elmyelmak said...

ارجو البحث عن حل لمشكلة prochat
لان بالكتابة عادى تظهر عربي لكن تصل الى الطرف الاخر برموزمثل
ط"ظ

mhewedy said...

ممكن توضح أكتر ؟

Mahmoud said...

what's the role of the following two lines? Do they affect the http header? Thanks

request.setCharacterEncoding("UTF-8");
response.setCharacterEncoding("UTF-8");

mhewedy said...

Welcome Dr Mahmoud,

First please consider using the following updated topic instead:


Enable Arabic (non-ASCII) characters in request parameteres of "GET" method


second:

request.setCharacterEncoding("UTF-8");

It doesn't affect the request as the request is already send by the client. It actually sets the character encoding of the request body (which contains the parameter data in case of POST, and empty in case of GET)


response.setCharacterEncoding("UTF-8");


Quote from Servlet API

In the case of HTTP, the character encoding is communicated as part of the Content-Type header for text media types