Showing posts with label XML. Show all posts
Showing posts with label XML. Show all posts

Wednesday, March 3, 2010

Calling an ASP.NET web service from a Java application

First, the Java application must contain a class that will encapsulate the functionality to invoke a XML web service via SOAP calls. The code is below:

import java.net.*;
import java.io.*;
import org.w3c.dom.Document;
import org.w3c.dom.*;

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;

public class WebServiceInvoker {
private String HTTPContinue = "^.*HTTP/[0-9].[0-9] 1[0-9][0-9].*$";
private String HTTPOk = "^.*HTTP/[0-9].[0-9] 200.*$";
private String HTTPSuccess = "^.*HTTP/[0-9].[0-9] 2[0-9][0-9].*$";
private String HTTPRedirect = "^.*HTTP/[0-9].[0-9] 3[0-9][0-9].*$";
private String HTTPClientError = "^.*HTTP/[0-9].[0-9] 4[0-9][0-9].*$";
private String HTTPServerError = "^.*HTTP/[0-9].[0-9] 5[0-9][0-9].*$";

private String _namespace;
private String _host;
private String _path;
private int _port;

private int _timeout;

/**
* @param host The server hosting the web service
* @param port The port on the server which will accept the call (default 80)
* @param wsPath The relative path of the web service on the server
* @param namespace The namespace defined in the webservice ("http://tempuri.org")
*/
public WebServiceInvoker(String host, int port, String wsPath,
String namespace) {
_host = host;
_port = port;
_path = wsPath;
_namespace = namespace;
_timeout = 60000;
}

public void setTimeout(int timeout) {
_timeout = timeout;
}

public int getTimeout() {
return _timeout;
}

public String invokeRPC(String remoteProcedure, ParameterCollection params)
throws Exception {
String result = "";

try {
String xmlData = ""
+ ""
+ "" + "<" + remoteProcedure + " xmlns=\""
+ _namespace + "\">";
for (int i = 0; i < params.size(); ++i) {
String[] sParam = params.Get(i);
if (sParam.length == 2) {
String paramName = sParam[0];
String paramValue = GeneralUtil.FixSpecialChar_forXML(sParam[1]);

xmlData += "<" + paramName + ">";
xmlData += paramValue;
xmlData += "";
} else {
System.out.println(xmlData);
throw new Exception("Poorly formatted parameter returned from Parameter Collection. WsTester.testWs");
}
}

xmlData += "" + "
"
+ "
";
InetAddress addr = InetAddress.getByName(_host);
Socket sock = new Socket(addr, _port);
sock.setSoTimeout(_timeout);

// Send header
BufferedWriter wr = new BufferedWriter(new OutputStreamWriter(sock.getOutputStream(), "UTF-8"));
wr.write("POST " + _path + " HTTP/1.1\r\n");
wr.write("Host: " + _host + "\r\n");
//wr.write("Content-Type: application/soap+xml; charset=\"utf-8\"; action=\"http://[URL to ASP.NET web service]\" \r\n");
wr.write("Content-Type: application/soap+xml; charset=\"utf-8\"; action=\"\" \r\n");
wr.write("Content-Length: " + xmlData.length() + "\r\n");
wr.write("\r\n");

// Send data
System.out.println("Soap Message:\n" + xmlData);
wr.write(xmlData);
wr.flush();

// Response
BufferedReader rd = new BufferedReader(new InputStreamReader(sock.getInputStream()));
String line = "";
String header = "";
boolean bContinue = true;
int byteCount = 0;
// Read Header
while (bContinue) {
do {
line = rd.readLine();
header += line + " ";
} while (!line.matches(""));

if (header.matches(HTTPContinue)) {
// Connection's established. Wait for the next header
header = "";
bContinue = true;
} else if (header.matches(HTTPOk)) {
// HTTP Ok. Retreive the data
byteCount = parseContentLength(header);
header = "";
bContinue = false;
} else if (header.matches(HTTPSuccess)) {
// Some other non-error success code
// Try again to see what happens, or wait until the socket
// times out
header = "";
bContinue = true;
} else if (header.matches(HTTPRedirect)) {
// Shouldn't encounter this one, but if we do I have no idea
// how to handle it
bContinue = false;
throw new Exception("HTTP Redirect encountered:\r\n"
+ header);
} else if (header.matches(HTTPClientError)) {
// Client error, most likely a Server not Found (404) or
// Forbidden (403, bad credentials).
//header = "";
bContinue = false;
throw new Exception("HTTP Client Error encountered:\r\n"
+ header);
} else if (header.matches(HTTPServerError)) {
// Server errors. Internal Service Error (500) type errors
byteCount = parseContentLength(header);
char c[] = new char[byteCount];
rd.read(c, 0, byteCount);

//header = "";
bContinue = false;

throw new Exception("HTTP Server Error encountered:\r\n" + header + "\r\n" + String.valueOf(c));
} else {
throw new Exception("Unknown HTTP header:\r\n" + header);
}
}

if (byteCount != 0) {
// Read Data
char c[] = new char[byteCount];
rd.read(c, 0, byteCount);

DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();

StringReader reader = new StringReader(String.valueOf(c)+ "\r\n");
InputSource source = new InputSource(reader);

Document doc = docBuilder.parse(source);
doc.getDocumentElement().normalize();

NodeList list = doc.getElementsByTagName(remoteProcedure + "Result");
if (list.getLength() > 0) {
result = list.item(0).getFirstChild().getNodeValue();
}
}
}

catch (Exception ex) {
System.out.println(ex.toString());
}

return result;
}

private int parseContentLength(String header) {
try {
int start = header.indexOf("Content-Length: ");
int end = header.indexOf(" ", start + 16);
String contentLine = header.substring(start, end);
contentLine = contentLine.replaceAll("^Content-Length: ", "");
contentLine = contentLine.trim();

return Integer.valueOf(contentLine).intValue();
} catch (Exception ex) {
return 0;
}
}
}

Also create a class called ParameterCollection which will hold the parameters to the web method invoked by the Java invoker:

public class ParameterCollection {
private String[][] _params;
int _index;
int _size = 10;

public ParameterCollection()
{
_params = new String[_size][2];
_index = 0;
}

public void Add(String param_name, String param_value)
{
if (_index >= _size)
{
IncreaseSize();
}

_params[_index][0] = param_name;
_params[_index][1] = param_value;

++ _index;
}

public String[] Get(int index) throws Exception
{
String[] retVal = new String[2];

if (index >= _index)
{
// Out of bounds
throw new Exception("Parameter Collection index is out of bounds: ParameterCollection.Get(" + index + ")");
}

retVal[0] = _params[index][0];
retVal[1] = _params[index][1];

return retVal;
}

public int size()
{
return _index;
}

private void IncreaseSize()
{
String[][] sOld = _params;
_size *= 2;

_params = new String[_size][2];

for (int i = 0; i < _size/2; ++i)
{
_params[i][0] = sOld[i][0];
_params[i][1] = sOld[i][1];
}
}
}


Next, after using the Visual Studio ASP.NET web service project wizard to create a web service project, add the following attribute to the class declaration of the class representing the web service:

[SoapDocumentService(RoutingStyle=SoapServiceRoutingStyle.RequestElement)]

The class declaration should look something like:
[WebService(Namespace = "http://tempuri.org/")]
[WebServiceBinding(ConformsTo = WsiProfiles.BasicProfile1_1)]
[ToolboxItem(false)]
[SoapDocumentService(RoutingStyle=SoapServiceRoutingStyle.RequestElement)]
public class Service1 : System.Web.Services.WebService


In the java class that will invoke the web service, create an instance of the WebServiceInvoker class:

WebServiceInvoker proxy = new WebServiceInvoker(webserviceservername, port, webserviceurl);

Create a list of parameters and populate appropriately:
ParameterCollection() oColl = new ParameterCollection();
oCall.Add(“parameter1”, “value1”);

Then use this as a parameter to the invokeRPC method of the web service proxy class:

Proxy.invokeRPC(“Name of web method to call”, oCall);

In IIS, make sure the ASP.NET web service accepts anonymous connections, and set which user the anonymous connections will use to access resources.

Thursday, December 25, 2008

Performance issues in querying XML in SQL Server

Introduction
One of the challenges in the Canadian broadcasting industry is to prove to the CRTC that what a broadcaster aired on television is actually what they wanted to air, and if a discrepancy exists, have the ability to explain it. To do so, broadcast trafficking systems like S4M and SDS generate playlists and asrun logs, which must be reconciled per CRTC regulations. I was tasked with developing a simple application to reconcile these playlists and asrun logs with each other. Since the playlists were provided as XML files, I stored the playlists in a SQL Server XML column so when provided with an asrun log, I could always go back to the database and find the associated playlist easily with a SQL query. Playlists were linked to asrun log files via an event id, which was an attribute in the XML playlist file but a field on a flat text asrun log. This meant that every time I needed to verify the event record in the asrun log had a matching playlist event, I would have to parse out the event id from a flat text row, then query the XML column for an event XML node with an eventid attribute with a matching value.

That was the business problem I was attempting to address. To translate it into SQL syntax, I had to execute the following query every time I wanted to reconcile an event record:

SELECT Playlist.query('for $A in /dlprog/dldate/dlevent
where $A/@eventid[.="79576616"]
return
$A '
)
FROM [Corporate_Applications].[dbo].[BIP_Playlists]

(Many thanks to Yihong Xu, our trustworthy data architect, in providing this query).

The Performance Issue:
If one examines the above SELECT statement query, it will become obvious that the records are not filtered with a WHERE clause. Instead, all the records are subjected to a XPath query in the SELECT clause. So when I innocently applied this query in my code, my reconciliation of the asrun log against the playlist took approximately 14 minutes.

The Solution:
Performance issues are normally rooted in architecture, so I revisited my application and in particular, my data architecture. Obviously, the first thing I focused on was reducing the time taken for the query. Since all the querying was done in the XPath expression in the SELECT statement, I had to pull that logic out and put it into a WHERE clause. However, the existing data architecture would not allow me.

Therefore, the only choice I had was to create another table to hold the eventid information I would be querying against in its own column, as opposed to buried in the XML data in the XML column, as well as the related values I would need to SELECT from the dlevent node. I then populated this table as I read the playlist files, as at that point I was inserting the entire playlist file as a XML field into the SQL Server table anyway. Next, I adjusted the above SELECT statement to query against the eventid column of my new table, and the resulting performance improvement was staggering. The resulting reconciliations now took roughly 1 minute, meaning a performance improvement of about 14 times.

Lesson learned
XPath queries embedded in SELECT clauses are very inefficient. If there is an attribute in a piece of XML data you will constantly be querying, create a separate table that holds that attribute value as a column, along with foreign keys to link back to the original table with the XML column.

Looking back on the problem, I would have done a deeper analysis of the data architecture to ensure that the table structures were optimized for SELECT queries, as opposed to just aiming for simplicity. I would also have used a different content management system to archive the playlists, such as Documentum, since it more naturally supports management and chunking of XML data, as well as indexing the XML data for fast retrieval.