Showing posts with label d. Show all posts
Showing posts with label d. Show all posts

Saturday, October 9, 2010

Damn Java Socket Exception Messages

When creating an error message you should think about what information would be useful for understanding what went wrong. This should especially be true if you are creating a library that is likely to be used by many other systems. Providing good error messages up front means that even if the programmers using the library do not check and customize the messages, the user will still get a reasonable result. For some use cases, such as scripting, it can also be useful because it may be a quick one-off program where the goal is to quickly automate a repetitive task or perform some analysis. In this case, having useful default error messages can speed up the initial development so you get your answer faster.

In my case, I was checking logs for a system that crawls pages and hence attempts to resolve and connect to thousands of hosts. This system logs the exceptions, but unfortunately did not provide a customized message when doing so. Analyzing the logs showed that the two most common error messages were: 1) a failure to resolve the hostname and 2) failing to connect to an HTTP server on the host. An example error message for the first case is java.net.UnknownHostException: some-host-that-does-not-exist. This message is quite useful as the exception name explains the problem and the message tells me the name of the host that could not be resolved. An example message for the second case is java.net.SocketTimeoutException: connect timed out. This message is explains the problem, but doesn't given me the crucial information of what it was trying to connect to.

Though this can easily be fixed in the application code, it is disappointing that the default message is so bad. I have noticed that my opinion of a programming language or technology seems to go down steadily the longer I am forced to use it at work. Is the grass greener on one of the other sides? How do other languages, or rather the networking libraries they provide, fair for this use case? I looked at 13 options to see how many would give a decent error message for both use cases. The results were not very encouraging. Only one option, Go, had reasonable messages for both. For the host not found case 4 options included the hostname. Only two options provided the host and port in the failed to connect case. The results are summarized in the table below the fold with links to the source code and raw error messages.