Saturday, October 9, 2010

Damn Java Socket Exception Messages

When creating an error message you should think about what information would be useful for understanding what went wrong. This should especially be true if you are creating a library that is likely to be used by many other systems. Providing good error messages up front means that even if the programmers using the library do not check and customize the messages, the user will still get a reasonable result. For some use cases, such as scripting, it can also be useful because it may be a quick one-off program where the goal is to quickly automate a repetitive task or perform some analysis. In this case, having useful default error messages can speed up the initial development so you get your answer faster.

In my case, I was checking logs for a system that crawls pages and hence attempts to resolve and connect to thousands of hosts. This system logs the exceptions, but unfortunately did not provide a customized message when doing so. Analyzing the logs showed that the two most common error messages were: 1) a failure to resolve the hostname and 2) failing to connect to an HTTP server on the host. An example error message for the first case is java.net.UnknownHostException: some-host-that-does-not-exist. This message is quite useful as the exception name explains the problem and the message tells me the name of the host that could not be resolved. An example message for the second case is java.net.SocketTimeoutException: connect timed out. This message is explains the problem, but doesn't given me the crucial information of what it was trying to connect to.

Though this can easily be fixed in the application code, it is disappointing that the default message is so bad. I have noticed that my opinion of a programming language or technology seems to go down steadily the longer I am forced to use it at work. Is the grass greener on one of the other sides? How do other languages, or rather the networking libraries they provide, fair for this use case? I looked at 13 options to see how many would give a decent error message for both use cases. The results were not very encouraging. Only one option, Go, had reasonable messages for both. For the host not found case 4 options included the hostname. Only two options provided the host and port in the failed to connect case. The results are summarized in the table below the fold with links to the source code and raw error messages.

Language - LibraryHost not foundUnable to connect
Bash - ncNoYes
C - sys/socket.hNoNo
C++ - boost::asioNoNo
C# - System.Net.SocketsNoNo
D - std.socketYesNo
Erlang - gen_tcpNoNo
Go - net.DialYesYes
Haskell - Network.SocketNoNo
Java - java.net.SocketYesNo
Perl - IO::Socket::INETYesNo
Python - socketNoNo
Ruby - socketNoNo
TCL - socketNoNo

Bash - NC command

#!/bin/bash

if [ "$1" == "" ] || [ "$2" == "" ]; then
    echo "Usage: $0 <host> <port>"
    exit 1
fi

nc -v -w1 $1 $2
Host not found:
$ ./connect.sh some-host-that-does-not-exist 12345
nc: getaddrinfo: nodename nor servname provided, or not known
Unable to connect:
$ ./connect.sh 1.2.3.4 12345
nc: connect to 1.2.3.4 port 12345 (tcp) failed: Operation timed out

C - sys/socket.h

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <netdb.h>

void
die(void (*f)(const char *), const char *label) {
    f(label);
    exit(EXIT_FAILURE);
}

int
main(int argc, char **argv) {
    if (argc < 3) {
        printf("Usage: %s <host> <port>\n", argv[0]);
        return EXIT_FAILURE;
    }

    int s = socket(PF_INET, SOCK_STREAM, 0);
    if (s < 0) {
        die(perror, "socket");
    }

    struct hostent *host = gethostbyname(argv[1]);
    if (host == NULL) {
        die(herror, "gethostbyname");
    }

    struct sockaddr_in addr;
    memset(&addr, 0, sizeof(addr));
    addr.sin_family = AF_INET;
    addr.sin_port = htons(atoi(argv[2]));
    addr.sin_addr = *(struct in_addr *) host->h_addr;

    if (connect(s, (struct sockaddr *) &addr, sizeof(addr)) < 0) {
        die(perror, "connect");
    }

    if (close(s) < 0) {
        die(perror, "close");
    }

    return EXIT_SUCCESS;
}
Host not found:
$ ./connect some-host-that-does-not-exist 12345
gethostbyname: Unknown host
Unable to connect:
$ ./connect 1.2.3.4 12345
connect: Connection timed out

C++ - boost::asio

#include <iostream>
#include <cstdlib>
#include <boost/asio.hpp>
#include <boost/system/error_code.hpp>

using namespace boost::asio;
using namespace boost::asio::ip;

int
main(int argc, char **argv) {
    if (argc < 3) {
        std::cout << "Usage: " << argv[0] << " <host> <port>" << std::endl;
        return EXIT_FAILURE;
    }

    io_service service;

    boost::system::error_code error;
    tcp::resolver resolver(service);
    tcp::resolver::query query(argv[1], argv[2]);
    tcp::resolver::iterator iter = resolver.resolve(query, error);
    tcp::resolver::iterator end;
    if (iter == end) {
        throw boost::system::system_error(error);
    }

    tcp::socket socket(service);
    socket.connect(*iter);
    socket.close();

    return EXIT_SUCCESS;
}
Host not found:
$ ./connect some-host-that-does-not-exist 12345
terminate called after throwing an instance of 'boost::system::system_error'
  what():  Host not found (authoritative)
Abort trap
Unable to connect:
$ ./connect 1.2.3.4 12345
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::system::system_error> >'
  what():  Operation timed out
Abort trap

C# - System.Net.Sockets

Note I tested this using Mono, not the implementation from Microsoft. I don't know if the error message would be exactly the same or not.
using System;
using System.Net.Sockets;

public class Connect {
    static public void Main(String[] args) {
        if (args.Length < 2) {
            Console.WriteLine("Usage: connect.exe <host> <port>");
            Environment.Exit(1);
        }

        TcpClient s = new TcpClient(args[0], int.Parse(args[1]));
        s.Close();
    }
}
Host not found:
$ mono ./connect.exe some-host-that-does-not-exist 12345

Unhandled Exception: System.Net.Sockets.SocketException: No such host is known
  at System.Net.Dns.hostent_to_IPHostEntry (System.String h_name, System.String[] h_aliases, System.String[] h_addrlist) [0x00000] in <filename unknown>:0 
  at System.Net.Dns.GetHostByName (System.String hostName) [0x00000] in <filename unknown>:0 
  at System.Net.Dns.GetHostEntry (System.String hostNameOrAddress) [0x00000] in <filename unknown>:0 
  at System.Net.Dns.GetHostAddresses (System.String hostNameOrAddress) [0x00000] in <filename unknown>:0 
  at System.Net.Sockets.TcpClient.Connect (System.String hostname, Int32 port) [0x00000] in <filename unknown>:0 
  at System.Net.Sockets.TcpClient..ctor (System.String hostname, Int32 port) [0x00000] in <filename unknown>:0 
  at Connect.Main (System.String[] args) [0x00000] in <filename unknown>:0
Unable to connect:
$ mono ./connect.exe 1.2.3.4 12345
Unhandled Exception: System.Net.Sockets.SocketException: Connection timed out
  at System.Net.Sockets.Socket.Connect (System.Net.EndPoint remoteEP) [0x00000] in <filename unknown>:0 
  at System.Net.Sockets.TcpClient.Connect (System.Net.IPEndPoint remote_end_point) [0x00000] in <filename unknown>:0 
  at System.Net.Sockets.TcpClient.Connect (System.Net.IPAddress[] ipAddresses, Int32 port) [0x00000] in <filename unknown>:0

D - std.socket

This is the first, and so far only, program I have written in D. I was pleasantly surprised by how easy it was to get it working. Overall I spent about thirty minutes including the time to download and install the compiler and then learn enough about the language to cobble together this simple example. Definitely a language I will look into more.
import std.conv;
import std.stdio;
import std.socket;

int main(string[] args) {
    if (args.length < 3) {
        writefln("Usage: %s <host> <port>", args[0]);
        return 1;
    }

    Socket s = new Socket(
        AddressFamily.INET, SocketType.STREAM, ProtocolType.TCP);
    s.connect(new InternetAddress(args[1], to!ushort(args[2])));
    s.close();

    return 0;
}
Host not found:
$ ./connect some-host-that-does-not-exist 12345
std.socket.AddressException: Unable to resolve host 'some-host-that-does-not-exist'
----------------
5   connect                             0x00023c5e std std.socket.InternetAddress.__ctor(immutable(char)[], ushort) + 166
6   connect                             0x00001a9e _Dmain + 246
7   connect                             0x00013043 extern (C) int rt.dmain2.main(int, char**) + 23
8   connect                             0x00012f7a extern (C) int rt.dmain2.main(int, char**) + 42
9   connect                             0x0001308b extern (C) int rt.dmain2.main(int, char**) + 59
10  connect                             0x00012f7a extern (C) int rt.dmain2.main(int, char**) + 42
11  connect                             0x00012f07 main + 179
12  connect                             0x0000199d start + 53
13  ???                                 0x00000003 0x0 + 3
Unable to connect:
$ ./connect 1.2.3.4 12345
std.socket.SocketException: Unable to connect socket: Operation timed out
----------------
5   connect                             0x00024301 D3std6socket6Socket7connectMFC3std6socket7AddressZv + 129
6   connect                             0x00001aaa _Dmain + 258
7   connect                             0x00013043 extern (C) int rt.dmain2.main(int, char**) + 23
8   connect                             0x00012f7a extern (C) int rt.dmain2.main(int, char**) + 42
9   connect                             0x0001308b extern (C) int rt.dmain2.main(int, char**) + 59
10  connect                             0x00012f7a extern (C) int rt.dmain2.main(int, char**) + 42
11  connect                             0x00012f07 main + 179
12  connect                             0x0000199d start + 53
13  ???                                 0x00000003 0x0 + 3

Erlang - gen_tcp

-module(connect).
-export([doConnect/2]).

doConnect(Host, Port) ->
    case gen_tcp:connect(Host, Port, [binary, {packet, 0}]) of
        {ok, Socket} ->
            gen_tcp:close(Socket);
        {error, Reason} ->
            io:format("error: ~p~n", [Reason])
    end.
Host not found:
12> connect:doConnect("some-host-that-does-not-exist", 12345).
error: nxdomain
Unable to connect:
13> connect:doConnect("1.2.3.4", 12345).                      
error: etimedout

Go - net.Dial

package main

import (
    "fmt"
    "log"
    "net"
    "os"
)

func main() {
    if len(os.Args) < 3 {
        fmt.Printf("Usage: %s <host> <port>\n", os.Args[0])
        os.Exit(1)
    }

    client, err := net.Dial("tcp", "", os.Args[1] + ":" + os.Args[2])
    if err != nil {
        log.Exit("connect error:", err)
    }
    client.Close()
}
Host not found:
$ ./connect some-host-that-does-not-exist 12345
2010/10/08 12:13:32 connect error: dial tcp some-host-that-does-not-exist:12345: lookup some-host-that-does-not-exist.: no such host
Unable to connect:
$ ./connect 1.2.3.4 12345
2010/10/08 12:15:04 connect error: dial tcp 1.2.3.4:12345: operation timed out

Haskell - Network.Socket

import Data.Maybe
import Network.Socket
import System

doConnect :: [String] -> IO ()

doConnect [host, port] = do
    addrs <- getAddrInfo (Just defaultHints) (Just host) (Just port)
    let addr = head addrs
    s <- socket (addrFamily addr) Stream 0
    connect s (addrAddress addr)
    sClose s

doConnect _ = do
    name <- getProgName
    putStrLn $ "Usage: " ++ name ++ " <host> <port>"
    exitFailure

main = do
    args <- getArgs
    doConnect args
Host not found:
$ ./connect some-host-that-does-not-exist 12345
connect: getAddrInfo: does not exist (nodename nor servname provided, or not known)
Unable to connect:
$ ./connect 1.2.3.4 12345
connect: connect: does not exist (Network is unreachable)

Java - java.net.Socket

import java.net.*;

public class Connect {
    public static void main(String[] args) throws Exception {
        if (args.length < 2) {
            System.out.println("Usage: java Connect <host> <port>");
            System.exit(1);
        }
        InetSocketAddress addr = new InetSocketAddress(
            args[0], Integer.valueOf(args[1]));
        Socket s = new Socket();
        s.connect(addr, 1000);
    }
}
Host not found:
$ java Connect some-host-that-does-not-exist 12345
Exception in thread "main" java.net.UnknownHostException: some-host-that-does-not-exist
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:177)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:529)
        at Connect.main(Connect.java:12)
Unable to connect:
$ java Connect 1.2.3.4 12345           
Exception in thread "main" java.net.SocketTimeoutException: connect timed out
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:529)
        at Connect.main(Connect.java:12)

Perl - IO::Socket::INET

#!/usr/bin/perl

use strict;
use warnings;
use IO::Socket::INET;

if (scalar @ARGV < 2) {
    print "Usage: $0 <host> <port>\n";
    exit 1;
}

my $s = IO::Socket::INET->new(PeerAddr => $ARGV[0], PeerPort => $ARGV[1]);
die $@ unless $s;
$s->close;
Host not found:
$ ./connect.pl some-host-that-does-not-exist 12345
IO::Socket::INET: Bad hostname 'some-host-that-does-not-exist' at ./connect.pl line 13.
Unable to connect:
$ ./connect.pl 1.2.3.4 12345
IO::Socket::INET: connect: Connection timed out at ./connect.pl line 13.

Python - socket

#!/usr/bin/python

import socket
import sys

if len(sys.argv) < 3:
    print "Usage: %s <host> <port>" % sys.argv[0]
    sys.exit(1)

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((sys.argv[1], int(sys.argv[2])))
s.close()
Host not found:
$ ./connect.py some-host-that-does-not-exist 12345
Traceback (most recent call last):
  File "./connect.py", line 11, in ?
    s.connect((sys.argv[1], int(sys.argv[2])))
  File "<string>", line 1, in connect
socket.gaierror: (-2, 'Name or service not known')
Unable to connect:
$ ./connect.py 1.2.3.4 12345
Traceback (most recent call last):
  File "./connect.py", line 11, in ?
    s.connect((sys.argv[1], int(sys.argv[2])))
  File "<string>", line 1, in connect
socket.error: (110, 'Connection timed out')

Ruby - socket

#!/usr/bin/ruby

require 'socket'

if ARGV.length < 2
    print "Usage: #{$0} <host> <port>\n"
    exit(1)
end

s = TCPSocket.open(ARGV[0], ARGV[1])
s.close
Host not found:
$ ./connect.rb some-host-that-does-not-exist 12345
./connect.rb:10:in `initialize': getaddrinfo: Name or service not known (SocketError)
        from ./connect.rb:10:in `open'
        from ./connect.rb:10
Unable to connect:
$ ./connect.rb 1.2.3.4 12345
./connect.rb:10:in `initialize': Connection timed out - connect(2) (Errno::ETIMEDOUT)
        from ./connect.rb:10:in `open'
        from ./connect.rb:10

TCL - socket

#!/usr/bin/tclsh

if {$argc < 2} {
    puts "Usage: $argv0 <host> <port>"
    exit 1
}

set s [socket [lindex $argv 0] [lindex $argv 1]]
close s
Host not found:
$ ./connect.tcl some-host-that-does-not-exist 12345
couldn't open socket: host is unreachable (nodename nor servname provided, or not known)
    while executing
"socket [lindex $argv 0] [lindex $argv 1]"
    invoked from within
"set s [socket [lindex $argv 0] [lindex $argv 1]]"
    (file "./connect.tcl" line 8)
Unable to connect:
$ ./connect.tcl 1.2.3.4 12345
couldn't open socket: connection timed out
    while executing
"socket [lindex $argv 0] [lindex $argv 1]"
    invoked from within
"set s [socket [lindex $argv 0] [lindex $argv 1]]"
    (file "./connect.tcl" line 8)

No comments:

Post a Comment