Webページを文字コードを変えずに取得して，コード判定 & コード変換

コード判定には
juniversalchardet : Java port of universalchardet
を用いる．ここから juniversalchardet-1.0.3.jar を落としてきて，適当なディレクトリに保存．

文字コード(encoding)を判定できたら，byte[] 型に入れたデータと合わせて，String へ変換．最後のタイミングで，文字列が UTF-8 になってくれる．

byte[] content;

UniversalDetector detector = new UniversalDetector(null);
detector.handleData(content, 0, contentLength);
detector.dataEnd();
encoding = detector.getDetectedCharset();
detector.reset();

String outContent = new String(content, Charset.forName(encoding));  //Java6からサポート