套接字示例 - 使用简单套接字读取网页

import java.io.*;
import java.net.Socket;

public class Main {

    public static void main(String[] args) throws IOException {//We don't handle Exceptions in this example 
        //Open a socket to stackoverflow.com, port 80
        Socket socket = new Socket("stackoverflow.com",80);

        //Prepare input, output stream before sending request
        OutputStream outStream = socket.getOutputStream();
        InputStream inStream = socket.getInputStream();
        BufferedReader reader = new BufferedReader(new InputStreamReader(inStream));
        PrintWriter writer = new PrintWriter(new BufferedOutputStream(outStream));

        //Send a basic HTTP header
        writer.print("GET / HTTP/1.1\nHost:stackoverflow.com\n\n");
        writer.flush();

        //Read the response
        System.out.println(readFully(reader));

        //Close the socket
        socket.close();
    }
    
    private static String readFully(Reader in) {
        StringBuilder sb = new StringBuilder();
        int BUFFER_SIZE=1024;
        char[] buffer = new char[BUFFER_SIZE]; // or some other size, 
        int charsRead = 0;
        while ( (charsRead  = rd.read(buffer, 0, BUFFER_SIZE)) != -1) {
          sb.append(buffer, 0, charsRead);
        }
    }
}

你应该得到一个以 HTTP/1.1 200 OK 开头的响应,它表示正常的 HTTP 响应,然后是 HTTP 标题的其余部分,接着是 HTML 表单的原始网页。

请注意,readFully() 方法对于防止过早的 EOF 异常非常重要。网页的最后一行可能缺少一个返回,表示行尾,然后 readLine() 会抱怨,所以必须手动读取它或使用 Apache commons-io IOUtils 中的实用程序方法

此示例是使用套接字连接到现有资源的简单演示,它不是访问网页的实用方法。如果你需要使用 Java 访问网页,最好使用现有的 HTTP 客户端库,例如 Apache 的 HTTP 客户端Google 的 HTTP 客户端