套接字示例 - 使用简单套接字读取网页
import java.io.*;
import java.net.Socket;
public class Main {
public static void main(String[] args) throws IOException {//We don't handle Exceptions in this example
//Open a socket to stackoverflow.com, port 80
Socket socket = new Socket("stackoverflow.com",80);
//Prepare input, output stream before sending request
OutputStream outStream = socket.getOutputStream();
InputStream inStream = socket.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(inStream));
PrintWriter writer = new PrintWriter(new BufferedOutputStream(outStream));
//Send a basic HTTP header
writer.print("GET / HTTP/1.1\nHost:stackoverflow.com\n\n");
writer.flush();
//Read the response
System.out.println(readFully(reader));
//Close the socket
socket.close();
}
private static String readFully(Reader in) {
StringBuilder sb = new StringBuilder();
int BUFFER_SIZE=1024;
char[] buffer = new char[BUFFER_SIZE]; // or some other size,
int charsRead = 0;
while ( (charsRead = rd.read(buffer, 0, BUFFER_SIZE)) != -1) {
sb.append(buffer, 0, charsRead);
}
}
}
你应该得到一个以 HTTP/1.1 200 OK
开头的响应,它表示正常的 HTTP 响应,然后是 HTTP 标题的其余部分,接着是 HTML 表单的原始网页。
请注意,readFully()
方法对于防止过早的 EOF 异常非常重要。网页的最后一行可能缺少一个返回,表示行尾,然后 readLine()
会抱怨,所以必须手动读取它或使用 Apache commons-io IOUtils 中的实用程序方法
此示例是使用套接字连接到现有资源的简单演示,它不是访问网页的实用方法。如果你需要使用 Java 访问网页,最好使用现有的 HTTP 客户端库,例如 Apache 的 HTTP 客户端或 Google 的 HTTP 客户端