JAVA Pattern & Matcher

어느 페이지에서의 소스를 긁어와서 원하는 데이터만 뽑을 때 필요한 JAVA Pattern & Matcher ㅋㅋㅋ

URL url = new URL("해당 페이지 URL");

BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));

String inputLine;

while((inputLine = in.readLine()) != null){

inputLine = inputLine.replaceAll("\\p{Z}", ""); // 공백제거 ㅋㅋㅋ

Pattern pt = Pattern.compile("<tdalign=\"right\">(.*?)</td>", Pattern.DOTALL);

Matcher mc = pt.matcher(inputLine);

String tmp1;

while( mc.find() ){

tmp1 = mc.group();

System.out.println(tmp1);

}

// String txt;

// while( mc.find() ){

// txt = mc.group(1); // <td>태그 안에 있는 데이터 뽑기

// aa[i] = txt;

// i++;

// }

}

.trim() 이 안먹힐때는 .replaceAll()을 쓰면 되는데, 이것도 안먹힐 때가 있음. 그에대한 해결책은

.replaceAll(" ", ""); <-- 이거 안먹힐 때 있음.

해결책은 아래 두개

.replaceAll("\\p{Z}", ""); <-- 전체 공백 제거

.replaceAll("(^\\p{Z}+|\\p{Z}+$)", "");; <-- 앞뒤 공백 제거

환's World