java - Extract text between two tags in CSS-less HTML -

- April 15, 2010

Using the JSOU, what would be an optimal approach to extracting text, in which its pattern is known (

 [number])   
) but remains in an HTML page that uses CSS or not divs, spans, squares or other identities of any type (yup, the old HTML page is not my control) ?   The only thing that consistently identifies that text segment (and is guaranteed to stay in it) is that HTML  always  looks like this (within a large part of HTML ):  
  & lt; Hour & gt; 2 %% 17 & lt; Hour & gt; (Numbers 2 and 17 are examples only. They can be any number and, in fact, these are two variables that I need to remove from that HTML page reliably .)   If the text is an attached and specific   or  & lt; Div & gt;  was within the identity, then I would not have any problem in extracting it using JSW. The problem is that this is not the case and the only way that I can think of right now (which is  not  all elegant)  raw  is to process through HTML Regex .  
 The raw HTML processing is disabled through a regex, however because I already have it parsed in a DOM through Jsoup.  
 Suggestions?   
 
  How about this?  
  Document document = Jsoup.connect (url) .get (); Elements = documents. Select ("hr"); Pattern Pattern = Pattern.compile ("(\\ d + %% \\ d +)"); For (element hour: hrs) {string textfthr = hr Newcast Sibling () ToString (); Matcher matcher = pattern.matcher (textAfterHr); While (matcher.find ()) {System.out.println (matcher.group (1)); // & lt; - there, your data}}    

 




  



















Get link





Facebook





X





Pinterest





Email





Other Apps

Comments Post a Comment

Search This Blog

Lay Page

java - Extract text between two tags in CSS-less HTML -

Comments

Post a Comment

Popular posts from this blog

mysql - BLOB/TEXT column 'value' used in key specification without a key length -

winapi - example code and API to log a user on and create a session and desktop - looking for - for Windows 7 -

c# - Using Vici cool Storage with monodroid -