java - Extract text between two tags in CSS-less HTML -


Using the JSOU, what would be an optimal approach to extracting text, in which its pattern is known ( [number])

) but remains in an HTML page that uses CSS or not divs, spans, squares or other identities of any type (yup, the old HTML page is not my control) ?

The only thing that consistently identifies that text segment (and is guaranteed to stay in it) is that HTML always looks like this (within a large part of HTML ):

  & lt; Hour & gt; 2 %% 17 & lt; Hour & gt; (Numbers 2 and 17 are examples only. They can be any number and, in fact, these are two variables that I need to remove from that HTML page reliably .)  

If the text is an attached and specific or & lt; Div & gt; was within the identity, then I would not have any problem in extracting it using JSW. The problem is that this is not the case and the only way that I can think of right now (which is not all elegant) raw is to process through HTML Regex .

The raw HTML processing is disabled through a regex, however because I already have it parsed in a DOM through Jsoup.

Suggestions?

How about this?

  Document document = Jsoup.connect (url) .get (); Elements = documents. Select ("hr"); Pattern Pattern = Pattern.compile ("(\\ d + %% \\ d +)"); For (element hour: hrs) {string textfthr = hr Newcast Sibling () ToString (); Matcher matcher = pattern.matcher (textAfterHr); While (matcher.find ()) {System.out.println (matcher.group (1)); // & lt; - there, your data}}    

Comments

Popular posts from this blog

mysql - BLOB/TEXT column 'value' used in key specification without a key length -

c# - Using Vici cool Storage with monodroid -

python - referencing a variable in another function? -