I am stuck in a situation where I have a field in mysql that is a long HTML field. I need to remove the words between the html tags. I
Say,
DOCTYPE html> gt; & Lt; Html & gt; & Lt; Body & gt; & Lt; H1 & gt; My first title & lt; / H1> & Lt; P & gt; My first paragraph. & Lt; / P & gt; & Lt; / Body & gt; & Lt; / Html & gt;
I want something like this.
"My first paragraph my first paragraph"
I am currently using Java with export CSV file in Java using one of the following functions :
public string getStringFromHtml (string html) {string nohtml = html.toString (). ReplaceAll ("& lt; [^ & gt;] *>", ""); Return nohtml.trim (). ReplaceAll ("\\ s +", ""); }
But suppose I am using MySQL workbench (not a server side script) only for some data analysis.
I was still wondering if there is any way MySQL allows us to eliminate the html tag and remove the words in between. I tried to search the whole stack overflow & amp; Google, I was not lucky because it only recommends using PHP or Java or stored procedures
Still there is no way to remove HTML text using SQL?
You can use the function to give an XPath expression that select the part you need Takes:
mysql> Choose HTML from Mytable; + ------------------------------------------------- --------------------------------------------- + | Html | + ------------------------------------------------- --------------------------------------------- + | & Lt ;! DOCTYPE html & gt; & Lt; Html & gt; & Lt; Body & gt; & Lt; H1 & gt; My first title & lt; / H1> & Lt; P & gt; My first paragraph & lt; / P & gt; & Lt; / Body & gt; & Lt; / Html & gt; | + ------------------------------------------------- --------------------------------------------- + Extract Value (HTML ), '// html / body / p [1]') value from mytable; + --------------------- + | Price | + --------------------- + | My first paragraph. + --------------------- +
Comments
Post a Comment