If you want to remove the HTML tags from a HTML string and retrieve only plain text, the below SQL Server function can be used. It is just removing all the HTML tags by identifying '<' and '>'.
SQL Server Function :create function [dbo].[StripHTML] ( @HTMLText varchar(max) ) returns varchar(max) as begin declare @Start int declare @end int declare @Length int set @Start = charindex('<',@HTMLText) set @end = charindex('>',@HTMLText,charindex('<',@HTMLText)) set @Length = (@end - @Start) + 1 while @Start > 0 and @end > 0 and @Length > 0 begin set @HTMLText = stuff(@HTMLText,@Start,@Length,'') set @Start = charindex('<',@HTMLText) set @end = charindex('>',@HTMLText,charindex('<',@HTMLText)) set @Length = (@end - @Start) + 1 end return ltrim(rtrim(@HTMLText)) end
Sample Input:
select dbo.StripHTML('
<!DOCTYPE html><html><body><h1>My First Heading. </h1><p>My first paragraph.</p></body></html>
')
Output:
My First Heading. My first paragraph.
No comments:
Post a Comment