Skip to main content

SQL query to extract file names from a html content


In this article I am going to explain how to extract the filename from html column using sql function.

Below is the function which i used to do this operation



CREATE FUNCTION dbo.fn_getFilenames(@InputHTML NVARCHAR(MAX))
RETURNS @res TABLE (pdf NVARCHAR(MAX)) AS
BEGIN
-- assumes there are no single quotes or double quotes in the PDF filename
DECLARE @i INT, @j INT, @k INT, @tmp NVARCHAR(MAX);
SET @i = CHARINDEX(N'.pdf', @InputHTML);
WHILE @i > 0
BEGIN
  SELECT @tmp = left(@InputHTML, @i+3);
  SELECT @j = CHARINDEX('/', REVERSE(@tmp)); -- directory delimiter
  SELECT @k = CHARINDEX('"', REVERSE(@tmp)); -- start of href
  IF @j = 0 or (@k > 0 and @k < @j) SET @j = @k;
  SELECT @k = CHARINDEX('''', REVERSE(@tmp)); -- start of href (single-quote*)
  IF @j = 0 or (@k > 0 AND @k < @j) SET @j = @k;
  INSERT @res VALUES (SUBSTRING(@tmp, len(@tmp)-@j+2, len(@TMP)));
  SELECT @InputHTML = STUFF(@InputHTML, 1, @i+4, ''); -- remove up to ".pdf"
  SET @i = CHARINDEX(N'.pdf', @InputHTML);
END
RETURN
END
GO

Below is the query to check the output

-- CREATE TABLE


create table mytable (Html varchar(max));

-- INSERT HTML Content

insert into mytable values('
<p>A deferred tuition payment plan,
or view the <a href="/uploadedFiles/uploadedFiles/uploadedFiles/uploadedFiles/Tuition-Reimbursement-Deferred.pdf"
target="_blank">list</a>.</p>')

insert into mytable values('
<p>A deferred tuition payment plan,
or view the <a href="Two files here-Reimbursement-Deferred.pdf"
target="_blank">list</a>.</p>And I use single quotes
   <a href=''/look/path/The second file.pdf''
target="_blank">list</a>');


--SELECT Statement

select t.*, p.pdf


from mytable t
cross apply dbo.extract_filenames_from_a_tags(html) p;

OUTPUT Will be:

SQL query to extract file names from a html content




Comments

Popular posts from this blog

Code To Convert rupees(numbers) into words using C#.Net

Introduction: In my previous article I have explained about how to validate emailid using javascript . In this article I am going to explain about code used to convert rupees(numbers) into words using C#.Net . Explanation: For explanation purpose I have a page. It has a textbox to input the numbers. And when you click on the convert to words button then it will convert the input numbers into words and shows it in the below label. Below is the C# code used to do this functionality. public static string NumbersToWords( int inputNumber) {     int inputNo = inputNumber;     if (inputNo == 0)         return "Zero" ;     int [] numbers = new int [4];     int first = 0;     int u, h, t;     System.Text. StringBuilder sb = new System.Text. StringBuilder ();     if (inputNo < 0)     { ...

Puzzles for kids - 12 Days Of Christmas

According to the traditional song, on the first day of Christmas (25th December), my true love sent to me: . A partridge in a pair tree On the second day of Christmas (26th December), my true love sent to me THREE presents: . Two turtle doves . A partridge in a pear tree On the third day of Christmas (27th December and so on) my true love sent to me SIX presents: . Three French hens . Two turtle doves . A partridge in a pear tree This carries on until the the twelfth day of Christmas, when my true love sends me: Twelve drummers drumming Eleven pipers piping Ten lords a-leaping Nine ladies dancing Eight maids a-milking Seven swans a-swimming Six geese a-laying Five gold rings Four calling birds Three French hens Two turtle doves A partridge in a pear tree After the twelve days of Christmas are over, how many presents has my true love sent me altogether? Our Solution: 1 + 3 + 6 + 10 + 15 + 21 + 28 + 36 + 45 + 55 + 66 + 78 = 364 presents Which is really interesting when you think ...

Asp.Net DataTable Manipulation - Add, Update,Delete & Sort DataTable in C#.Net & VB.Net

In the asp.net developer's life it is very common to come across the datatable manipulation. So here i have decided to explain the datatable manipulation like adding records to datatable, editing and updating datatable records,deleting records from datatable based on the condition and finally sorting datatable based on columns.