Skip to main content

SQL query to extract file names from a html content


In this article I am going to explain how to extract the filename from html column using sql function.

Below is the function which i used to do this operation



CREATE FUNCTION dbo.fn_getFilenames(@InputHTML NVARCHAR(MAX))
RETURNS @res TABLE (pdf NVARCHAR(MAX)) AS
BEGIN
-- assumes there are no single quotes or double quotes in the PDF filename
DECLARE @i INT, @j INT, @k INT, @tmp NVARCHAR(MAX);
SET @i = CHARINDEX(N'.pdf', @InputHTML);
WHILE @i > 0
BEGIN
  SELECT @tmp = left(@InputHTML, @i+3);
  SELECT @j = CHARINDEX('/', REVERSE(@tmp)); -- directory delimiter
  SELECT @k = CHARINDEX('"', REVERSE(@tmp)); -- start of href
  IF @j = 0 or (@k > 0 and @k < @j) SET @j = @k;
  SELECT @k = CHARINDEX('''', REVERSE(@tmp)); -- start of href (single-quote*)
  IF @j = 0 or (@k > 0 AND @k < @j) SET @j = @k;
  INSERT @res VALUES (SUBSTRING(@tmp, len(@tmp)-@j+2, len(@TMP)));
  SELECT @InputHTML = STUFF(@InputHTML, 1, @i+4, ''); -- remove up to ".pdf"
  SET @i = CHARINDEX(N'.pdf', @InputHTML);
END
RETURN
END
GO

Below is the query to check the output

-- CREATE TABLE


create table mytable (Html varchar(max));

-- INSERT HTML Content

insert into mytable values('
<p>A deferred tuition payment plan,
or view the <a href="/uploadedFiles/uploadedFiles/uploadedFiles/uploadedFiles/Tuition-Reimbursement-Deferred.pdf"
target="_blank">list</a>.</p>')

insert into mytable values('
<p>A deferred tuition payment plan,
or view the <a href="Two files here-Reimbursement-Deferred.pdf"
target="_blank">list</a>.</p>And I use single quotes
   <a href=''/look/path/The second file.pdf''
target="_blank">list</a>');


--SELECT Statement

select t.*, p.pdf


from mytable t
cross apply dbo.extract_filenames_from_a_tags(html) p;

OUTPUT Will be:

SQL query to extract file names from a html content




Comments

Popular posts from this blog

Sort Dictionary Based On Value In Asp.Net And C#.Net | Convert Dictionary into KeyValuePair or KeyValuePair into Dictionary.

In this tutorial i am going to explain about how to sort dictionary object based on value in asp.net and C#.Net or convert unsorted dictionary to sorted dictionary object in C#.Net and VB.Net or Convert Dictionary into KeyValuePair or KeyValuePair into Dictionary.

Code To Convert rupees(numbers) into words using C#.Net

Introduction: In my previous article I have explained about how to validate emailid using javascript . In this article I am going to explain about code used to convert rupees(numbers) into words using C#.Net . Explanation: For explanation purpose I have a page. It has a textbox to input the numbers. And when you click on the convert to words button then it will convert the input numbers into words and shows it in the below label. Below is the C# code used to do this functionality. public static string NumbersToWords( int inputNumber) {     int inputNo = inputNumber;     if (inputNo == 0)         return "Zero" ;     int [] numbers = new int [4];     int first = 0;     int u, h, t;     System.Text. StringBuilder sb = new System.Text. StringBuilder ();     if (inputNo < 0)     { ...

C# code to send mail using smtp from gmail,yahoo mail and live mail

Introduction: In my previous article I have explained about   How to bind/Unbind events in jQuery . In this article I am going to explain about how to send mail from ASP.Net using gmail,yahoomail and live mail credentials. Explanation: First Include the below namespaces in your code behind file. using System; using System.Net; using System.Net.Mail;