I am a danish programmer living in Bangkok.
Read more about me @ rasmus.rummel.dk.
Webmodelling Home > ASP.NET > C# Utility Functions > String Strip Tags

String Strip Tags

References :

Usage

  • Example
    • I use StripTags in my RichTextBox WebControl to get the text value (eg. counting characters) of the whole html input.
  • Example Code
    • string myHtml = "<table><tr><td>my table</td></tr></table>";
      string myText = Utils.String.StripTags(myHtml);
      

The StripTags function :

public static string StripTags(string pTaggedText)
{
	return StripTags(pTaggedText, new string[] { });
}
public static string StripTags(string pTaggedText, string[] pTagsToStrip)
{
	if (pTagsToStrip.Length == 0) //strip all tags
	{
		Regex rx = new Regex("<[^>]+>");
		string resultText = rx.Replace(pTaggedText, "");
 
		return resultText;
	}
	else //strip only specified tags
	{
		string tagsToStrip = "";
		for (int s = 0; s < pTagsToStrip.Length; s++)
		{
			if (s > 0) { tagsToStrip += "|"; }
			tagsToStrip += pTagsToStrip[s];
		}
		Regex rx = new Regex("</?(?i:" + tagsToStrip + ")([^>]*>");
		string resultText = rx.Replace(pTaggedText, "");
 
		return resultText;
	}
}

I have not worked enough on the above function, that I am confident it is reliable. I would very much appreciate comments on how to improve the function. One way to improve the function is to improve the regular expression used, currently I have these in mind :

  • <[^>]+> : the one I am using because the non-gready match is logically builtin.
  • <(.|\n)*?> : secures that tags spanning multiple lines are matched (also secures non-gready matching using "*?").
  • </?(?i:script|embed|iframe)([^>])*> : the principle I am using for removing selected tags.

Comments

You can comment without logging in
Profile
Username
Password
Password
Email
Nickpic
Get notified on reply to own posts  (only works if you specify an email address)
Get notified on receiving a PM  (only works if you specify an email address)
Remember my username
Remember my password
signature
Words: Chars: Chars left: