2017-04-18 3 views
0

Contexteexpression Regex ne pas supprimer css en ligne de Html chaîne

ont actuellement une application console qui obtient email à partir de 0365 compte Outlook, j'utilise les perspectives api 2.0

Problème

J'accède au corps du mail en utilisant l'API, mais le corps arrive en tant que chaîne html. J'utilise ma fonctionnalité go regex qui supprime les balises html, mais Outlook ajoute une classe CSS à leur code HTML, ce qui rend mon expression regex fondamentalement obsolète.

code

string body = "<html> 
<head> 
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> 
<meta content="text/html; charset=us-ascii"> 
<meta name="Generator" content="Microsoft Word 15 (filtered medium)"> 
<style> 
<!-- 
@font-face 
    {font-family:"Cambria Math"} 
@font-face 
    {font-family:Calibri} 
p.MsoNormal, li.MsoNormal, div.MsoNormal 
    {margin:0in; 
    margin-bottom:.0001pt; 
    font-size:11.0pt; 
    font-family:"Calibri",sans-serif} 
a:link, span.MsoHyperlink 
    {color:#0563C1; 
    text-decoration:underline} 
a:visited, span.MsoHyperlinkFollowed 
    {color:#954F72; 
    text-decoration:underline} 
span.EmailStyle17 
    {font-family:"Calibri",sans-serif; 
    color:windowtext} 
.MsoChpDefault 
    {font-family:"Calibri",sans-serif} 
@page WordSection1 
    {margin:1.0in 1.0in 1.0in 1.0in} 
div.WordSection1 
    {} 
--> 
</style> 
</head> 
<body lang="EN-US" link="#0563C1" vlink="#954F72"> 
<div class="WordSection1"> 
<p class="MsoNormal">&nbsp;</p> 
</div> 
<hr> 
<p><b>Confidentiality Notice:</b> This e-mail is intended only for the addressee named above. It contains information that is privileged, confidential or otherwise protected from use and disclosure. If you are not the intended recipient, you are hereby notified 
that any review, disclosure, copying, or dissemination of this transmission, or taking of any action in reliance on its contents, or other use is strictly prohibited. If you have received this transmission in error, please reply to the sender listed above 
immediately and permanently delete this message from your inbox. Thank you for your cooperation.</p> 
</body> 
</html> 
"; 
string viewString1 = Regex.Replace(body, "<.*?>", string.Empty); 
string viewString12 = viewString1.Replace("&nbsp;", string.Empty); 

Les résultats de mon expression régulière

<!-- 
@font-face 
    {font-family:"Cambria Math"} 
@font-face 
    {font-family:Calibri} 
p.MsoNormal, li.MsoNormal, div.MsoNormal 
    {margin:0in; 
    margin-bottom:.0001pt; 
    font-size:11.0pt; 
    font-family:"Calibri",sans-serif} 
a:link, span.MsoHyperlink 
    {color:#0563C1; 
    text-decoration:underline} 
a:visited, span.MsoHyperlinkFollowed 
    {color:#954F72; 
    text-decoration:underline} 
span.EmailStyle17 
    {font-family:"Calibri",sans-serif; 
    color:windowtext} 
.MsoChpDefault 
    {font-family:"Calibri",sans-serif} 
@page WordSection1 
    {margin:1.0in 1.0in 1.0in 1.0in} 
div.WordSection1 
    {} 
--> 







Confidentiality Notice: This e-mail is intended only for the addressee named above. It contains information that is privileged, confidential or otherwise protected from use and disclosure. If you are not the intended recipient, you are hereby notified 
that any review, disclosure, copying, or dissemination of this transmission, or taking of any action in reliance on its contents, or other use is strictly prohibited. If you have received this transmission in error, please reply to the sender listed above 
immediately and permanently delete this message from your inbox. Thank you for your cooperation. 

Objectif

je devrai balises html bande capable de la chaîne, et également supprimer les classes CSS qui sur regarder des endroits dans le corps.

+1

Par ailleurs, vous voudrez peut-être envisager de remplacer   pour un espace (blanc), ce qui est ce qu'il représente (pas vide). – JuanR

Répondre

3

Vous pouvez remplacer <!--.*?--> avec String.Empty avec le regex optionSingleline(qui fait . correspondent de nouvelles lignes):

string viewString1 = Regex.Replace(body, "<.*?>", string.Empty, RegexOptions.Singleline);