2011-10-18 4 views

Je suis nouveau avec des expressions régulières. Je dois extraire le chemin des lignes suivantes:Regex pour correspondre à un chemin en C#

XXXX  c:\mypath1\test 
YYYYYYY    c:\this is other path\longer 
ZZ  c:\mypath3\file.txt 

J'ai besoin de mettre en œuvre une méthode qui retourne le chemin d'une ligne donnée. La première colonne est un mot avec 1 ou plusieurs caractères, n'est jamais vide, la deuxième colonne est le chemin. Le séparateur peut comprendre 1 ou plusieurs espaces, ou un ou plusieurs onglets, ou les deux. (. Cela suppose que la première colonne ne contient jamais des espaces ou des tabulations)


L'entrée est un fichier ou des lignes individuellement? –


@RoyiNamir est-ce important? – username


oui. le traitement pour la ligne et pour le fichier est différent. à moins que vous ne le lisiez ligne par ligne depuis le fichier tex et que vous deviez aussi prendre soin des caractères de saut de ligne, etc. –



Il me semble que vous voulez juste

string[] bits = line.Split(new char[] { '\t', ' ' }, 2, 
// TODO: Check that bits really has two entries 
string path = bits[1]; 

EDIT: Comme une expression régulière, vous pouvez probablement faire:

Regex regex = new Regex(@"^[^ \t]+[ \t]+(.*)$"); 

Exemple de code:

using System; 
using System.Text.RegularExpressions; 

class Program 
    static void Main(string[] args) 
     string[] lines = 
      @"XXXX  c:\mypath1\test", 
      @"YYYYYYY    c:\this is other path\longer", 
      @"ZZ  c:\mypath3\file.txt" 

     foreach (string line in lines) 

    static readonly Regex PathRegex = new Regex(@"^[^ \t]+[ \t]+(.*)$"); 

    static string ExtractPathFromLine(string line) 
     Match match = PathRegex.Match(line); 
     if (!match.Success) 
      throw new ArgumentException("Invalid line"); 
     return match.Groups[1].Value; 

Les chemins peuvent avoir des espaces, donc le second est assez mauvais. – xanatos


@Jon: Désolé, j'ai besoin d'une expression régulière depuis que j'utilise .NET 1.1 et je n'ai pas accès à la surcharge StringSplitOptions.RemoveEmptyEntries. Merci quand même! –


@ DanielPeñalba: Il aurait été utile de le dire pour commencer - nécessitant .NET 1.1 est très rare ces jours-ci. Éditera. –

StringCollection resultList = new StringCollection(); 
try { 
    Regex regexObj = new Regex(@"(([a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)?(\\?(?:[^\\/:*?""<>|\r\n]+\\)+)[^\\/:*?""<>|\r\n]+)"); 
    Match matchResult = regexObj.Match(subjectString); 
    while (matchResult.Success) { 
     matchResult = matchResult.NextMatch(); 
} catch (ArgumentException ex) { 
    // Syntax error in the regular expression 


(       # Match the regular expression below and capture its match into backreference number 1 
    (       # Match the regular expression below and capture its match into backreference number 2 
     |        # Match either the regular expression below (attempting the next alternative only if this one fails) 
     [a-z]       # Match a single character in the range between “a” and “z” 
     :        # Match the character “:” literally 
     |        # Or match regular expression number 2 below (the entire group fails if this one fails to match) 
     \\       # Match the character “\” literally 
     \\       # Match the character “\” literally 
     [a-z0-9_.$]     # Match a single character present in the list below 
              # A character in the range between “a” and “z” 
              # A character in the range between “0” and “9” 
              # One of the characters “_.$” 
      +        # Between one and unlimited times, as many times as possible, giving back as needed (greedy) 
     \\       # Match the character “\” literally 
     [a-z0-9_.$]     # Match a single character present in the list below 
              # A character in the range between “a” and “z” 
              # A character in the range between “0” and “9” 
              # One of the characters “_.$” 
      +        # Between one and unlimited times, as many times as possible, giving back as needed (greedy) 
    )?       # Between zero and one times, as many times as possible, giving back as needed (greedy) 
    (       # Match the regular expression below and capture its match into backreference number 3 
     \\       # Match the character “\” literally 
     ?        # Between zero and one times, as many times as possible, giving back as needed (greedy) 
     (?:       # Match the regular expression below 
     [^\\/:*?""<>|\r\n]    # Match a single character NOT present in the list below 
              # A \ character 
              # One of the characters “/:*?""<>|” 
              # A carriage return character 
              # A line feed character 
      +        # Between one and unlimited times, as many times as possible, giving back as needed (greedy) 
     \\       # Match the character “\” literally 
    )+       # Between one and unlimited times, as many times as possible, giving back as needed (greedy) 
    [^\\/:*?""<>|\r\n]    # Match a single character NOT present in the list below 
            # A \ character 
            # One of the characters “/:*?""<>|” 
            # A carriage return character 
            # A line feed character 
     +        # Between one and unlimited times, as many times as possible, giving back as needed (greedy) 

Cela semble très compliqué d'obtenir tout ce qui suit le premier ensemble d'espaces/onglets. –


@JonSkeet Je suis d'accord. C'est une expression rationnelle plus générale pour le chemin de Windows. – FailedDev


@FailedDev cela ne fonctionne pas par exemple pour "k: \ test \ test". Si j'essaie de passer le chemin comme ** \\ test \ t><* st **, il sera valide. J'ai trouvé cette regex '^ (?: [C-zC-Z] \: | \\) (\\ [a-zA-Z _ \ - \ s0-9 \.] +) +'. Il valide le chemin correctement sur mon opinion. Trouvé [ici] (https://www.codeproject.com/Tips/216238/Regular-Expression-to-Validate-File-Path-and-Exten) – Potato


Regex Tester est un bon site pour tester la Regex rapide.

Regex.Matches(input, "([a-zA-Z]*:[\\[a-zA-Z0-9 .]*]*)"); 
Questions connexes