2017-10-19 5 views
2

J'ai trouvé ce problème de codage très difficile en ligne que je voudrais essayer.KMP Algorithme de recherche de chaînes de caractères?

L'idée générale est que la chaîne de texte donnée T et le motif P, trouvent les occurrences de ce motif, résument sa valeur correspondante et renvoient max et min. Si vous voulez lire le problème plus en détail, veuillez vous référer au this. Cependant, ci-dessous est le code que j'ai fourni, il fonctionne pour un cas de test simple, mais lors de l'exécution sur plusieurs cas de test complexes est assez lent, et je ne sais pas où mon code doit être optimisé.

Quelqu'un peut-il s'il vous plaît aider où im obtenir la logique fausse.

public class DeterminingDNAHealth { 


    private DeterminingDNAHealth() { 
    /* 
    * Fixme: 
    * Each DNA contains number of genes 
    * - some of them are beneficial and increase DNA's total health 
    * - Each Gene has a health value 
    * ====== 
    * - Total health of DNA = sum of all health values of beneficial genes 
    */ 
    } 

    int checking(int start, int end, String pattern) { 
    String[] genesChar = new String[] { 
     "a", 
     "b", 
     "c", 
     "aa", 
     "d", 
     "b" 
    }; 
    String numbers = "123456"; 

    int total = 0; 

    for (int i = start; i <= end; i++) { 
     total += KMPAlgorithm.initiateAlgorithm(pattern, genesChar[i]) * (i + 1); 
    } 

    return total; 
    } 

    public static void main(String[] args) { 

    String[] genesChar = new String[] { 
     "a", 
     "b", 
     "c", 
     "aa", 
     "d", 
     "b" 
    }; 
    Gene[] genes = new Gene[genesChar.length]; 

    for (int i = 0; i < 6; i++) { 
     genes[i] = new Gene(genesChar[i], i + 1); 
    } 

    String[] checking = "15caaab 04xyz 24bcdybc".split(" "); 


    DeterminingDNAHealth DNA = new DeterminingDNAHealth(); 
    int i, mostHealthiest, mostUnhealthiest; 

    mostHealthiest = Integer.MIN_VALUE; 
    mostUnhealthiest = Integer.MAX_VALUE; 

    for (i = 0; i < checking.length; i++) { 
     int start = Character.getNumericValue(checking[i].charAt(0)); 
     int end = Character.getNumericValue(checking[i].charAt(1)); 
     String pattern = checking[i].substring(2, checking[i].length()); 

     int check = DNA.checking(start, end, pattern); 

     if (check > mostHealthiest) 
     mostHealthiest = check; 
     else 
     if (check < mostUnhealthiest) 
     mostUnhealthiest = check; 
    } 

    System.out.println(mostHealthiest + " " + mostUnhealthiest); 

    // DNA.checking(1,5, "caaab"); 
    } 
} 

KMPAlgorithm

public class KMPAlgorithm { 

    KMPAlgorithm() {} 


    public static int initiateAlgorithm(String text, String pattern) { 

    // let us generate our LPC table from the pattern 
    int[] partialMatchTable = partialMatchTable(pattern); 

    int matchedOccurrences = 0; 

    // initially we don't have anything matched, so 0 
    int partialMatchLength = 0; 

    // we then start to loop through the text, !note, not the pattern. The text that we are testing the pattern on 
    for (int i = 0; i < text.length(); i++) { 

     // if there is a mismatch and there's no previous match, then we've hit the base-case, hence break from while{...} 
     while (partialMatchLength > 0 && text.charAt(i) != pattern.charAt(partialMatchLength)) { 

     /* 
     * otherwise, based on the number of chars matched, we decrement it by 1. 
     * In fact, this is the unique part of this algorithm. It is this part that we plan to skip partialMatchLength 
     * iterations. So if our partialMatchLength was 5, then we are going to skip (5 - 1) iteration. 
     */ 
     partialMatchLength = partialMatchTable[partialMatchLength - 1]; 

     } 


     // if however we have a char that matches the current text[i] 
     if (text.charAt(i) == pattern.charAt(partialMatchLength)) { 

     // then increment position, so hence we check the next char of the pattern against the next char in text 
     partialMatchLength++; 

     // we will know that we're at the end of the pattern matching, if the matched length is same as the pattern length 
     if (partialMatchLength == pattern.length()) { 
      // to get the starting index of the matched pattern in text, apply this formula (i - (partialMatchLength - 1)) 

      // this line increments when a match string occurs multiple times; 
      matchedOccurrences++; 

      // just before when we have a full matched pattern, we want to test for multiple occurrences, so we make 
      // our match length incomplete, and let it run longer. 
      partialMatchLength = partialMatchTable[partialMatchLength - 1]; 

     } 
     } 

    } 

    return matchedOccurrences; 


    } 


    private static int[] partialMatchTable(String pattern) { 
    /* 
    * TODO 
    * Note: 
    * => Proper prefix: All the characters in a string, with one or more cut off the end. 
    * => proper suffix: All the characters in a string, with one or more cut off the beginning. 
    * 
    * 1.) Take the pattern and construct a partial match table 
    * 
    * To construct partial match table { 
    *  1. Loop through the String(pattern) 
    *  2. Create a table of size String(pattern).length 
    *  3. For each character c[i], get The length of the longest proper prefix in the (sub)pattern 
    *   that matches a proper suffix in the same (sub)pattern 
    * } 
    */ 

    // we will need two incremental variables 
    int i, j; 

    // an LSP table also known as “longest suffix-prefix” 
    int[] LSP = new int[pattern.length()]; 


    // our initial case is that the first element is set to 0 
    LSP[0] = 0; 

    // loop through the pattern... 
    for (i = 1; i < pattern.length(); i++) { 

     // set our j as previous elements data (not the index) 
     j = LSP[i - 1]; 


     // we will be comparing previous and current elements data. ei char 
     char current = pattern.charAt(i), previous = pattern.charAt(j); 

     // we will have a case when we're somewhere in loop and two chars will not match, and j is not in base case. 
     while (j > 0 && current != previous) 
     // we decrement our j 
     j = LSP[j - 1]; 

     // simply put, if two characters are same, then we update our LSP to say that at that point, we hold the j's value 
     if (current == previous) 
     // increment our j 
     j++; 

     // update the table 
     LSP[i] = j; 


    } 

    return LSP; 

    } 
} 

crédit code Cource à Github

+1

'Si vous voulez lire le problème plus en détail, veuillez vous y reporter.' Où est le lien? –

+0

Excuses, j'ai mis à jour la question. Le lien devrait être disponible. @PhamTrung – OOP

+0

Avez-vous jeté un coup d'oeil à l'éditorial? la complexité de votre temps est O (n * m) avec n est la quantité de requêtes et m est la quantité de motifs, avec n est 10^5 et m est 10^5, évidemment il ne correspondra pas à la limite de temps. –

Répondre

0

Vous pouvez essayer cette implémentation KMP. C'est O (m + n), comme KMP est censé être. Cela devrait être beaucoup plus rapide:

private static int[] failureFunction(char[] pattern) { 
    int m = pattern.length; 

    int[] f = new int[pattern.length]; 
    f[0] = 0; 

    int i = 1; 
    int j = 0; 

    while (i < m) { 
     if (pattern[i] == pattern[j]) { 
      f[i] = j + 1; 
      i++; 
      j++; 
     } else if (j > 0) { 
      j = f[j - 1]; 
     } else { 
      f[i] = 0; 
      i++; 
     } 
    } 
    return f; 
} 

private static int kmpMatch(char[] text, char[] pattern) { 
    int[] f = failureFunction(pattern); 

    int m = pattern.length; 
    int n = text.length; 

    int i = 0; 
    int j = 0; 

    while (i < n) { 
     if (pattern[j] == text[i]) { 
      if (j == m - 1){ 
       return i - (m - 1); 
      } else { 
       i++; 
       j++; 
      } 
     } else if (j > 0) { 
      j = f[j - 1]; 
     } else { 
      i++; 
     } 
    } 
    return -1; 
}