2017-09-22 1 views
0

Je développe un script qui est utilisé pour compter les éléments d'une séquence donnée. J'ai déjà trouvé un moyen d'améliorer cette tâche, mais je me demandais s'il était possible d'utiliser un dictionnaire quand les lettres incluses dans la chaîne ne sont pas celles qui comptent réellement et comment imprimer quand même.Comment utiliser un dictionnaire pour la chaîne suivante?

Par exemple:

sequence = str(input('Enter DNA sequence:')) 
print ('Your sequence contain:',len(sequence), 'bases', 'with the following 
structure:') 
adenine = sequence.count("A") + sequence.count("a") 
thymine = sequence.count("T") + sequence.count("t") 
cytosine = sequence.count("C") + sequence.count("c") 
guanine = sequence.count ("G") + sequence.count("g") 



print("adenine =", adenine) 
print("thymine=", thymine) 
print("cytosine=", cytosine) 
print("guanine=", guanine) 

Je pensais dans un dictionnaire comme celui-ci: DICC = {adénine: [ "A", "a"], thymine: [ "T", "t"] , cytosine: [ "C", "c"], guanine: [ "g", "g"]

}

Mais je ne sais pas comment imprimer ces lettres qui ne sont pas nucléotides si elles sont donné dans la séquence, par exemple, dans la séquence suivante le résultat devrait être quelque chose comme ceci:

sequence = AacGTtxponwxs: 
your sequence contain 13 bases with the following structure: 
adenine = 2 
thymine = 2 
cytosine = 1 
thymine = 2 
p is not a DNA value 
x is not a DNA value 
o is not a DNA value 
n is not a DNA value 
w is not a DNA value 
s is not a DNA value 

Répondre

0

essayer cela

sequence = 'AacGTtxponwxs' 
adenine = 0 
thymine = 0 
cytosine = 0 
guanine = 0 
outputstring = [] 
for elem in sequence: 
    if elem in ('a','A'): 
    adenine += 1 
    elif elem in ('T','t'): 
    thymine += 1 
    elif elem in ('C','c'): 
    cytosine += 1 
    elif elem in ('G','g'): 
    guanine += 1 
    else: 
    outputstring.append('{} is not a DNA value'.format(elem)) 
print ('your sequence contain {} bases with the following structure:'.format(len(sequence))) 
print ('adenine = ',adenine) 
print ('thymine = ',thymine) 
print ('cytosine = ',cytosine) 
print ('thymine = ',guanine ) 
print ("\n".join(outputstring)) 

sortie:

your sequence contain 13 bases with the following structure: 
adenine = 2 
thymine = 2 
cytosine = 1 
thymine = 1 
x is not a DNA value 
p is not a DNA value 
o is not a DNA value 
n is not a DNA value 
w is not a DNA value 
x is not a DNA value 
s is not a DNA value 
1

collections.Counter à l'aide (qui est une classe dict -comme), vous pouvez être plus DRY:

from collections import Counter 

sequence = 'AacGTtxponwxs' 
s = sequence.lower() 
bases = ['adenine', 'thymine', 'cytosine', 'guanine'] 
non_bases = [x for x in s if x not in (b[0] for b in bases)] 
c = Counter(s) 
for base in bases: 
    print('{} = {}'.format(base, c[base[0]])) 
# adenine = 2 
# thymine = 2 
# cytosine = 1 
# guanine = 1 

for n in non_bases: 
    print('{} is not a DNA value'.format(n)) 
# o is not a DNA value 
# n is not a DNA value 
# p is not a DNA value 
# s is not a DNA value 
# w is not a DNA value 
# x is not a DNA value 
0
#Are you studying bioinformatics at HAN? I remember this as my assignment lol 
#3 years ago 
sequence = str(input('Enter DNA sequence:')) 
sequence.lower() 
count_sequence = 0 
countA = 0 
countT = 0 
countG = 0 
countC = 0 
countNotDNA = 0 
for char in sequence: 
    if char in sequence: 
     count_sequence+=1 
     if char == 'a': 
      countA +=1 
     if char == 't': 
      countT +=1 
     if char == 'g': 
      countG +=1 
     if char == 'c': 
      countC +=1 

     else: 
      countNotDNA+=1 


print("sequence is", count_sequence, "characters long containing:","\n", countA, "Adenine","\n", countT, "Thymine","\n", countG, "Guanine","\n", countC, "Cytosine","\n", countNotDNA, "junk bases") 

Et voilà :)