J'ai un script qui contient deux classes. (Je supprime évidemment beaucoup de choses que je ne crois pas pertinentes à l'erreur que je traite.) La tâche finale est de créer un arbre de décision, comme je l'ai mentionné dans la question this. Malheureusement, je reçois une boucle infinie, et j'ai de la difficulté à identifier pourquoi. J'ai identifié la ligne de code qui déraille, mais j'aurais pensé que l'itérateur et la liste que j'ajouterais seraient des objets différents. Y a-t-il un effet secondaire de la fonctionnalité .append de la liste dont je ne suis pas au courant? Ou est-ce que je fais une autre erreur évidente?Boucle infinie lors de l'ajout d'une ligne à une liste dans une classe dans python3
class Dataset:
individuals = [] #Becomes a list of dictionaries, in which each dictionary is a row from the CSV with the headers as keys
def field_set(self): #Returns a list of the fields in individuals[] that can be used to split the data (i.e. have more than one value amongst the individuals
def classified(self, predicted_value): #Returns True if all the individuals have the same value for predicted_value
def fields_exhausted(self, predicted_value): #Returns True if all the individuals are identical except for predicted_value
def lowest_entropy_value(self, predicted_value): #Returns the field that will reduce <a href="http://en.wikipedia.org/wiki/Entropy_%28information_theory%29">entropy</a> the most
def __init__(self, individuals=[]):
et
class Node:
ds = Dataset() #The data that is associated with this Node
links = [] #List of Nodes, the offspring Nodes of this node
level = 0 #Tree depth of this Node
split_value = '' #Field used to split out this Node from the parent node
node_value = '' #Value used to split out this Node from the parent Node
def split_dataset(self, split_value): #Splits the dataset into a series of smaller datasets, each of which has a unique value for split_value. Then creates subnodes to store these datasets.
fields = [] #List of options for split_value amongst the individuals
datasets = {} #Dictionary of Datasets, each one with a value from fields[] as its key
for field in self.ds.field_set()[split_value]: #Populates the keys of fields[]
fields.append(field)
datasets[field] = Dataset()
for i in self.ds.individuals: #Adds individuals to the datasets.dataset that matches their result for split_value
datasets[i[split_value]].individuals.append(i) #<---Causes an infinite loop on the second hit
for field in fields: #Creates subnodes from each of the datasets.Dataset options
self.add_subnode(datasets[field],split_value,field)
def add_subnode(self, dataset, split_value='', node_value=''):
def __init__(self, level, dataset=Dataset()):
Mon code d'initialisation est actuellement:
if __name__ == '__main__':
filename = (sys.argv[1]) #Takes in a CSV file
predicted_value = "# class" #Identifies the field from the CSV file that should be predicted
base_dataset = parse_csv(filename) #Turns the CSV file into a list of lists
parsed_dataset = individual_list(base_dataset) #Turns the list of lists into a list of dictionaries
root = Node(0, Dataset(parsed_dataset)) #Creates a root node, passing it the full dataset
root.split_dataset(root.ds.lowest_entropy_value(predicted_value)) #Performs the first split, creating multiple subnodes
n = root.links[0]
n.split_dataset(n.ds.lowest_entropy_value(predicted_value)) #Attempts to split the first subnode.
+1 Bonne réponse. –