J'ai un problème assez compliqué, que je n'arrive pas à résoudre.Comment puis-je calculer le nombre d'occurrences d'une valeur particulière dans une rangée dans R
I ont un grand ensemble de données (23277 lignes, 151 colonnes). Chaque colonne a des valeurs de 0: 100 (inclus) représentant les probabilités assignées aux événements dans le monde.
Dans le cadre du calcul du score pour chaque individu, je dois compter les occurrences de chacune des valeurs dans l'ensemble de données.
J'ai d'abord essayé appliquer, mais je dois ignorer de NA, et sous-ensemble, alors quand je l'ai essayé les éléments suivants:
apply(ans.samp, 1, sum(ans.samp[ans==0]), na.rm=TRUE)
J'ai reçu le message d'erreur: somme (ans.samp [ans == 0]) » est pas une fonction, caractère ou symbole
Je répétais ce processus avec sapply, vapply, tapply et do.call en vain. En abandonnant une solution vectorisée, j'ai écrit ce qui suit pour la boucle. Cependant, après avoir obtenu ce travail, il renvoie seulement la somme totale de O dans l'échantillon.
Je vous serais reconnaissant de l'aide avec cela, comme je suis sous une certaine pression de temps, et je voudrais être en mesure de résoudre ce genre de problèmes dans R dans l'avenir.
Données d'échantillons inclus pour la reproductibilité:
structure(list(X = 1:6, X100 = c(70L, NA, 80L, 0L, 40L, NA),
X10 = c(30L, NA, NA, NA, NA, NA), X1 = c(50L, NA, NA, NA,
NA, NA), X11 = c(50L, NA, NA, NA, NA, NA), X12 = c(30L, NA,
NA, NA, NA, NA), X13 = c(50L, NA, NA, NA, NA, NA), X14 = c(70L,
NA, NA, NA, NA, NA), X15 = c(60L, NA, NA, NA, NA, NA), X158 = c(30L,
NA, NA, NA, NA, NA), X159 = c(50L, NA, NA, NA, NA, NA), X160 = c(80L,
NA, NA, NA, NA, NA), X16 = c(50L, NA, NA, NA, NA, NA), X161 = c(40L,
NA, NA, NA, NA, NA), X162 = c(100L, NA, NA, NA, NA, NA),
X163 = c(50L, NA, NA, NA, NA, NA), X164 = c(0L, NA, NA, NA,
NA, NA), X165 = c(0L, NA, NA, NA, NA, NA), X166 = c(20L,
NA, NA, NA, NA, NA), X167 = c(0L, NA, NA, NA, NA, NA), X168 = c(30L,
NA, NA, NA, NA, NA), X169 = c(100L, NA, NA, NA, NA, NA),
X170 = c(30L, NA, NA, NA, NA, NA), X17 = c(40L, NA, NA, NA,
NA, NA), X171 = c(50L, NA, NA, NA, NA, NA), X172 = c(20L,
NA, NA, NA, NA, NA), X173 = c(30L, NA, NA, NA, NA, NA), X174 = c(20L,
NA, NA, NA, NA, NA), X175 = c(30L, NA, NA, NA, NA, NA), X176 = c(10L,
NA, NA, NA, NA, NA), X177 = c(70L, NA, NA, NA, NA, NA), X178 = c(40L,
NA, NA, NA, NA, NA), X179 = c(70L, NA, NA, NA, NA, NA), X180 = c(0L,
NA, NA, NA, NA, NA), X18 = c(30L, NA, NA, NA, NA, NA), X181 = c(100L,
NA, NA, NA, NA, NA), X182 = c(100L, NA, NA, NA, NA, NA),
X183 = c(20L, NA, NA, NA, NA, NA), X184 = c(80L, NA, NA,
NA, NA, NA), X185 = c(90L, NA, NA, NA, NA, NA), X186 = c(0L,
NA, NA, NA, NA, NA), X187 = c(10L, NA, NA, NA, NA, NA), X188 = c(100L,
NA, NA, NA, NA, NA), X189 = c(100L, NA, NA, NA, NA, NA),
X190 = c(0L, NA, NA, NA, NA, NA), X19 = c(100L, NA, NA, NA,
NA, NA), X191 = c(0L, NA, NA, NA, NA, NA), X192 = c(90L,
NA, NA, NA, NA, NA), X193 = c(50L, NA, NA, NA, NA, NA), X194 = c(100L,
NA, NA, NA, NA, NA), X195 = c(10L, NA, NA, NA, NA, NA), X196 = c(100L,
NA, NA, NA, NA, NA), X197 = c(20L, NA, NA, NA, NA, NA), X198 = c(40L,
NA, NA, NA, NA, NA), X199 = c(20L, NA, NA, NA, NA, NA), X200 = c(0L,
NA, NA, NA, NA, NA), X20 = c(0L, NA, NA, NA, NA, NA), X201 = c(0L,
NA, NA, NA, NA, NA), X202 = c(20L, NA, NA, NA, NA, NA), X203 = c(20L,
NA, NA, NA, NA, NA), X204 = c(80L, NA, NA, NA, NA, NA), X205 = c(0L,
NA, NA, NA, NA, NA), X206 = c(80L, NA, NA, NA, NA, NA), X207 = c(0L,
NA, NA, NA, NA, NA), X2 = c(10L, NA, NA, NA, NA, NA), X21 = c(0L,
NA, NA, NA, NA, NA), X22 = c(100L, NA, NA, NA, NA, NA), X23 = c(50L,
NA, NA, NA, NA, NA), X24 = c(50L, NA, NA, NA, NA, NA), X25 = c(70L,
NA, NA, NA, NA, NA), X26 = c(60L, NA, NA, NA, NA, NA), X27 = c(40L,
NA, NA, NA, NA, NA), X28 = c(20L, NA, NA, NA, NA, NA), X29 = c(0L,
NA, NA, NA, NA, NA), X30 = c(90L, NA, NA, NA, NA, NA), X3 = c(0L,
NA, NA, NA, NA, NA), X31 = c(50L, NA, NA, NA, NA, NA), X32 = c(50L,
NA, NA, NA, NA, NA), X33 = c(0L, NA, NA, NA, NA, NA), X34 = c(50L,
NA, NA, NA, NA, NA), X35 = c(90L, NA, NA, NA, NA, NA), X36 = c(50L,
NA, NA, NA, NA, NA), X37 = c(60L, NA, NA, NA, NA, NA), X38 = c(40L,
NA, NA, NA, NA, NA), X39 = c(50L, NA, NA, NA, NA, NA), X40 = c(0L,
NA, NA, NA, NA, NA), X4 = c(50L, NA, NA, NA, NA, NA), X41 = c(90L,
NA, NA, NA, NA, NA), X42 = c(80L, NA, NA, NA, NA, NA), X43 = c(50L,
NA, NA, NA, NA, NA), X44 = c(80L, NA, NA, NA, NA, NA), X45 = c(80L,
NA, NA, NA, NA, NA), X46 = c(0L, NA, NA, NA, NA, NA), X47 = c(80L,
NA, NA, NA, NA, NA), X48 = c(20L, NA, NA, NA, NA, NA), X49 = c(100L,
NA, NA, NA, NA, NA), X50 = c(0L, NA, NA, NA, NA, NA), X5 = c(0L,
NA, NA, NA, NA, NA), X51 = c(80L, 100L, 70L, 100L, 0L, 60L
), X52 = c(10L, 0L, 0L, 0L, 0L, 20L), X53 = c(40L, 40L, 70L,
20L, 90L, 50L), X54 = c(0L, 10L, 0L, 50L, 50L, 0L), X55 = c(20L,
80L, 90L, 80L, 30L, 0L), X56 = c(100L, 100L, 50L, 100L, 80L,
100L), X57 = c(60L, 0L, 100L, 70L, 100L, 80L), X58 = c(100L,
100L, 100L, 50L, 100L, 100L), X59 = c(80L, 50L, 80L, 0L,
30L, 50L), X60 = c(70L, 50L, 60L, 50L, 100L, 100L), X6 = c(100L,
NA, NA, NA, NA, NA), X61 = c(50L, 50L, 50L, 30L, 70L, 50L
), X62 = c(20L, 50L, 40L, 40L, 50L, 100L), X63 = c(50L, 0L,
100L, 10L, 50L, 100L), X64 = c(60L, 30L, 0L, 50L, 50L, 50L
), X65 = c(50L, 50L, 70L, 80L, 50L, 50L), X66 = c(70L, 40L,
10L, 90L, 60L, 50L), X67 = c(30L, 50L, 50L, 0L, 50L, 60L),
X68 = c(30L, 0L, 0L, 40L, 70L, 80L), X69 = c(30L, NA, 70L,
10L, 0L, 20L), X70 = c(80L, NA, 50L, 50L, 70L, 100L), X7 = c(100L,
NA, NA, NA, NA, NA), X71 = c(70L, NA, 50L, 100L, 100L, 100L
), X72 = c(60L, NA, 70L, 50L, 80L, 50L), X73 = c(80L, NA,
80L, 80L, 80L, NA), X74 = c(50L, NA, 50L, 0L, 50L, NA), X75 = c(30L,
NA, 70L, 10L, 80L, NA), X76 = c(70L, NA, 40L, 80L, 100L,
NA), X77 = c(80L, NA, 50L, 100L, 40L, NA), X78 = c(80L, NA,
0L, 0L, 0L, NA), X79 = c(80L, NA, 50L, 50L, 50L, NA), X80 = c(40L,
NA, 90L, 70L, 60L, NA), X8 = c(50L, NA, NA, NA, NA, NA),
X81 = c(70L, NA, 60L, 40L, 80L, NA), X82 = c(80L, NA, 100L,
60L, 60L, NA), X83 = c(30L, NA, 100L, 30L, 0L, NA), X84 = c(80L,
NA, 0L, 60L, 100L, NA), X85 = c(80L, NA, 50L, 40L, 30L, NA
), X86 = c(50L, NA, 90L, 50L, 50L, NA), X87 = c(80L, NA,
50L, 70L, 20L, NA), X88 = c(40L, NA, 70L, 30L, 90L, NA),
X89 = c(50L, NA, 50L, 80L, 80L, NA), X90 = c(90L, NA, 100L,
60L, 100L, NA), X91 = c(0L, NA, 0L, 0L, 0L, NA), X9 = c(100L,
NA, NA, NA, NA, NA), X92 = c(50L, NA, 70L, 90L, 80L, NA),
X93 = c(40L, NA, 50L, 50L, 50L, NA), X94 = c(40L, NA, 0L,
60L, 40L, NA), X95 = c(90L, NA, 100L, 40L, 50L, NA), X96 = c(50L,
NA, 50L, 50L, 50L, NA), X97 = c(60L, NA, 60L, 100L, 50L,
NA), X98 = c(40L, NA, 40L, 0L, 0L, NA), X99 = c(30L, NA,
0L, 50L, 70L, NA)), .Names = c("X", "X100", "X10", "X1",
"X11", "X12", "X13", "X14", "X15", "X158", "X159", "X160", "X16",
"X161", "X162", "X163", "X164", "X165", "X166", "X167", "X168",
"X169", "X170", "X17", "X171", "X172", "X173", "X174", "X175",
"X176", "X177", "X178", "X179", "X180", "X18", "X181", "X182",
"X183", "X184", "X185", "X186", "X187", "X188", "X189", "X190",
"X19", "X191", "X192", "X193", "X194", "X195", "X196", "X197",
"X198", "X199", "X200", "X20", "X201", "X202", "X203", "X204",
"X205", "X206", "X207", "X2", "X21", "X22", "X23", "X24", "X25",
"X26", "X27", "X28", "X29", "X30", "X3", "X31", "X32", "X33",
"X34", "X35", "X36", "X37", "X38", "X39", "X40", "X4", "X41",
"X42", "X43", "X44", "X45", "X46", "X47", "X48", "X49", "X50",
"X5", "X51", "X52", "X53", "X54", "X55", "X56", "X57", "X58",
"X59", "X60", "X6", "X61", "X62", "X63", "X64", "X65", "X66",
"X67", "X68", "X69", "X70", "X7", "X71", "X72", "X73", "X74",
"X75", "X76", "X77", "X78", "X79", "X80", "X8", "X81", "X82",
"X83", "X84", "X85", "X86", "X87", "X88", "X89", "X90", "X91",
"X9", "X92", "X93", "X94", "X95", "X96", "X97", "X98", "X99"), row.names = c(NA,
6L), class = "data.frame")
Toute idée serait grandement apprécié.
De quelques tentatives sur le petit jeu de données ci-dessus, il semble que le nombre est calculé pour chaque ligne, mais quand je retourne la résolution objet, il me donne simplement la valeur finale. Comment puis-je réparer cela?
+1 pour résoudre le problème de codage réel – Vincent
vous n'avez pas besoin d'une fonction anonyme pour appliquer, vous pouvez simplement utiliser appliquer (mat, 1, foo, na.rm = TRUE) – mdsumner
@mdsumner Oui, vous avez raison . Il semble que j'ai mélangé les choses à un moment donné. Cependant, comme je n'ai pas passé 'na.rm = T' en paramètre (avec la valeur par défaut) à' foo() ', le code ci-dessus ne fonctionnera pas. Je mettrai à jour ma réponse pour refléter votre bon point. – chl