2016-11-09 1 views
1

J'ai une table Postgres (ver 9.5.4.) geo qui contient 738,884 enregistrements de données géographiques du pays avec la structure suivante:Performance des descendants recherche à l'aide Postgres intarray champ

   Table "public.geo" 
    Column |   Type    | Modifiers | Storage | Stats target | Description 
-------------+-----------------------------+-----------+----------+--------------+------------- 
id   | integer      | not null | plain |    | 
kind  | character varying(255)  |   | extended |    | 
name  | character varying(255)  |   | extended |    | 
is_owner | integer      |   | plain |    | 
path_array | integer[]     |   | extended |    | 
Indexes: 
    "geo_pkey" PRIMARY KEY, btree (id) 
    "kind_index" btree (kind) 
    "path_array_idx" gin (path_array gin__int_ops) 

enregistrements ont hiérarchie par kind domaine : country ->province ->area ->locality. Cette hiérarchie est stockée dans le champ path_array sous forme de tableau d'ancêtres et d'identifiants de ligne.

Exemple:

17239123 locality Moscow 1 {17073865,17073877,17073958,17239123} 

J'ai installé l'extension et intarray ajouté index propre à path_array champ.

Maintenant, j'ai un tas de ids de Recods qui peuvent avoir tout type (de pays à localité) et je dois sélectionner tous leurs descendants avec le type locality (à savoir les enregistrements qui ont tout cela ids dans leur path_array).

C'est ma requête:

SELECT 
    id 
FROM geo 
WHERE 
    kind = 'locality' 
    AND is_owner = 1 
    AND path_array && '{17073888,17073984,17073885,17073905,17073958,17073927,17073908,17073952,17073948,17073947,17073917,17073944,17073919,17073922,17073914,17073937,17073895,17073904,17073911,17073949,17073938,17073957,17073900,17073915,17073936,17073887,17073933,17073939,17073956,17073884,17073901,17073881,17153202,17073916,17073945,17073883,17073943,17073909,17073950,17073942,17073906,17073886,17073910,17073882,17073941,17073891,17073929,17073928,17073903,17073912,17073930,17073898,17073899,17073954}'::integer[] 

est ici EXPLAIN ANALYZE sortie:

Bitmap Heap Scan on geo (cost=1418.04..1532.99 rows=8 width=4) (actual time=685.183..723.330 rows=20984 loops=1) 
Recheck Cond: ((is_owner = 1) AND (path_array && '{17073888,17073984,17073885,17073905,17073958,17073927,17073908,17073952,17073948,17073947,17073917,17073944,17073919,17073922,17073914,17073937,17073895,17073904,17073911,17073949,17073938,17073957,17073900,17073915,17073936,17073887,17073933,17073939,17073956,17073884,17073901,17073881,17153202,17073916,17073945,17073883,17073943,17073909,17073950,17073942,17073906,17073886,17073910,17073882,17073941,17073891,17073929,17073928,17073903,17073912,17073930,17073898,17073899,17073954}'::integer[])) 
Filter: ((kind)::text = 'locality'::text) 
Rows Removed by Filter: 2037 
Heap Blocks: exact=17106 
-> BitmapAnd (cost=1418.04..1418.04 rows=29 width=0) (actual time=681.154..681.154 rows=0 loops=1) 
    -> Bitmap Index Scan on is_owner_index (cost=0.00..544.24 rows=29309 width=0) (actual time=5.493..5.493 rows=29201 loops=1) 
      Index Cond: (is_owner = 1) 
    -> Bitmap Index Scan on path_array_idx (cost=0.00..873.54 rows=739 width=0) (actual time=667.888..667.888 rows=607440 loops=1) 
      Index Cond: (path_array && '{17073888,17073984,17073885,17073905,17073958,17073927,17073908,17073952,17073948,17073947,17073917,17073944,17073919,17073922,17073914,17073937,17073895,17073904,17073911,17073949,17073938,17073957,17073900,17073915,17073936,17073887,17073933,17073939,17073956,17073884,17073901,17073881,17153202,17073916,17073945,17073883,17073943,17073909,17073950,17073942,17073906,17073886,17073910,17073882,17073941,17073891,17073929,17073928,17073903,17073912,17073930,17073898,17073899,17073954}'::integer[]) 
Planning time: 0.212 ms 
Execution time: 727.370 ms 

La requête ci-dessus a pris environ 700 ms, ce qui je crois est très lent. Ai-je raison ou je demande trop?

+0

Selon le plan d'explication, la requête n'a duré que 700 ms, et non 1,4 secondes –

+0

Il s'agissait d'un problème de latence réseau. Fixé et ajouté une solution possible. – bbrodriges

+0

me semble bon - afaiac, allez-y et postez comme réponse. – shaunc

Répondre

0

J'ai créé un index complexe sur les champs path_array et is_owner.

CREATE INDEX path_array_owner_idx ON geo USING gin (path_array gin__int_ops) WHERE is_owner = 1 

------------------------------------------------------------------------------------- 
    Bitmap Heap Scan on geo (cost=436.04..550.99 rows=8 width=4) (actual time=30.292..68.778 rows=20984 loops=1) 
    Recheck Cond: ((path_array && '{17073888,17073984,17073885,17073905,17073958,17073927,17073908,17073952,17073948,17073947,17073917,17073944,17073919,17073922,17073914,17073937,17073895,17073904,17073911,17073949,17073938,17073957,17073900,17073915,17073936,17073887,17073933,17073939,17073956,17073884,17073901,17073881,17153202,17073916,17073945,17073883,17073943,17073909,17073950,17073942,17073906,17073886,17073910,17073882,17073941,17073891,17073929,17073928,17073903,17073912,17073930,17073898,17073899,17073954}'::integer[]) AND (is_owner = 1)) 
    Filter: ((kind)::text = 'locality'::text) 
    Rows Removed by Filter: 2037 
    Heap Blocks: exact=17106 
    -> Bitmap Index Scan on path_array_owner_idx (cost=0.00..436.04 rows=29 width=0) (actual time=25.923..25.923 rows=23021 loops=1) 
     Index Cond: (path_array && '{17073888,17073984,17073885,17073905,17073958,17073927,17073908,17073952,17073948,17073947,17073917,17073944,17073919,17073922,17073914,17073937,17073895,17073904,17073911,17073949,17073938,17073957,17073900,17073915,17073936,17073887,17073933,17073939,17073956,17073884,17073901,17073881,17153202,17073916,17073945,17073883,17073943,17073909,17073950,17073942,17073906,17073886,17073910,17073882,17073941,17073891,17073929,17073928,17073903,17073912,17073930,17073898,17073899,17073954}'::integer[]) 
Planning time: 0.219 ms 
Execution time: 72.956 ms 

Maintenant, la requête ci-dessus a pris 70 ms, ce qui est agréable.