2017-10-06 1 views
0

J'ai créé une fonction qui crée une adresse aléatoire, mais cela prend beaucoup trop de temps pour chaque appel (environ 10 à 20 secondes). Je dois gérer cela sur plus de 900 000 enregistrements, et selon mes calculs sur le calendrier de cette fonction, cela prendrait 120 jours à donner ou à recevoir. Voici la fonction:Accélérer la fonction table

CREATE function dbo.fn_GetAddress2 (@state NVARCHAR(20)) 
returns @NewAddress TABLE 
( 
    Address1 NVARCHAR(MAX), 
    Address2 NVARCHAR(MAX), 
    City  NVARCHAR(MAX), 
    Postcode NVARCHAR(MAX) 
) 
AS 
BEGIN 
    DECLARE @Address1 NVARCHAR(MAX) 
    DECLARE @Address2 NVARCHAR(MAX) 
    DECLARE @City  NVARCHAR(MAX) 
    DECLARE @Postcode NVARCHAR(MAX) 
    DECLARE @StreetPID NVARCHAR(MAX) 
    DECLARE @newID1  NVARCHAR(36) 

    SELECT @StreetPID = 
     (SELECT TOP 1 g.street_locality_pid AS StreetPID 
      FROM [GNAF].dbo.Street_Locality g 
       INNER JOIN [GNAF].dbo.Address_Detail aD ON g.street_locality_pid = aD.street_locality_pid 
      WHERE g.street_name IS NOT NULL AND g.state != @state AND aD.flat_number IS NOT NULL 
      ORDER BY (SELECT new_id FROM getNewID)) 

    SELECT @Address1 = 
     (SELECT TOP 1 CAST(aD.flat_number AS VARCHAR(20)) + ' ' + g.Street_name + ' ' + g.street_type_code AS Address1 
      FROM [GNAF].dbo.Street_Locality g 
       INNER JOIN [GNAF].dbo.Address_Detail aD ON g.street_locality_pid = aD.street_locality_pid 
      WHERE g.street_name IS NOT NULL AND g.state != @state AND aD.flat_number IS NOT NULL 
        AND g.street_locality_pid = @StreetPID 
      ORDER BY (SELECT new_id FROM getNewID)) 


    SELECT @postcode = 
     (SELECT TOP 1 aD.postcode AS postcode 
      FROM [GNAF].dbo.Street_Locality g 
       INNER JOIN [GNAF].dbo.Address_Detail aD ON g.street_locality_pid = aD.street_locality_pid 
      WHERE g.street_name IS NOT NULL AND g.state != @state AND aD.flat_number IS NOT NULL 
        AND g.street_locality_pid = @StreetPID 
      ORDER BY (SELECT new_id FROM getNewID)) 

    SELECT @City = 
     (SELECT TOP 1 l.locality_name AS city 
      FROM [GNAF].dbo.Street_Locality g 
       INNER JOIN [GNAF].dbo.Address_Detail aD ON g.street_locality_pid = aD.street_locality_pid 
       INNER JOIN [GNAF].dbo.Locality l ON aD.locality_pid = l.locality_pid 
      WHERE g.street_name IS NOT NULL AND g.state != @state AND aD.flat_number IS NOT NULL 
        AND g.street_locality_pid = @StreetPID 
      ORDER BY (SELECT new_id FROM getNewID)) 

    IF @Address1 IS NOT NULL 
    BEGIN 
     INSERT @NewAddress 
     SELECT @Address1, @Address2, @city, @postcode; 
    END; 
    Return; 
END 
GO 

Le [GNAF] base de données est une base de données énorme, rempli de chaque adresse unique en Australie. Les fonctions et newid() sont complètement nouvelles pour moi.

Ive a essayé quelques méthodes différentes, y compris CTE:

SET @State = 'NSW' 
;WITH CTE AS (
    SELECT TOP 1 CAST(aD.flat_number AS VARCHAR(20)) + ' ' + g.Street_name + ' ' + g.street_type_code AS Address1 
      , aD.postcode AS postcode 
    FROM [GNAF].dbo.Street_Locality g 
     INNER JOIN [GNAF].dbo.Address_Detail aD ON g.street_locality_pid = aD.street_locality_pid 
    WHERE g.street_name IS NOT NULL AND g.state != @state AND aD.flat_number IS NOT NULL 
    ORDER BY (SELECT new_id FROM getNewID) 
) 
SELECT @Address1 = (SELECT Address1 FROM CTE) 
     ,@postcode = (SELECT postcode FROM CTE) 
SELECT @Address1 
     , @postcode 

Cela a été fait plus lent. Toute aide à ce sujet serait grandement appréciée.

+0

Quelques choses: Un meilleur plan d'exécution pourrait être produit si vous "CROSS JOIN" la table 'getNewID' et commandez par la colonne' new_id' au lieu de la façon dont vous le faites actuellement.Et en supposant que je comprenne ce que vous essayez de faire correctement, vous n'avez pas besoin de toutes ces variables ou d'un CTE ou de la variable table - vous pouvez faire tout ce que vous voulez avec une seule instruction select. par exemple. 'CREATE FUNCTION dbo.blah (@state NVARCHAR (20)) RETOURNE TABLE COMME RETOUR (SELECT TOP 1 FROM CROSS JOIN getNewID AS n O WH ORDER BY n.new_id);' – ZLK

+0

Merci, ce que j'ai remarqué est que la requête prend exactement le même temps si elle retourne 1 ou 1000000 enregistrements, ce que j'essaie de faire maintenant est de créer une requête SQL dynamique pour compter la quantité de l'état unique alors il suffit de le joindre en utilisant row_number avec les données réelles. J'ai juste du mal à passer une variable dans la requête CTE comme SQL dynamique –

+0

aaaand je me souviens juste que vous pouvez utiliser TOP (@variable) dans SQL 2005 + –

Répondre

1

Voici quelque chose qui devrait fonctionner pour vous. Veuillez noter: Plutôt que de faire des allers-retours vers une table d'adresses complète, j'ai simplement créé 5 nouvelles tables, une pour chaque section d'adresse et les ai remplies avec des données d'une table d'adresses. J'ai utilisé 2000 pour tous sauf la table d'état. Vous pouvez utiliser plus ou moins juste assurez-vous de changer les valeurs modulo dans la fonction pour correspondre au nombre de lignes que vous mais dans chaque table.

Dans tous les cas, c'est rapide ... Je vais poster les STOTISTIQUES SET STOTISTICS, les nombres de temps basés sur 10.000, 100.000 & 1.000.000 lignes sont générées.

USE tempdb; 
GO 
-- Populate a series of individual tables one for each part of the address... 
CREATE TABLE dbo.a1 (ID INT NOT NULL IDENTITY (1,1) PRIMARY KEY CLUSTERED, Address1 VARCHAR(60)); 
INSERT dbo.a1 (Address1) 
SELECT TOP 2000 b.PhysAddr1 FROM Xyz.dbo.ContactBranch b WHERE b.PhysAddr1 LIKE '[0-Z ][0-Z ][0-Z ][0-Z ][0-Z ]%'; 

CREATE TABLE dbo.a2 (ID INT NOT NULL IDENTITY (1,1) PRIMARY KEY CLUSTERED, Address2 VARCHAR(50)); 
INSERT dbo.a2 (Address2) 
SELECT TOP 2000 ISNULL(b.PhysAddr2, '') FROM Xyz.dbo.ContactBranch b; 

CREATE TABLE dbo.cty (ID INT NOT NULL IDENTITY (1,1) PRIMARY KEY CLUSTERED, City VARCHAR(50)); 
INSERT dbo.cty (City) 
SELECT TOP 2000 b.PhysCity FROM Xyz.dbo.ContactBranch b WHERE b.PhysCity LIKE '[0-Z ][0-Z ][0-Z ][0-Z ][0-Z ]%'; 

CREATE TABLE dbo.st (ID INT NOT NULL IDENTITY (1,1) PRIMARY KEY CLUSTERED, State CHAR(2)); 
INSERT dbo.st (State) 
SELECT s.Description FROM Xyz.dbo.LK_States s WHERE s.Description LIKE '[a-Z][a-Z]'; 

CREATE TABLE dbo.zip (ID INT NOT NULL IDENTITY (1,1) PRIMARY KEY CLUSTERED, Zip VARCHAR(5)); 
INSERT dbo.zip (Zip) 
SELECT TOP 2000 LEFT(b.PhysZip10, 5) FROM Xyz.dbo.ContactBranch b WHERE b.PhysZip10 LIKE '[0-Z ][0-Z ][0-Z ][0-Z ][0-Z ]%'; 

/* DROP TABLE dbo.a1; DROP TABLE dbo.a2; DROP TABLE dbo.cty; DROP TABLE dbo.st; DROP TABLE dbo.zip; */ 
/* 
(2000 rows affected) 
(2000 rows affected) 
(2000 rows affected) 
(52 rows affected) 
(2000 rows affected) 
*/ 

Le code de fonction ...

SET QUOTED_IDENTIFIER ON 
GO 
SET ANSI_NULLS ON 
GO 
CREATE FUNCTION dbo.tfn_AddressGenerator 
/* =================================================================== 
10/06/2017 JL, Created: to randomly generate random addresses. 
    The general premmise is based on the Ben-Gan" or inline Tally table. 
=================================================================== */ 
--===== Define I/O parameters 
(
    @State CHAR(2), 
    @NumToCreate INT 
) 
RETURNS TABLE WITH SCHEMABINDING AS 
RETURN 

    WITH 
     cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)), --rows 
     cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),        -- 100 rows 
     cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n2 b),        -- 10,000 rows 
     cte_Tally (n) AS (
      SELECT TOP (@NumToCreate) 
       ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) 
      FROM 
       cte_n3 a CROSS JOIN cte_n3 b             -- 100,000,000 rows 
      ) 
    SELECT 
     a1.Address1, 
     a2.Address2, 
     c.City, 
     State = IIF(s1.State = @State, s2.State, s1.State), 
     z.Zip 
    FROM 
     cte_Tally t 
     CROSS APPLY (VALUES (
      ABS(CHECKSUM(t.n)) % 2000 + 1, ABS(CHECKSUM(t.n)) % 1528 + 1, 
      ABS(CHECKSUM(t.n)) % 2000 + 1, ABS(CHECKSUM(t.n)) % 52 + 1, 
      ABS(CHECKSUM(t.n)) % 52 + 1, ABS(CHECKSUM(t.n)) % 2000 + 1 
      )) x (Add1, Add2, City, State1, State2, Zip) 
     CROSS APPLY (SELECT TOP 1 dbo.a1.Address1 FROM dbo.a1 WHERE x.Add1 = dbo.a1.ID) a1 
     CROSS APPLY (SELECT TOP 1 dbo.a2.Address2 FROM dbo.a2 WHERE x.Add2 = dbo.a2.ID) a2 
     CROSS APPLY (SELECT TOP 1 dbo.cty.City FROM dbo.cty WHERE x.City = dbo.cty.ID) c 
     CROSS APPLY (SELECT TOP 1 dbo.st.State FROM dbo.st WHERE x.State1 = dbo.st.ID) s1 
     CROSS APPLY (SELECT TOP 1 dbo.st.State FROM dbo.st WHERE x.State2 = dbo.st.ID) s2 
     CROSS APPLY (SELECT TOP 1 dbo.Zip.Zip  FROM dbo.zip WHERE x.Zip = dbo.zip.ID) z; 
GO 

L'exécution réelle de la fonction ...

SELECT ag.Address1, ag.Address2, ag.City,ag.State, ag.Zip 
FROM dbo.tfn_AddressGenerator('FL',10000) ag; 

Exemple de sortie ...

Address1     Address2 City    State Zip 
--------------------------- ----------- ---------------- ----- ----- 
111 CONGRESSIONAL BLVD     ATLANTA   AL 30042 
414 Eagle Rock Ave # 100 STE 400  MARIETTA   AR 70816 
414 Eagle Rock Ave Ste 107 Suite 300 NORCROSS   AZ 72116 
3931 HIGHWAY 78 W STE B200    SAVANNAH   CA 31702 
4728 Joseph Eli Dr   STE 6  STONE MOUNTAIN CO 30338 
29620 IH10 West       DULUTH   CT 63026 
4666 El Camino Real      ATLANTA   DC 60555 
3700 Thomas Rd Ste 215  STE 100  ATLANTA   DE 32241 
3700 Thomas Rd Ste 215  STE B-2190 ALPHARETTA  FL 36117 
2615 East West Connector    ALPHARETTA  GA 35201 

10.000 résultats de rangée ...

SQL Server parse and compile time: 
    CPU time = 0 ms, elapsed time = 0 ms. 

(10000 rows affected) 
Table 'zip'. Scan count 0, logical reads 20000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'st'. Scan count 0, logical reads 40000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'cty'. Scan count 0, logical reads 20000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'a2'. Scan count 0, logical reads 20000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'a1'. Scan count 0, logical reads 20000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 

SQL Server Execution Times: 
    CPU time = 94 ms, elapsed time = 93 ms. 

100.000 résultats de rangée ...

SQL Server parse and compile time: 
    CPU time = 0 ms, elapsed time = 0 ms. 

(100000 rows affected) 
Table 'zip'. Scan count 0, logical reads 200000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'st'. Scan count 0, logical reads 400000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'cty'. Scan count 0, logical reads 200000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'a2'. Scan count 0, logical reads 200000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'a1'. Scan count 0, logical reads 200000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 

SQL Server Execution Times: 
    CPU time = 907 ms, elapsed time = 948 ms. 

1.000.000 résultats de rangée ...

SQL Server parse and compile time: 
    CPU time = 0 ms, elapsed time = 1 ms. 
SQL Server parse and compile time: 
    CPU time = 31 ms, elapsed time = 51 ms. 

(1000000 rows affected) 
Table 'a1'. Scan count 0, logical reads 4000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'a2'. Scan count 0, logical reads 3056, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'cty'. Scan count 0, logical reads 4000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'st'. Scan count 0, logical reads 208, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'zip'. Scan count 0, logical reads 4000, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 

SQL Server Execution Times: 
    CPU time = 10921 ms, elapsed time = 15743 ms. 

100K rangs en moins d'une seconde & 1 million de lignes ~ 15 secondes ...

0

Rond était de courir juste la meilleure façon contre chaque Etat avec une variable de quantité, voici le code:

DECLARE @states TABLE (name NVARCHAR(50)); 

INSERT INTO @states (name) 
SELECT DISTINCT 
    State 
FROM anon_AddressChange 


DECLARE @count INT 
DECLARE @i INT 
SET @i = 0 
SET @count = (SELECT COUNT(*) FROM @states) 

while @i < @count 
BEGIN 

    DECLARE @state NVARCHAR(MAX) 
    SET @State = (SELECT top 1 name from @states order by name) 

    DECLARE @amount INT 
    SET @amount = (SELECT count(*) FROM anon_addresschange where state = @state) 





    ;WITH CTE AS (
     SELECT TOP (@amount) CAST(aD.flat_number AS VARCHAR(20)) + ' ' + g.Street_name + ' ' + g.street_type_code AS Address1 
       , aD.postcode AS postcode 
       , l.locality_name AS city 

     FROM [GNAF].dbo.Street_Locality g 
      INNER JOIN [GNAF].dbo.Address_Detail aD ON g.street_locality_pid = aD.street_locality_pid 
      INNER JOIN [GNAF].dbo.Locality l ON aD.locality_pid = l.locality_pid 
     WHERE g.street_name IS NOT NULL AND g.state = @state AND aD.flat_number IS NOT NULL 
      AND g.state NOT IN ('OT', 'NT' ,'TAS' ,'VIC' ,'ACT') 
     ORDER BY (SELECT new_id FROM getNewID) 
    ) 
    UPDATE anon_addresschange SET 
     newStreet1  = UPPER(LEFT(a.Address1,1))+LOWER(SUBSTRING(a.Address1,2,LEN(a.Address1))) 
     ,newCity  = UPPER(LEFT(a.city,1))+LOWER(SUBSTRING(a.city,2,LEN(a.city))) 
     ,newPostcode = a.postcode 
     ,newState  = @state 
     ,newCountry  = 'Australia' 
    FROM (
    SELECT *, ROW_NUMBER() OVER (ORDER BY CAST(GETDATE() AS TIMESTAMP)) AS RowNumber from cte) a 
    CROSS APPLY (
    SELECT *, ROW_NUMBER() OVER (ORDER BY CAST(GETDATE() AS TIMESTAMP)) AS RowNumber FROM anon_AddressChange 
    WHERE state = @state) b 
    WHERE a.Rownumber = b.Rownumber 
     AND anon_addresschange.personID = b.personID 


    SET @i = @i + 1 
    delete from @states WHERE NAME IN (SELECT TOP 1 name FROM @states order by name) 
END 

Tout ce que je dois vraiment faire est de l'utiliser dans une déclaration mise à jour/insertion.

Cette opération a pris 2 secondes pour s'exécuter sur 1003 enregistrements, soit 33 minutes pour les 1 000 000 enregistrements.