2012-08-14 4 views
4

J'essaie d'extraire un code de carte cadeau d'une commande Magento. Un autre code utilise l'API Magento pour récupérer les informations de commande en tant que XML depuis Magento et insérer le XML dans un enregistrement MS SQL Server. En utilisant T-SQL, je peux utiliser des fonctions XML pour analyser le XML extrait de l'API Magento et obtenir presque tout ce dont j'ai besoin, mais le seul endroit où le code de carte cadeau est stocké est le champ gift_cards qui est un php sérialisé chaîne.Désérialiser les données sérialisées PHP dans T-SQL

Exemples:

a:1:{i:0;a:5:{s:1:"i";s:1:"1";s:1:"c";s:12:"00XCY8S3ZXCU";s:1:"a";d:119;s:2:"ba";d:119;s:10:"authorized";d:119;}} 
a:3:{i:0;a:5:{s:1:"i";s:2:"10";s:1:"c";s:12:"045EMJJWRCF1";s:1:"a";d:100;s:2:"ba";d:100;s:10:"authorized";d:100;}i:1;a:5:{s:1:"i";s:2:"11";s:1:"c";s:12:"06DUQ7Z5GVT7";s:1:"a";d:101;s:2:"ba";d:101;s:10:"authorized";d:101;}i:2;a:5:{s:1:"i";s:2:"12";s:1:"c";s:12:"07A6MRYW511J";s:1:"a";d:102;s:2:"ba";d:102;s:10:"authorized";d:102;}} 

Le code de carte cadeau est la valeur du tableau avec la clé de "c" comme: 00XCY8S3ZXCU045EMJJWRCF106DUQ7Z5GVT707A6MRYW511J

Je suis en train d'essayer d'aborder ce en analysant la valeur en utilisant une fonction T-SQL, qui est comme essayer de conduire un clou avec un tournevis. Apparemment cela a été demandé avant here et la seule suggestion était de construire un analyseur à partir de zéro dans T-SQL, mais qu'utiliser PHP pour le désérialiser était la meilleure option. Il serait bien que Magento ne stocke pas les données sérialisées PHP dans sa base de données, et les distribue encore sérialisé dans leurs services Web, mais c'est ce que je dois travailler. J'envisagerais d'utiliser C# pour le convertir et le stocker comme un champ séparé dans la base de données, mais il serait beaucoup plus pratique de pouvoir analyser les données dans T-SQL. Si je devais utiliser C# pour analyser et désérialiser l'objet PHP, je le stockerais probablement en XML dans la base de données parce que c'est un format beaucoup mieux pour échanger les données.

+0

Avez-vous envisagé d'écrire une [fonction SQLCLR] (http://msdn.microsoft.com/fr-fr/library/ms189876.aspx) en C#? De cette façon, vous pouvez utiliser .NET pour l'analyse réelle, mais appelez la fonction comme s'il s'agissait d'une fonction TSQL. Il peut vous donner le meilleur des deux mondes dans ce cas, en supposant que votre version de SQL Server prend en charge CLR et l'a activée. – Pondlife

+0

Je l'ai considéré, mais je n'ai jamais joué avec le SQLCLR, alors j'ai fini par travailler avec quelque chose que je connaissais. Je devrais prendre le temps d'en apprendre plus sur la SQLCLR parce que je joue assez récemment avec C#. – mttjohnson

Répondre

1

Voici ce que j'ai pu trouver avec moi-même. J'ai été encouragé par un post about parsing JSON et j'ai décidé de le comprendre pour les objets php sérialisés. J'ai pris une approche complètement différente cependant.

Les exemples de code mis à jour sont now published here on github.

La chaîne php sérialisé:

a:3:{ 
    i:0; 
    a:5:{ 
    s:1:"i"; 
    s:2:"10"; 

    s:1:"c"; 
    s:12:"045EMJJWRCF1"; 

    s:1:"a"; 
    d:100; 

    s:2:"ba"; 
    d:100; 

    s:10:"authorized"; 
    d:100; 
    } 

    i:1; 
    a:5:{ 
    s:1:"i"; 
    s:2:"11"; 

    s:1:"c"; 
    s:12:"06DUQ7Z5GVT7"; 

    s:1:"a"; 
    d:101; 

    s:2:"ba"; 
    d:101; 

    s:10:"authorized"; 
    d:101; 
    } 

    i:2; 
    a:5:{ 
    s:1:"i"; 
    s:2:"12"; 

    s:1:"c"; 
    s:12:"07A6MRYW511J"; 

    s:1:"a"; 
    d:102; 

    s:2:"ba"; 
    d:102; 

    s:10:"authorized"; 
    d:102; 
    } 
} 

Ma requête pour obtenir les résultats:

select * 
from parsePhpSerializedString('a:3:{i:0;a:5:{s:1:"i";s:2:"10";s:1:"c";s:12:"045EMJJWRCF1";s:1:"a";d:100;s:2:"ba";d:100;s:10:"authorized";d:100;}i:1;a:5:{s:1:"i";s:2:"11";s:1:"c";s:12:"06DUQ7Z5GVT7";s:1:"a";d:101;s:2:"ba";d:101;s:10:"authorized";d:101;}i:2;a:5:{s:1:"i";s:2:"12";s:1:"c";s:12:"07A6MRYW511J";s:1:"a";d:102;s:2:"ba";d:102;s:10:"authorized";d:102;}}') 

Les résultats de la requête:

element_id parent_id var_name           var_type           var_length value_int value_string                                                              value_decimal 
----------- ----------- -------------------------------------------------- -------------------------------------------------- ----------- ----------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------- 
1   0   NULL            a             3   NULL  NULL                                                                NULL 
2   1   0             a             5   NULL  NULL                                                                NULL 
3   1   1             a             5   NULL  NULL                                                                NULL 
4   1   2             a             5   NULL  NULL                                                                NULL 
5   2   i             s             2   NULL  10                                                                NULL 
6   2   c             s             12   NULL  045EMJJWRCF1                                                              NULL 
7   2   a             d             NULL  NULL  NULL                                                                100 
8   2   ba             d             NULL  NULL  NULL                                                                100 
9   2   authorized           d             NULL  NULL  NULL                                                                100 
10   3   i             s             2   NULL  11                                                                NULL 
11   3   c             s             12   NULL  06DUQ7Z5GVT7                                                              NULL 
12   3   a             d             NULL  NULL  NULL                                                                101 
13   3   ba             d             NULL  NULL  NULL                                                                101 
14   3   authorized           d             NULL  NULL  NULL                                                                101 
15   4   i             s             2   NULL  12                                                                NULL 
16   4   c             s             12   NULL  07A6MRYW511J                                                              NULL 
17   4   a             d             NULL  NULL  NULL                                                                102 
18   4   ba             d             NULL  NULL  NULL                                                                102 
19   4   authorized           d             NULL  NULL  NULL                                                                102 

Si je voulais juste le cadeau codes de la carte Je peux écrire une requête comme ceci:

select value_string 
from parsePhpSerializedString('a:3:{i:0;a:5:{s:1:"i";s:2:"10";s:1:"c";s:12:"045EMJJWRCF1";s:1:"a";d:100;s:2:"ba";d:100;s:10:"authorized";d:100;}i:1;a:5:{s:1:"i";s:2:"11";s:1:"c";s:12:"06DUQ7Z5GVT7";s:1:"a";d:101;s:2:"ba";d:101;s:10:"authorized";d:101;}i:2;a:5:{s:1:"i";s:2:"12";s:1:"c";s:12:"07A6MRYW511J";s:1:"a";d:102;s:2:"ba";d:102;s:10:"authorized";d:102;}}') 
where parent_id != 0 and 
     var_name = 'c' 

Résultats:

value_string 
------------- 
045EMJJWRCF1 
06DUQ7Z5GVT7 
07A6MRYW511J 

Voici la fonction T-SQL pour analyser la chaîne PHP sérialisé:

IF OBJECT_ID (N'dbo.parsePhpSerializedString') IS NOT NULL 
    DROP FUNCTION dbo.parsePhpSerializedString 
GO 
CREATE FUNCTION dbo.parsePhpSerializedString(@phpSerialized VARCHAR(MAX)) 
RETURNS @results table 
    (
     element_id int identity(1,1) not null, /* internal surrogate primary key gives the order of parsing and the list order */ 
     parent_id int, /* if the element has a parent then it is in this column. */ 
     var_name varchar(50), /* the name or key of the element in a key/value array list */ 
     var_type varchar(50), 
     var_length int, 
     value_int int, 
     value_string varchar(max), 
     value_decimal numeric 
    ) 
AS 
BEGIN 

    /* 
    Built by Matt Johnson ([email protected]) 2012-08-14 
    */ 

    -- we use this table later for collecting auto generated 
    -- identity values when inserting records into @results 
    declare @insertedIds table (
     element_id int 
    ) 

    -- define variables 
    declare @element_start int 
    declare @var_type_end int 
    declare @var_type varchar(50) 
    declare @element_end int 
    declare @chunk varchar(max) 
    declare @var_length_start int 
    declare @var_length_end int 
    declare @var_length_string varchar(max) 
    declare @var_length int 
    declare @value_start int 
    declare @value_end int 
    declare @value_string varchar(max) 
    declare @value_int int 
    declare @value_decimal numeric 
    declare @array_level int 
    declare @value_string_position int 
    declare @next_open int 
    declare @next_close int 
    declare @parent_id int 
    declare @element_id int 
    declare @key_element_id int 
    declare @inserted_element_id int 
    declare @var_name varchar(50) 

    --initialize variables 
    set @parent_id = 0 


    --loop through the supplied @phpSerialized string until it's empty 
    while 1=1 begin 
     set @element_start = null 
     set @var_type_end = null 
     set @var_type = null 
     set @element_end = null 
     set @chunk = null 
     set @var_length_start = null 
     set @var_length_end = null 
     set @var_length_string = null 
     set @var_length = null 
     set @value_start = null 
     set @value_end = null 
     set @value_string = null 
     set @value_int = null 
     set @value_decimal = null 
     set @array_level = null 
     set @value_string_position = null 
     set @next_open = null 
     set @next_close = null 
     set @var_name = null 

     --confirm that there is an element to parse and define its starting point 
     --patindex will return a value of 1 if the pattern is found and this pattern 
     --will only match if the element starting point is the first character in the 
     --supplied string. If it is encapsulated in quotes or anything else it will not match 
     set @element_start = patindex('[asid]:%[;}]', @phpSerialized) 

     if @element_start <= 0 begin 
      --if the supplied string is now empty check the existing results table 
      --for any nested elements in any array elements 

      --reset the value of @element_id to be safe 
      set @element_id = null 

      --only retrieve the first element found containing sub elements to parse 
      select top 1 
        @phpSerialized = value_string, 
        @element_id = element_id  
      from @results 
      where var_type = 'a' and 
        value_string is not null 

      --set the parent_id to the array's element_id 
      set @parent_id = @element_id 

      --if there were no results found then that means there either 
      --were no arrays to parse, or all arrays have already been parsed 
      --so break the continuous loop because we are completely done now 
      if @element_id is null break 

      --set the @element_start again now that we 
      --have a new string to parse for elements 
      set @element_start = patindex('[asid]:%[;}]', @phpSerialized) 
     end 

     --find the end of the type of the element then extract the variable type from the string 
     set @var_type_end = patindex('%:%', @phpSerialized) 
     set @var_type = substring(@phpSerialized, @element_start, @[email protected]_start) 

     --generate an error if a variable type is supplied that hasn't been coded for. 
     if @var_type not like '[asid]' begin 
      /* 
      print @var_type 
      RAISERROR (N'Error parsing php serialized string. Variable type found that has not been defined to parse for.', -- Message text. 
         16, -- Severity, 
         1 -- State 
         ) 
      */ 

      --apparently errors can't be raised within a function so skip the element 
      break 
     end 

     --array elements contain sub elements so we use different methods for parsing 
     --sub elements than we do for parsing individual elements. 
     if @var_type != 'a' begin 
      --element has no sub elements 

      --determine the end of this individual element and then extract 
      --only this individual element from the string 
      set @element_end = patindex('%;%', @phpSerialized)+1 
      set @chunk = substring(@phpSerialized, @element_start, @[email protected]_start) 

      --strings are serialized differently than numeric elements 
      if @var_type = 's' begin 
       --element has var length 

       --find the starting and ending positions for the var_length and then extract the length 
       set @var_length_start = @var_type_end+1 
       set @var_length_end = patindex('%:%', substring(@chunk, @var_length_start, len(@chunk))) + @var_length_start - 1 
       set @var_length_string = substring(@chunk, @var_length_start, @[email protected]_length_start) 
       if @var_length_string not like '[^0-9]' begin 
        --its nice to verify this is actually a number before casting it as such 
        set @var_length = cast(@var_length_string as int) 
       end 

       --find the starting and ending positions for the value and then extract the value 
       set @value_start = @var_length_end+1 
       set @value_end = patindex('%;%', @chunk) 
       --a string value is quoted so remove quotes in start and end of substring for value 
       --we set the substring starting position +1 just past the start of the quote and then 
       --set the length of the extracted string -2 to account for both the starting quote and 
       --ending quote to be removed from the extracted string. 
       set @value_string = substring(@chunk, @value_start+1, @[email protected]_start-2) 

      end else begin 
       --element does not have a var length 

       --find the starting and ending positions for the value and then extract the value as a string 
       set @value_start = @var_type_end+1 
       set @value_end = patindex('%;%', @chunk) 
       set @value_string = substring(@chunk, @value_start, @[email protected]_start) 

       --determine what value type the string should be converted to 
       if @var_type = 'i' begin 
        if @value_string not like '[^0-9.]' begin 
         set @value_int = cast(@value_string as int) 
         --clear the value_string because the element's value has been converted to its appropriate type 
         set @value_string = null 
        end 
       end else if @var_type = 'd' begin 
        if @value_string not like '[^0-9.]' begin 
         set @value_decimal = cast(@value_string as numeric) 
         --clear the value_string because the element's value has been converted to its appropriate type 
         set @value_string = null 
        end 
       end 

      end 


     end else begin 
      --element is array and has sub elements 

      --we are going to chop up the string to try and determine its end so we'll 
      --first set the string to a variable we can destroy in this process 
      set @chunk = @phpSerialized 

      --find the starting and ending positions for the var_length and then extract the length 
      --arrays use this to state how may elements this array contains 
      set @var_length_start = @var_type_end+1 
      set @var_length_end = patindex('%:%', substring(@chunk, @var_length_start, len(@chunk))) + @var_length_start - 1 
      set @var_length_string = substring(@chunk, @var_length_start, @[email protected]_length_start) 
      if @var_length_string not like '[^0-9]' begin 
       set @var_length = cast(@var_length_string as int) 
      end 

      --find the value starting position 
      --later we will find the true end of the value 
      set @value_start = @var_length_end+1 

      -- to determine the ending position we have to dig through the sub elements and track the 
      -- nested level to identify the ending brace for this level 
      set @array_level = 0 
      --we start the string position at 1 for the begining of the serialized string 
      set @value_string_position = 1 

      -- loop through the value chopping up the chunk while trying to find the ending brace for this array 
      while 1=1 begin 

       --find the next open and close braces in the chunk 
       set @next_open = patindex('%{%', @chunk) 
       set @next_close = patindex('%}%', @chunk) 

       --check to see which brace is the next in the chunk 
       if @next_open > 0 and @next_open < @next_close begin 
        --found an opening brace 

        --since this is an opening brace we need to increment the level and strip off 
        --everything from the chunk before the brace so that we can search for additional braces 
        --we also note the position in the string for use in finding the end of the value later 
        --we track the previous position and add to it because we keep chopping off the beginning of 
        --the chunk as we parse through the string, and later we will need to reference the position 
        --relative to the entire serialized string. 
        set @value_string_position = @value_string_position + patindex('%{%', @chunk)-1 
        set @chunk = substring(@chunk, patindex('%{%', @chunk)+1, len(@chunk)) 
        set @array_level = @array_level + 1 

       end else if @next_close > 0 begin 
        --found a closing brace 
        --print 'found close at level: ' + cast(@array_level as varchar(10)) + '(' + cast(patindex('%}%', @chunk) as varchar(10)) + ')' 

        --since this is a closing brace we need to decrement the level and strip off 
        --everything from the chunk before the brace so that we can search for additional braces 
        --we also note the position in the string for use in finding the end of the value later 
        --we track the previous position and add to it because we keep chopping off the beginning of 
        --the chunk as we parse through the string, and later we will need to reference the position 
        --relative to the entire serialized string. 
        set @value_string_position = @value_string_position + patindex('%}%', @chunk)+1 
        set @chunk = substring(@chunk, patindex('%}%', @chunk)+1, len(@chunk)) 
        set @array_level = @array_level - 1 

       end else break 

       --once we get back to level 0 we know we've found the end of this array element 
       --so break the continuous loop now that we have the ending position 
       if @array_level <= 0 break 
      end 

      --set the ending position of the element and the value since the value is the last part of the element 
      set @element_end = @value_string_position 
      set @value_end = @element_end 
      --an array value is surrounded by braces so remove the braces in start and end of the substring value 
      --we set the substring starting position +1 just past the start of the opening brace and then 
      --set the length of the extracted string -2 to account for both the opening brace and 
      --closing brace to be removed from the extracted string. 
      set @value_string = substring(@phpSerialized, @value_start+1, @[email protected]_start-2) 
      set @chunk = substring(@phpSerialized, @element_start, @[email protected]_start) 

      -- if the array is empty just set it to null so that 
      -- we don't try and parse the contents of the array value later. 
      if @value_string = '' set @value_string = null 
     end 

     --we populate the results table differently depending on the element that is being parsed. 
     --Any element contained in an array has a key element and a value element. Though we parse 
     --all key elements in an array all we do with them in the results table is set their value 
     --as the var_name (key) for the element, and store the key_element_id for setting the value 
     --in the next pass since key/value pairs are listed sequentially in a serialized string. 
     if @parent_id > 0 and @key_element_id > 0 begin 
      --parent_id > 0 indicates this is a sub element inside an array 
      --and the array contents currently being parsed contain key elements and value 
      --elements sequentially in the serialized string. Because key_element_id is not 0 
      --we know this must be the value part of the element contained in the array as the 
      --key was just previously defined. 

      --update the existing keyed element with it's type, length and value 
      update @results 
      set var_type = @var_type, 
       var_length = @var_length, 
       value_string = @value_string, 
       value_int = @value_int, 
       value_decimal = @value_decimal 
      where element_id = @key_element_id  

      set @key_element_id = null 
     end else if @parent_id > 0 begin 
      --this element is a part of the array which contains key/value paris and since 
      --the @key_element_id is 0 or not defined we can asume this is the key of the pair 

      --determine what the key element type is and cast it as a string to the var_name 
      if @var_type = 'i' set @var_name = cast(@value_int as varchar(50)) 
      if @var_type = 'd' set @var_name = cast(@value_decimal as varchar(50)) 
      if @var_type = 's' set @var_name = @value_string 

      --here we are assuming that all keys should be of type s, d, or i... so we if get anything 
      --other than that throw an error since this code is not setup to handle that. 
      if @var_type not like '[sid]' begin 
       /* 
       print @var_type 
       RAISERROR (N'Error parsing php serialized string. Variable type found that should be akey but is not of type s, d, or i.', -- Message text. 
          16, -- Severity, 
          1 -- State 
          ) 
       */ 

       --apparently errors can't be raised within a function so skip the element 
       break 
      end 

      --insert a new record into the results table defining the parent_id and var_name 
      insert @results 
      (
       parent_id, 
       var_name 
      ) 
      output inserted.element_id into @insertedIds 
      values (
       @parent_id, 
       @var_name 
      ) 

      --since we stored the identity value in the output set that value to @inserted_element_id 
      select top 1 @inserted_element_id = element_id from @insertedIds 
      delete from @insertedIds 

      --set the key_element_id so that the next pass catches the value and assigns it to this result record 
      set @key_element_id = @inserted_element_id 

     end else begin 
      --this will be executed for parsed strings that are not part of an array 
      --in which case the elements do not contain value_name keys 

      --insert the entire element details into the results table 
      insert into @results 
      (
       parent_id, 
       var_type, 
       var_length, 
       value_string, 
       value_int, 
       value_decimal 
      ) 
      output inserted.element_id into @insertedIds 
      values (
       @parent_id, 
       @var_type, 
       @var_length, 
       @value_string, 
       @value_int, 
       @value_decimal 
      ) 

      --here we capture the identiy value for the inserted record 
      select top 1 @inserted_element_id = element_id from @insertedIds 
      delete from @insertedIds 
     end 

     -- if the current php serilized string was an array then it would have 
     -- been parsed and elements added to the results, so we should remove 
     -- the value_string from the array element in the results table so that 
     -- it doesn't get parsed again in the loop. 
     if @element_id is not null begin 
      --the only strings that get parsed where the @element_id has 
      --a value is from an array element 
      update @results 
      set value_string = null 
      where element_id = @element_id 
     end 

     --since we have parsed this element from the serialized string chop off this element 
     --from the string and run the rest of it through the loop again to ensure all 
     --elements have been parsed from the supplied serialized string. 
     set @phpSerialized = substring(@phpSerialized, @element_end, len(@phpSerialized)) 

    end 

    return 
end 

Et si elle ne tient pas compte de tous les types sérialisés et tels, c'était suffisant pour mon but ici, et les commentaires inclus devraient aider à guider n'importe qui qui a besoin de l'adapter pour leur propre usage, ou étendre sa fonctionnalité.

+0

J'ai essayé d'éditer ma réponse et de la mettre à jour avec quelques corrections de bugs à la fonction que j'ai posté, mais j'ai continué à recevoir des erreurs en essayant de poster le nouveau code. J'ai publié la version mise à jour sur un article de blog ici (maintenant lien vers la version d'archive Internet): https://web.archive.org/web/20130920034044/http://www.evdat.com/2013/03/15/ unserialize-php-sérialisé-data-in-t-sql/ – mttjohnson

+0

Malheureusement, le lien vers votre script zip mis à jour n'est pas valide. Pourriez-vous le rendre disponible à nouveau? – needfulthing

+1

J'ai trouvé une sauvegarde du fichier et j'ai décidé de le publier sur GitHub: https://github.com/mjohnsonperl/tsqlphpunserialize/tree/master – mttjohnson

0

Voici une autre fonction MSSQL que vous pouvez utiliser pour extraire la valeur de l'instance première d'un jeton d'une chaîne sérialisée PHP, qui se trouve généralement dans la base de données de Magento. La fonction accepte le nom d'attribut pour la valeur que vous souhaitez extraire, et la valeur sérialisé que vous voulez l'analyser à partir de:

CREATE FUNCTION extractValueFromSerializedPhpString(
    @attribute_name nvarchar(100), @serialized_value nvarchar(max) 
) RETURNS nvarchar(1024) 
BEGIN 
    DECLARE @attribute_value_length int, @attribute_value_length_string nvarchar(5) 
    SELECT @attribute_value_length_string = REPLACE(REPLACE(SUBSTRING(@serialized_value,CHARINDEX(@attribute_name,@serialized_value)+LEN(@attribute_name)+4,3),':',''),'"','') 
    IF ISNUMERIC(@attribute_value_length_string) = 1 BEGIN 
     SELECT @attribute_value_length = CAST(@attribute_value_length_string AS int) 
     RETURN SUBSTRING(@serialized_value,CHARINDEX(@attribute_name,@serialized_value)+LEN(@attribute_name)+CASE WHEN @attribute_value_length > 99 THEN 9 WHEN @attribute_value_length > 9 THEN 8 ELSE 7 END,@attribute_value_length) 
    END 
    RETURN NULL 
END 

Notez que cela ne fonctionne que pour les valeurs de chaîne avec des longueurs de 1 à 999 et a probablement toutes sortes d'autres cas ennuyeux de bord aussi, mais cela a fonctionné pour mes buts (simples).

Questions connexes