首页 文章

USQL调用不带密钥读取json数组

提问于
浏览
2

我有一个嵌套的json,我试图在usql中展平 . 我无法共享数据,但结构与此类似 .

{
    "userlist": [user1, user1],
    "objects": {
        "largeobjects": [object1, object2, object3]
        "smallobjects": [s_object1, s_object2]
    },
    "applications": [{
            "application": sdq3ds5dsa
        }, {
            "application": dksah122j4
        }, {
            "application": sadsw2dq2s
        }, {
            "application": pro3dfdsn3
        }
    ],
    "date" : 12344232,
    "timezone" : "Asia",
    "id" : "sad2ddssa2",
    "admin": {
        "lang": "eng",
        "country": "us",
    }
}

我使用自定义jsonoutputter(https://github.com/Azure/usql/tree/master/Examples/DataFormats/Microsoft.Analytics.Samples.Formats)从json文件和jsontuple函数中提取以提取值 . 我的问题是该函数使用sql map生成键值对 . 这适用于我有一个键的情况,但是当我尝试使用该函数从无键数组中获取值时会抛出错误 .

任何有关如何解决这个问题的建议将不胜感激 .

EDIT 这是我正在关注的输出:

sad2ddssa2,object1,12344232,"Asia","eng","us",
sad2ddssa2,object2,12344232,"Asia","eng","us"

2 回答

  • 1

    First option

    尝试在你的u-sql中使用PROSE . 使用PROSE的c#nuget处理数据并进行复杂的提取 . 这是一个非常强大的AI包 . 在此处查看视频和示例:https://microsoft.github.io/prose

    Second option

    创建一个c#函数来处理你的json . 像这样的东西,使用c#json api将此示例调整为您的自定义提取请求:

    /* Formats the array of values into a named json array. */
    
    DECLARE @JsonArray Func<SqlArray<string>, string, string> = (data, name) => 
        {
            StringBuilder buffer = new StringBuilder();
    
            buffer.Append("{\r\n\t\"" + name + "\": [\r\n");
    
            for (int i = 0; i < data.Count(); i++)
            {
                if (i > 0)
                {
                    buffer.Append(",\r\n");
                }
    
                buffer.Append("\t\"" + data[i] + "\"");
            }
    
            buffer.Append("\r\n\t]\r\n}");
    
            return buffer.ToString();
        };
    
    /* Format the array containing groups of comma separated values into a named json array */
    
    @Query = 
        SELECT
            @JsonArray(SubscriptionArray, "subscriptionList") AS JsonArray
        FROM @subscriptionsQuery1;
    

    Third option

    在根据您的需求调整后,尝试这种方法:

    /* For each json line create a json map (SqlMap) */
    
    @subscriptionsQuery1 = 
        SELECT 
               JsonFunctions.JsonTuple(JsonLine) AS JsonMap
        FROM @SubscriptionsExtractor AS t;
    
    /* For each json map get the required property value */
    
    @subscriptionsQuery1 = 
        SELECT DISTINCT
               JsonMap["alias"] AS Subscription
        FROM @subscriptionsQuery1 AS t;
    
    /* Join the value of all rows into a single row containing an array of all values */
    
    @subscriptionsQuery1 = 
        SELECT
            ARRAY_AGG<string>(Subscription) AS SubscriptionArray
        FROM @subscriptionsQuery1 AS t;
    
  • 1

    我能够使用NewtonSoft MultiLevelJsonExtractor提取器和this fixed-up JSON file来使用它:

    REFERENCE ASSEMBLY [Newtonsoft.Json];
    REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats]; 
    
    USING Microsoft.Analytics.Samples.Formats.Json;
    
    DECLARE @inputFile string = @"\input\yourInputJSON.json";
    DECLARE @outputFile string = @"\output\output.csv";
    
    
    @input =
        EXTRACT id string,
                largeobjects string,
                date string,
                timezone string,
                lang string,
                country string
        FROM @inputFile
        USING new MultiLevelJsonExtractor("objects", false,
              "id",
              "largeobjects",
              "date",
              "timezone",
              "admin.lang",
              "admin.country"
              );
    
    // Convert the JSON column to SQL MAP to multiple rows
    @working =
        SELECT id,
               JsonFunctions.JsonTuple(largeobjects).Values AS largeobject,
               date,
               timezone,
               lang,
               country
        FROM @input;
    
    
    // Explode the JSON SQL MAP
    @output =
        SELECT id,
               x.y AS largeobject,
               date,
               timezone,
               lang,
               country
        FROM @working
             CROSS APPLY
                 EXPLODE(largeobject) AS x(y);
    
    
    OUTPUT @output
    TO @outputFile
    USING Outputters.Csv(quoting : false);
    

    我的结果:

    My results

    我想说这可能比使用自己动手的方法更安全一些,因为NewtonSoft库专门用于操作JSON并且经过了尝试和测试 .

相关问题