朋友们,

该场景与wifi控制器日志文件有关,其中用户活动被跟踪,就像用户“加入”,“进入”,断开“和”离开wifi网络一样 . 我已经清理并在表格中上传结构化数据 . 我想在用户级别为他/她的活动创建会话ID . 确定一个会话可以按事件“加入”和“离开”之间的所有事件进行分组 . 我附上了示例数据以便更好地理解 .

一些有助于创建会话ID的指针:

  • 可以通过"joins"事件识别用户的新会话,并且_11627_事件表示结束 .

  • "disconnects from"和"leave"具有相同的时间戳,因此如果按时间戳排序,则顺序可以是随机的 .
    每个"disconnects from"

  • 都有"leave"的条目,但反之亦然 . 因此,我们可以在没有"disconnects from"记录的情况下使用"leave" .

  • "in"表示漫游,因此它应该是同一会话的一部分 . 它将在"joins"和"leave"事件之间 .

  • 您可能只看到记录只有"in"或说"disconnects from"因为完整的日志文件不可用,仍然希望将这些记录标记为不同的会话ID .

请帮我创建会话ID,考虑给定的指针 . 以下是样本数据,几乎没有可能的场景,以便更好地理解我在寻找什么 . 已经为“必需数据”分配了虚拟会话ID,这应该是它的外观 . 发布我的任何进一步澄清 . 我想用SQL实现这一点 .

场景1

原始数据:

usermac datetime event

xx:xx:xx:xx:0e:72   2018-04-19 09:30:58.000 joins
xx:xx:xx:xx:0e:72   2018-04-19 09:51:54.000 in
xx:xx:xx:xx:0e:72   2018-04-19 09:51:54.000 in
xx:xx:xx:xx:0e:72   2018-04-19 12:01:39.000 disconnects from
xx:xx:xx:xx:0e:72   2018-04-19 12:01:39.000 leave
xx:xx:xx:xx:0e:72   2018-04-19 12:46:05.000 joins
xx:xx:xx:xx:0e:72   2018-04-19 15:27:28.000 disconnects from
xx:xx:xx:xx:0e:72   2018-04-19 15:27:28.000 leave

所需格式:

usermac datetime event sessionid

xx:xx:xx:xx:0e:72   2018-04-19 09:30:58.000 joins               1
xx:xx:xx:xx:0e:72   2018-04-19 09:51:54.000 in                  1
xx:xx:xx:xx:0e:72   2018-04-19 09:51:54.000 in                  1
xx:xx:xx:xx:0e:72   2018-04-19 12:01:39.000 disconnects from    1
xx:xx:xx:xx:0e:72   2018-04-19 12:01:39.000 leave               1
xx:xx:xx:xx:0e:72   2018-04-19 12:46:05.000 joins               2
xx:xx:xx:xx:0e:72   2018-04-19 15:27:28.000 disconnects from    2
xx:xx:xx:xx:0e:72   2018-04-19 15:27:28.000 leave               2

场景2

原始数据:

usermac datetime event

xx:xx:xx:xx:xx:80   2018-04-12 21:35:48.000 joins
xx:xx:xx:xx:xx:80   2018-04-12 21:40:01.000 leave
xx:xx:xx:xx:xx:80   2018-04-12 21:40:01.000 disconnects from
xx:xx:xx:xx:xx:80   2018-04-12 21:40:08.000 joins
xx:xx:xx:xx:xx:80   2018-04-12 21:43:45.000 leave
xx:xx:xx:xx:xx:80   2018-04-12 21:43:45.000 disconnects from
xx:xx:xx:xx:xx:80   2018-04-12 21:44:00.000 joins
xx:xx:xx:xx:xx:80   2018-04-12 22:01:29.000 leave

所需格式:

usermac datetime event sessionid

xx:xx:xx:xx:xx:80   2018-04-12 21:35:48.000 joins               3
xx:xx:xx:xx:xx:80   2018-04-12 21:40:01.000 leave               3
xx:xx:xx:xx:xx:80   2018-04-12 21:40:01.000 disconnects from    3
xx:xx:xx:xx:xx:80   2018-04-12 21:40:08.000 joins               4
xx:xx:xx:xx:xx:80   2018-04-12 21:43:45.000 leave               4
xx:xx:xx:xx:xx:80   2018-04-12 21:43:45.000 disconnects from    4
xx:xx:xx:xx:xx:80   2018-04-12 21:44:00.000 joins               5
xx:xx:xx:xx:xx:80   2018-04-12 22:01:29.000 leave               5

场景3

原始数据:

usermac datetime event

xx:xx:xx:xx:xx:92   2018-04-23 09:15:18.000 joins               
xx:xx:xx:xx:xx:92   2018-04-23 11:02:30.000 in                  
xx:xx:xx:xx:xx:92   2018-04-23 11:02:30.000 in                  
xx:xx:xx:xx:xx:92   2018-04-23 11:06:09.000 disconnects from    
xx:xx:xx:xx:xx:92   2018-04-23 11:06:09.000 leave               
xx:xx:xx:xx:xx:92   2018-04-23 11:07:03.000 joins               
xx:xx:xx:xx:xx:92   2018-04-23 13:35:51.000 leave

所需格式:

usermac datetime event sessionid

xx:xx:xx:xx:xx:92   2018-04-23 09:15:18.000 joins               6
xx:xx:xx:xx:xx:92   2018-04-23 11:02:30.000 in                  6
xx:xx:xx:xx:xx:92   2018-04-23 11:02:30.000 in                  6
xx:xx:xx:xx:xx:92   2018-04-23 11:06:09.000 disconnects from    6
xx:xx:xx:xx:xx:92   2018-04-23 11:06:09.000 leave               6
xx:xx:xx:xx:xx:92   2018-04-23 11:07:03.000 joins               7
xx:xx:xx:xx:xx:92   2018-04-23 13:35:51.000 leave               7

场景4

原始数据:

usermac datetime event

xx:xx:xx:xx:xx:5f   2018-06-23 07:03:24.000 disconnects from
xx:xx:xx:xx:xx:5f   2018-06-23 07:03:24.000 leave
xx:xx:xx:xx:xx:5f   2018-06-23 07:05:45.000 joins
xx:xx:xx:xx:xx:5f   2018-06-23 07:05:48.000 leave
xx:xx:xx:xx:xx:5f   2018-06-23 07:05:48.000 disconnects from
xx:xx:xx:xx:xx:5f   2018-06-23 07:31:59.000 joins
xx:xx:xx:xx:xx:5f   2018-06-23 07:32:02.000 leave
xx:xx:xx:xx:xx:5f   2018-06-23 07:32:02.000 disconnects from

所需格式:

usermac datetime event sessionid

xx:xx:xx:xx:xx:5f   2018-06-23 07:03:24.000 disconnects from    8
xx:xx:xx:xx:xx:5f   2018-06-23 07:03:24.000 leave               8
xx:xx:xx:xx:xx:5f   2018-06-23 07:05:45.000 joins               9
xx:xx:xx:xx:xx:5f   2018-06-23 07:05:48.000 leave               9
xx:xx:xx:xx:xx:5f   2018-06-23 07:05:48.000 disconnects from    9
xx:xx:xx:xx:xx:5f   2018-06-23 07:31:59.000 joins               10
xx:xx:xx:xx:xx:5f   2018-06-23 07:32:02.000 leave               10
xx:xx:xx:xx:xx:5f   2018-06-23 07:32:02.000 disconnects from    10

场景5

原始数据:

usermac datetime event

xx:xx:xx:xx:xx:b2   2018-06-12 07:22:14.000 in          
xx:xx:xx:xx:xx:b2   2018-06-12 07:22:14.000 in
xx:xx:xx:xx:xx:b2   2018-06-12 07:22:40.000 in
xx:xx:xx:xx:xx:b2   2018-06-12 07:22:40.000 in
xx:xx:xx:xx:xx:b2   2018-06-12 07:23:39.000 leave

所需格式:

usermac datetime event sessionid

xx:xx:xx:xx:xx:b2   2018-06-12 07:22:14.000 in               11
xx:xx:xx:xx:xx:b2   2018-06-12 07:22:14.000 in               11
xx:xx:xx:xx:xx:b2   2018-06-12 07:22:40.000 in               11
xx:xx:xx:xx:xx:b2   2018-06-12 07:22:40.000 in               11
xx:xx:xx:xx:xx:b2   2018-06-12 07:23:39.000 leave            11

场景6

原始数据:

usermac datetime event

xx:xx:xx:xx:xx:16   2018-06-23 07:55:26.000 in
xx:xx:xx:xx:xx:16   2018-06-23 07:55:26.000 in

所需格式:

usermac datetime event sessionid

xx:xx:xx:xx:xx:16   2018-06-23 07:55:26.000 in               12
xx:xx:xx:xx:xx:16   2018-06-23 07:55:26.000 in               12

场景7

原始数据:

usermac datetime event

xx:xx:xx:xx:xx:m6   2018-06-24 06:56:30.000 rejoins
xx:xx:xx:xx:xx:m6   2018-06-24 06:56:51.000 rejoins
xx:xx:xx:xx:xx:m6   2018-06-24 06:56:59.000 leave
xx:xx:xx:xx:xx:m6   2018-06-24 06:56:59.000 disconnects from
xx:xx:xx:xx:xx:m6   2018-06-24 06:57:03.000 joins

所需格式:

usermac datetime event sessionid

xx:xx:xx:xx:xx:m6   2018-06-24 06:56:30.000 rejoins           13
xx:xx:xx:xx:xx:m6   2018-06-24 06:56:51.000 rejoins           13
xx:xx:xx:xx:xx:m6   2018-06-24 06:56:59.000 leave             13
xx:xx:xx:xx:xx:m6   2018-06-24 06:56:59.000 disconnects from  13
xx:xx:xx:xx:xx:m6   2018-06-24 06:57:03.000 joins             14

事情到现在为止:

with cte1 as ( SELECT row_number() over(partition by usermac ORDER BY datetime) indexno, row_number() over(partition by event ORDER BY usermac, datetime) sessionid, datetime, usermac, event from t8_wfc_log_apr_jun_syslog_cast where usermac='xx:xx:xx:xx:16' order by usermac, datetime ) SELECT case when event = 'joins' then sessionid else (select max(sessionid) from cte1 innie where innie.usermac = oute.usermac and innie.indexno < oute.indexno and innie.event = 'joins') end sessionid, datetime, usermac, event FROM cte1 oute order by usermac, datetime;