我正在从显示时间戳和状态(0或1)的传感器数据开始以下这些示例 . 我能够计算每个具有相同状态的行之间的时间差,但我想计算每个状态的总时间长度(0和1) .
df = pd.DataFrame(data=[['2018/02/16 15:00:05', 0],
['2018/02/16 15:00:08', 0],
['2018/02/16 15:00:09', 0],
['2018/02/16 15:00:14', 1],
['2018/02/16 15:00:26', 0],
['2018/02/16 15:00:28', 0],
['2018/02/16 15:00:29', 0],
['2018/02/16 15:00:31', 1],
['2018/02/16 15:00:33', 1],
['2018/02/16 15:00:34', 1],
['2018/02/16 15:00:37', 1],
['2018/02/16 15:00:39', 1],
['2018/02/16 15:00:40', 1],
['2018/02/16 15:00:41', 1],
['2018/02/16 15:00:43', 1]],
columns=['Datetime', 'Status'])
# convert to datetime object
df.Datetime = pd.to_datetime(df['Datetime'])
# find when the state changes
run_change = df['Status'].diff()
# get the step lengths
step_length = df['Datetime'].diff()
# loop and get the change since last state change
since_change = []
current_delta = 0
for is_change, delta in zip(run_change, step_length):
current_delta = 0 if is_change != 0 else \
current_delta + delta.total_seconds()
since_change.append(current_delta)
# add this data to the data frame
df['Run_Change'] = run_change
df['Step_Length'] = step_length
df['Time_Since_Change(sec)'] = pd.Series(since_change).values
结果如下:
Datetetime Status Run_Change Step_Length Time_Since_Change
0 2018-02-16 15:00:05 0 NaN NaT 0.0
1 2018-02-16 15:00:08 0 0.0 00:00:03 3.0
2 2018-02-16 15:00:09 0 0.0 00:00:01 4.0
3 2018-02-16 15:00:14 1 1.0 00:00:05 0.0
4 2018-02-16 15:00:26 0 -1.0 00:00:12 0.0
5 2018-02-16 15:00:28 0 0.0 00:00:02 2.0
6 2018-02-16 15:00:29 0 0.0 00:00:01 3.0
7 2018-02-16 15:00:31 1 1.0 00:00:02 0.0
8 2018-02-16 15:00:33 1 0.0 00:00:02 2.0
9 2018-02-16 15:00:34 1 0.0 00:00:01 3.0
10 2018-02-16 15:00:37 1 0.0 00:00:03 6.0
我需要整个数据的总时间长度(以秒为单位),例如,对于状态0,总长度为7秒(状态0长度从00:05到00:09计算,继续00:26到00:29 ) .
1 回答
您可以将
groupby('Status')
与groupby(df2.index - np.arange(df2.shape[0]))
一起使用 . 第二个groupby
根据连续索引创建不同的系列 . 然后你可以使用groupby.last()
和groupby.first()
来计算时差 .注意这一行
它替换0秒的间隔(一个连续一次有一行的系列)1秒(否则你的状态1会得到12,它应该是13)