sql查询用户连续n天登录的数据
业务中常见分析之一是分析用户连续登录使用情况,这也对应着SQL常见面试题——用户连续N天登录问题。
我们假设现在有一张用户登录信息表user_login_info,表中字段有用户id(uid)、登录时间(login_time)。表中数据如下所示:
现在要求查询出连续登录N天的用户。
(1) 首先我们要对用户登录表进行去重操作,以避免用户当天多次登录情况对查询结果产生影响。这里用到了distinct关键词。
1 2 3 4 | select
uid,
distinct date (login_time) as login_time
from user_login_info
|
(2)其次我们使用窗口排名函数row_number对同一用户的不同登录时间进行排名,得到新一列为rk。
1 2 3 4 5 6 7 8 9 10 11 | select
uid,
login_time,
row_number() over(partition by uid order by login_time) as rk
from
(
select
uid,
distinct date (login_time) as login_time
from user_login_info
) t1
|
查询结果如图所示:
(3)之后用date_sub函数计算登录时间login_time一列加上rk天之后生成新的一列sub_date。假如在表格中,同一用户的sub_date相同则说明相同sub_date数据的行是连续登录使用的情况。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | select *, DATE_SUB(login_time, INTERVAL rk DAY ) as sub_date
from
(
select
uid,
login_time,
row_number() over(partition by uid order by login_time) as rk
from
(
select
uid,
distinct date (login_time) as login_time
from user_login_info
) t1
) t2
|
查询结果如图所示:
(4)之后我们对得到的查询结果,按照用户id,登录时间进行分组计数,得到的计数结果就是用户连续登录多少天的情况记录。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | select uid, count (*) as 连续登录天数
from
(
SELECT *, DATE_SUB(login_time, INTERVAL rk DAY ) AS sub_date
from
(
select
uid,
login_time,
row_number() over(partition by uid order by login_time) as rk
from
(
select
uid,
distinct date (login_time) as login_time
from user_login_info
) t1
) t2
) t3
group by uid, sub_date
|
(5)之后我们可以在此查询结果上,根据需要用having条件就可以筛选出我们想要得知的连续N天登录的用户id
完整代码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | select uid, count (*) as 连续登录天数
from
(
SELECT *, DATE_SUB(login_time, INTERVAL rk DAY ) AS sub_date
from
(
select
uid,
login_time,
row_number() over(partition by uid order by login_time) as rk
from
(
select
uid,
distinct date (login_time) as login_time
from user_login_info
) t1
) t2
) t3
group by uid, sub_date
having 连续登录天数 = N
|