Calculating No Job Days in SQL: A Deep Dive into Pattern Matching and Window Functions
Introduction
In this article, we will explore the concept of calculating “no job days” for a given month and job ID from a LOG table using Oracle SQL. The idea is to identify the days between active and end or active and suspended periods for each job. We will delve into the use of pattern matching with MATCH_RECOGNIZE and window functions to achieve this.
Problem Statement
The problem statement involves calculating no job days, which are the days between active and end OR active and suspended periods for a given month and job ID. The LOG table has the following structure:
| LogId | JobId | LogDate | JobStatus |
|---|
We want to calculate the total number of no job days for each job in a given month.
Example Data
To illustrate this, let’s consider an example data set:
SELECT 1, 1, TO_DATE('09/01/2019','mm/dd/YYYY'), 'active' FROM dual UNION ALL
SELECT 2, 1, TO_DATE('09/02/2019','mm/dd/YYYY'), 'end' FROM dual UNION ALL
SELECT 3, 2, TO_DATE('08/03/2019','mm/dd/YYYY'), 'active' FROM dual UNION ALL
SELECT 4, 2, TO_DATE('08/05/2019','mm/dd/YYYY'), 'suspended' FROM dual UNION ALL
SELECT 5, 2, TO_DATE('08/08/2019','mm/dd/YYYY'), 'active' FROM dual UNION ALL
SELECT 6, 2, TO_DATE('08/15/2019','mm/dd/YYYY'), 'end' FROM dual UNION ALL
SELECT 7, 3, TO_DATE('06/01/2019','mm/dd/YYYY'), 'active' FROM dual UNION ALL
SELECT 8, 3, TO_DATE('06/04/2019','mm/dd/YYYY'), 'suspended' FROM dual UNION ALL
SELECT 9, 3, TO_DATE('06/08/2019','mm/dd/YYYY'), 'active' FROM dual UNION ALL
SELECT 10, 3, TO_DATE('06/12/2019','mm/dd/YYYY'), 'suspended' FROM dual UNION ALL
SELECT 11, 3, TO_DATE('06/18/2019','mm/dd/YYYY'), 'active' FROM dual UNION ALL
SELECT 12, 3, TO_DATE('06/25/2019','mm/dd/YYYY'), 'end' FROM dual;
Solution Overview
To solve this problem, we will use a combination of pattern matching with MATCH_RECOGNIZE and window functions.
Pattern Matching with MATCH_RECOGNIZE
We start by creating a temporary view t that contains all the data from the LOG table. We then create another temporary view m that uses the MATCH_RECOGNIZE clause to identify the active and end or suspended periods for each job.
The pattern we use is:
s_ACTIVE (s_SUSPENDED s_REACTIVE)* s_END
This pattern matches one or more occurrences of an active period followed by zero or more occurrences of a suspended period, and finally ends with an end period.
We define the following variables:
s_activeto represent an active periods_reactiveto represent an active period (same ass_active)s_endto represent an end periods_suspendedto represent a suspended period
The MEASURES clause specifies that we want to calculate the SUM of SUSPEND_DAYS for each match.
Window Functions
We use window functions to calculate the total number of no job days for each job in a given month. We join the m view with the original table using an outer join, and group by JobID. For each row, we calculate the start date as the minimum active date, the end date as the maximum end date, the suspended days as the sum of suspend days, the active days as the difference between the end date and the start date minus the suspended days, and the active days from current month as the difference between the last day of the month and the active days.
Final Query
Here is the complete query:
WITH t(LogId, JobId, LogDate, JobStatus) AS
(SELECT 1, 1, TO_DATE('09/01/2019','mm/dd/YYYY'), 'active' FROM dual UNION ALL
SELECT 2, 1, TO_DATE('09/02/2019','mm/dd/YYYY'), 'end' FROM dual UNION ALL
SELECT 3, 2, TO_DATE('08/03/2019','mm/dd/YYYY'), 'active' FROM dual UNION ALL
SELECT 4, 2, TO_DATE('08/05/2019','mm/dd/YYYY'), 'suspended' FROM dual UNION ALL
SELECT 5, 2, TO_DATE('08/08/2019','mm/dd/YYYY'), 'active' FROM dual UNION ALL
SELECT 6, 2, TO_DATE('08/15/2019','mm/dd/YYYY'), 'end' FROM dual UNION ALL
SELECT 7, 3, TO_DATE('06/01/2019','mm/dd/YYYY'), 'active' FROM dual UNION ALL
SELECT 8, 3, TO_DATE('06/04/2019','mm/dd/YYYY'), 'suspended' FROM dual UNION ALL
SELECT 9, 3, TO_DATE('06/08/2019','mm/dd/YYYY'), 'active' FROM dual UNION ALL
SELECT 10, 3, TO_DATE('06/12/2019','mm/dd/YYYY'), 'suspended' FROM dual UNION ALL
SELECT 11, 3, TO_DATE('06/18/2019','mm/dd/YYYY'), 'active' FROM dual UNION ALL
SELECT 12, 3, TO_DATE('06/25/2019','mm/dd/YYYY'), 'end' FROM dual),
m as
(SELECT JobId, ACTIVE_DATE, END_DATE, SUSPEND_DAYS, JobStatus, var_match
FROM t
MATCH_RECOGNIZE (
PARTITION BY JobId
ORDER BY LogId
MEASURES
CLASSIFIER() AS var_match,
FINAL FIRST(s_ACTIVE.LogDate) AS ACTIVE_DATE,
FINAL LAST(s_END.LogDate) AS END_DATE,
(s_REACTIVE.LogDate - s_SUSPENDED.LogDate) AS SUSPEND_DAYS
ALL ROWS PER MATCH
PATTERN ( s_ACTIVE (s_SUSPENDED s_REACTIVE)* s_END )
DEFINE
s_active AS JobStatus = 'active',
s_reactive AS JobStatus = 'active',
s_end AS JobStatus = 'end',
s_suspended AS JobStatus = 'suspended'
)
)
SELECT JobId,
MIN(ACTIVE_DATE) AS START_DATE,
MAX(END_DATE) AS END_DATE,
SUM(SUSPEND_DAYS) AS SUSPENDED_DAYS,
MAX(END_DATE) - MIN(ACTIVE_DATE) - NVL(SUM(SUSPEND_DAYS),0) AS ACTIVE_DAYS,
EXTRACT(DAY FROM (LAST_DAY(MIN(ACTIVE_DATE)))) - (MAX(END_DATE) - MIN(ACTIVE_DATE) - NVL(SUM(SUSPEND_DAYS),0)) AS ACTIVE_DAYS_FROM_CURRENT_MONTH
FROM m
WHERE JobStatus = 'active'
GROUP BY JobId
ORDER BY JobID;
Result
The result of this query is a table with the following columns:
| JOBID | START_DATE | END_DATE | SUSPENDED_DAYS | ACTIVE_DAYS | ACTIVE_DAYS_FROM_CURRENT_MONTH |
|---|
For example, for job ID 1, the result is:
| JOBID | START_DATE | END_DATE | SUSPENDED_DAYS | ACTIVE_DAYS | ACTIVE_DAYS_From_CURRENT_MONTH |
|---|---|---|---|---|---|
| 1 | 01.09.2019 | 02.09.2019 | 1 | 29 |
This means that for job ID 1, the start date is September 1, 2019, the end date is September 2, 2019, there are no suspended days, the active days are 1 day (September 1-2), and the active days from current month are 29 days.
Conclusion
In this article, we discussed how to calculate “no job days” for a given month and job ID from a LOG table using Oracle SQL. We used pattern matching with MATCH_RECOGNIZE and window functions to achieve this. The query uses two temporary views: t to store the data from the LOG table and m to perform the pattern matching. The final query joins the m view with the original table, groups by job ID, and calculates the start date, end date, suspended days, active days, and active days from current month for each row.
References
Last modified on 2024-04-06