Best Approach to write the query
-
Wednesday, September 12, 2012 1:01 PM
In a given month each member is considered a certain lifecycle type. This will change on monthly basis based on their previous & current month's activity, i.e.
New = New P2P in the calendar month
Existing i.e. P2P in a prior calendar month can be either:
Retained = Active in the prior calendar month and the current calendar month
Unretained = Active in the prior calendar month, but not active in the current calendar month
Reactivated = No activity in the prior calendar month, but active in the current calendar month
Lapsed = No activity in the prior calendar month or the current calendar month
P2P stands for “Pay to Play”, i.e. the Member has placed at least one real money wager since joining.
Active means the member has made a minimum of one real money wager in the month.
Source Tables
The “REVENUE_ANALYSIS” table provides a summary of each member’s activity on a given date, listing the total amounts wagered and won for each game played that day. If a member does not play on a given date there will be no entries in the table. You can assume this is a very large table ~100M rows.
There is also a CALENDAR table that provides a base “date” dimension, one row per day from 2000 to 2020.
REVENUE_ANALYSIS
ACTIVITY_DATE DATE NOT NULL, Date wager was made
MEMBER_ID INTEGER NOT NULL, Unique Player identifier
GAME_ID SMALLINT NOT NULL, Unique Game identifier
WAGER_AMOUNT REAL NOT NULL, Total amount wagered on the game
NUMBER_OF_WAGERS INTEGER NOT NULL, Number of wagers on the game
WIN_AMOUNT REAL NOT NULL, Total amount won on the game
ACTIVITY_YEAR_MONTH INTEGER NOT NULL, YYYYMM
BANK_TYPE_ID SMALLINT DEFAULT 0 NOT NULL 0=Real money, 1=Bonus money
CALENDAR
CALENDAR_DATE DATE NOT NULL, Base date (YYYY-MM-DD)
CALENDAR_YEAR INTEGER NOT NULL, 2010, 2011 etc
CALENDAR_MONTH_NUMBER INTEGER NOT NULL, 1-12
CALENDAR_MONTH_NAME VARCHAR(100), January, February etc
CALENDAR_DAY_OF_MONTH INTEGER NOT NULL, 1-31
Friday, 25 February 2011 In Confidence Page 2 of 2
CALENDAR_DAY_OF_WEEK INTEGER NOT NULL, 1-7
CALENDAR_DAY_NAME VARCHAR(100), Monday, Tuesday etc
CALENDAR_YEAR_MONTH INTEGER NOT NULL, 201011, 201012, 201101 etc
The required solution is a view with the following columns:
MEMBER_ID,
CALENDAR_YEAR_MONTH,
MEMBER_LIFECYCLE_STATUS,
LAPSED_MONTHS
Listing one row per Member per month, starting from the month in which they first place a real money wager, giving their lifecycle status for that month, also if the member has lapsed we need a rolling count of the number of months since they were last active.
I think the case statement should be fine but would like to know the thoughts from other people.
Mustafa
- Changed Type MustafaH Tuesday, September 25, 2012 8:57 AM
- Changed Type Naomi NMicrosoft Community Contributor, Moderator Tuesday, September 25, 2012 6:14 PM Question rather than discussion
All Replies
-
Wednesday, September 12, 2012 2:26 PMThe best way would be to learn SQL and at least attempt to write the query yourself.
- Edited by Johnny Bell Jnr Wednesday, September 12, 2012 2:26 PM
-
Wednesday, September 12, 2012 2:29 PMI know the query. just wanted to know what is the best approach?
MH
-
Wednesday, September 12, 2012 2:30 PMModeratorThen please give us at least what you have so far. Also, you might want to provide some sample data and desired output.
-
Wednesday, September 12, 2012 2:37 PM
Hi Kent,
Thanks for your reply. I am working on it and put my query shortly.
MH
-
Thursday, September 13, 2012 9:17 PM
Hi All,
Desired output
Member_ID, Calendar_year_month, Member_Life_Cycle_Status, Lapsed_Months
Business Rules For Member_life_cycle_status is
New = New P2P in the calendar month
Retained = Active in the prior calendar month and the current calendar month
Unretained = Active in the prior calendar month, but not active in the current calendar month
Reactivated = No activity in the prior calendar month, but active in the current calendar month
Lapsed = No activity in the prior calendar month or the current calendar month
<o:p></o:p>
I have this Query 1
Select DISTINCT A.activity_date as current_activity ,a.member_id ,a.activity_year_month as current_month ,B.PREVIOUS_MONTH INTO #FINAL_SAMPLE From dbo.Revenue_Analysis AS A LEFT JOIN (SELECT DISTINCT MEMBER_ID ,ACTIVITY_DATE AS PREVIOUS_ACTIVITY ,ACTIVITY_YEAR_MONTH AS PREVIOUS_MONTH FROM dbo.Revenue_Analysis) AS B ON A.MEMBER_ID = B.MEMBER_ID AND ((YEAR(A.ACTIVITY_DATE) - YEAR(GETDATE())) * 12) + MONTH(A.ACTIVITY_DATE) - MONTH(GETDATE()) = ((YEAR(B.PREVIOUS_ACTIVITY) - YEAR(GETDATE())) * 12) + (MONTH(B.PREVIOUS_ACTIVITY) - MONTH(GETDATE())) + 1 where A.activity_date >= '2012-01-01'
Query 2
SELECT MEMBER_ID ,current_month as ACTIVITY_YEAR_MONTH ,CASE WHEN PREVIOUS_MONTH IS NULL THEN 'New' WHEN PREVIOUS_MONTH IS NOT NULL or current_month is not null THEN 'Retained' else 'Other'end as MEMBER_LIFECYCLE_STATUS ,cast(null as int) as LAPSED_MONTHS FROM #FINAL_SAMPLE
I am struggling to get conditions for otherlife_cycle_status(Unretained,Reactivated,Lapsed) and lapsed_months Any suggestions would be much appreciated??
Many Thanks.
I have date dimension and revenue analysis sample data in excel sheet for year 2012. For sample data please click here Sample Data
MH
- Edited by MustafaH Thursday, September 13, 2012 9:38 PM
-
Thursday, September 13, 2012 9:21 PM
Script For creating calendar table
CREATE TABLE #CALENDAR ( CALENDAR_DATE DATEtime NOT NULL, --Base date (YYYY-MM-DD) CALENDAR_YEAR INT NOT NULL, --2010, 2011 etc CALENDAR_MONTH_NUMBER INT NOT NULL, --1-12 CALENDAR_MONTH_NAME VARCHAR(100), --January, February etc CALENDAR_DAY_OF_MONTH INT NOT NULL, --1-31 CALENDAR_DAY_OF_WEEK INT NOT NULL, --1-7 CALENDAR_DAY_NAME VARCHAR(100), --Monday, Tuesday etc CALENDAR_YEAR_MONTH VARCHAR(20) NOT NULL, --201011, 201012, 201101 etc )
Revenus_Analysis table script
CREATE TABLE #REVENUE_ANALYSIS ( ACTIVITY_DATE DATETIME NOT NULL, --Date wager was made MEMBER_ID INT NOT NULL, --Unique Player identifier GAME_ID INT NOT NULL, --Unique Game identifier WAGER_AMOUNT FLOAT NULL, --Total amount wagered on the game NUMBER_OF_WAGERS INT NULL, --Number of wagers on the game WIN_AMOUNT FLOAT NULL, --Total amount won on the game ACTIVITY_YEAR_MONTH VARCHAR(20) NOT NULL, --YYYYMM BANK_TYPE_ID SMALLINT DEFAULT 0 NOT NULL --0=Real money, 1=Bonus money )
MH
-
Tuesday, September 25, 2012 8:19 AM
Hi All,
I have posted it the way i think. is there any best way to do it?
Thanks in advance.
MH
-
Wednesday, September 26, 2012 2:57 AM
Your schema is wrong. No keys, improper data types and no constraints. Did you know that using FLOAT for money is illegal and you have to use a fixed number of decimal places by law? You might want to learn the ISO-11179 rules for data element names. All of those data parts can be done with a computed column. How do I wager NULL dollars? How do I win NULL dollars? There is no such thing as a “type_id” in RDBMS; there is a “<something>_id” or a “<something>_type” but not a hybrid.
CREATE TABLE Revenue_Analysis
(activity_date DATE NOT NULL,
member_id INTEGER NOT NULL
REFERENCES Players(player_id),
game_id INTEGER NOT NULL
REFERENCES Games(game_id),PRIMARY KEY (activity_date, member_id , game_id),
wager_amt DECIMAL(18,5) NOT NULL,
wager_cnt INTEGER NOT NULL
CHECK (wager_cnt > 0),
win_amt DECIMAL(18,5) NOT NULL
CHECK win_amt >= 0.00),
bank_type SMALLINT DEFAULT 0 NOT NULL
CHECK (bank_type IN (0,1));
A useful idiom is a report period calendar. It gives a name to a range of dates.
CREATE TABLE Report_Periods
(report_name VARCHAR(30) NOT NULL PRIMARY KEY,
report_start_date DATE DEFAULT CURRENT_TIMESTAMP NOT NULL,
report_end_date DATE NOT NULL,
CONSTRAINT Date_Ordering
CHECK (report_start_date <= report_end_date),
etc);
These report periods can overlap or have gaps. I like the MySQL convention of using double zeroes for months and years, That is 'yyyy-mm-00' for a month within a year and 'yyyy-00-00' for the whole year. The advantage is that it will sort with the ISO-8601 data format required by Standard SQL.Then you need to read http://www.simple-talk.com/sql/t-sql-programming/state-transition-constraints/ to get valid data about the behavior of the players.
Since you are not paying me, I will you do the rest of this :) Think look up tables and not temporal math. Much faster and safer.
--CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL
- Marked As Answer by Kalman TothMicrosoft Community Contributor, Moderator Monday, October 01, 2012 11:14 PM
-
Wednesday, September 26, 2012 9:09 AM
Thank you so much CELKO. I will try this again the approach you have mentioned above and update the results here. Can I ask question if i find any difficulty?
Many thanks for the help. Did you look sample data? Is that ok to practice this exercise?
MH

