Friday, December 28, 2012 5:42 PM
I'm very new to SQL and I'm sorry if this sounds like a dumb question. I have a problem where I don't know how to even start the code.
I have a table that shows every transaction a person makes at the store for the past year. I'd like to know who our VIP customers are for the last 6 months. They would have to have purchased a "shirt" every month to be considered and if they haven't I'd like to filter them out.
The columns I have are..
Any help would be appreciated!!
Friday, December 28, 2012 6:17 PMThe first step is to identify which purchases are "shirts". I assume that there are multiple products that qualify as "shirts" - there must be some way to associate Product_ID with this particular type - most likely a table (or set of tables).
Friday, December 28, 2012 6:19 PMModerator
For this type of question it is very helpful to post table schema, sample data in the form of "insert" statements, and expected result. This way we do not have to guess column names, data types, etc. Help us to be able to help you.
Said that, I wonder if you are trying to identify VIP customers based on the criteria you provided (purchased a shirt every month)?
If so, then this problem is known as Relational Division, and you can read more about it in this great article.
Divided We Stand: The SQL of Relational Division
transaction_date >= dateadd(month, -5, dateadd(day, datediff(day, '19000101', getdate()), '19000101')) -- 20120701
and product_id = 4 -- you can use a subquery to bring the product_id for "Shirt"
count(distinct month(transaction_date)) = 6;
That idea is to count the distinct months for all the transactions in the las six months for each member. If this value is equal 6, then the criteria is matched.
Friday, December 28, 2012 11:29 PM
Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. If you know how, follow ISO-11179 data element naming conventions and formatting rules. Temporal data should use ISO-8601 formats. Code should be in Standard SQL as much as possible and not local dialect.
This is minimal polite behavior on SQL forums.
CREATE TABLE Sales
(sale_date DATE NOT NULL,
member_id CHAR(10) NOT NULL,
PRIMARY KEY (sale_date, member_id)
upc CHAR(13) NOT NULL,
sale_qty INTEGER NOT NULL
CHECK (sale_qty > 0));
Since SQL is a database language, we prefer to do look ups and not calculations. They can be optimized while temporal math messes up optimization. A useful idiom is a report period calendar that everyone uses so there is no way to get disagreements in the DML. The report period table gives a name to a range of dates that is common to the entire enterprise.
CREATE TABLE Sales_Report_Periods
(sales_report_name CHAR(10) NOT NULL PRIMARY KEY
CHECK (sales_report_name LIKE '[0-9][0-9][0-9]-[0-9]-00'),
sales_report_start_date DATE NOT NULL,
sales_report_end_date DATE NOT NULL,
CHECK (sales_report_start_date <= sales_report_end_date),
These report periods can overlap or have gaps. I like the MySQL convention of using double zeroes for months and years, That is 'yyyy-mm-00' for a month within a year and 'yyyy-00-00' for the whole year. The advantage is that it will sort with the ISO-8601 data format required by Standard SQL.
FROM Sales AS S, Sales_Report_Periods AS R
WHERE S.sale_date BETWEEN R.sales_report_start_date AND R.sales_report_end_date
AND upc IN (<<list of shirt codes>>)
AND R.sales_report_name BETWEEN '2012-07-00' AND '2012-12-00'
GROUP BY S.member_id
HAVING COUNT (DISTINCT R.sales_report_name) = 6;
This is at least one shirt per month for the last half of the year.
--CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL
- Marked As Answer by DTTVHP Monday, December 31, 2012 4:57 PM