Note: Forums will be making significant UX changes to address key usability improvements surrounding search, discoverability and navigation. To learn more about these changes please visit the announcement which can be found HERE.

Answered How would I even start...

  • Friday, December 28, 2012 5:42 PM
     
     

    Good Morning,

    I'm very new to SQL and I'm sorry if this sounds like a dumb question. I have a problem where I don't know how to even start the code.

    I have a table that shows every transaction a person makes at the store for the past year. I'd like to know who our VIP customers are for the last 6 months. They would have to have purchased a "shirt" every month to be considered and if they haven't I'd like to filter them out.

    The columns I have are..

    Transaction_Date

    Member_ID

    Product_ID

    QTY

    Any help would be appreciated!!

All Replies

  • Friday, December 28, 2012 6:17 PM
     
     
    The first step is to identify which purchases are "shirts".  I assume that there are multiple products that qualify as "shirts" - there must be some way to associate Product_ID with this particular type - most likely a table (or set of tables). 
  • Friday, December 28, 2012 6:19 PM
    Moderator
     
     Answered

    For this type of question it is very helpful to post table schema, sample data in the form of "insert" statements, and expected result. This way we do not have to guess column names, data types, etc. Help us to be able to help you.

    Said that, I wonder if you are trying to identify VIP customers based on the criteria you provided (purchased a shirt every month)?

    If so, then this problem is known as Relational Division, and you can read more about it in this great article.

    Divided We Stand: The SQL of Relational Division
    http://www.simple-talk.com/sql/t-sql-programming/divided-we-stand-the-sql-of-relational-division/

    select
        member_id
    from
        T
    where
        transaction_date >= dateadd(month, -5, dateadd(day, datediff(day, '19000101', getdate()), '19000101')) -- 20120701
        and product_id = 4 -- you can use a subquery to bring the product_id for "Shirt"
    group by
        member_id
    having
        count(distinct month(transaction_date)) = 6;

    That idea is to count the distinct months for all the transactions in the las six months for each member. If this value is equal 6, then the criteria is matched.


    AMB

    Some guidelines for posting questions...

  • Friday, December 28, 2012 11:29 PM
     
     Answered

    Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. If you know how, follow ISO-11179 data element naming conventions and formatting rules. Temporal data should use ISO-8601 formats. Code should be in Standard SQL as much as possible and not local dialect. 

    This is minimal polite behavior on SQL forums. 

    CREATE TABLE Sales
    (sale_date DATE NOT NULL,
     member_id CHAR(10) NOT NULL,
     PRIMARY KEY (sale_date, member_id)
     upc CHAR(13) NOT NULL,
     sale_qty INTEGER NOT NULL
      CHECK (sale_qty > 0));

    Since SQL is a database language, we prefer to do look ups and not calculations. They can be optimized while temporal math messes up optimization. A useful idiom is a report period calendar that everyone uses so there is no way to get disagreements in the DML. The report period table gives a name to a range of dates that is common to the entire enterprise. 

    CREATE TABLE Sales_Report_Periods
    (sales_report_name CHAR(10) NOT NULL PRIMARY KEY
       CHECK (sales_report_name LIKE '[12][0-9][0-9][0-9]-[01][0-9]-00'),
     sales_report_start_date DATE NOT NULL,
     sales_report_end_date DATE NOT NULL,
      CONSTRAINT date_ordering
        CHECK (sales_report_start_date <= sales_report_end_date),
    etc);

    These report periods can overlap or have gaps. I like the MySQL convention of using double zeroes for months and years, That is 'yyyy-mm-00' for a month within a year and 'yyyy-00-00' for the whole year. The advantage is that it will sort with the ISO-8601 data format required by Standard SQL. 

    SELECT S.member_id 
      FROM Sales AS S, Sales_Report_Periods AS R
     WHERE S.sale_date BETWEEN R.sales_report_start_date AND R.sales_report_end_date
       AND upc IN (<<list of shirt codes>>)
       AND R.sales_report_name BETWEEN '2012-07-00' AND '2012-12-00'
     GROUP BY S.member_id
     HAVING COUNT (DISTINCT R.sales_report_name) = 6;

    This is at least one shirt per month for the last half of the year. 


    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    • Marked As Answer by DTTVHP Monday, December 31, 2012 4:57 PM
    •