locked
Huge size or record count of table access problem. RRS feed

  • Question

  • Hi,
    Here is the a table called MyTable.

    _T("IF OBJECT_ID ('MyTable','U') IS NULL
    CREATE TABLE MyTable
    (
    _id    BINARY(4)  NOT NULL,
    _time    SMALLDATETIME  NOT NULL,
    _type   TINYINT   NOT NULL DEFAULT 1,
    _ip    BINARY(4)    NOT NULL,
    _protocol   TINYINT   NOT NULL,
    _port    BINARY(2)   NOT NULL,
    _in    FLOAT    NOT NULL DEFAULT 0.0,
    _out    FLOAT    NOT NULL DEFAULT 0.0,
    PRIMARY KEY(_id,_time,_type,_ip,_protocol,_port)
    );

    It has 216,000,000 records.

    I want to retrieve top 10 (_protocol - _port) pair order by sum of _in from Mytable.

    SELECT TOP 10 _protocol,_port
    FROM MyTable
    WHERE _id=0xC0A801D5
    AND (_time >= '2009-06-22 00:00:00' AND _time <= '2009-06-23 23:59:00') 
    AND _ip=0xC0A801C1
    GROUP BY _protocol,_port
    ORDER BY SUM(_in) DESC;

    Above query takes very long time.
    Are there anyone to help me to speed up this query?


    sql server 2005,C++,C#
    Tuesday, June 23, 2009 8:32 AM

Answers

  • You have 3 fields in your WHERE clause.  Let's assume that _id and _ip have a low cardinality (there are lots of rows with the same ip and id).  Then your best index will be a clustered index on:

    _time, _id, _ip, INCLUDE (_protocol, _port, _in).

    If _id or _ip are very specific (have a high cardinality), then you'd want one or both ahead of the _time column in your index.


    BTW, your time range query is better coded as exclusive on the top end:

    AND _time >= '2009-06-22' AND _time < '2009-06-24'


    Michael Asher
    Tuesday, June 23, 2009 11:35 AM

All replies

  • You have 3 fields in your WHERE clause.  Let's assume that _id and _ip have a low cardinality (there are lots of rows with the same ip and id).  Then your best index will be a clustered index on:

    _time, _id, _ip, INCLUDE (_protocol, _port, _in).

    If _id or _ip are very specific (have a high cardinality), then you'd want one or both ahead of the _time column in your index.


    BTW, your time range query is better coded as exclusive on the top end:

    AND _time >= '2009-06-22' AND _time < '2009-06-24'


    Michael Asher
    Tuesday, June 23, 2009 11:35 AM
  • Consider rearranging the order of the primary key columns in the clustered index (or creating a separate non-clustered covering index) in order to speed up this particular query.  The _type column is before the _ip column in the index key and prevents _ip from being part of the seek predicate because _type is not in the WHERE clause.

    --
    Hope this helps.
     
    Dan Guzman
    SQL Server MVP
    http://weblogs.sqlteam.com/dang /
    "remedios_" wrote in message news:1b78379a-c574-4d1 3-a39d-556f354eb884...

    Hi,
    Here is the a table called MyTable.

    _T("IF OBJECT_ID ('MyTable','U') IS NULL
    CREATE TABLE MyTable
    (
    _id    BINARY(4)  NOT NULL,
    _time    SMALLDATETIME  NOT NULL,
    _type   TINYINT   NOT NULL DEFAULT 1,
    _ip    BINARY(4)    NOT NULL,
    _protocol   TINYINT   NOT NULL,
    _port    BINARY(2)   NOT NULL,
    _in    FLOAT    NOT NULL DEFAULT 0.0,
    _out    FLOAT    NOT NULL DEFAULT 0.0,
    PRIMARY KEY(_id,_time,_type,_ip,_protocol,_port)
    );

    It has 216,000,000 records.

    I want to retrieve top 10 (_protocol - _port) pair order by sum of _in from Mytable.

    SELECT TOP 10 _protocol,_port
    FROM MyTable
    WHERE _id=0xC0A801D5
    AND (_time >= '2009-06-22 00:00:00' AND _time <= '2009-06-23 23:59:00') 
    AND _ip=0xC0A801C1
    GROUP BY _protocol,_port
    ORDER BY SUM(_in) DESC;

    Above query takes very long time.
    Are there anyone to help me to speed up this query?


    sql server 2005,C++,C#
    Tuesday, June 23, 2009 11:42 AM