Answered by:
Better way to select records?

Question
-
I have a table named "tblSelectedCustomers" that contains 250 customers (out of a total of 1,000 customers that reside in another table.) I also have a second table named "tblSalesOrders" containing 10,000 sales orders (across all 1,000 customers.)
I need to select only those orders from tblSalesOrders that are associated with my tblSelectedCustomers group.
Currently, I open tblSelectedCustomers and loop through it looking for associated orders in tblSalesOrders. I do this 250 times. It works, but I'm wondering if there's a quicker way? Thanks.
Tuesday, March 20, 2012 3:00 AM
Answers
-
Using a Join statement in a SQL stateement which will allow you to filter the two tables and get the results you are looking for. You can search the webpage for SQL and Join to get plenty of examples.
jdweng
Tuesday, March 20, 2012 3:53 AM -
IF you want to place all data together in the same combo you can use union statement
"SELECT Doc_Date AS UnionDate from Table1 where wc1 Union ALL SELECT Dateoftran as UnionDate
from Table2 where wc2"
- Marked as answer by Kenrav Wednesday, March 21, 2012 10:07 PM
Tuesday, March 20, 2012 10:21 AM
All replies
-
Using a Join statement in a SQL stateement which will allow you to filter the two tables and get the results you are looking for. You can search the webpage for SQL and Join to get plenty of examples.
jdweng
Tuesday, March 20, 2012 3:53 AM -
IF you want to place all data together in the same combo you can use union statement
"SELECT Doc_Date AS UnionDate from Table1 where wc1 Union ALL SELECT Dateoftran as UnionDate
from Table2 where wc2"
- Marked as answer by Kenrav Wednesday, March 21, 2012 10:07 PM
Tuesday, March 20, 2012 10:21 AM -
250x??!!! No, no, no. Take a look at this:
http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
http://blog.sqlauthority.com/2009/04/13/sql-server-introduction-to-joins-basic-of-joins/
Those tutorials are aimed at the SLQ Server audience, but the purpose in Access is exactly the same (for your specific needs). There are certainly differences b/w the two types of SQL (I don't think you'll come across the differences for a while).
Tuesday, March 20, 2012 7:54 PM -
>> I need to select only those orders from tblSalesOrders that are associated with my tblSelectedCustomers group. <<
You can easily get all the records you want by opening a recordset with a SQL statement that looks like this:
SELECT tblSalesOrders.* FROM tblSalesOrders WHERE someFKField IN (SELECT somePKField IN tblSelectedCustomers)
In code something like this would work ...
Dim strSQL As String strSQL = <a SQL statement that resembles what is above> With CurrentDb.OpenRecordset(strSQL) Do Until .EOF Debug.Print .Fields(0) 'Do your thing for each record here. .MoveNext Loop End With
There are other variations of SQL statements you can use, one being a JOIN, like Joel indicated ... which would look something like this:
SELECT tblSalesOrders.* FROM tblSalesOrders INNER JOIN tblSelectedCustomers ON tblSalesOrders.someFKfield = tblSelectedCustomers.somePKfield
Hope that helps!Brent Spaulding | Access MVP
- Proposed as answer by -suzyQ Tuesday, March 20, 2012 9:49 PM
Tuesday, March 20, 2012 9:06 PM -
SELECT tblSelectedCustomers.* FROM tblSalesOrders INNER JOIN
tblSelectedCustomers ON (tblSalesOrders.customerID = tblSelectedCustomers.customerID);This will work in any dialect of SQL.
p.s. Looks like Brent Spaulding raced me!
Matthew Slyman M.A. (Camb.)
- Edited by Matthew Slyman Wednesday, March 21, 2012 7:35 PM
Wednesday, March 21, 2012 7:33 PM -
Brent -
Thanks. Trying to stay away from looping though. I like the INNER JOIN approach better.
Ken
Wednesday, March 21, 2012 10:10 PM -
Thanks. I'll check them out.
Ken
Wednesday, March 21, 2012 10:11 PM -
Ken ...
>> Thanks. Trying to stay away from looping though. I like the INNER JOIN approach better. <<
I did not propose looping to be the solution, I recommended the INNER JOIN (or the IN clause) as the solution to get your data. The looping was just an example as to how to work with the resultant set of data returned by a recordset generated from the INNER JOIN (or IN clause) SQL statement should per record operations need to be done. Since we don't know what operation you were doing on each resultant record, I went ahead and included how you could work with the resultant set.
By the way, the looping I have shown is not like the looping you indicated you were doing --- but do remember that sometimes you just gotta loop! LOL!, but seriously please examine my response in greater detail, if my suggestion was unclear, I apologize.
Brent Spaulding | Access MVP
- Edited by datAdrenalineMVP Wednesday, March 21, 2012 10:37 PM
Wednesday, March 21, 2012 10:36 PM -
I recommended the INNER JOIN (or the IN clause) as the solution to get your data.
A JOIN will almost certainly be the most efficient solution here. An alternative is:
SELECT *
FROM tblSalesOrders
WHERE EXISTS
(SELECT *
FROM tblSelectedCustomers
WHERE tblSelectedCustomers.CustomerID = tblSalesOrders.CustomerID);
In most cases the use of the EXISTS predicate is faster than the IN operator as it allows the optimizer full rein, though in this case, with the large discrepancy in the cardinality of the two tables, the latter might have the edge.
BTW I think you meant 'FROM tblSelectedCustomers' in the subquery, not 'IN tblSelectedCustomers'.
Ken Sheridan, Stafford, England
Wednesday, March 21, 2012 11:30 PM -
Ken,
What's the difference? Why is the optimizer restricted in any way by my use of "INNER JOIN ... ON" rather than "WHERE EXISTS"? If I use appropriate indexes, and use "...ON...", why am I ever at any disadvantage?
More generally, where can I get good books on SQL? Not speaking here of any particular dialect - but more generally of SQL, indexing, optimization and fundamental database technology? What would you recommend?
Matthew Slyman M.A. (Camb.)
- Edited by Matthew Slyman Thursday, March 22, 2012 3:05 PM meant "ON" rather than "IN"
Thursday, March 22, 2012 6:22 AM -
Indexing is of course the crucial factor. The general preference for the EXISTS predicate over the IN operator is one of those axioms which you see time and time again in the literature, e.g. Celko:
"Conversion [of IN] to an EXISTS predicate is often a good way to improve performance, but it will not be as easy to read as the as the original IN predicate"
I don't actually agree with him about the readability; it's generally appeared equally if not more readable to me. Interestingly Celko refers to the IN predicate. I've always thought of it more as an operator, i.e. the operation is more one of the referential algebra rather than the referential calculus. One important difference between the two is the treatment of NULLs. This is particularly important with the NOT IN operator, which won't work if any row in the subquery returns a NULL. Doing a little simple algebra soon shows why. So the NOT EXISTS predicate should always be used if NULLs are a possibility.
Also from Celko:
"EXISTS (SELECT * FROM . . . )
In general, the SELECT * option should perform better than an actual column. It lets the query optimizer decide which column to use. If a column has an index on it, then simply seeing a pointer in the index is enough to determine that something exists. We do not need to know what the actual value was."
This brings me to the point I made about the difference in the cardinality of the two tables. If the table in the subquery has a low cardinality then it's not going to take long to scan it, so in this situation the IN operator might well be a better bet.
As regards books, the above quotes are both taken from Joe Celko's SQL for Smarties: Advanced SQL Programming. This is a good buy. It's not only authoritative, but very entertaining, and I've found many of the methodologies it describes to be usable in a variety of real world situations. One thing I don't share is his enthusiasm for his Nested Set Model for trees (in the mathematical rather than botanical sense). It's an ingenious, elegant and extremely efficient model, but I've found that true trees, in which there is one path only between each node, are relatively rare in the real world, or at least in that part with which I've collided in the course of my own field of work. Convergent graphs, in which there can be multiple paths between nodes have tended t be more common, for which the classical adjacency list model has always seemed to me to be a better, albeit less efficient, solution. As far as I recall Celko suggests using aliases to convert a convergent graph to a tree, but I think that could soon become very cumbersome.Another book of Celko's worth a read is his Data & Databases: Concepts in Practice. It's not about SQL as such, but it does make you think again about some of the basic principles underlying the relational model. On theory, as a general primer I still don't think there's anything to beat Chris Date's An Introduction to Database Systems. He's a Cambridge man like you. It's quite pricey these days though. I got my copy 12 years ago when it was affordable!
Ken Sheridan, Stafford, England
Thursday, March 22, 2012 1:42 PM -
Ken S.
One important disctinction ... The Jet/ACE engine does things differently than SQL Server.
I did an emperical study with respect to EXISTS, IN, NOT IT, INNER JOIN once -- it was some time ago and I can't find the thread at the moment (it was on UtterAccess). But if my memory serves me correctly ...
With indexes (and 'simple' queries):
IN vs INNER JOIN: virtually identical. I would not be surprized if the query plans were identical.
EXISTS: Was indeed fast, but I don't recall it being faster.
NOT IN: Slow, plus, it was slow ... did I mention that it was slow :) <indexes aren't used with a NOT IN anyway>
With OUT indexes (and 'simple' queries):
IN vs INNER JOIN: INNER JOIN was the best.
EXISTS: If I recall it was faster than IN as long as the WHERE clause was 'simple'
NOT IN: I don't think I bothered.
Brent Spaulding | Access MVP
- Edited by datAdrenalineMVP Thursday, March 22, 2012 5:11 PM
Thursday, March 22, 2012 5:11 PM