locked
Mysterious rows RRS feed

  • Question

  • May I know if there is a way to tell how a given row has been populated into a table? What I mean is; I have a table which is supposedly populated from two separate sources. One source uses a stored procedure while the other uses a simple straight forward query. Both are run by an SSIS package. The problem is that there are mysterious rows which are created in the process. Is there any way that I can tell how and where these rows come about other than just looking at the above named queries. I have exhausted all possible ways within the limit of my knowledge.

      

    Thanks,


    Mpumelelo
    Wednesday, January 25, 2012 4:31 PM

Answers

  • I think I have managed to get to the root of my problem. There is a code which is buried in a gigantic SSIS package in one of the Execute SQL tasks. This code is the culprit. However, Naomi and Dharampal’s comments are very useful to be used in problems related to this. As such, I’m marking both as answers.

    Many thanks,


    Mpumelelo
    • Marked as answer by Mpumelelo S Tuesday, January 31, 2012 4:48 PM
    Tuesday, January 31, 2012 4:48 PM
  • You can create a trace, but if these rows are created as part of the big insert (not a single row insert) you may be out of luck identifying the source. Of course, you can also have a trigger on the table that will save user's name and inserted info in the audit table (or do auditing by other means).
    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog

    • Edited by Naomi N Wednesday, January 25, 2012 4:34 PM
    • Marked as answer by Mpumelelo S Tuesday, January 31, 2012 4:49 PM
    Wednesday, January 25, 2012 4:34 PM
  • You have two sources, one is SP and another is direct qeuries.

    Suppose you have 10 columns in Target table. Add one more column as "Source" in target table.

    While entering records from SP, pass 10 columns value with the value "source is sp" in "Source" column of target table. If records are inserting by another place or direct query then with 10 columns pass "source is direct sql" in "Source" column of target table.


    dharampal sikhwal
    • Proposed as answer by vinaypugalia Monday, January 30, 2012 12:49 PM
    • Unproposed as answer by Mpumelelo S Monday, January 30, 2012 2:44 PM
    • Marked as answer by Mpumelelo S Tuesday, January 31, 2012 4:49 PM
    Thursday, January 26, 2012 12:34 PM

All replies

  • You can create a trace, but if these rows are created as part of the big insert (not a single row insert) you may be out of luck identifying the source. Of course, you can also have a trigger on the table that will save user's name and inserted info in the audit table (or do auditing by other means).
    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog

    • Edited by Naomi N Wednesday, January 25, 2012 4:34 PM
    • Marked as answer by Mpumelelo S Tuesday, January 31, 2012 4:49 PM
    Wednesday, January 25, 2012 4:34 PM
  • Thank you Naomi for that prompt reply. The table in question has a nightly refresh (trash and load) of approx 149000 rows. Given that, what could be the best approach?


    Mpumelelo
    Wednesday, January 25, 2012 4:46 PM
  • Hi,

    You have to perform looping for this kind of operation.


    Cheers,
    IT-Singh
    Wednesday, January 25, 2012 4:47 PM
  • Or maybe there is another object loading those mysterious rows which I’m not able to pinpoint. But one thing for sure, the rows in question are wrong and don’t represent any real life data. The trouble is; the system is full of so many stored procedures and SSIS packages which load various tables in the database. As far as I can investigate, there are only two of them responsible for populating that table. Besides, those two objects pull data from the only two sources that we use, unless if there is a third non-obvious object responsible for the mysterious rows. Is there any way that I can use to pinpoint that culprit (query/stored procedure).


    Mpumelelo
    Wednesday, January 25, 2012 4:55 PM
  • The only way I see is to set up a trace and monitor the process for couple of days. 
    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog
    Wednesday, January 25, 2012 4:59 PM
  • Thank you Naomi. I will do as per your suggestion. I will come back with the results some time next week as my working week ends here for this week. Any other suggestions from other techies will be gladly welcome but the results will be next week.


    Mpumelelo

    • Edited by Mpumelelo S Wednesday, January 25, 2012 5:10 PM
    Wednesday, January 25, 2012 5:08 PM
  • If you are interested only to know the source of rows and if the source is so important for you, you can add one more column "source" in the table and while inserting rows you can pass appropriate values in source field.
    dharampal sikhwal
    Wednesday, January 25, 2012 7:31 PM
  • Thank you Dharampal. By appropriate values, what do you mean? How will I determine or baseline the appointment of those values in relation to the source? Hope my question is clear.


    Mpumelelo
    Thursday, January 26, 2012 9:07 AM
  • You have two sources, one is SP and another is direct qeuries.

    Suppose you have 10 columns in Target table. Add one more column as "Source" in target table.

    While entering records from SP, pass 10 columns value with the value "source is sp" in "Source" column of target table. If records are inserting by another place or direct query then with 10 columns pass "source is direct sql" in "Source" column of target table.


    dharampal sikhwal
    • Proposed as answer by vinaypugalia Monday, January 30, 2012 12:49 PM
    • Unproposed as answer by Mpumelelo S Monday, January 30, 2012 2:44 PM
    • Marked as answer by Mpumelelo S Tuesday, January 31, 2012 4:49 PM
    Thursday, January 26, 2012 12:34 PM
  • Thanks Dharampal. Your suggestion has been well understood. I will try that when I get to work next week and will give the feedback once I get an amicable solution.


    Mpumelelo
    Thursday, January 26, 2012 2:31 PM
  • This problem is not yet solved. Actually I have just noticed that the current setup already has a “Source” column which Dharampal suggested that I should include. I found this setup already in place when I joined this organisation. Unfortunately values in that “Source” column also seem to be victims of the mystery surrounding my predicament as they are giving the impression that the fictitious rows are coming from the source queried by a straight SQL statement.  

    Maybe let me add some more details to what I have already explained in my earlier posts about the setup of my systems. 

    1.      The source which is queried by a straight SQL statement is a legacy database which is no longer functional on a daily basis. It is however queried every night to populate a data warehouse which is populated on a ‘trash and build’ method. I’ll call this source OLD.

    2.      The source queried by a stored procedure is the currently functional transactional database. This too is queried every night to populate a data warehouse on a ‘trash and build’ method. I’ll call this source NEW.

    Records from OLD and NEW are populated into the same table, thus effectively combining them.

    Client records from OLD and NEW

    1.      May be new on NEW database.

    2.      May be closed and just resident on the legacy OLD database.

    3.      May have been migrated from the legacy OLD database to NEW database as they were still open and live at the time of the introduction of NEW database. 

    It is the records on number 3 that are giving problems that I’m experiencing. When a client record closes, i.e. a record which originally spanned two databases (OLD and NEW) at the time when NEW was launched, a spurious record gets created pretending to be originating from OLD. That record persists until a new genuine and authentic record for that client in question is created, then at that point the fictitious record closes. This seems to be happening on all the different client records as long as they spanned OLD and NEW databases when NEW was launched. 

    I hope these further details that I have given make my dilemma more understandable and attackable. I will appreciate additional suggestions further to those already given so far. In the meantime I will explore the trace option which Naomi suggested. It sounds a little daunting though to someone like me who is more of a developer than a DBA, and for that matter still growing in experience. Will be good learning curve for me I guess. 

    Many thanks,

     


    Mpumelelo




    • Edited by Mpumelelo S Monday, January 30, 2012 2:45 PM
    Monday, January 30, 2012 11:47 AM
  • I think I have managed to get to the root of my problem. There is a code which is buried in a gigantic SSIS package in one of the Execute SQL tasks. This code is the culprit. However, Naomi and Dharampal’s comments are very useful to be used in problems related to this. As such, I’m marking both as answers.

    Many thanks,


    Mpumelelo
    • Marked as answer by Mpumelelo S Tuesday, January 31, 2012 4:48 PM
    Tuesday, January 31, 2012 4:48 PM