none
Big data and .net

    Question

  • Good day to all!
    Recently, faced with a very trivial, but as it turned out, quite a challenge. It's all about the choice of architecture to handle big data, as well as the use of .Net

    Situation.
    The company has a system full of bugs and errors with a low possibility of customization. Business requirements are changing every week. In the end, the sistem is turned into megamonstra. The volume of information out there impressive. A table can have more than 40 million records.

    Requirements.
    I need to create a universal interface for fast data access (MS SQL Server 2008). The User (>.<) wants to work immediately with all the data (to display, filter and sort large amounts of data with complex queries).
    In the future we plan to make the ability to edit. So this must also be taken into account.

    Problem.
    We can't have in memory such amounts of data and user don't wand to download 40 gigs. Therefore, the logic of sorting, filtering and information retrieval have to move on to the server.
    My solution is to give the user only piece of information. So-called Lazy Load. But how to combine it with ADO.NET for me remains a mystery.
    Custom computer just hangs when I build it in RAM DataSet.

    QUESTIONS

    1. Is there .Net mechanisms to make virtual manipulation with data? I have a timeout exception while waiting for response.

    2. How to enable caching TO THE HARD DISK in ADO.NET  (http://msdn.microsoft.com/ru-ru/library/bb384436.aspx)?

    3. How to pass information not en masse, having a delay and 40 gigs request as a result, as it is retrieved from SQL Server, but dynamicly like in sql server manager row by row.

    4. Can I create a local disk query cache? Something like "cached view" =)

    5. So, the question in a million: that it is better to choose a two-tier or three-tier architecture (who better to cope with the task of selecting hundreds of millions of values ??from the table: disk cache data, or indexed view in SQL SERVER)?

    I would be glad if someone will share experience in developing applications intended to work with large volumes of data with .NET technologies.
    • Moved by Vicky SongMicrosoft employee Friday, May 04, 2012 7:17 AM (From:Visual Studio Database Development Tools (Formerly "Database Edition Forum"))
    Thursday, May 03, 2012 1:28 PM

Answers

  • Hi,

    The issue seems to be in the "wants to work immediately with all the data to filter them..." that is it seems you are trying to load all data and then let the user filter those data. IMO it should be the other way round. That is the user should start by telling which data he wants using a fitlering interface so that the filtering can be done server side and only requested data be fetched from the server.

    The idea is that the DataSet is not intended to hold all the database data and then filter them out. It is intended to be used as a "read/write cache" that holds only the data you currently needs.

    If you have decided to use DataSets, a group such as http://social.msdn.microsoft.com/Forums/en-US/adodotnetdataset/threads could be better.

    FYI, Entity Framework allows to handle database data as object instances (http://msdn.microsoft.com/en-us/data/aa937723).


    Please always mark whatever response solved your issue so that the thread is properly marked as "Answered".

    Friday, May 04, 2012 11:36 AM

All replies

  • Hello hex.stvle,

    I am moving your case to the ADO.NET Entity Framework forum so that you can get better support there.

    Thanks.


    Vicky Song [MSFT]
    MSDN Community Support | Feedback to us

    Friday, May 04, 2012 7:17 AM
  • Hi,

    The issue seems to be in the "wants to work immediately with all the data to filter them..." that is it seems you are trying to load all data and then let the user filter those data. IMO it should be the other way round. That is the user should start by telling which data he wants using a fitlering interface so that the filtering can be done server side and only requested data be fetched from the server.

    The idea is that the DataSet is not intended to hold all the database data and then filter them out. It is intended to be used as a "read/write cache" that holds only the data you currently needs.

    If you have decided to use DataSets, a group such as http://social.msdn.microsoft.com/Forums/en-US/adodotnetdataset/threads could be better.

    FYI, Entity Framework allows to handle database data as object instances (http://msdn.microsoft.com/en-us/data/aa937723).


    Please always mark whatever response solved your issue so that the thread is properly marked as "Answered".

    Friday, May 04, 2012 11:36 AM