locked
Explicit vs. Implicit Behaviour

    General discussion

  • Prefacing some words about the focus of this post: I am looking at the features described below having only the maintainability of the resulting code in mind. The assumed situation is having an enterprise project with > 20 developers, internal and external, who built the application, and a completely different team of developers maintaining the system later on. Also, the assumption is that none of the developers who originally created the system will be available for questions by the maintaining staff. The members of the developing and maintaining team will be average developers, maybe 'just' programmers, who are not familiar with (and maybe also not interrested in) implementation details hidden by higher level statements like DB connection lifetime.

    So from my point of view, it is very important to have, beside accurate documentation, an explicit behaviour, that is obvious to any reader of the code, even if he does not have any deep knowledge of what is going on behind the scenes. From my experience with enterprise projects, self-documenting code is essential for the maintainability.

    The first implicit issue I saw in a PDC slide is the declaration of a variable as var. What I unsderstood, the compiler will select the 'real' type of the variable depending on the initial value. Now take this line of code:

    var n = 5;

    What is the correct type? From the type system, it can be any of the integral types (sbyte, byte, char, short, ushort, int, uint, long, ulong). Only those programmers who read the language specification in detail will know what type will be chosen by the compiler. This is an implicit, hidden behaviour, which decreases the maintainability. Having int n = 5; instead is explicit. Anyone can see the type of n directly.

    Another maintaining issue from my point of view is the possibility to declare anonymous types. I saw production code having methods with > 1000 lines of code. This is not really good, but it is real life. Now imagine someone who writes methods with > 1000 lines of code is using anonymous types, and someone different has to maintain that code. I do doubt that it will be easy to identify the types used in the code, even having just two screen pages of code. Beside this, when you discuss the code with a colleague, how will you call that anonymous type? A1, A2, A3, 'that type over there' and 'the other type over here'?
     
    A maintaining and resource usage issue seems to me the lifetime of e.g. a DB connection within DLinq. It is not obvious how long the database connection is open (in use) when using DLinq. Do I have a disconnected access, which might lead to heavy memory usage on the client giving the resultset has many rows and/or large field content, or is the connection open until GC destroys the underlying objects? This increases the resource usage on the DB server because of potential high number of concurrent connections. I'm sure this will be described in the documentation of DLinq, but, as said above, the average programmer might not read this.

    Of course, one might prevent the usage of these features by having strict coding conventions and quality reviews, but this is an additional effort, which increases the cost of a project.

    Maybe there are ways to use (these) implicit features in an explicit way (without writing additional comments ;-) )?

    Sunday, October 02, 2005 2:56 PM

All replies

  • What is the correct type? ... Only those programmers who read the language specification in detail will know what type will be chosen by the compiler.

    Frankly, I think this is, at best, a really atypical example of how LINQ will actually be used. I'm not saying that such a construct isn't (and shouldn't be) valid, though. Realistically no body is going write a static singleton sequence for use as query unless they have a specific and well reasoned need to do it since its such an atypical use. And yes, that should be documented.

    This is an implicit, hidden behaviour, which decreases the maintainability. Having int n = 5; instead is explicit. Anyone can see the type of n directly.

    Anybody can look at the bound type of the generic iterator that provides instance values to see what the actual type is. That's kind of the point of generics. It could be any of a number of things that are in a family of values, not just one type. So, LINQ in and of itself doesn't introduce more complexity into that space. We get to start coping with it for real in 39 days or maybe less.

    It is not obvious how long the database connection is open (in use) when using DLinq.

    Explicit? No. Obvious? Well, it was to me... just long enough to populate the sequences of bound objects. If you run SQL profiler against such an app now, you see an explicit open on first use, the command to execute flys over, gets executed and returns. Then there is an explict close before the first interated value is returned. At least, at appears to be what I'm seeing today.

    Do I have a disconnected access, which might lead to heavy memory usage on the client giving the resultset has many rows and/or large field content,

    Yes, its disconnected post access, but the footprint is in the sequenced instances rather than in a method-rich object like a Data(Set|Table). Regardless of the technology choosen, its always possible wind up with Bowling Balls in your paths though. There's no magic in LINQ for that. Smile

    or is the connection open until GC destroys the underlying objects?

    Not based on what I'm seeing now. That'd be a horrible design from a scalability standpoint as you go on discuss though.

    I'm sure this will be described in the documentation of DLinq, but, as said above, the average programmer might not read this.

    Proactive preparation and planning prevents piss poor program(mer) performance. It'll be a long time until there's a technology fix for that behavior. Its why we have managers and training budgets, right....? Smile

    Monday, October 03, 2005 3:04 AM
  • Kent,

    Looking at your answers I do have the impression I was not able to express myself correctly. Sorry for that!

    The description of the particular implemention wasn't my focus. My focus in this posting is that I do have the impression that some of LINQ's features do hide behaviour, which makes it harder to maintain a system. Isn't e.g. your suggestion to use SQL profiler to see the behaviour a proof of this?

    Of course I can spend more time in preparation and planning to try to avoid unwanted programming style. This also requires additional time to review if all rules were met. Since time is money, and money is a very limited resource in all enterprise projects I know, I'm trying to save as much as I can. (I do agree with you that a technology to 'fix' programmer's behaviour is far far away in the future Smile.)

    In this thread I want to take the chance to discuss a more 'abstract' aspect of this new technique (let me say from a manager's point of view), since the impact on maintainablitly is a very important one to me. So maybe this discussion might help to find a way that serves both side: Having a 'decoupled' way to access database data that does not have a negative impact on maintainability.

    Please let me know if I got you wrong.
    Sunday, October 09, 2005 5:01 PM