Data Points - Dapper, Entity Framework and Hybrid Apps RRS feed

All replies

  • Signing in so I can get notifications if anyone comments! :)

    Julie Lerman, Author of Programming Entity Framework, MVP

    Wednesday, May 11, 2016 7:59 PM
  • Hey Julie, 

    Been a follower of your column for a long time, great work. I do, unfortunately, disagree with your evaluation in this article; I don't see the same numbers you see. I actually have an outstanding pull request on the dapper repo where i got almost identical performance from EF on their tests  (PR to modified Dapper tests https://github.com/StackExchange/dapper-dot-net/pull/510) and more performance from L2SQL <= This blew me away honestly.

    Anyway, I think there are a few things that can be tweaked in the test code in the article that are not quite keeping things apples-to-apples:

    1. For a true feature parity, we should turn off lazy loading, proxy generation, validation on save and, like you did, change tracking 
    2. You are creating a new instance of the context each time in your loop, the dapper code is an extension, therefore a static. So, even though you arent necessarily going to use your context statically ever (since you cant change the connection string at runtime) it is a "real-life" emulation, but not a proper benchmark of the 'mapping' code... i know, its nitpick-y, but move it outside of the for-loop and youll be blown away by the construction time
    3. I dont know what your tables look like, but I've found smaller tables with a few rows and datatypes are actually faster in dapper, but not quite so when that table gets wide with nullables and different datatypes. So, that might be something to add
    4. I couldnt get dapper to map more than one child object (read two joins) without my own processing after the fact and a LINQ statement to do it.

    Please take a look at my tests here : EF6 vs. Dapper (https://github.com/ewassef/ef6-vs-dapper) and let me know what you find. I was doing my own comparison and was blown away by what I found and even more so after reading your article about EF Utilities and bulk insert etc, so I started researching to write my own blog post about it. 

    Thanks, and keep writing, I look forward to it every month


    Wednesday, May 18, 2016 12:52 AM
  • Hey thanks Eddie.

    Great points.

    re # 1. FWIW, I don't have anything in my domain classes that would trigger lazy loading or proxy generation however these are good points for a more generic sample where someone's domain classes MAY be designed in that way.

    re #2. I'm creating a new instance of the context intentionally. I have to go look more closely at dapper for the apples to apples way of comparing them.

    re #3. ahhh cool. Good to have that point here for others to read

    re #4. hmmm...

    Will check out your repo. My sample is also on GH if you want to do a PR: https://github.com/julielerman/ef6_dapper_experiments

    Thanks a bunch!

    Julie Lerman, Author of Programming Entity Framework, MVP

    Thursday, May 19, 2016 1:56 PM
  • It is definitely true that the more recent builds of EF have aggressively targeted some of the scenarios where micro-ORMs  used to embarrass it - I know that the EF team have actually been using dapper comparisons as part of their goal, so I'm very pleased for them  if they've achieved it in some scenarios. Perhaps part of the difference here is that dapper *defaults* to this lightweight scenario. They do target different uses, but either *can* be used  in the place of the other. As I keep telling people: an ORM or micro-ORM is just a tool. You are allowed to use different tools for different jobs in the same application. The back end of a screwdriver can be used as a hammer, but it is easier to pick up a hammer.

    On your points 1/2: yes, a like for like comparison should minimize any misleading feature / design differences.

    3: sounds odd, but can't comment concretely

    4: yep, complex trees isn't a key targeted usage, although I'd like to make this better; the hardest bit is probably figuring out what the API to describe it should  look like! (edit: there *is* a multi-map that comes close to this, but it needs some manual love to make it build a tree rather than object-set per row)

    Marc [C# MVP]

    Thursday, May 19, 2016 2:01 PM
  • @MarcGravell I hope you take a look at Eddie's code on his EF6 v Dapper repo, I work with him, we've been discussing the ability of doing things in Dapper v our existing ORM (NHibernate) vs moving to EF6. I took some time this weekend to follow up on his code, and I've made some incremental changes to his, adding a few new tests to indicate those incremental improvements. In the last case, I've sped things up by a factor of about 4, but he hasn't had a chance to review my code and give me feedback as of the time of my writing this comment, so he may find something I've done wrong.

    Regarding #3, it's important that we optimize for retrieval. If we are returning 10k child records at 10 fields, and 1k parent records at 20 fields, we should be returning approximately 100k child record-fields, and 20k parent record-fields, for a total retrieval of 120k record-fields. If we inadvertently do a cartesian join in the naive case, we end up returning 10k records at 30 fields apiece, or 300k fields, which is obviously way larger than 120k fields. This is significant bloat and tends to take up a lot of over-the-wire time, which significantly degrades our speed. There are ways to improve that time, by writing optimized code, based on testing.

    Regarding #4, that is a direct extension of #3, so if we incorrectly do #3, we may have issues with #4. On top of that, as Marc mentioned, the use-case of Dapper is not to do multi-tier record retrieval, we should be using the "default" multi-mapper (in the way that Eddie wrote his sample case) in cases where we are doing split-table entities, or in cases where the parent will have no more than 1-2 child records in the response. In the case of a large number of responses, or a deep layer of nesting, we should limit ourselves to the non-naive case, and write the complex code as it should be done.

    Monday, May 30, 2016 5:51 AM
  • I've been interested in Dapper for a while since I worked in tons of environments that were stuck in 2004-era writing data code.  The one thing that has always stuck out as a bugbear to me though is two things:  

    1) Writing SQL inline in code.  I get the fact that SQL isn't evil or dirty; I like SQL.  But there's something offputting about hardcoding a query.  Maybe it's just that so many places I've worked for consider this a cardinal sin and require everything to be done via a stored procedure or a function.

    2) The fact it seems extremely clunky to do anything beyond basic queries, and the fact that everything seems to use "Select *" which is another thing I commonly see as a "cardinal sin" (at my current place of employment, not that we use Dapper, it's in our coding standards to never ever ever use select *).  I suppose the easy way around this is to make use of views and functions to do the joining of data so you don't have to write an ugly SQL statement in a method call.

    I don't know.  I like the concept of Dapper, and similar ORMs (I've also used PetaPoco and found it was pretty good), maybe my viewpoint is just clouded by working at places that always want to use procs for data access.

    Wednesday, June 1, 2016 1:36 PM
  • Hi Wayne,

    I'm with you on writing SQL etc (gosh I've forgotten much of it since using EF for so long) but I think the real point (and value) of a micro-ORM like Dapper is when performance outranks ease of coding. You have to pick your battles, right? So you just may find times that it is worth the extra effort to get the performance gain. If not, then there's no need really. WRT raw SQL, you can call views and stored procedures from Dapper as well. HTH

    Julie Lerman, Author of Programming Entity Framework, MVP

    Wednesday, June 1, 2016 2:02 PM
  • The speed comparisons are useful, but honestly, when dealing with complicated queries, adding another layer such as EF or NHibernate only complicates matters.  You then must learn the proper way to query data from EF so it can generate the proper SQL to return your data.  That's backwards.  How many layers of abstraction do we need?  I'd rather just craft the SQL in the first place.  We ran into this often with NHibernate.  It got rather ridiculous.  We were chasing an Ivory Tower ideal, when really just writing the SQL was much easier.  There is no need to force RDBMS data into a pure OOP model.  Even if Dapper was the same speed as EF or even slightly slower, it would still be my preferred way of querying a database.
    Saturday, December 17, 2016 5:24 AM
  • Hi Juile,

    I am working on an open source library which translates EF lambda into sql and execute it using Dapper. I am trying to help people to get the full OO experience from EF, and also get the performance benefit from Dapper. Seems like you have great knowledges of EF, maybe you could tell me if I am doing something useful, or if there is anything I should also consider. Here is the link to the library https://github.com/ethanli83/EFSqlTranslator.

    Thanks in advance

    Sunday, January 29, 2017 7:48 AM