Ask a questionAsk a question
 

AnswerMega Cluster logic

  • Thursday, July 20, 2006 7:34 AMSoulSolutionsMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    Dr Neil first mentioned this on the via virtual earth site recently and now I need to use them. Just though I would put my ideas out here and see what you all think and can add.

    To be clear this is a concept and I’m looking for ideas / help before I build it.

    Basically the issue is you have X number of points to put on the map where X is very large.

    1. If you go and add them all it is going to slow down to a crawl.

    2. If you have points at a close proximity at certain zoom levels they are going to sit on top of each other and be unhoverable.

    The solution is to have a mega cluster that provides information about the points it represents.

    Firstly we need to look at how we calculate what points need to be merged to become a mega cluster, secondly we need to look at what a mega cluster shows on hover to be useful.

    AJAX

    So straightaway with this many points we need to make an asynchronous call to a web service to get out points in XML. This can then be bound to the VE in JavaScript (or Atlas) in various ways. What we want is every time the map is panned or zoomed to get a new set of data. The parameters we will need to pass are: top left Lat Lon, bottom right Lat Long (so we know the area that is visible) and the zoom level.

    Calculations

    1. So I assume you have a collection of all the points in some way. We need to get only the ones that are visible. If the points are stored in the database we could query to get only these. If we had a more common set of points perhaps these are cached, if so we need to get a subset of just the visible ones.

    2. Now we need to identify which points are so close to each other they should be made into a mega cluster. We know the zoom level, we should have a constant value that indicates how big our icons are in pixels so what we need is a value in degrees for our current zoom level that indicates the radius of our icon. Using this we can group our points but there are some complications.

    If you have a set of evenly spaced points all within the range of each other to be clustered which ones get clustered with which points?

    My algorithm:

    I’m sure someone out there is a maths genius (I hope so) so I wonder if there is an algorithm that already exists to do this in the most optimal format.

    For now I propose we simply have a sorted list of our visible points, sorted by longitude.

    We go through the list in order, for each point if we:

    look backwards in the list for any points within the range that are not already grouped, as the points are in order we exit as soon as it exceeds the range.  

    WE then look forwards in the list for any points within the range, again we short out.

    For each value we find that longitude is in the range we check to see if the latitude is in the range also, if it is we group this. Perhaps we add a value to indicate a new point group id, or remove it from the list and build a new list of points, I’m undecided and up for ideas.

    Importantly we end up with a list of points, some of them grouped, all of them visible in the current view.

    Looks

    So we could just pass this information out to VE but it doesn’t know what to do with our new grouping. We could potentially still have a large number of points all grouped together.

    Now points have a balloon that appears when you hover. It could contain anything but most commonly it has a title, description, an image.

    Option 1 – “VE Windows” (Popup over the top of VE – I assume these can be made?? I have never made one).

    I’m thinking for a mega cluster we provide a brief list of links to the points’ balloons with a more link. This way if there is say less then 5 you can see the titles and go to the one you are after immediately else you need to see the whole list.

    For the whole list we could open up a new “VE window” and show the complete list of titles as links.

    When you click on a link either from the initial 5 or from the list it would open another “VE window” (replacing the one that is up, or minimising it) showing the balloon information.

    Option 2 – Zoom

    Same concept with the lists of titles but instead on being individual links there is a single “zoom to expand” link, perhaps also invoked by clicking on the point. It simply centres on the point and zooms in a specified amount.

    This would force a new request and the points would effectively be separated into smaller groups or individual points. Icons would help here.

    The issue here is what happens when you have Y number of points at exactly the same location. These can never be separated. How could this happen you ask? Say you took 10 photos from the same spot and are plotting photos of the world…

    Hey it was long, you made it through it, share you thoughts.

    John.

Answers

All Replies

  • Thursday, July 20, 2006 8:10 AMhafi23 Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Seams to bee a taff thing you are planning to do...
    have a look at this PPT http://dimacs.rutgers.edu/Workshops/MiningTutorial/pindyk-slides.ppt,
    maybe this "Algorithms for Nearest Neighbor Search" can be a starting point for a datastructure or anything else related to your algorithm... found it yesterday looking for s/t similar!

    hope it can help you ;-)
    cheers
  • Thursday, July 20, 2006 8:12 AMhafi23 Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Oops... ugly font size after dam copy&past :-o
  • Thursday, July 20, 2006 2:11 PMnside Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    John,

    You should use the excellent Cluster 3.0 clustering engine which is perfect for this kind of task. It provides k-means clustering (that's the hard math part of the problem) capabilities that you can improve by using a simple divide and conquer algorithm. This way you get very natural and precise clustering (feeding all the points to the k-means algorithm gives strange results and is very slow). It works best with static data but I could imagine someone using it for dynamic datasets.

    Regards,
  • Thursday, July 20, 2006 10:12 PMSoulSolutionsMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Thanks to both of you for your ideas and links.
    It seems I was a little too simple in my algorithm for clustering. It appears you need to factor in the curve of the earth and use something like the "Haversine formula" to calulate the distance between two points:

    dlon = lon2 - lon1;

    dlat = lat2 - lat1;

    a = (sin(dlat/2)) ^2 + cos(lat1) * cos(lat2) * (sin(dlon/2)) ^2;

    c = 2 * atan2(sqrt(a), sqrt(1-a));

    d = R * c;

    (Where R is the Radius of the earth)

    Then we create a matrix with the distance between every point.
    I can see this being very slow, the first thing we can do is eliminate points are are clearly two far away based on my simple evaluation of dlon and dlat. The code we write should be mulitheaded utilising a divide and conquer algorithm as suggested by nside.

    I will run up a few tests over the next week.
    John.
  • Friday, July 21, 2006 10:13 PMSoulSolutionsMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    After reading some of the articles on clustering I came across a simply idea. Instead of calculating distances between points etc why not just lay down a grid of points that do not overlap for the size of the icon and current zoom level and then simply snap the points to that grid.

    The calculation for the grid will be a little complex to factor in the curvature of the earth but then only a single pass over the visible points will be required to group them.

    My first steps will be to create the grid and confirm it will work. I can already see that this could be optimised based on a sorted set of points to only calculate the grid points where necessary over two passes of the data, one for latitude, one for longitude.

    Any thoughts?

  • Friday, July 21, 2006 11:56 PMSoulSolutionsMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I have put up a simple example of a 10x10 degree grid over VE:
    http://soulsolutions.com.au/MegaClusterEx1.htm

    (IE only sorry)
    Note the curvature of the earth.
    John.
  • Sunday, July 23, 2006 7:47 AMSoulSolutionsMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I returned to Dr Neil's example on viavirtualearth and can see there is going to be a need for Mega Clusters to be calulated on the client side for some applications. I can also see that at some stage MS will release custering or at least overlapping pin avoidance. Apart from the fact I can't wait, I don't see how this is going to be efficently handled on the client side for a large number of points. What would be nice is some client side code that relocates overlapping pins so they are usable. The combination of what i hope to do here and that client side feature could solve many issues.

    My next steps are:
    1) create a grid that compensates for the curve of the earth.
    2) create a function that snaps points to that grid server side (basic within bounds of grid square).
    3) calulate the required grid for each zoomlevel and specified icon size.
    4) display grouped points as mega clusters
    5) optimise the algorithm

    I'll try to put each step up on my site for those that are interested.
    John.
  • Thursday, July 27, 2006 7:48 AMSoulSolutionsMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Ok its been a big week so not a lot of progress on this.
    I tried using some funky algorithims to draw a 500Km grid:
    http://soulsolutions.com.au/MegaClusterEx2.htm
    (ie only, Fair bit of CPU work here so be patient)
    Immediatly you can see that this was silly as it follows the same issue as using degrees.

    I then found the very nice map.PixelToLatLong(x,y) function and ploted a 100 pixel grid
    http://soulsolutions.com.au/MegaClusterEx3.htm
    (ie only again - need to look at polylines in FF problem)

    So I have the peices of the puzzle but now to put it together on the server side...
    John.
  • Friday, July 28, 2006 8:17 AMSoulSolutionsMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Cracked it.
    With a bit of help from the logic in the WorldWind Hack on Viavirtualearth (thanks Casey) i got clustering working server side.
    Checkout the screen shot of 100,000 points clustered over the middle east.
    http://soulsolutions.com.au/megacluster.jpg

    Lots of tweaking to go buts it is incredibly fast server side already. Client side is the killer, the javascript rendering time for the above screen shot was > 30sec. Turning off the labels helped but i think VE just won't work with too many points.

    So let me know if your interested and i'll turn this into an article for Viavirtualearth. Otherwise I'll assume no-one else wants to plot 100,000 points in VE ;)
    John.
  • Friday, August 04, 2006 7:33 PMIonGuy Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    Did you ever post the article?  I am very interested in this approach.  I would like to modify it to work with a c. 30,000 point polyline to display a high-resolution track (at high zoom levels) without overwhelming the client or the server (at low zoom levels).

    Regards,

    - John
    IonGuy

  • Saturday, August 05, 2006 6:25 AMSoulSolutionsMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    You're the first to ask.

    I'll put something together and send it to Dr Neil over at viavirtualearth.com within the next few days.

    I'm using it to plot map replays of thousands of point so adding some polylines shouldn't be in issue.

    Its all vb.net based on the AJAX example from channel 9. Maybe a good place to start is to watch their articles on VE while I put this together.

    John.

     

  • Thursday, August 10, 2006 3:34 PMCudknees Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I am also very interested in seeing your megaclustering solution.
  • Friday, August 11, 2006 10:50 AMSoulSolutionsMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    Indeed. It has just been posted at via virtual earth
    http://viavirtualearth.com/VVE/Articles/Clustering.ashx

    All the code is there.
    John.

    Soul Solutions - Virtual Earth and Dotnetnuke

  • Friday, February 09, 2007 9:31 PMSKMcc Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    How do I debug eval(result) errors. I can run your code perfectly in the VB.NET development environment but it breaks when I deploy the code to an IIS server.
  • Saturday, February 10, 2007 2:20 AMSoulSolutionsMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    I highly recommend Firebug for firefox. Use the "NET" tab and look at "XHR" to see all the headers and responce for every ajax request.

    http://www.getfirebug.com/

    Send me an email if you want the source code from that article, it is a little old and an improved version (100x better) in c# and ASP.NET AJAX is on its way (its drafted just need some spare time to finish it!)

    John.

  • Monday, July 30, 2007 8:05 AMBunty120 Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

     

    Hello John,

                          I am doing a project where we are converting .shp file to .xml and displaying it on the VE map (Version 5). Can you tell me briefly or send me some coding for how to implement clustering of Georss feeds i.e.the XML file to display it on the map.Any quick help will  be greatly appreciated. I am developing the application using ASP .Net C#. Plz help

     

    Regards,

    Vishal

  • Monday, July 30, 2007 8:07 AMBunty120 Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    Hello John,

                          I am doing a project where we are converting .shp file to .xml and displaying it on the VE map (Version 5). Can you tell me briefly or send me some coding for how to implement clustering of Georss feeds i.e.the XML file to display it on the map.Any quick help will  be greatly appreciated. I am developing the application using ASP .Net C#. Plz help

     

    Regards,

    Vishal

  • Tuesday, October 23, 2007 5:00 PMmsdv Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

     

    Hi,

     

    I implemented the Clustering solution from your articles. It works just great. Thanks for the post.

     

    I've another question for you if you don't mind.

     

    I want the map to be zoomed to a certain zoom level when its first plotted, so i can see the map at Street level.

    But this messes the Clustering logic because Clustering is determined based on zoom level and I'm forcing the map to zoom to a particular zoom level.

     

    Is there a solution for this? If so, can you please post it here?

     

    Thanks.

     

     

  • Wednesday, October 24, 2007 1:02 AMSoulSolutionsMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

     

    If its just the initial view then in the js code behind file - default.aspx.js when you set the initial values:

     

    Code Block
    function Page_Load() { 
        var mapArgs = new SoulSolutions.Demo.MapArgs("myMap", new VELatLong(-27.5, 137), 4, VEMapStyle.Hybrid, false, VEMapMode.Mode2D, VEDistanceUnit.Kilometers);
        map = new SoulSolutions.Demo.Map(SoulSolutions.ClusterArticle.MapService, mapArgs);
    }

     

     

     

    The 3rd parameter (4) is zoomlevel - change it to anything between 1-17.

     

    Also if you ever need to refresh the data manually you can call:

     

    map._GetPinData();

     

    Although if you needed to do this I would drop the underscore to make note it as a public method or create another. In most of the work I do other components typically raise change events that cause the map to refresh, like filters on dates or categories for the data shown.

     

    John.

  • Wednesday, October 31, 2007 5:39 PMmsdv Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

     

    Hi,

     

    Thanks for the reply. I'm sorry as I'm replying back too late.

     

    The issue I'm running into is:

    For the initial view, I have to pass default bounds and default zoom level to get the visible pins in the Cluster.

    Now If I hard code the zoom level (=16), I even have to hard code the bounds (by calculation). This would not work with different maps. Ex. default bounds for a NewYork map at zoom level (=16) would be different than for LosAngeles map at the same zoom level.

     

     

    Please help.

     

    Thanks.

  • Wednesday, October 31, 2007 10:34 PMSoulSolutionsMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    The Map._GetMapData() function in the javascript calulates the bounds for you, these is done based on if your in birdseye or not and it calls the web service for you.

    I don't think i understand the issue, do you change the values supplied in the mapargs? If you want to set the inital view programically from the server side you need to add some varibles in javascript that are populated by asp.net, the syntax in your aspx file is something like:

     

    Code Block
        <script type="text/javascript">
        var cLat = '<% =Latitude %>';
        var cLong = '<% =Longitude %>';
        </script>

     

     

     

    Then you can pass these varibles into the mapargs instead. I hope I'm on the right track for your question.

    John.

     

  • Thursday, November 01, 2007 11:34 PMmsdv Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

     

    Hi, thanks for the reply. Let me explain you again.

     

    To apply Clustering, I need the the lat/lon points collection, zoomlevel and bounds(NW-SE).

    Zoomlevel (=4) displays all US states, but at zoomlevel(=16), I'm looking at either LA or NY or any other city and hence the bounds value here could be different based on the city I'm looking at.

     

    In this case I'm not able to send the bounds to Clustering logic along with the zoomlevel, since zoom(=16) could be any city and could have different bounds.

     

    Please help.

     

    Thanks.

     

     

     

  • Friday, November 02, 2007 6:26 AMSoulSolutionsMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    The demo shows a setup where everytime you move the map it retrieve just the data needed for that screen view.

    So if the map was positioned over NY it requested from the clustering logic the data within those bounds, if you panned over to LA it would request new data based on the new bounds.

    It could be benifical to your application to retrieve all data for a given zoomlevel at once, I could see this where you only had a several hundred points. To enable this you would only request new data on zoom level change, remove the bounds logic completely and remove the whole pin differential logic on the client side as it would not be needed. You could also cache the results for each zoom level to speed it up.

    John.

  • Monday, November 05, 2007 6:30 PMmsdv Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Is there a way to get zoomlevel based on bounds? Plz help.
  • Monday, November 05, 2007 6:51 PMmsdv Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    Sorry, Rephrasing again.

     

    Is there a way to get zoom level based on bounds server side.

     

    I got my custom bounds, I just need some method which could take the bounds object and return back zoom level.

  • Monday, November 05, 2007 11:56 PMSoulSolutionsMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    The bounds are effected by screen resolution, there is no way to calculate the zoomlevel from this unless you knew the screen resolution. I can only assume you are doing something different with the code then the example, Although it maybe clear to you I have no idea what your up to. My guess is your tring to prerender all the clusters in advance? This is a great idea but wouldn't you just loop through all zoom levels - 1-19 to acheive this?

    John.