How to find out what was labeled and what changed since a label?
- Hi,
We have a case where people sparse label files under $/Project/Trunk/
So under there v1.0 might label file1.cpp, v1.1 might label file2.cpp and file3.cpp, etc. A label signifies ready for QA perhaps release if that tests out good.
The question that comes up often is I've changed File5.cpp or a bunch of files under Trunk/Folder1 when is the last time they got labeled? So we would know if it is safe to label for a release certain changes. For example if I changed one line of code I would be comfortable labeling my change but if that file has not been labeled for a release in months then obviously we need to schedule a lot more testing.
For a specific set of files or a directory how can we produce alist of changesets that have changed since the most recent (specific) label for an item/directory? (First need to find what is the last label that we want for a file/ or set of files under a directory, then find out all the changes since that label.).
In VSS history dialog would easily show this. In TFS this seems nearly impossible or at least a lot harder then it should be.
I know first suggestion lose sparse labeling, but even if we switched to using a branch and merged changes this problem really doesn't get any easier to find this information. Then you need to know what got merged or not, and then find out from the merges what is the last thing that got labeled and then find out what changed (so it even gets harder).
We are using TFS2008 SP1.
Is there another way to solve this? or change workflow?
Thanks.
Answers
- I think we're on the same page. The power is there to achieve everything you want & more, but the system does very little to guide you into the "intended" usage where it shines. To me (and other folks on the original TFS v1.0 team) it feels natural because branch-based SCM was already ingrained in the culture -- certainly by comparison, since the tool they used previously had no UI at all!There are really 2 parts to usability:1) Making sure the mainstream workflow discoverable, intuitive, clean, etc.2) Keeping you away from frustrating situations.Merging in 2005/2008 does a mediocre job on #1 and #2. The workflow is not too difficult, but if you're coming from VSS you won't figure it out without reading things like the Patterns & Practices guidelines (not included on-disc last I checked). Merges are committed atomicly like everything else, preventing the most egregious user errors, but there are lots of quirky edge cases + outright bugs in the first couple releases. Labels in 2005/2008 are pretty good on part #1 but downright awful on #2. The system lets you do anything & everything, making it nearly impossible to keep a handle on your labels unless you're very disciplined.2010 should greatly improve merging on both counts. There's a ton of investment in UI, the P&P docs are much improved, and the remaining edge cases should degrade gracefully. Meanwhile, it makes labels somewhat prettier but does absolutely nothing for #2. The core concept that "any label can contain any combination of item-version pairs, at any scope, and can be changed anytime with no record of what happened" is staying the same as far as I know.I think this SQL does what you want:use TfsVersionControlgodeclare @c as intset @c = 3712select @c as Changeset,l.labelname as LabelName,n.fullpath as LabelScope,l.comment as LabelComment,v.fullpath as ItemPath,le.recursive as FolderChildrenAlsoInLabelfrom tbl_labelentry as leinner join tbl_version as von le.itemid = v.itemid and le.versionfrom = v.versionfrominner join tbl_label as lon le.labelid = l.labelidinner join tbl_namespace as non l.itemid = n.itemid and n.islatest = 1where le.versionfrom = @cNo promises on efficiency, though it works pretty quick for me. If you want the recursive folders expanded, you'll need to add more code...either a UNION ALL with the results of func_GetVersionedItems(v.fullpath, 1), or maybe a CASE WHEN + subquery in place of the join to tbl_version, or maybe an outer select that wraps this whole statement. Not my area of expertise :)
- Marked As Answer byWXS123 Thursday, November 05, 2009 6:44 PM
All Replies
There is no direct way to do this within Visual Studio. However, from the command line you can do this provided you know the name of the last label. Lets suppose the last label name is LastLabel.
From a Visual Studio command prompt (Start, All Programs, Microsoft Visual Studio 2008, Visual Studio Tools, select Visual Studio 2008 Command Prompt)
cd to a folder in your workspace that you want to see changes since LastLabel
tf history .\ /r /version:L"LastLabel"~T
The history command above will show you the changes from LastLabel to the tip (or latest). You can then look at that changeset's to see what has changed. You may also want to try the following alternative command:
tf history .\ /r /format:detailed /i /version:L"LastLabel"~T
The above command will display the changeset details to the console.
Ed
http://blogs.msdn.com/edhintz- Marked As Answer byEd HintzMSFT, ModeratorTuesday, November 03, 2009 2:40 PM
- Unmarked As Answer byWXS123 Tuesday, November 03, 2009 3:02 PM
- Couple issues with this.
1. We don't know the last label we need to find that by File or Folder
2. Those history commands are inclusive so doesn't return just new changesets also returns the ones included in the label.
So does not work. It also seems innaccurate for some reason the answer should have been only 3 changesets (we believe) have changed but it displayed 11. So I'm not sure what it is picking up.
This is a HUGE issue, we need to be able to be able to answer these types of questions. Other source control products like Clearcase, Perforce and MKS and even VSS can answer these.
VSS used to be able to do version history with labels on files or directories that would have answered these questions in a reasonable time (seconds on a file and about a minute and a half on folders).
Other ideas?
Thanks - This request is going to be super tedious from the command line; borderline impractical. (as you've seen, the "obvious" route won't cut it)You can do it from the API. Should be fairly straightforward to code, but I don't think it makes the job of maintaining all these sparse labels any easier. No matter how friendly and bug-free the tool you write is, your users are bound to make lots of mistakes. How do you define relationships between the labels? How do you delegate responsibility for a given label's quality? When looking at a sparse label, how do distinguish cases when absence of a file meant "I intentionally don't want this file" from cases where the file simply didn't exist yet?Here's a question: how are you doing builds? Getting Team Build to understand and work in harmony with your labeling scheme sounds like a nightmare. I'm not super familiar with NAnt or CC.Net but I can't imagine they'd be much better. What do you use currently? That will help lead to a recommendation that makes sense for your workflow.In the meantime, some pure opinion: use branches. TFS's merge history tracking is very robust. In 2008 you can only browse it from the command line / API, but you still get the huge benefit of letting the server's logic decide when it is safe to automatically pend certain kinds of changes -- and when it isn't, intelligently provide the closest common ancestor for client tools work with. Questions like "which changes haven't been merged yet" and "which changes were merged & when" can be answered with 1 simple command line, or if you prefer, 3rd party UIs like TFS Sidekicks.As soon as you upgrade to TFS 2010, all of that historical information in the database "lights up" in Visual Studio's new WPF UI as well. 2010 also brings new APIs that allow you to answer more complicated questions like "show me all of the branches that this change has/hasn't been pushed to, when, & how" -- along with corresponding new visualizations.
This request is going to be super tedious from the command line; borderline impractical. (as you've seen, the "obvious" route won't cut it)
You can do it from the API. Should be fairly straightforward to code, but I don't think it makes the job of maintaining all these sparse labels any easier. No matter how friendly and bug-free the tool you write is, your users are bound to make lots of mistakes. How do you define relationships between the labels? How do you delegate responsibility for a given label's quality? When looking at a sparse label, how do distinguish cases when absence of a file meant "I intentionally don't want this file" from cases where the file simply didn't exist yet?Here's a question: how are you doing builds? Getting Team Build to understand and work in harmony with your labeling scheme sounds like a nightmare. I'm not super familiar with NAnt or CC.Net but I can't imagine they'd be much better. What do you use currently? That will help lead to a recommendation that makes sense for your workflow.In the meantime, some pure opinion: use branches. TFS's merge history tracking is very robust. In 2008 you can only browse it from the command line / API, but you still get the huge benefit of letting the server's logic decide when it is safe to automatically pend certain kinds of changes -- and when it isn't, intelligently provide the closest common ancestor for client tools work with. Questions like "which changes haven't been merged yet" and "which changes were merged & when" can be answered with 1 simple command line, or if you prefer, 3rd party UIs like TFS Sidekicks.As soon as you upgrade to TFS 2010, all of that historical information in the database "lights up" in Visual Studio's new WPF UI as well. 2010 also brings new APIs that allow you to answer more complicated questions like "show me all of the branches that this change has/hasn't been pushed to, when, & how" -- along with corresponding new visualizations.
Lets assume we did branches. That really doesnt make the situation any better as we still need to use labels on branches. (One of our products does not use sparse labels and does use branching with full labels on the branch)
We do about 600+ production labels a year (that doesn't even count the ones that never make it to production) and our code base is fairly large making branching due to the time to create and storage impractical for each label.
So if we branched we would use a single branch and have to label the branch when we had a QA version ready and so we could pull out earlier versions. If we did this the merges command (command line only - shows all merges... pain to parse) and tells what merged but not what made it into a particular label on the branch.
So how on the branch would I see what file already got labeled on the branch at a particular point and see if it was safe to merge and label an item from the trunk? (Still the same problem if not worse).
TFS2010 UI and even command line still doesn't seem to be able to answer these questions either. The UI shows labels but not recursively so if you had to find out everything under a directory your stuck. Also it doesn't even show what version of a file directory got labeled just that that singular item did get labeled.
I've also tried the API usage on TFS2008 to pull this off and it's both too slow as recursive labels are not implemented in the API plus it's a pain because we need to run the query twice to pull labels from root scope and labels from team project scope and try to combine results adding even more time.
Other tools seem to provide this feature Clearcase, Perforce, MKS and even VSS...
Other ideas to help this situation api/workflow, etc? we're stuck...
Thanks.- I think "use branches" means something very different to you than it does to me. My perspective:* Marking code as "ready for QA" is represented in the system by merging it from a less stable branch to a more stable branch. SCM best practice is for merges in this direction to (1) be direct copies, with the dirty work of integration already pieced together & tested in the more unstable branch (2) bring all of the new development as an atomic unit, so that the configuration in the more stable branch matches the tests you've already done as closely as possible. However, TFS doesn't force either behavior on you. If the merge encounters nontrivial conflicts it will handle them (far better than trying to piece together labels, certainly), and it supports cherry-picking individual changesets. While I do agree with the established wisdom on this point, you could certainly continue to cherry-pick changes the way you do now. If anything, representing code promotion as merges should improve your QA by simplifying the build/deploy/test cycle. I don't know how you manage it today, like I said before, but I do know that configuring Team Build to automatically build/deploy/test your code as soon as it's checked into the QA branch is very straightforward -- much more so than trying to make it understand labels.* Designating which units of code constitute a release is represented by merging them into further more stable branches. There are many strategies here with various pros & cons. Sometimes these "release branches" are permanently fixed at the time of deployment; sometimes they're reused. After a release is done you can make it the new mainline and repeat the process in so-called staircase fashion, or you can merge everything back to the original mainline indefinitely. And so on. Frankly, even the most ardent advocate for branching is not going to recommend creating 600 new ones every year. Branch-per-release has its place in some circumstances, but there are many more combinations of these ideas that would suit you better.Once information about QA cycles and releases is captured by your branch structure, you shouldn't need labels at all. All your questions can be queried directly from the history metadata. Which changes were promoted to QA? tf merges Dev QA /r. Which changes were not? tf merge Dev QA /r /candidate. Which changes went into the most recent release? tf history Release /r /stopafter:1. And so on.Of course, I'm just scratching the surface, and I'm too tired to draw the necessary diagrams in ASCII art. Try these pretty pictures: http://www.accurev.com/whitepaper/vendor_code.htm Ignoring the marketing pitch, my key takeaway is don't allow your workflow to bog you down in micromanagement. Bad as they try to make the situation in Figure 3 sound, I can't even fathom trying to do it with labels! Figures 4-7 clearly illustrate the important principles of letting static code inherit from stable branches (left-hand side) to unstable ones (right) while changes flow in the other direction.The best overall explanation of how these topics fit together comes from a Google Tech Talk: http://video.google.com/videoplay?docid=-577744660535947210&ei=BAu0SqXsD5bWrQLz7Zj_AQ&q=laura+wingerd&hl=en#Note that while Laura Wingerd is an evangelist for Perforce, which like you say supports labels, the word "label" doesn't appear a single time in the slide deck :)
- Proposed As Answer byEd HintzMSFT, ModeratorWednesday, November 04, 2009 2:29 PM
Hi,
Thanks for taking the time to think about our problem in depth.
I read through the links and watched the video. Those video's are good for concepts what typically is left out as is the case in those, is tracking actual deliverable packages (whether to QA/Beta/Release, etc.) in the promotion model, which gets back to our issue.
Even if we used lets say for example Dev, QA and Release promotion model. So merge from Dev(Trunk) to QA and then when that was good merge to Release. Without labels how would you rebuild or possibly branch out a previous version if you did need to revert to an old release to base changes on that? You need something (a label), to capture the points in time to denote builds/packages to allow going back to them even in the promotion model.
So how does that model actually answer the question, did this file in Dev get into this particular label on QA or Release?
- I could tell when a particular changeset was merged to QA but I wouldn't know what package off QA it was built in without a label. It also is fairly painful since merge history is only from the command line (unlike other products again Clearcase, Perforce and MKS all with UI's for this that show merging AND label information)
- Same issue to release what labeled package did it make it into? I might know something was merged or not but not what build or package it made it into without a label of some sort. Plus its harder for a release branch as now I have to run two tf merges read through the output to find the changes and I still don't know what package it made it into for testing QA or release.
I think one product we follow a modified promotion mechanism.
First some background:
- Our shop comes from VSS where branching was painful and most developers are very branch averse
- VSS was able to support the current model with full or sparse labels with history
-Merging is generally avoided due to adding additional danger of merging due to the risk adverse nature due to our products
- Developers have never needed to and do not want to always have to merge to make code changes or release code, it is viewed as significant added overhead and potential for merge dangers/issues.
- Developers want a model that reduces merging and branching
- We haven't been able to use team build as TFS2005 didn't support queuing and the vs2005 tfs client doesn't support queuing and we use vs2005 (and vs2008). queuing is essential as our builds require specific drive mappings and obviously only one drive can be mapped/subst at a time. So until we can guarantee queuing all the time we can't use it.
We presently have sort of like what you could consider Dev, QA/Release promotion model.
$/Project1/Trunk we consider Dev and
$/Project1/QARelease we would consider a QA/Release branch.
So once a change is determined to be ready in trunk they will merge it into the QARelease branch. Either intraday or end of day or maybe in a day or two a label will be applied and it will be built from the Label. This way we can always go back to an old version, branch on old versions if needed. But it still is very hard to determine what label/package a particular code change made it into. The label is tested in QA until it is ready to release if it's not they don't release it, if it is we know that label is good and we have nothing to do but deploy it. It does mean that branch is a cross between QA and production but since we release so often rather than stabilizing branches just minimal changes are pushed to that branch and tested, then released.
(I know there are lots of recommendations to procedures different than this but I think the basic premise in all of them still holds)
So unless we use a branch exactly like a label (which really is impractical to use branching that way for cherry picked changes due to size, number and scope), I think we are back to needing labels to track deliverable packages, even with the promotion branch model. As I still want to know what specific build/package got a change or not.
How can we track what change makes it into a specific package without labels or using a branch like a label (as I mentioned so impractical it might as well be impossible.)
Thanks
- Edited byWXS123 Wednesday, November 04, 2009 5:23 PM
- I've given up on the available API's to get this informtation as it is way too slow.
I'm trying to find a direct SQL query that could look up labels by changeset and return the label name, the changeset, the scope of the label and the full path of each item labeled at that changeset.
This doesn't look right yet full path does not seem unique and I can't find scope information in the tables.
select tbl_Changeset.ChangeSetId,tbl_Label.LabelName,tbl_Version.FullPath from tbl_Changeset,tbl_LabelEntry, tbl_Label,tbl_Version where ChangeSetId=dbo.tbl_LabelEntry.VersionFrom and tbl_LabelEntry.LabelId=tbl_Label.LabelId and tbl_LabelEntry.ItemId=tbl_Version.ItemId and tbl_Changeset.ChangeSetId=17
Then in theory if I use the API to pull changeset history I could do the lookup for each changeset and display the related labels and the items labeled at that changeset.
Ideas on how to fix this query?
Thanks - >> Without labels how would you rebuild or possibly branch out a previous version if you did need to revert to an old release to base changes on that? You need something (a label), to capture the points in time to denote builds/packages to allow going back to them even in the promotion model.You're thinking in the right direction. Once you use branching & merging to represent your configurations*, that's 90% of the battle. If you like, you can still use labels as a way to give a friendly name to a changeset that represents an important milestone. In this scheme a label always contains a complete and consistent snapshot of a branch at some point in time. Given those invariants labels become much, much easier to work with. You lose some power, but that's not a bad thing in my opinion -- after all, the simplicity of VSS's labels is why you adopted them in the first place.Again, though, this isn't anything you can't capture with merges alone. Say you need to track "builds sent to QA" and "released packages." Each could be represented by a special branch where there's a 1:1 relationship between checkins (i.e. incoming merges) and the milestones being recorded. Changeset comments take the place of label names, ordinary history lookups take the place of QueryLabels, and merge history tracks the relationship between each configuration.In the label-free case, the baseline for our Dev branches is probably called Integration or Main instead -- the special QA branch is a child of it. As before, new features are copied up to Main. Any last minute integration issues that aren't part of the final build sent to QA are fixed made here, as are direct changes requested by QA that can't wait for the next push from Dev. Only the final build is merged into the QA branch, and nothing is ever changed in there directly. Similar rules would apply to the special Release branch.Let's answer your query in both models:>> did this file in Dev get into this particular label on QA or Release?A) Compare the changeset # of the Dev version in question to the changeset # of the file in the desired label. If the latter number is higher, it was included. If it's lower, or the label doesn't contain the file, it wasn't included. (To be completely accurate, you'd want to call QueryMergesWithDetails aka 'tf merges' anyway. The client can't reliably convert the filename in Dev to the filename in QA/Release on its own, since renames in one branch might not have propagated to the other.)B) Call QueryMergesWithDetails twice: find when it was merged to Main, then find when that changeset was merged to QA/Release. Compare the latter to the changeset that represents the desired milestone.In 2008 I'd say they are about the same -- both have kludgy bits but get the job done. In 2010, B becomes the clear winner since 1 command gets you "where have I been merged" info for all branches at once, there are rich Branch Visualizations to track the flow of change, and the newly 2-dimensional History viewer is perfect suited to the merges-as-milestones concept. Label UI is somewhat better but not nearly to the same degree, and you still have to rely on merge metadata for robustness.Couple implementation notes:
- TFS doesn't have a way to stop users from creating sparse and/or inconsistently-dated labels. It also doesn't have separate permissions for creating labels vs updating existing ones. You're kind of on your own when it comes to enforcement.
- While not necessary for your particular queries, it would sometimes be helpful to know exactly what changeset (and therefore timestamp) a label corresponds to. Luckily you don't need to enumerate all of its items and choose the highest version #, which would be annoying and slow. Instead, call QueryHistory with a label versionspec and StopAfter=1.
*in the SCM sense of the word. AccuRev uses the term directly. In Wingerd's presentation it's the state (active vs inactive vs private) of the modules in a codeline; don't remember if she names it."Using a branch like a label" should not add any overhead. It sounds like you only have 1 Dev branch, and the labeled configurations therein are never merged together; you just keep committing new changes in-place. Representing those same configurations using branch-and-merge doesn't change anything! So long as code flow remains unidirectional, every "merge" is simply a copy. Neither TFS nor the developer is piecing together files; you're just copying things around. Of course, the capability is there to do much more with your branches. Once things get more complex, you may indeed want to merge, and should follow guidelines such as Wingerd's "rules of the road" when you do. But since we're talking about a direct replacement for labels, the only thing that should change is which dialog you click-click-click through. >> Without labels how would you rebuild or possibly branch out a previous version if you did need to revert to an old release to base changes on that? You need something (a label), to capture the points in time to denote builds/packages to allow going back to them even in the promotion model.
You're thinking in the right direction. Once you use branching & merging to represent your configurations*, that's 90% of the battle. If you like, you can still use labels as a way to give a friendly name to a changeset that represents an important milestone. In this scheme a label always contains a complete and consistent snapshot of a branch at some point in time. Given those invariants labels become much, much easier to work with. You lose some power, but that's not a bad thing in my opinion -- after all, the simplicity of VSS's labels is why you adopted them in the first place.Again, though, this isn't anything you can't capture with merges alone. Say you need to track "builds sent to QA" and "released packages." Each could be represented by a special branch where there's a 1:1 relationship between checkins (i.e. incoming merges) and the milestones being recorded. Changeset comments take the place of label names, ordinary history lookups take the place of QueryLabels, and merge history tracks the relationship between each configuration.In the label-free case, the baseline for our Dev branches is probably called Integration or Main instead -- the special QA branch is a child of it. As before, new features are copied up to Main. Any last minute integration issues that aren't part of the final build sent to QA are fixed made here, as are direct changes requested by QA that can't wait for the next push from Dev. Only the final build is merged into the QA branch, and nothing is ever changed in there directly. Similar rules would apply to the special Release branch.Let's answer your query in both models:>> did this file in Dev get into this particular label on QA or Release?A) Compare the changeset # of the Dev version in question to the changeset # of the file in the desired label. If the latter number is higher, it was included. If it's lower, or the label doesn't contain the file, it wasn't included. (To be completely accurate, you'd want to call QueryMergesWithDetails aka 'tf merges' anyway. The client can't reliably convert the filename in Dev to the filename in QA/Release on its own, since renames in one branch might not have propagated to the other.)B) Call QueryMergesWithDetails twice: find when it was merged to Main, then find when that changeset was merged to QA/Release. Compare the latter to the changeset that represents the desired milestone.In 2008 I'd say they are about the same -- both have kludgy bits but get the job done. In 2010, B becomes the clear winner since 1 command gets you "where have I been merged" info for all branches at once, there are rich Branch Visualizations to track the flow of change, and the newly 2-dimensional History viewer is perfect suited to the merges-as-milestones concept. Label UI is somewhat better but not nearly to the same degree, and you still have to rely on merge metadata for robustness.Couple implementation notes:- TFS doesn't have a way to stop users from creating sparse and/or inconsistently-dated labels. It also doesn't have separate permissions for creating labels vs updating existing ones. You're kind of on your own when it comes to enforcement.
- While not necessary for your particular queries, it would sometimes be helpful to know exactly what changeset (and therefore timestamp) a label corresponds to. Luckily you don't need to enumerate all of its items and choose the highest version #, which would be annoying and slow. Instead, call QueryHistory with a label versionspec and StopAfter=1.
*in the SCM sense of the word. AccuRev uses the term directly. In Wingerd's presentation it's the state (active vs inactive vs private) of the modules in a codeline; don't remember if she names it."Using a branch like a label" should not add any overhead. It sounds like you only have 1 Dev branch, and the labeled configurations therein are never merged together; you just keep committing new changes in-place. Representing those same configurations using branch-and-merge doesn't change anything! So long as code flow remains unidirectional, every "merge" is simply a copy. Neither TFS nor the developer is piecing together files; you're just copying things around. Of course, the capability is there to do much more with your branches. Once things get more complex, you may indeed want to merge, and should follow guidelines such as Wingerd's "rules of the road" when you do. But since we're talking about a direct replacement for labels, the only thing that should change is which dialog you click-click-click through.
Again thanks for a well thought out response. It seems we are having to do a lot of process change to work around functionality that is available in other competitive enterprise products.
"Again, though, this isn't anything you can't capture with merges alone. Say you need to track "builds sent to QA" and "released packages." Each could be represented by a special branch where there's a 1:1 relationship between checkins (i.e. incoming merges) and the milestones being recorded. Changeset comments take the place of label names, ordinary history lookups take the place of QueryLabels, and merge history tracks the relationship between each configuration."
For the above comment, actually we've measured the overhead and we are not sure if its all temp tables or actual meta data but one of our branches I think amounts to about 0.5GB which would bring a years branches to about 300-400GB. Using labels there is no such overhead or very minimal but we have the tracking issues, and other products again Clearcase, Perforce and MKS all seem to support the label lookups.
On these items:
>> did this file in Dev get into this particular label on QA or Release?A) Compare the changeset # of the Dev version in question to the changeset # of the file in the desired label. If the latter number is higher, it was included. If it's lower, or the label doesn't contain the file, it wasn't included. (To be completely accurate, you'd want to call QueryMergesWithDetails aka 'tf merges' anyway. The client can't reliably convert the filename in Dev to the filename in QA/Release on its own, since renames in one branch might not have propagated to the other.)B) Call QueryMergesWithDetails twice: find when it was merged to Main, then find when that changeset was merged to QA/Release. Compare the latter to the changeset that represents the desired milestone.
For A.I don't know the changeset number of the desired label as I don't even know which label I want to find which labels included which changesets for particular files/directories. So I don't have anything to compare lower or higher to. I want to know what files/directories in this directory at which changeset levels made it into which particular labels, on the QA/Release branch.
For B - Yes we could find out what got merged to the QA branch but so that gives me a changeset number. I still need to find out which changesets got labeled on a directory/file.
On this item I found interesting:
"While not necessary for your particular queries, it would sometimes be helpful to know exactly what changeset (and therefore timestamp) a label corresponds to. Luckily you don't need to enumerate all of its items and choose the highest version #, which would be annoying and slow. Instead, call QueryHistory with a label versionspec and StopAfter=1."
This is interesting, though since I don't necessarily know the label up front I need to be able to find the labels on the directory firts before I can use this. That is an interesting idea for finding a latest changeset though, might be able to use that in other ways...
So to answer the original question we need to figure out what got labeled of everything first so I think that precludes being able to solve the queries we are presently looking for with the above. TFS2010 does at least show labels but not what version got labeled and not recursively so even using the new client is not very helpful to resolve those questions.
Thanks again for the very thoughtful responses.
I'm hoping we can figure out some ideas to solve our this usage issue, we do seem to be having to work awfully hard due to features that are missing from tfs but available in other competitive products.- >> For the above comment, actually we've measured the overhead and we are not sure if its all temp tables or actual meta data but one of our branches I think amounts to about 0.5GB which would bring a years branches to about 300-400GB.Wrong on a couple counts.1) From a technical perspective, the cost of a branch is roughly (files per branch * workspaces that map the new branch)/2 KB. If a branch contains 60k files and will be used by 100 people, that's only 3MB. In reality only 10 of those people might be release engineers who need to maintain the special release branches, making the cost an order of magnitude lower still. That calculation doesn't include any new revisions you commit to the branch, true, but it's no different from committing them to the old branch. There's some additional overhead for merge history, and for files whose contents continue to diverge over time, but they'll be minimal if you keep your existing workflow (1 dev branch, all merges are copies).2) I still don't think you understand my proposal. I'm NOT advocating branch-per-release. I'm talking about merge-per-release. And since you never integrate >1 codeline (so far), it's actually even simpler than that: copy/overwrite-per-release. (substitute "send build to QA" for "release" where appropriate) All told, you should only need ~4 branches total to represent the various states that code passes through in its lifetime. Once they are in place, creating a new branch should be a rare and deliberate event.>> I don't know the changeset number of the desired label as I don't even know which label I want to find which labels included which changesets for particular files/directories. So I don't have anything to compare lower or higher to. I want to know what files/directories in this directory at which changeset levels made it into which particular labels, on the QA/Release branch.I need to be clearer. Let's back up. In Scenario A, the QA branch receives various bits of code that are copied up from Dev. (and nothing else -- in SCM terms the QA codeline is "complete" but entirely composed of "inactive modules")It's true that looking at its changeset history doesn't have as much info as you'd like; most of these checkins don't carry any special meaning. Then occasionally (~300 times a year) a checkin brings the branch into a state where it's ready for QA. In order to record this milestone, you affix a descriptive label that spans every file & folder in the QA branch @ that point in time.To review the list of builds you've sent to QA, you query for all labels that have been applied to the QA branch. (For efficiency, you can simply query on the root folder.) Let's say you scan this list and find that build label "X" is pertinent to some issue the QA is investigating. You want to know if change #1234 to file1.cpp back in the Dev branch was part of that build. QueryMergesWithDetails will tell you the path & version number in QA that corresponds to $/.../Dev/file1.cpp;C1234. QueryLabels with includeInfo=true will tell you the version number of QA\file1.cpp that got recorded in the label. Now you compare those two numbers.Scenario B is similar except you use merges for both code flow & recording the final package, instead of switching to labels for the latter. The release process consists of a 2nd merge with a descriptive changeset comment instead of a descriptive label. Browsing past releases is done in the History toolwindow instead of the Find Label dialog. Lookups consist of a 2nd call to tf merges instead of 1 tf merges + 1 tf labels.>> For B - Yes we could find out what got merged to the QA branch but so that gives me a changeset number. I still need to find out which changesets got labeled on a directory/file.In Scenario B there are no labels anywhere. Changesets inside a Release branch correspond directly to a release, and vice versa.>> This is interesting, though since I don't necessarily know the label up front I need to be able to find the labels on the directory firts before I can use this.When all labels are complete (not sparse) then this is easy. You should be able to query on any file inside the branch and get the same list every time, unless the file was created after the label. (As mentioned it's easiest to query on the root of the branch, which sidesteps this final snag.)That's assuming we're taking Suggestion A, of course. I'm putting it out there as a middle ground, but I really think B is best suited to how TFS works now & in the future.>> we do seem to be having to work awfully hard due to features that are missing from tfs but available in other competitive products.TFS doesn't let you pin files or make partial checkins either :) If you look harder at today's best version control systems I think you'll find they too have left VSS style codeline management in the past where it belongs...
- I think I finally understand what you are getting at. I do think it is what I would call an interesting workaround to the tool missing features.
Why I say workaround is that clearly checkins themselves have no significance themselves (as you said too) in terms of promotion which would be tracked as just that in other enterprise tools. (I don't really consider labels a bad way to manage code either (if they were implemented well), I do think most tools implementation of them is not what it needs to be to make it truly reasonable and that include VSS and TFS plus many others I'm sure).
Obviously tagging the comment of a checkin with important significance really is workaround (the tool needs better support to handle this type of scenario so the tool would actually understand the significance of the promotion).
Thanks for the information on the branch cost... interesting maybe just database as to why db size increased by 0.5GB when we did branches (could have been temp tables or something).
Have to think on this idea of checkin instead of a label/promotion , some initial thoughts/concerns though
- Our team doesn't work with a single integrator/gatekeeper for merging/promotion to avoid bottlenecks and instead everyone merges their own changes. So there would be many checkins and if the person merging forgets to comment or tag the checkin then that would be missed.
- This technique then relies on people remembering to do comments or comming up with some sort of automation which can automerge which sounds like there can ocassionally be multi-merge checkin issues.
- What if a merge requires multiple checkins due to deletes and add's you would actually have two checkins to indicate when that happens.
One of our teams is attempting this by creating a merge file they would checkin and then all of the files from that merge file would be merged and the auto script would tag Again using comment as the informational field... not ideal but would provide the information. They were going to avoid using merge due to the problems of the command line potentially failing a merge and make it just a copy and have the tool put the comment on the checkin, far from ideal as well.
Does Team build have a way to do promotion like this in some way?
Thanks very much for the interesting idea. Not sure if we could process wise or politically implement this as workaround. Have to think hard about it as it might let us survive with tfs for a while to see if we should make a decision to stick with it or consider some other tool options.
Microsoft though I think really needs to improve the tool to have a promotion option that does promote the code and record historically in history as a promotion (aka like a label/tag, etc showing up in history). Again other enterprise tools allow doing a single promotion aka Auto Merge and recording it as such. I don't see anything like that built into TFS yet, that could provide that. Using comments in tfs to manually record promotion so we can see it in history is a cool workaround but clearly not a promotion feature.
Do you have any ideas to get my SQL query to work I posted to pull out the information mentioned? I think that could work if I could find the info in the tables.
Thanks again, appreciate you sticking with it to get me to understand the idea at least.- Edited byWXS123 Thursday, November 05, 2009 12:47 PM
- Checking in a merge absolutely has significance. When I explained how merges can represent code promotion, I wasn't trying to sound clever; it's standard practice. You've already seen how this works in Perforce and AccuRev. TFS is intended to function similarly, as seen in documents like http://tfsbranchingguideii.codeplex.com/Release/ProjectReleases.aspx?ReleaseId=20785. Likewise Seapine: http://downloads.seapine.com/pub/papers/SCMbranchingmodels.pdf. And pretty much every vendor-independent study of the topic, e.g.: http://www.cmcrossroads.com/bradapp/acme/branching/branch-structs.htmlEven when mixing merges with labels in Scenario A, you must have some criteria in mind for moving code from Dev to QA, otherwise why do it? Compared to ideal "best practices" your criteria might be a little fuzzy and the mechanics a little more dangerous than necessary*, but the principles are the same. While not every checkin to QA gets the full packaging treatment, those checkins are what signaled your intent to release the code at all. Tracking precisely what ends up getting released is a separate function -- similar in many ways, which is why it too can be modeled with merges -- but that doesn't take anything away from the overall promotion model.As important as code promotion is to enterprise development, I do agree it's sad TFS makes it so hard to discover. If branching & merging metadata had been seamlessly integrated into the UI back in 2005 -- or if "Rosario" had shipped between VS2008 and 2010 as hoped -- then we almost certainly wouldn't be having this discussion. Once you're comfortable with branches, the idea of promoting via label sounds silly. However, lack of UI + bad experiences with VSS + the fact great docs like the ones I've linked are not included in the box = lots of people avoided branches. If you try to get by on labels alone it's very easy to conclude TFS isn't a serious SCM tool, when it's actually quite robust under the hood and has been from the very beginning. With luck 2010 can change that perception.>> Thanks for the information on the branch cost... interesting maybe just database as to why db size increased by 0.5GB when we did branches (could have been temp tables or something).May have been one-time growth -- SQL likes to keep a certain % of free space in its table files. Or maybe you did a lot of parallel development. Every time you commit something to the new branch that's *not* a direct copy-merge from the source branch then you have to add its [compressed] size to the total, of course. I left that out of the calculation since the proposal was strictly unidirectional code flow, but who knows what the situation was when you experimented before.>> Our team doesn't work with a single integrator/gatekeeper for merging/promotion to avoid bottlenecks and instead everyone merges their own changes. So there would be many checkins and if the person merging forgets to comment or tag the checkin then that would be missed.This technique then relies on people remembering to do comments or comming up with some sort of automation which can automerge which sounds like there can ocassionally be multi-merge checkin issues.It shouldn't. Checkin comments are there to help you find things after the fact, just like label names. Imagine if you created a release label with a blank name: it would be hard to determine your intent, but the metadata would still show exactly which files you added & when. Same thing if someone commits a merge with an empty checkin comment. That code has now been promoted to the target branch, period, end of story. It's actually a stronger statement than any label (named or not) can be: every changeset becomes an irrevocable piece of history that's atomic across the whole server, while you never know if a label might be incomplete or internally inconsistent.The only thing you're missing is whatever notes you'd append about the specific purpose of this promotion / release. Let's say you forgot to write "merging features #2 and #5 so Bob's team can QA them." It's now harder to search for all the releases that went to Bob...though if this is a common thing to search for, maybe Bob should have his own release branch. It's somewhat harder to see at a glance when feature #5 was merged, but you can always go to your work item tracking system and follow the links from there. Etc. The core info -- what versions of what files were promoted, to which state, by whom, & when -- is fixed forever as soon as you hit Checkin.>> What if a merge requires multiple checkins due to deletes and add's you would actually have two checkins to indicate when that happens.If you're in a situation like Scenario A it shouldn't matter -- individual checkins aren't intended to represent concrete releases, merely the promotion from less stable code to more stable code.In Scenario B, I recommended sticking an intermediate "Integration" branch between the Dev side of the hierarchy and the Release branches where individual checkins are taken super seriously. Doing so is an absolute must when there is more than one Dev branch -- you need some place for them to meet in the middle and hash out their differences. It's also a recognition that nobody is perfect: sometimes regressions are copied up to a stable baseline that need to be fixed there directly, yet you don't want to pollute the history of Release directly. Consider the TFS "delete+add" quirk (which goes away in 2010, fwiw) yet another reason why it's a good idea to have a staging area as suggested.Scenario C, btw, is to use one of the variations on branch-by-release. While I've criticized this approach in previous posts, it does have some advantages. One advantage is it's really easy to find the final build of a given release without relying on metadata (such as A's labels) or strict rules (such as B's "1 checkin per milestone" representation). You simply go to the branch associated with the release and get the latest version. You do end up with a *lot* of branches, but each branch has far fewer changes than in less branch-heavy strategies, so the net effect on DB size wouldn't be too bad. And since branches are treated like folders in the TFS version tree, you might actually find it easier to organize & find them than you would with label or changeset history.Anyway, this rare quirk of TFS namespace operations probably doesn't deserve 4 paragraphs of elaboration. Just thought I'd demonstrate how flexible the overall concept of SCM-via-branches is. There are many possible strategies, each of them capable of accommodating common development snafus, if not always in exactly the same way.>> One of our teams is attempting this by creating a merge file they would checkin and then all of the files from that merge file would be mergedI honestly don't understand this. Sounds like a way to do more work and get less value from it.>> Does Team build have a way to do promotion like this in some way?Not really. You could script something but it wouldn't be any easier than a standalone script.The reason Team Build favors a branch-based promotion model is that its build definitions are based on paths, just like branches themselves are. It's straightforward to create a build definition that says "whenever someone commits code somewhere under the $/.../QA folder, compile it,** deploy it to this staging server, and run these tests against it." Making Team Build do something similar whenever someone added or updated a label with QA in its name would be very difficult.>> Do you have any ideas to get my SQL query to work I posted to pull out the information mentioned? I think that could work if I could find the info in the tables.Been quite awhile since I've peeked inside a TFS database. I'll have a look tomorrow.*it's safer to use multiple Dev branches that can each be copied down as an atomic unit, rather than cherry-pick changes from one shared branch. I understand that would require bigger changes, politically and otherwise, which is why I didn't suggest it before.**to be clear, "it" means "the entire QA branch" not "the new code." You might turn on incremental compilation for speed, but conceptually you're always looking at a complete, consistent snapshot of the branch itself. Labels can be complete & consistent, if you're careful, but TFS leaves it entirely to the user. Changesets force you to treat codelines as proper codelines.
I think many of those other tools support auto merge promotion making developers lives a bit easier (plus it records those as promotions). Thus those documents are correct, though implementing and tracking with tfs is a bit more thorny.
On tfs, I think the concern I have with the multiple people merging and deciding a promotion point as a single persons merge. We don't want to consider that the promotion point it is typically daily or every other day or so. We definately have developers that don't add comments or add comments that no one would understand which would make this method tricky. Also as you said TFS is making this information difficult to track as promotions. I don't think we really want to use a single merge as a promotion in our environment. As you said introducing another branch as indirection might help (our developers are already merge and branch adverse- adding even more merging probably won't go over well, but we'll have the discussion - maybe if that were automated somehow)
Having to find what was modified by always doing a view history on the higher level to find the latest changeset with the appropriate notes just to figure out if your file in a subdirectory is "promoted" also seems like the tool should be doing a lot more for us. (I believe all of the other enterprise class tools do).
We are also not using TFS work items as they are not flexible enough to easily support the many audit requirements and special fields we need presently. That makes tracking changes also more difficult.
I think part of the aversion to branches in TFS compared to other tools is presentation. I think most people can't deal with a lot of clutter so the branch by release model in our scenario with like 600+ releases a year would product a lot of clutter (this could be offset by moving things to folders, etc... tfs even makes this hard since folder version history is kept just the act of moving items to new folders can make it difficult to retrieve history before or after the move).Other products seem to narrow the information people need to look at by creating personal streams (aka like branches) and they see only their stream most of the time and promote back into another stream. The individual streams never show up in a standard view for everyone so no clutter there and typically sort of disappear and fold in when promoted.
Other tools also offer automatic promotion (basically just copying and tracking the files- though is basically an auto merge) rather than treating it as a manual branch and merge, reducing the workload on developers/integrators doing promotions.
That in combination of the tfs merge UI's being not so good shall we say (and underlying tracking of changed files, etc) I think adds to aversions to many of the branch for release and promotion merge scenarios that would be more easily done on other tools.
Thanks again for all the pointers, we'll think on this a bit, tfs as a tool still seems a bit lacking in tracking the information we would want in these scenario's in a more enterprise scm way, and the presentation makes the utilization a lot more cumbersome than I suspect it could be.
I'm going to have another look at the sql today to see if there is anything to do there as it looked pretty close, but not quite.- I think we're on the same page. The power is there to achieve everything you want & more, but the system does very little to guide you into the "intended" usage where it shines. To me (and other folks on the original TFS v1.0 team) it feels natural because branch-based SCM was already ingrained in the culture -- certainly by comparison, since the tool they used previously had no UI at all!There are really 2 parts to usability:1) Making sure the mainstream workflow discoverable, intuitive, clean, etc.2) Keeping you away from frustrating situations.Merging in 2005/2008 does a mediocre job on #1 and #2. The workflow is not too difficult, but if you're coming from VSS you won't figure it out without reading things like the Patterns & Practices guidelines (not included on-disc last I checked). Merges are committed atomicly like everything else, preventing the most egregious user errors, but there are lots of quirky edge cases + outright bugs in the first couple releases. Labels in 2005/2008 are pretty good on part #1 but downright awful on #2. The system lets you do anything & everything, making it nearly impossible to keep a handle on your labels unless you're very disciplined.2010 should greatly improve merging on both counts. There's a ton of investment in UI, the P&P docs are much improved, and the remaining edge cases should degrade gracefully. Meanwhile, it makes labels somewhat prettier but does absolutely nothing for #2. The core concept that "any label can contain any combination of item-version pairs, at any scope, and can be changed anytime with no record of what happened" is staying the same as far as I know.I think this SQL does what you want:use TfsVersionControlgodeclare @c as intset @c = 3712select @c as Changeset,l.labelname as LabelName,n.fullpath as LabelScope,l.comment as LabelComment,v.fullpath as ItemPath,le.recursive as FolderChildrenAlsoInLabelfrom tbl_labelentry as leinner join tbl_version as von le.itemid = v.itemid and le.versionfrom = v.versionfrominner join tbl_label as lon le.labelid = l.labelidinner join tbl_namespace as non l.itemid = n.itemid and n.islatest = 1where le.versionfrom = @cNo promises on efficiency, though it works pretty quick for me. If you want the recursive folders expanded, you'll need to add more code...either a UNION ALL with the results of func_GetVersionedItems(v.fullpath, 1), or maybe a CASE WHEN + subquery in place of the join to tbl_version, or maybe an outer select that wraps this whole statement. Not my area of expertise :)
- Marked As Answer byWXS123 Thursday, November 05, 2009 6:44 PM
I think we're on the same page. The power is there to achieve everything you want & more, but the system does very little to guide you into the "intended" usage where it shines. To me (and other folks on the original TFS v1.0 team) it feels natural because branch-based SCM was already ingrained in the culture -- certainly by comparison, since the tool they used previously had no UI at all!
There are really 2 parts to usability:1) Making sure the mainstream workflow discoverable, intuitive, clean, etc.2) Keeping you away from frustrating situations.Merging in 2005/2008 does a mediocre job on #1 and #2. The workflow is not too difficult, but if you're coming from VSS you won't figure it out without reading things like the Patterns & Practices guidelines (not included on-disc last I checked). Merges are committed atomicly like everything else, preventing the most egregious user errors, but there are lots of quirky edge cases + outright bugs in the first couple releases. Labels in 2005/2008 are pretty good on part #1 but downright awful on #2. The system lets you do anything & everything, making it nearly impossible to keep a handle on your labels unless you're very disciplined.2010 should greatly improve merging on both counts. There's a ton of investment in UI, the P&P docs are much improved, and the remaining edge cases should degrade gracefully. Meanwhile, it makes labels somewhat prettier but does absolutely nothing for #2. The core concept that "any label can contain any combination of item-version pairs, at any scope, and can be changed anytime with no record of what happened" is staying the same as far as I know.I think this SQL does what you want:use TfsVersionControlgodeclare @c as intset @c = 3712select @c as Changeset,l.labelname as LabelName,n.fullpath as LabelScope,l.comment as LabelComment,v.fullpath as ItemPath,le.recursive as FolderChildrenAlsoInLabelfrom tbl_labelentry as leinner join tbl_version as von le.itemid = v.itemid and le.versionfrom = v.versionfrominner join tbl_label as lon le.labelid = l.labelidinner join tbl_namespace as non l.itemid = n.itemid and n.islatest = 1where le.versionfrom = @cNo promises on efficiency, though it works pretty quick for me. If you want the recursive folders expanded, you'll need to add more code...either a UNION ALL with the results of func_GetVersionedItems(v.fullpath, 1), or maybe a CASE WHEN + subquery in place of the join to tbl_version, or maybe an outer select that wraps this whole statement. Not my area of expertise :)
Thanks for that SQL! I'm testing it now but at first glance it looks... well awesome! I think it actually pulls the information we wanted much faster than the API. I just have to write some of the filters by the path the user queried either sql or manually and this might be an interesting way to get info when we need it. Wish they had made an API for this query :)
I'm testing now the performance on top level directories to see if this will work for large folders.
I suspect this might hit the server too hard for a lot of people doing it (which is probably why there isn't and API). But it does at least provide a way we can get to the information if we need it.
I've got a simple prototype that looks pretty good if I refine it this looks pretty nice. Obviously permissioning would be an issue (probably have to do a stored proc or something and just permission that for users). Performance remains to be seen but initial tests are better than the API's.
Definately appreciate the time and effort you put in to this. I was getting ready to have some discussions with the dev team but might wait to get a base implementation of a tool that uses this sql first.
Thanks for all the great info and help!
Updated: My tool using the sql on high level directories runs past the 5 minute mark and still running, so probably not good for an on demand initiated request.. we'll see if the total time comes in under the hours it takes for the API to return.
Updated: Tool with your sql took about 20 minutes for a fairly top level directory to output labels and changesets labeled. Significantly better than the API which I think would take hours to return that and only if I chose one scope and I really need two with the API so possibly days.
Very interesting... going to have to think on this tool/sql, could be useful as a last resort.- >> I suspect this might hit the server too hard for a lot of people doing it (which is probably why there isn't and API).While I didn't design the API, I'm pretty sure their reasons were simple. Nobody imagined that customers would want to slice up label metadata the way you do.Anyway, I looked at the query plan for my script -- not bad at all considering the # of tables involved -- certainly far less expensive than the procs which handle merging! The only full table scan comes during the WHERE clause. If you create a nonclustered index on tbl_LabelEntry.VersionFrom, you can eliminate that too. Creating large, sparse labels will be a little slower (and take up a fair bit more space) but lookups by changeset will be super fast. Technically you could do the same thing on tbl_Version.VersionFrom to speed up the corresponding JOIN, but that would have a major ripple effect on the whole system. Almost every operation reads or writes to that table; I'm nowhere near qualified to estimate the overhead of such a change.[insert standard warning about mucking in the TFS databases...always backup!]
- I know they tried to move away from VSS as TFS wasn't supposed to be a direct upgrade but notice they are marketing now more towards VSS customers. I think they will need to change to support both the older paradigms (but better) and the newer ones at the same time.
I also checked with our local clearcase expert who said on a promotion point it just applies a label internally itself to keep track of the promotion points.
Thanks for taking a look at the query plan :) That was the next thing on my list to look at. I'll try with the indexes on our test TFS server and see the impact of operations.
Yea not sure I'm game enough to muck with our actual tfs production database.
Thanks again, I'm really intrigued with the sql method as there might be a way I can call it more efficiently.
Also will have the promotion branch discussion with our teams and see how that goes.
I'd almost like to automate the promotion from an Integration branch to a QA branch and QA to Release.
Update: 3 Minutes on a fairly high level directory with the Index! Nice. I'm going to try from the root of our Trunk and see what happens. - It's running good enough with that index we might want to think about getting this reviewed with MS to see that it won't have a big impact on performance in other areas. I suspect it's ok, but if we decide to go with this as an option we'll get it reviewed. I'm going to run this on our production system in off hours and see how it does since there is a lot more data there. (Have to do without the index) I'll gauge the relative times and see if it would likely be useful there. If it performs as well there this might be a winner. CPU usage was also sigificantly down with the index.
- Going to have to run it some more.. unfortunately seems a lot slower on the production database... will have to play with it a bit. I think it's even a better machine for our production system. Got a Sql client timeout expired some queries took so long with my test .net app. That's a bit surprising. It just might be there are a lot more label items there.


