[This post assumes reader is at least somewhat acknowledged with git and git-tfs bridge]
UPDATE: the bug described here is already fixed in git-tfs, so if you have latest version built from sources – it is not relevant for you. v0.11 still has the bug. Probably next version wouldn’t.
TFS doesn’t differ checkins based on author’s email. In fact it doesn’t even know author’s email. And that fact leads to some problems when working with git-tfs bridge as for git author’s email is crucial part of every commit influencing commit SHA-1 hash.
Suppose you have remote TFS server with slow connection (or – more likely today – it is not very reliable connection and could fail often) and you want to minimize network activity. And TFS server has some developers behind it of course. Naturally with DVCS like git it leads to such desired schema:
Dev1 \ / TFS Dev1
\ git [ slow ] /
Dev2 ------- Central ----- [ network ] ---- TFS Server ---- TFS Dev2
... / repository [connection] \ ...
DevN / \ TFS DevN
So when TFS pulling is required – any developer on the left executes git tfs pull (or fetch), and pushes tfs/default branch to the Central so that every other developer on the left could get it without going to TFS.
That is the goal. In the ideal world it would work this way from the begining. Oh, wait! In the ideal world there wouldn’t have been TFS on the schema in the first place 🙂
So lets return to reality. Just to test things when not everything going as expected I set up git repo called test-central:
test-central$ git tfs clone tfs_url test-central
--- and make it bare, like that
test-central $ cd test-central
test-central $ mv .git .. # save .git somewhere (not important where really)
test-central $ rm -fr * # remove everything in the folder
test-central $ mv ../.git . # take .git back
test-central $ mv .git/* . # get everything from .git to the repo's root
test-central $ rmdir .git # delete .git
test-central $ git config --bool core.bare true # tell git it is bare repo
Then I created test-dev-1 repository:
test-dev-1 $ git clone test-central test-dev-1
test-dev-1 $ cd test-dev-1
test-dev-1 $ git tfs bootstrap
test-dev-1 $ git config user.name dev-1
test-dev-1 $ git config user.email email@example.com
And test-dev-2 repository:
test-dev-2 $ git clone test-central test-dev-2
test-dev-2 $ cd test-dev-2
test-dev-2 $ git tfs bootstrap
test-dev-2 $ git config user.name dev-2
test-dev-2 $ git config user.email firstname.lastname@example.org
Both devs have the same TFS history cloned from test-central. Now tfs-dev-1 have some changes and sends them to TFS. test-dev-1 spots new changeset in TFS and decides to pull them:
test-dev-1 $ git tfs pull # suppose fast-forward merge for simplicity
Now this changeset is stored in his local repository with author’s name tfs-dev-1 and (as TFS don’t have emails) author’s email email@example.com. So he pushes this commit to test-central to share it with other developers:
test-dev-1 $ git push
At this time test-dev-2 also spots new changeset. He doesn’t know that dev-1 already got it (or just forgot to check) so he also decides to pull it from TFS:
test-dev-2 $ git tfs pull
His commits have author’s name also tfs-dev-1, but author’s email is firstname.lastname@example.org this time! So his commit from git’s point of view is entirely different from dev-1’s commits. And so…
test-dev-2 $ git push
…results in a conflict.
That seems pretty bad. So to provide commits originated in TFS with ‘shareability’ they should have the same email, right? So probably git-tfs bridge should set email to some predefined value for every commit that originates from TFS changeset.
This way test-dev-1’s and test-dev-2’s commits will both have some identical fake value like TFS@email.com and SHA-1 hashes will be equal and everything will be great. Right?
Apparently it is not so easy (we’re already back to the real world, remember?)
Let me explain with an example a problem I’ve faced an hour ago. The most simple scenario. Single dev, single git local repository, just one new commit. As basic as possible.
At the start git repo is like that (tfs is tfs/default – just shortage):
A <---- B
Then I make some changes and commit them to git:
A <---- B <------ C
Commit C is normal git commit so it has author='dev' and email@example.com'.
After that I want to checkin my commit to TFS so I execute 'git tfs checkin'. Nothing changes within my git repo. 'git tfs fetch' gets back my commit from tfs. And weird things start to happen…
Commit that came from TFS when we did 'git tfs fetch' (lets call it D for clarity) has author='dev-1-tfs-account-name' and email='TFS@email.com' (as we agreed above). You're already know how graph will look like, yeah? 🙂
A <---- B <------ C
That doesn’t seem like fast-forward we were desiring from [tfs] branch… For the same reason as before commit D differs from C. But we want them to be equal! What we need for such outcome to become real, then?
Yeah, even more restrictive rule:
$ git config user.name dev-1-tfs-account-name
$ git config user.email TFS@email.com
Well, TFS@email.com was chosen absolutely arbitrary so you could set it to any fake value you like.
To work more-or-less comfortable with TFS every developer should have git’s user.name equal to TFS account name and all developers should share single email.
P.S. In the last example you could merge C with D, get some commit E (without any conflicts actually as B->C and B->D diffs are absolutely the same)… but than you’ll have even the simplest graph looks like DNA molecule. It’s not what I can call comfortable work.