A place for spare thoughts

21/06/2011

Git resources

Filed under: git — Ivan Danilov @ 17:21

Recently I was surfing the web actively looking for info about git. This post is something like a summary of good (IMO) resources about this DVCS.

Official main page contains useful documentation, FAQ and large list of tools available.

The most comprehensive single resource probably is Pro Git Book

There’s also site with many howtos, tip-and-tricks and the like: Git Ready

Several atricles too:
Why staging area matters?

In praise of Git’s index

A successful git branching model

My Git Workflow by Oliver Steele

What is rebase and how one could use it:
The Case for Git Rebase
A Rebase Workflow for Git

What is NOT a good idea to do in git:
Avoiding Git Disasters: A Gory Story

And my own posts related to git.

Currently my git tool-set consists of:

  • msysgit – actually git’s implementation on Windows.
  • GitExtensions – GUI of choice. Mostly for viewing trees and adding files to staging area.
  • And the last but not least is the tool which made me interest in git in the first place: git-tfs bridge.

There’s also GitSharp – implementation of Git for the .NET. Actually, GitExtensions and git-tfs uses this library internally to manipulate git’s repository.

Other suggestions are welcome 🙂

Advertisements

GitTfs rebasing workflow. Is it possible?

Filed under: git, git-tfs — Ivan Danilov @ 06:55

Here is the issue on GitTfs about some complexities in usage. The point is GitTfs allows you to take a feature branch and put it into the TFS in single checkin. On the git side this action produces merge commit with two parents: one for previous commit fetched from TFS and another from your feature branch. If you want to have fine-grained history on TFS side – you have a problem.

So what could we do with it?

I’ve just finished a patch that allows to perform fine-grained workflow more-or-less painlessly. The idea is new option to checkin command, namely –rebase-workflow or just -r.

Lets name for simplicity commit being checked into TFS as ‘source’ and fetched afterwards as ‘result’. Source belongs to feature branch that we want to check into TFS commit-by-commit.

So what -r key is supposed to do? First and the most important result of -r is that it suppresses marking result as merge commit. Just skips assigning source as parent. So after checkin we’ll have two separate branches, and result will actually contain all changes from source but it won’t be shown in the graph.
See the diagrams below:

  A
[tfs]
     \
      \<-- B <--- C
               [branch]

becomes

  A <----- B'
    \    [tfs]
     \
      \<-- B <--- C
               [branch]

Here B’ has all changes from B.

Just to compare with default behavior:

              [tfs]
  A <---------- M
    \         /
     \       /
      \<-- B <----- C
                [branch]

M is merge commit and B is original. They are just the same change from A most of the time and it is confuses history very much.

So how could we going to turn two diverging branches into linear history? With a rebase of course.
We could take remaining of the local branch (so it is C actually) and rebase it to the B’. It will go smoothly as we are just applying clean patch essentially. Thus B becomes (most likely – if there were no third branches spawned from feature branch) orphan commit and will go away. Which is ok as we have all changes from it in the B’. And it is exactly what -r key tries to do for you.

I did some testing (simplest actually and with some conflicts/interfering with native TFS client) on my local TFS server and it seems working well and producing much more understandable history. But I’m not an expert in git so I could miss some cases. Currently I only check with rev-list that source doesn’t have parents which are not parents of HEAD thus we could apply source..HEAD to result smoothly.

Intended workflow is like that:

git checkout -b local
# make changes
git commit -m 'blah'
# make changes
git commit -m 'blah2'
git tfs fetch
git rebase tfs/default

git tfs checkin -r -m 'blah goes to tfs with rebase' HEAD^
# thus we are sending just first commit leaving history clean

git tfs checkin -r -m 'blah2 goes to tfs also'
# here source is HEAD so we can omit it

git checkout master
git merge tfs/default
git branch -D local

As a result you should have just a linear history in git consisting from TFS commits.

So the comments/objections/suggestions are welcome.

Branch with corresponding changes is here

UPDATE: currently rebase-workflow branch is integrated in mainline of the git-tfs project in form of rcheckin command. So the branch mentioned above is no longer exist.

17/06/2011

GitExtensions command line usability

Filed under: git — Ivan Danilov @ 17:07

I prefer to use git bash command line for manipulations with repository (push, pull, checkout, rebase, merge, git-tfs’ commands etc) and GUI for things like staging, merging files, viewing history, blaming etc. Thus I always have bash console opened at my repository root.
Unfortunately when I tried to run commit process via GUI I have this:

$ gitex
sh.exe": gitex: command not found</blockquote>

From cmd.exe’s console everything is just working. cmd.exe in fact is calling gitex.cmd from GitExtensions’ root folder which is just settings paths and starts GitExtensions.exe with given command line arguments.

Bash just can’t run *.cmd file. Well, it is not a big problem. I just have to write my own script to do the same. It is strange that GitExtensions didn’t include this thing from the start. Or maybe I missed something?

Anyway, you need just file “gitex” (without extension) with this lines:

#!/bin/sh

GitExtensions.exe "$@" &

$@ means to pass to GitExtensions.exe all command line arguments that were passed to the script and & means we want it to be started separately, i.e. mingw console should return immediately and do not wait until GitExtensions.exe process finished.

P.S. As a Windows user I also fallen into a trap of missing “#!/bin/sh” at the beginning and got same message. And ‘chmod a+x gitex’ to make it executable didn’t help either. It seems mingw determines executability of a file just by that header.

16/06/2011

First git-tfs usage problems

Filed under: git, git-tfs — Ivan Danilov @ 21:45

[This post assumes reader is at least somewhat acknowledged with git and git-tfs bridge]

UPDATE: the bug described here is already fixed in git-tfs, so if you have latest version built from sources – it is not relevant for you. v0.11 still has the bug. Probably next version wouldn’t.

TFS doesn’t differ checkins based on author’s email. In fact it doesn’t even know author’s email. And that fact leads to some problems when working with git-tfs bridge as for git author’s email is crucial part of every commit influencing commit SHA-1 hash.

Goals

Suppose you have remote TFS server with slow connection (or – more likely today – it is not very reliable connection and could fail often) and you want to minimize network activity. And TFS server has some developers behind it of course. Naturally with DVCS like git it leads to such desired schema:

Dev1 \                                                     / TFS Dev1
      \        git         [   slow   ]                   /
Dev2 ------- Central ----- [  network ] ---- TFS Server ---- TFS Dev2
...   /     repository     [connection]                   \   ...
DevN /                                                     \ TFS DevN

So when TFS pulling is required – any developer on the left executes git tfs pull (or fetch), and pushes tfs/default branch to the Central so that every other developer on the left could get it without going to TFS.

That is the goal. In the ideal world it would work this way from the begining. Oh, wait! In the ideal world there wouldn’t have been TFS on the schema in the first place 🙂

First attempt

So lets return to reality. Just to test things when not everything going as expected I set up git repo called test-central:

test-central$ git tfs clone tfs_url test-central
--- and make it bare, like that
test-central $ cd test-central
test-central $ mv .git ..    # save .git somewhere (not important where really)
test-central $ rm -fr *      # remove everything in the folder
test-central $ mv ../.git .  # take .git back
test-central $ mv .git/* .   # get everything from .git to the repo's root
test-central $ rmdir .git    # delete .git
test-central $ git config --bool core.bare true  # tell git it is bare repo

Then I created test-dev-1 repository:

test-dev-1 $ git clone test-central test-dev-1
test-dev-1 $ cd test-dev-1
test-dev-1 $ git tfs bootstrap
test-dev-1 $ git config user.name dev-1
test-dev-1 $ git config user.email dev-1@email.com

And test-dev-2 repository:

test-dev-2 $ git clone test-central test-dev-2
test-dev-2 $ cd test-dev-2
test-dev-2 $ git tfs bootstrap
test-dev-2 $ git config user.name dev-2
test-dev-2 $ git config user.email dev-2@email.com

Both devs have the same TFS history cloned from test-central. Now tfs-dev-1 have some changes and sends them to TFS. test-dev-1 spots new changeset in TFS and decides to pull them:

test-dev-1 $ git tfs pull  # suppose fast-forward merge for simplicity

Now this changeset is stored in his local repository with author’s name tfs-dev-1 and (as TFS don’t have emails) author’s email dev-1@email.com. So he pushes this commit to test-central to share it with other developers:

test-dev-1 $ git push

At this time test-dev-2 also spots new changeset. He doesn’t know that dev-1 already got it (or just forgot to check) so he also decides to pull it from TFS:

test-dev-2 $ git tfs pull

His commits have author’s name also tfs-dev-1, but author’s email is dev-2@email.com this time! So his commit from git’s point of view is entirely different from dev-1’s commits. And so…

test-dev-2 $ git push

…results in a conflict.

That seems pretty bad. So to provide commits originated in TFS with ‘shareability’ they should have the same email, right? So probably git-tfs bridge should set email to some predefined value for every commit that originates from TFS changeset.
This way test-dev-1’s and test-dev-2’s commits will both have some identical fake value like TFS@email.com and SHA-1 hashes will be equal and everything will be great. Right?

Second attempt

Apparently it is not so easy (we’re already back to the real world, remember?)

Let me explain with an example a problem I’ve faced an hour ago. The most simple scenario. Single dev, single git local repository, just one new commit. As basic as possible.

At the start git repo is like that (tfs is tfs/default – just shortage):

   A <---- B
        [master]
         [tfs]

Then I make some changes and commit them to git:

   A <---- B <------ C
         [tfs]    [master]

Commit C is normal git commit so it has author='dev' and email='dev@email.com'.

After that I want to checkin my commit to TFS so I execute 'git tfs checkin'. Nothing changes within my git repo. 'git tfs fetch' gets back my commit from tfs. And weird things start to happen…

Commit that came from TFS when we did 'git tfs fetch' (lets call it D for clarity) has author='dev-1-tfs-account-name' and email='TFS@email.com' (as we agreed above). You're already know how graph will look like, yeah? 🙂

   A <---- B <------ C
           \      [master]
            \
             \<----- D
                   [tfs]

That doesn’t seem like fast-forward we were desiring from [tfs] branch… For the same reason as before commit D differs from C. But we want them to be equal! What we need for such outcome to become real, then?
Yeah, even more restrictive rule:

$ git config user.name dev-1-tfs-account-name
$ git config user.email TFS@email.com

Well, TFS@email.com was chosen absolutely arbitrary so you could set it to any fake value you like.

Conclusion

To work more-or-less comfortable with TFS every developer should have git’s user.name equal to TFS account name and all developers should share single email.

P.S. In the last example you could merge C with D, get some commit E (without any conflicts actually as B->C and B->D diffs are absolutely the same)… but than you’ll have even the simplest graph looks like DNA molecule. It’s not what I can call comfortable work.

Troubleshooting: Git Home directory

Filed under: git — Ivan Danilov @ 08:10

After installing GitExtensions it starts to show me error each time complaining about lack of permissions to get .gitconfig from u://.gitconfig. Wow! I didn’t even have a U drive. As it is not very likely that everybody who uses GitExtensions (GE for short further) has U drive – this path should be set somewhere.

When you run GE -> Settings you could see on ‘Git’ tab some text about home directory where GE search for .gitconfig. Well, actually it says that if %HOME% environment dir is present – it will look for .gitconfig there.

So I went to Computer -> Properties -> Advanced system settings -> Environment variables… -> System variables and set there variable HOME to some local path.

But after starting GE again – it still can’t find .gitconfig.

Well, it costed me getting sources and some debugging to figure out that HOME variable is being looked only in _user_ variables, not system ones. It makes sense actually, because HOME is home directory of current user. But I was so accustomed to set system vars, that did it almost automatically.

UPDATE: There’s another issue exists. Make sure %HOME% is not equal to Git’s base directory. The thing is msysgit treats C:\Program Files (x86)\Git (or wherever you have it installed) as root / and adds /bin to it literally – similarly to command PATH=$HOME/bin:$PATH. Thus if your %HOME% is C:\Program Files (x86)\Git you’ll have $PATH starting with //bin: which is obviously incorrect.
This can lead to inability to find extensions like git-tfs.exe and others.

CLR internals: implementation of .NET timers

Filed under: Uncategorized — Ivan Danilov @ 03:24

Recently I had conversation with Hans Passant on Stack Overflow about internal CLR implementation of timers. So I decide to do some night-source-digging and refresh my dusty C++ knowledge…

First of all I want to recommend excellent book on almost every low-level multithreading topic in Windows: Concurrent programming on Windows by Joe Duffy. Most part of the info in this post could be read inside.

First steps are fairly simple with some reflector’ing so I will skip them. Managed code ends with call to internally implemented AddNativeTimer function. And thus the story begins…

From ROTOR sources I can see that AddNativeTimer function is implemented in TimerNative::CorCreateTimer [1] which calls ThreadpoolMgr::CreateTimerQueueTimer [2]:

BOOL ThreadpoolMgr::CreateTimerQueueTimer(...)
{
...
    if (NULL == TimerThread)
    {
...
            HANDLE TimerThreadHandle
                        = CreateThread(NULL,                // security descriptor
                                       0,                   // default stack size
                                       TimerThreadStart,        //
                                       &params,
                                       0,
                                       &threadId);
...
            TimerThread = TimerThreadHandle;
   }
...
   QueueUserAPC((PAPCFUNC)InsertNewTimer,TimerThread,(size_t)timerInfo);
...
}

Here we see creating of new timer thread (in the first execution only) and then queuing ThreadpoolMgr::InsertNewTimer [3] APC via QueueUserAPC function. This APC just queues new timerInfo to linked list (I skipped several not-so-important-to-understanding lines):

// Executed as an APC in timer thread
void ThreadpoolMgr::InsertNewTimer(TimerInfo* pArg)
    TimerInfo * timerInfo = pArg;

    if (timerInfo->state & TIMER_DELETE)
    {   // timer was deleted before it could be registered
        DeleteTimer(timerInfo);
        return;
    }

    // set the firing time = current time + due time (note initially firing time = due time)
    DWORD currentTime = GetTickCount();
    if (timerInfo->FiringTime == (ULONG) -1)
    {
        timerInfo->state = TIMER_REGISTERED;
        timerInfo->refCount = 1;
    }
    else
    {
        timerInfo->FiringTime += currentTime;

        timerInfo->state = (TIMER_REGISTERED | TIMER_ACTIVE);
        timerInfo->refCount = 1;

        // insert the timer in the queue
        InsertTailList(&TimerQueue,(&timerInfo->link));
    }
}

New thread will start with ThreadpoolMgr::TimerThreadStart [4], setups itself as timer threadpool thread (i.e. it is a thread supporting ThreadQueue and it is belong to ThreadPool) and starts infinite loop of handling pending timers (also APCs described earlier are being posted to this thread, so they gets executed when this thread calls one of Wait* functions. Most probably it is done this way to eliminate synchronization at all as queue with timer infos accessed only from this thread):

DWORD __stdcall ThreadpoolMgr::TimerThreadStart(LPVOID p)
{
    CreateTimerThreadParams* params = (CreateTimerThreadParams*)p;
    Thread* pThread = SetupThreadPoolThreadNoThrow(TimerMgrThread);
...
    for (;;)
    {
         // moved to its own function since EX_TRY consumes stack
        TimerThreadFire();
    }
}

void ThreadpoolMgr::TimerThreadFire()
{
    EX_TRY {
        DWORD timeout = FireTimers();
        SleepEx(timeout, TRUE);

        // the thread could wake up either because an APC completed or the sleep timeout
        // in both case, we need to sweep the timer queue, firing timers, and readjusting
        // the next firing time
    }
    EX_CATCH {
        if (SwallowUnhandledExceptions())
        {
            // Do nothing to swallow the exception
        }
        else
        {
            EX_RETHROW;
        }
    }
    EX_END_CATCH(SwallowAllExceptions);
}

Actual handling is performed by ThreadpoolMgr::FireTimers [5]:

// executed by the Timer thread
// sweeps through the list of timers, readjusting the firing times, queueing APCs for
// those that have expired, and returns the next firing time interval
DWORD ThreadpoolMgr::FireTimers()
{
    for (LIST_ENTRY* node = (LIST_ENTRY*) TimerQueue.Flink;
         node != &TimerQueue;
        )
    {
        TimerInfo* timerInfo = (TimerInfo*) node;
        node = (LIST_ENTRY*) node->Flink;

        if (TimeExpired(LastTickCount, currentTime, timerInfo->FiringTime))
        {
            if (timerInfo->Period == 0 || timerInfo->Period == (ULONG) -1)
            {
                DeactivateTimer(timerInfo);
            }

            InterlockedIncrement(&timerInfo->refCount);

            QueueUserWorkItem(AsyncTimerCallbackCompletion,
                              timerInfo,
                              QUEUE_ONLY /* TimerInfo take care of deleting*/);

            timerInfo->FiringTime = currentTime+timerInfo->Period;

            if ((timerInfo->Period != 0) && (timerInfo->Period != (ULONG) -1) && (nextFiringInterval > timerInfo->Period))
                nextFiringInterval = timerInfo->Period;
        }
        else
        {
            DWORD firingInterval = TimeInterval(timerInfo->FiringTime,currentTime);
            if (firingInterval < nextFiringInterval)
                nextFiringInterval = firingInterval;
        }
    }

    LastTickCount = currentTime;

    return nextFiringInterval;
}

Well, this code just does what comment before function says: going through timers queue and searching for expired ones. And at last when some of the timers expires ThreadpoolMgr::AsyncTimerCallbackCompletion [6] is executed in the thread pool with QueueUserWorkItem. It essentially just calls callback provided with timer creation so comment seems somewhat misleading in part that states APC (or maybe it is just async procedure call in broad sense who knows?).

P.S. Locations of functions mentioned:
[1] TimerNative::CorCreateTimer – \clr\src\vm\comthreadpool.cpp line 956
[2] ThreadpoolMgr::CreateTimerQueueTimer – \clr\src\vm\win32threadpool.cpp line 3628
[3] ThreadpoolMgr::InsertNewTimer – \clr\src\vm\win32threadpool.cpp line 3816
[4] ThreadpoolMgr::TimerThreadStart – \clr\src\vm\win32threadpool.cpp line 3729
[5] ThreadpoolMgr::FireTimers – \clr\src\vm\win32threadpool.cpp line 3864
[6] ThreadpoolMgr::AsyncTimerCallbackCompletion – \clr\src\vm\win32threadpool.cpp line 3917

What we saw above is just an implementation of legacy Win32 thread pool and timers on top of it in the CLR. So my guess that CLR just tries to be as compatible as possible and not rely on newer native thread pool’s features implemented in Windows Vista. I think it is very likely that newer CLR try to switch to newer API when runs on Vista or higher version. And it might be the reason I didn’t find any timer threads myself when I tested timers on .NET 4.0.

So as a short summary:

  • As for SSCLI20 Hans was totally right. There’s really separate thread for handing APCs and queued timers. Thanks for your insistence, btw. I received a chance to dig something interesting 🙂
  • On newer systems it still could be implemented without additional threads. I was just mistakenly assuming it was already there when CLR 2.0 was written. For details see CreateThreadpoolTimer, SetThreadpoolTimer and CloseThreadpoolTimer. Or better just read the book mentioned in the second sentence.

13/06/2011

Troubleshooting: Git Bash icon replaced standard cmd.exe’s one

Filed under: git — Ivan Danilov @ 01:48

Shortly after installing msysgit on my box I saw that my console windows were shown with Git’s icon on the taskbar. Some checking revealed that only 32-bit cmd.exe placed in the SysWOW64 folder had icon replaced. No matter how you run cmd.exe from 32-bit process (namely I run it most often from Total Commander) you see Git’s icon.

After an hour of trials and errors I found that it is due to mere existing of shortcut file in the start menu. Wow!
The shortcut was to Git Bash and it has this line as a Target:
C:\Windows\SysWOW64\cmd.exe /c ""C:\Program Files (x86)\Git\bin\sh.exe" --login -i"
Icon was changed to Git’s one (actually it was the way I discovered this file). When I renamed it to something like “*.lnk_” or changed target to any other executable – cmd.exe obtains its rightful icon back. It’s worth to mention that other places of my hard drive still had links to cmd.exe with Git’s icon, but this fact didn’t influence cmd.exe. Only shortcut in start menu did.

So I decided to check what happens if I would have two shortcuts with different icons there? 🙂 And so I created new shortcut to cmd.exe in the same place. Now cmd.exe has own icon and Git Bash has its icon, everybody is happy.

With some additional experiments I found that which shortcut selected to get icon properties from depends on parameter list in the first place. And if param list of both shortcuts is equal – selection is made by lexical order of shortcuts’ filenames – e.g. if one named a.lnk and another b.lnk – cmd.exe will have icon set in a.lnk.

I don’t understand this behavior of Windows (I have Win7 x64 Ultimate with SP1 installed by the way) but it is as is how EULA explicitly mentions.

09/06/2011

Troubleshooting: SQL Server error 18456 state 11

Filed under: sqlserver — Ivan Danilov @ 14:47

After my team installed SQL Server 2008 R2 on our project server for testing I experienced annoying problem: I can’t login to it until my domain account was explicitly added to SQL Server. And that’s despite the fact I was member of local Administrators group.

It was very strange, so I decided to investigate. To check that domain is not related to the issue I created local admin account and logged to system from that account. But the result was the same: error 18456. Error message is intentionally uninformative to prevent disclosing of sensitive information. So I took a look at the logs where I found additional ‘state 11’ piece of info. As google told me helpfully state 11 means ‘Valid login but server access failure’ (you can see other states here or here).

Turned out the reason was extremely simple: User Account Control. When I run SQL Server Management Studio without elevated rights UAC effectively ‘removed’ Administrators membership from my account and thus SQL Server couldn’t grant me access, because it is only this membership was my ticket to pass in.

So, the moral of this story: either always run SQL Server Management Studio with elevated rights or turn off the UAC 🙂

Create a free website or blog at WordPress.com.