A method of tracking hand manipulating objects is presented by Hamer et al. (2009), as shown in the picture above it seems quite robust for tracking.
A full paper about this can be found here titled as ‘tracking a hand manipulating objects‘
As described by the authors, a method of individual local tracker is used to achieve extractions. To achieve the goal as the left picture shows, it is required to build a 3-dimensional frame to skip the covering of objects and overlapped parts of hands. Seems the authors do not narrowly focus on color image segmentation which has been commonly adopted by computer vision researchers, but they integrate an estimation of features as well as 2.5-dimensional maps. That means probably extra dimension may required. In previous posts, we have successfully located contours of hands yet the shape recognition is still under investigation. And this new method may inspire us something further in hands tracking and recognition.