I am doing project on
As gcalmettes has already noted, Hough can detect lines. You are looking for line segments, though, which is a slight wrinkle to the standard algorithm.
Another option is the RANSAC algorithm, which is worth studying since it follows a quite different approach:
The Hough technique can also be used to find circles. For curves fitting you can use spline fitting techniques or sometimes even polynomial fits, although the latter tend to be less useful for real-world image analysis.
You'll have to be careful with skeletonization / thinning (or a medial axis transform), because thinning will tend to yield undesirable curves around 3-way intersections such as the two T-intersections in the character "H". Thinning algorithms can be fast, though. Check out the implementations of Stentiford and Zhang-Suen thinning algorithms in WinTopo, an app for raster-to-vector conversion (i.e. pixels to curves):
In your sample image there are several examples of 2-way and 3-way intersections of known, fixed angle. If you study corner detection a bit, you should see how to extend it to detecting intersections for fixed angles.
If the size of the characters and figures in the images are constant--for example, if the "H" is always 40 pixels tall--then template matching could be used to find just about any pattern. (Search for the term "normalized cross-correlation", though the Wikipedia entry may not be all that helpful.)
Briefly put, if the features in the image will be a constant size, you can create a template (something like a mask or a stencil) that matches the light/dark pattern of the object to be identified. Template matching is relatively easy to implement and debug, and is a bit more intuitive than Hough.
In simple terms, imagine that you would assign every black pixel to a value of 0 (an absence of reflection) and every white pixel a value of 1 (total reflection). To find a 3 x 3 unfilled black square, you would create a template like this:
1 1 1 1 1 1 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 1 1 1 1
You then check this template for matches in the original image. You have a match if all white pixels in the template match white pixels in the image, and all black pixels in the template match black pixels in the image.
To match a small hexagon for your benzene ring (or whatever it is), you could have a template like this:
1 1 0 1 1 1 0 1 0 1 0 1 1 1 0 0 1 1 1 0 0 1 1 1 0 1 0 1 0 1 1 1 0 1 1
That's not quite the right shape since the angles won't be 45 degrees.
You won't get a perfect match for a template in some cases because other figures such as line segments, circles, and characters intrude in the rectangle circumscribing any particular feature you want to find. Rather than a perfect match, you can require that 90% or more of all pixels in the template match. The "normalized" part of "normalize cross-correlation" makes the template more robust under changes in contrast: for example, your image might be dark gray figures on a light gray background rather than black figures on a white background.
After examining the various combinations of different figures, you can identify certain elements in the 2D template as "don't care", meaning the element can be either black or white. Since you can have line segments representing bonds at the corners of the hexagon, you have some good candidates for "don't care" pixels.
The "don't care" pixels below are marked with X. Identifying which elements in the template can be "don't care" pixels takes some tinkering and tweaking.
X 1 0 1 X 1 0 1 0 1 0 1 1 1 0 0 1 X 1 0 0 1 1 1 0 1 0 1 0 1 X 1 0 1 X
There are certainly more complicated ways to find various figures, but I recommend the simplest approaches first in case they work. Hough and RANSAC techniques can be generalized to find just about anything, but the implementation can be tricky.