The Hough transform and the Radon transform are indeed very similar to each other and their relation can be loosely defined as the former being a discretized form of the latter.
The Radon transform is a mathematical integral transform, defined for continuous functions on Rn on hyperplanes in Rn. The Hough transform, on the other hand, is inherently a discrete algorithm that detects lines (extendable to other shapes) in an image by polling and binning (or voting).
I think a reasonable analogy for the difference between the two would be like the difference between
- calculating the characteristic function of a random variable as the Fourier transform of its probability density function (PDF) and
- generating a random sequence, calculating its empirical PDF by histogram binning and then transforming it appropriately.
However, the Hough transform is a quick algorithm that can be prone to certain artifacts. Radon, being more mathematically sound, is more accurate but slower. You can in fact see the artifacts in your Hough transform example as vertical striations. Here's another quick example in Mathematica:
img = Import["http://i.stack.imgur.com/mODZj.gif"];
radon = Radon[img, Method -> "Radon"];
hough = Radon[img, Method -> "Hough"];
GraphicsRow[{#1, #2, ColorNegate@ImageDifference[#1, #2]} & @@ {radon,hough}]
The last image is really faint, even though I negated it to show the striations in dark color, but it is there. Tilting the monitor will help. You can click all figures for a larger image.
Part of the reason why the similarity between the two is not very well known is because different fields of science & engineering have historically used only one of these two for their needs. For example, in tomography (medical, seismic, etc.), microscopy, etc., Radon transform is perhaps used exclusively. I think the reason for this is that keeping artifacts to a minimum is of utmost importance (an artifact could be a misdiagnosed tumor). On the other hand, in image processing, computer vision, etc., it is the Hough transform that is used because speed is primary.
You might find this article quite interesting and topical:
M. van Ginkel, C. L. Luengo Hendriks and L. J. van Vliet, A short introduction to the Radon and Hough transforms and how they relate to each other, Quantitative Imaging Group, Imaging Science & Technology Department, TU Delft
The authors argue that although the two are very closely related (in their original definitions) and equivalent if you write the Hough transform as a continuous transform, the Radon has the advantage of being more intuitive and having a solid mathematical basis.
There is also the generalized Radon transform similar to the generalized Hough transform, which works with parametrized curves instead of lines. Here is a reference that deals with it:
Toft, P. A., "Using the generalized Radon transform for detection of curves in noisy images", IEEE ICASSP-96, Vol. 4, 2219-2222 (1996)