There is a lot of subtlety between the meanings of convolution and correlation. Both belong to the broader idea of inner products and projections in linear algebra, i.e. projecting one vector onto another to determine how "strong" it is in the direction of the latter.
This idea extends into the field of neural networks, where we project a data sample onto each row of a matrix, to determine how well it "fits" that row. Each row represents a certain class of objects. For example, each row could classify a letter in the alphabet for handwriting recognition. It's common to refer to each row as a neuron, but it could also be called a matched filter.
In essence, we're measuring how similar two things are, or trying to find a specific feature in something, e.g. a signal or image. For example, when you convolve a signal with a bandpass filter, you're trying to find out what content it has in that band. When you correlate a signal with a sinusoid, e.g. the DFT, you're looking for the strength of the sinusoid's frequency in the signal. Note that in the latter case, the correlation doesn't slide, but you're still "correlating" two things. You're using an inner product to project the signal onto the sinusoid.
So then, what's the difference? Well, consider that with convolution the signal is backwards with respect to the filter. With a time-varying signal, this has the effect that the data is correlated in the order it enters the filter. For a moment, let's define correlation simply as a dot product, i.e. projecting one thing onto another. So, at the start, we're correlating the first part of the signal with the first part of the filter. As the signal continues through the filter, the correlation becomes more complete. Note that each element in the signal is only multiplied with the element of the filter it's "touching" at that point in time.
So then, with convolution, we're correlating in a sense, but we're also trying to preserve the order in time that changes occur as the signal interacts with the system. If the filter is symmetrical, however, as it often is, it doesn't actually matter. Convolution and correlation will yield the same results.
With correlation, we're just comparing two signals, and not trying to preserve an order of events. To compare them, we want them facing in the same direction, i.e. to line up. We slide one signal over the other so we can test their similarity in each time window, in case they're out of phase with each other or we're looking for a smaller signal in a larger one.
In image processing, things are a little different. We don't care about time. Convolution still has some useful mathematical properties, though. However, if you're trying to match parts of a larger image to a smaller one (i.e. matched filtering), you won't want to flip it because then the features won't line up. Unless, of course, the filter is symmetrical. In image processing, correlation and convolution are sometimes used interchangeably, particularly with neural nets. Obviously, time is still relevant if the image is an abstract representation of 2-dimensional data, where one dimension is time - e.g. spectrogram.
So in summary, both correlation and convolution are sliding inner products, used to project one thing onto another as they vary over space or time. Convolution is used when order is important, and is typically used to transform the data. Correlation is typically used to find a smaller thing inside of a larger thing, i.e. to match. If at least of one of the two "things" is symmetrical, then it doesn't matter which you use.