I think that the first thing to remember is that the whole concept of using an 'average' is that you are throwing data away in order to better see the forest for the trees.
To describe an AC waveform perfectly, you would need an infinitely long list of numbers, describing the voltage at every instant in time. Such a perfect description would well be meaningless if all you wanted to know was how well a lamp would light up.
So we come up with a set of calculations which reduce this huge pile of data to a few numbers that better describe the whole picture...but to get to this we need to throw away the supposedly irrelevant details.
There happen to be different 'measures of central tendency' which we calculate using different equations...and each has its place where it makes sense.
For example, the common 'average' where you sum up your set of values and divide by the number of values. You can extend this concept to continuous functions, and you will find that the _average_ value of a perfect sine wave, taken over a full cycle, is _zero_. Not particularly useful if you want to know how well the bulb will light, but actually a very useful measurement if you are concerned about transformer saturation.
When you apply a voltage to a resistive load, the power delivered to the load is proportional to the square of the voltage. Because of this, the 'average' most used for voltage measurements is the 'RMS' average. You start with your voltage values, square them, take the common average of all these squared values, and then take the square root of this common average.
When power is delivered to a load is not continuous, generally the common average is used to describe the power delivered over time.
So we measure RMS current and RMS voltage, but mean (common average) power. Assuming a resistive load, RMS current * RMS voltage gives average power.
If the load is not resistive, or the waveforms are not perfect sinusoids, then the information thrown away to generate the various averages might actually be relevant, and you have a choice; either you walk away from the averages or you have to add additional terms to describe the missing information.
For example, you can use RMS voltage and RMS current plus 'power factor' to deal with reactive loads. Or you can use RMS voltage plus 'crest factor' to deal with non-sinusoidal waveforms.
-Jon