RadioBanter - View Single Post - Eduardo - don't let the door hit you in the ass on the way out !

D Peter Maus · March 28th 07, 08:44 AM posted to rec.radio.shortwave

David Eduardo wrote:
"D Peter Maus" wrote in message
...
David Eduardo wrote:
"D Peter Maus" wrote in message
...
Any effort I have seen (some done on purpose) to disprove a music test
results when the test itself follows standard techniques has failed.
That's because the axiomatic assumptions are the same in each case.
The statistical science is the same. Of course the results are going to
be the same
Did you know most of the Census data was produced by the Census long
form, given to only about 12% of all households and /or persons?

The data is considered reliable enough to use for a huge variety of
government progams.

Yeah, I've read that.

And most of the debate on either side.

Most of the debate has to do with politics and the status quo. Politicians,
who would have to initiate a change, do not want one as they are going to be
concerned about redistriting and changes in Federal funds. The debate has
very little to do with accuracy and a lot to do with insuring reelection.

On BOTH sides.

In fact, the Census Bureau fairly conclusively showed that the Census
could be done more accurately by a sample than a census... the problem is
the constitution requires, specifically, a census.

And there is a reason for that.

Yeah, the ability to poll did not really exist when that part of the
constitution was written... and a census was simpler with a population that
had limited mobility and lower population densities.

Not exactly the point I was trying to make, no.

You can't do a head count more accurately by statistical sampling
than you can by counting heads. One has a margin of error, one does not.
And that's the point. Whether or not the ability to manipulate
numbers was advanced enough at the time of the Constitution is not the
point. The point is, you can't get more accurate than a direct count.

Now, whether or not the count is actually taking place...that would
be a good discussion left for a time when the beer flows freely and
neither of us is sober enough to do any damage.

In the sense that the test is implemented by the staition program
management, you can say that your point is accurate. But as to the
selection of what songs to play, that is entirely done by the listeners.
By a sample of the listeners, measured against the executively defined
'tiers'. The difference between a list of music titles and music
programming.

The programming is the mixing of the songs. The frequency of play is in
proportion to popularity. There really is no other way. The music itself is
picked by the listeners. the way it is blended together is the programming
function.

Yes, I believe I just said that. Or am I in a different room.

In the sense that listeners are involved, yes, you're point is valid.
But the statement is incomplete.

I don't think so. As long as play is in proportion to popularity (which is
the entire purpose of a test... to tell how much each song is wanted), it is
totally responsive to the listeners' picks. The programmer decides how the
songs should flow together...

Exactly my point.

In some cases, the adherence to the test is so total that a minimum average
score is put on each hour that matches the average of the testing songs.
If you pick the sample to represent faithfull the group under study and
every subset of interest, you don't need 100%. When you can repeat the
test, with the same sample specifications again (and again and again) and
get the same results, you know the sample does faithfully represent the
universe under study.

Yes, David. We agree on the science, and how it's done. My point is
directed at the statement. And that it is incomplete. Regardless of how
you scientifically measure, gather, and interpret the data, it's still a
matter of data implemented based on decisions of PD's and Consultants,
whether at the local level, or not.

I still do not follow this. If songs are played in total proportionality to
scores (in reality, it might be by quintiles or something similar), then the
test is not being interpreted. It is just a ranker of song popularity, with
the best being programmed the most.

That's not the point. There is no use in doing a census when a sample
gives the same results, reliably, over and over. The other factor, of
course, is that a census of all the listeners of a major LA station might
cost over $100 million dollars, while the highest billing station in the
market grosses $60 million, making a census totally impossible.

Regardless of the reasons, a sample does not produce the same result
as a census. Similar, yes.

No two Censues (Censi?) give equal results... the last one was as much as
+/- 4% off. One reason is that it takes so long to do that many
characteristics of the universe have changed, due to migration, moving,
births, deaths, etc. I would say that the difference between two censuses is
about the same or more than the difference between a poll and a census.

And if the assumptions are correct and the sample is not contaminated by
individuals that fall outside of the norm, results can be quite close to a
census. But the two are not the same. Regardless of replicability. Close,
no matter how, is not identity.

In any poll, some persons are rejected as not falling in the recruit
specifications, either at the start or based on differences between
collected data and the recruit spcifications. For any imaginable applicaiton
in broadcasting, the cost of increasing sample up to and including a census
is not, then, justifiable.
And it's the subtlest of differences at the input, that can make the
most dramatic differences at the output.

That is where recruit verifications are important. If properly conducted,
multiple tiers of recruit verification are done. This includes verification
of a percentage of recruits by a different person, and reverification at the
time of data collection and further verification via "trick questions" in
the collection process and data cleansing after the process is done to
remove people who lied or were improperly recruited.
In this case, I have to say that if a music test with 100 people (the
average size) did not work, ratings-wise, nobody would do them. Marketing
implies hype and puffery in your statement. The reality is that the
results are the same for 200, 400, 1000 listeners. So there is no hype or
distortion and there is very accurate data that reflects the total
listener base of a station (and, sometimes, its direct competitors).
Not hype and puffery, so much as misdirection, and illusion. The
'show' in show business.

The show is in how a station is ut together... the imaging, etc. The
underpinings of rotations, songs and such are pure math.
The results are the same because the axiomatic assumptions are the
same. Statistics is a kind of an elegantly deceptive science. Because the
assumptions are accepted as axiomatic, the results are believed beyond
reproach. Neither is the case.

Generally, in a music test, there are no assumptions. You sample a portion
of your audience (generally the users who give most of the listening time,
P1's or, in English, the ones who listen to you more than any other station)
based on the simple fact that they listen. Then you get the right balance
within the sample for age and sex, and you have no assumptions... just a
reflection of the real audience you serve.
I"m not saying the model doesn't work, because the business and many
of it's successes are based on it. But the process is in fact a shorthand
to the cut and try of carefully crafted creative programming. Which we
both know to be far more unreliable and expensive than corporate bean
counters and lawyers are willing to tolerate.

My first experience with music testing was in a format I had programmed by
ear and gut for two decades... always resulting in #1 stations. I guessed
the scores (tiers, really) of the first 100 songs. I was off by over 20% on
more than 75% of the songs. After implementing, on a station that was
already successful, the ratings increased dramatically. I knew how to
program the format, but did not know enough about how listeners felt about
the music. The combination of both was magic.

Similarly, in a case much more recenetly, a 20 share researched station in a
really large market got a competitor programmed by a guy who was the
recognized expert in the music we played. I mean, he knew every nuance of
every song and group... and he got a 1.8 to our ongoing 20. No research,
lots of gut feel for the music. It was amusing, because the artists figured
this out, and we were the one with dialy live unplugged sets in the studio,
the station with the unreleased long versions of songs, and all kinds of
twists on the music... but we only played hits, no matter what version.
Still there are wild successes that defy the research. iPod was one
of them. Zero interest in portable hard drive based digital music players.
Until someone introduced one. How the research of one company can be so
wrong, while the research of another can be so right has to do with the
methodology.

I can't engage in a discussion of a market I do not know. But it is awfully
rare to find any format today that does not use some form of listener
consultation or feedback to establish parameters.
Much of what passes for research, today, is corporate investigation
into how to reproduce a previous success. Quite literally, asking the
questions which will produce the desired answers.

I have never seen that, but I don't work outside one company in the US. And
since I supervise our research, and am paid for results, doing anything but
what the listener wants would not be in my benefit or that of the company.
Much research works. But it's flawed. Because it dismisses both
undesired and unsought results.

Much of the reasearch I do is based on finding problems before they become
disasters. Like a health check up, we are looking for the bugs under the
rocks, and we spend a lot on the shovels we use to turn over the rocks.
Jim Collins in "Good to Great" said that most research is a waste of
time, because it's bad research. It proves nothing, offers nothing but
what is expected. That the hallmark of good research is that it produces a
result you didn't expect. And that the hallmark of great research is that
it produces a result you don't like.

The purpose of a music test is to find out the bad news about bad songs...
and get rid of them... as well as the good news on what to play a lot. In
perceptual research, the idea is to identify competitive challenges,
weaknesses, etc. and fix them, plus reinforcing the good things. Nobody
wants more to find out bad news than we do, if there is any.
Very little research does either. Instead, it seeks to prescreen and
select a very carefully crafted sample, ask highly directed questions, to
produce results that don't really break any ground. Because the
methodology is circular. And radio stations are the master at this. Select
a group of P1 listeners, and refine them to a focus group. Take the
results of the focus group and shape the survey questionaire. Apply the
survey questionaire to more listeners.

I have never done a focus group. They don't work. And focus groups are not
used for music tests, either. Perceptuals are best done with face to face
personal interviews by a very skilled interviewer.

The reason we test heavy listeners is that the 30% of total listeners who
are heavy or P1 give over 70% of the listening time. You will always get P2
or lighter listeners if you do a very good job on the core. It's the bell
curve, plain and simple.
Well, ****...what the hell results come from that?

I've been involved in nearly as many focus groups, perceptual
surveys, and music tests as you. And damnation if everyone of them didn't
work.

I have done 50 music tests since January. I have done over 400 in the last
36 months. I don't do focus groups. But we do loads of poersonal
perceptuals, and have a staff of nearly 50 to help do this.
Until the didn't. And we got our asses handed to us by a station that
did everything that the 'listeners' said they didn't want. Everything the
focus groups said were wrong. Everything the survey results said wouldn't
work.

There are bad cars, bad pizzas and bad research companies. And there is bad
implementation. That does not impinge on the reliabilty of good research,
implemented by good radio people. All it proves is that, returning to the
bell curve, that half the population has an IQ under 100.

All of which raises question about the quality of research, and the
real effectiveness of a sample.

100, 200, 400, or 1000....it doesn't matter. Replicable results are
no surprise if the axiomatic assumptions are the same. And the results may
work. But they're not the same as a 100% sample. If they were, no radio
station, no business, doing research, would ever fail for lack of
audience.

Much more of a station is the execution of the format. I can show you how
the same good research used by a bad PD produced half the ratings of the
same research used by a good PD at the same station in the same market.

I can give a magnificent palette and a wonderful set of brushes to a monkey,
and he will still not be Van Gogh.

Airchecking and training talent, doing compelling promotions, gettting
involved with the artists, making every hour flow beautully, refreshing
promos, making sure the audio is right for the station listener group,
holding a tolerable commercial load, etc., etc. are what makes a good music
list work... it is the whole staiton, not the research... the research is
just one tool in a kit. Necessary, but so are all the others.
So, getting back to my point...there are still executive decisions
to be made in programming music. Those executive decisions are made by the
station, or it's parent. And only a sample of the listeners actually have
input. That sample may pick the songs they pick, but the other part of
the process is the executive decisions about category, rotation, and
execution. The executive decisions are what separates music data from
music programming.

The rotations are a product of scores, nothing more and nothing less.
categories are simply collections of like-testing songs that move like
gears, every song a tooth. There are really no decisions other than, "here
is the list... play them in order of appeal." In fact, a music test should
instruct the listeners to indicate "how much do you want to hear this song
on the radio today."
So, I stand by my assertion that your statement may be accurate,
but, at best, incomplete.

The only possible area of "incompleteness " would be sample size. But
testing has shown that doubling or tripling has not effect on the results.
Going any further would be beyond the economics of radio, so it is not
really incomplete but, rather, impossible.

Wow. You're amazing. You've debated every point that wasn't at issue,
here. Are you SURE you're not Michael Bryant?

To review....the point I was trying to make, which apparently got
lost in a lot more things than I had intended to say....

Your original statement was that you don't program the music, the
listeners program the music.

My rebuttal, which need not be repeated here in it's detail for the
fifth time, is that, Your listeners DON"T program the music. But that
rather a sample of your listeners have influence in the songs you play.
But the Programming of the Music, is still based on decisions of PD's
and Consultants.

Or for those in Rio Linda....a group of your listeners pick songs,
YOU PROGRAM the music based on them.

Damn, David... I love you like a brother, but ****....sometimes,
you're such a Consultant.