Urban75 Home About Offline BrixtonBuzz Contact

(statistamatics question)

heres a challenge: put it in simple English, if you are capable :cool:

The wiki article posted is in simple english. The various explanations about circularity in using the same data to both generate and test a hypothesis are in simple english. However, technical jargon is more precise and simpler in the right circumstances, eg the OP.
 
Quoad, can I give you a stupid example of why you can't do what you want to do? It's not very rigorous, but it has the advantage of being pithy.

Suppose I have two cats. One of them never eats chicken cat food and the other never eats lamb cat food. So every night I put out one bowl of each.

One night, the lamb-eater goes to bed with me, because I am a big softy. He has not eaten his cat food, but that's OK -- he'll have it in the morning. The other hangs around downstairs. I have no cat flap and there are definitively no other animals in the house.

The next morning, myself and lamb-eater open the bedroom door and come downstairs. The lamb cat food has gone!

Now -- what is the probability that the chicken-eater also ate the lamb cat food?

How does this probability vary from the situation in which neither cat slept in my bedrooom?

Therein is your problem. You are trying to do something analagous to the first case whereas you actually need to do something more like the second. By reusing your data, you aren't assessing the independent probability that the data was generated from the hypothecated scenario -- you are assessing the probability that the data was generated given that you know that the data already happened. It's a totally different assessment.

Right.

Sorry to return to this after quite some time...

I'm reading a paper in which the authors are trying to compare inductively and deductively derived clusters.

Long story short, it's hypothesised that there are three types of domestic violence perpetrator. The authors seek to compare a deductive and inductive methodology, to see whether or not they produce the same clusters.

So.

i) Administer battery of tests to all perpetrators (the sample's only 49, heh).
iia) Run test results through an (inductive) mixture analysis (essentially an exploratory cluster analysis)
iib) Run test results through a (deductive) clustering process, using decision rules to allocate perpetrators to clusters.
iii) Compare iia) and iib).

I think that's ok because they're not seeking to confirm their exploratory with a confirmatory. They're testing one againt the other.

But I'm slightly confused, as they're using the same dataset to run an inductive and deductive analysis on. Which is pretty much the same as an exploratory and confirmatory analysis, right?

:hmm:
 
Right.

Sorry to return to this after quite some time...

I'm reading a paper in which the authors are trying to compare inductively and deductively derived clusters.

Long story short, it's hypothesised that there are three types of domestic violence perpetrator. The authors seek to compare a deductive and inductive methodology, to see whether or not they produce the same clusters.

So.

i) Administer battery of tests to all perpetrators (the sample's only 49, heh).
iia) Run test results through an (inductive) mixture analysis (essentially an exploratory cluster analysis)
iib) Run test results through a (deductive) clustering process, using decision rules to allocate perpetrators to clusters.
iii) Compare iia) and iib).

I think that's ok because they're not seeking to confirm their exploratory with a confirmatory. They're testing one againt the other.

But I'm slightly confused, as they're using the same dataset to run an inductive and deductive analysis on. Which is pretty much the same as an exploratory and confirmatory analysis, right?

:hmm:

No. They're comparing two different methods using the same data. Using the same data is the best way to compare two different methods because they're trying to say something about the performance of the methods.

You want to use one technique to generate a hypothesis and another technique to prove that the hypothesis is true. You're trying to say something about the underlying relationships in the data and so you need to be using two different data sets. Using the same data is circular, because whatever chance structure/spurious relationships are in the data you generated the hypothesis from will still be there in the data you test the hypothesis on.
 
But I'm slightly confused, as they're using the same dataset to run an inductive and deductive analysis on. Which is pretty much the same as an exploratory and confirmatory analysis, right?

:hmm:
But they decided what they were going to look for before they got their data. They didn't have a sift through first and then decide what to look for. That's the difference.
 
No. They're comparing two different methods using the same data. Using the same data is the best way to compare two different methods because they're trying to say something about the performance of the methods.

You want to use one technique to generate a hypothesis and another technique to prove that the hypothesis is true. You're trying to say something about the underlying relationships in the data and so you need to be using two different data sets. Using the same data is circular, because whatever chance structure/spurious relationships are in the data you generated the hypothesis from will still be there in the data you test the hypothesis on.

OK, sorry, to rephrase...

Are they comparing an exploratory and confirmatory analysis?

I'm aware that I'm seeing 'exploratory' as a direct stand-in for 'inductive' and 'confirmatory' as a direct stand-in for 'deductive'.

Which may be an abuse of their technical meanings :hmm:

But I get the rest of the stuff :) TY.
 
OK, sorry, to rephrase...

Are they comparing an exploratory and confirmatory analysis?

I'm aware that I'm seeing 'exploratory' as a direct stand-in for 'inductive' and 'confirmatory' as a direct stand-in for 'deductive'.

Which may be an abuse of their technical meanings :hmm:

But I get the rest of the stuff :) TY.
Well, no. For what they are doing to be comparable to what you want to do, they would have to derive the decision rules for the deductive method from the clusters derived by the inductive method. They're not doing that - they're using decision rules which came from elsewhere, which were derived from different data. They may or may not have been testing the hypothesis that those decision rules accurately reflected the structure of the data, ie performed well.
 
I dont understand what all the fuss is about - it really is quite simple. Bits of the sea and coastal areas have names such as Humber, Dogger, Bank and the weather people check out what weather conditions there will be at sea and around the coast - for example showers or gales and wind direction. Sailors need to know if the sea is going to be rough or calm or inbetween. The shipping forecast gives them that information. People like the tune that is played before it comes on because it reminds them of ye olden days when it was just radio and Ovaltine and things were simpler.

So now I have provided High Voltage with an explanation for that which he/she finds inexplicable onto Quoads problem: I don't know but as Morpheus said to Neo - the answer is out there.
 
Well, no. For what they are doing to be comparable to what you want to do, they would have to derive the decision rules for the deductive method from the clusters derived by the inductive method. They're not doing that - they're using decision rules which came from elsewhere, which were derived from different data. They may or may not have been testing the hypothesis that those decision rules accurately reflected the structure of the data, ie performed well.

Yes, I understand that :) And wasn't asking about my OP. Which I scrapped as rubbish long ago.

I was asking if confirmatory and exploratory have broadly similar meanings to deductive and inductive :)
 
I was asking if confirmatory and exploratory have broadly similar meanings to deductive and inductive :)
Oh. Not really, no. Deductive/inductive approaches use different techniques to answer the same question, whereas exploratory/confirmatory analyses use the same technique to ask different questions.
 
Back
Top Bottom