Re: Autocorrelation (Peter Cariani )

Subject: Re: Autocorrelation
From:    Peter Cariani  <peter(at)>
Date:    Wed, 19 Jul 2000 16:56:12 -0400

Christian Kaernbach wrote: > > If the proponents of a theory believe in ellipses, one does not make a > > model with circles, falsify it, and expect that they will agree that > > their model has been falsified. > > I did not mean to attack you personally. None of my comments was meant > to degrade the importance of your work. I am sorry I could not convince > you that the results of K&D should be considered seriously by anyone who > is integrating an autocorrelation stage in his/her model. I will > nevertheless uphold this point. I haven't taken any of this as a personal attack, and I certainly haven't intended any of my criticisms as a personal attack directed at you. I also don't want to understate the implications of your psychophysical demonstration, because it has advanced my thinking on these matters. > You are right: there is not ONE autocorrelation theory, there are plenty > of them. It would not be very meaningful to falsify only one of them, > and a tremendous work to falsify all of them. Please note that > "Psychophysical evidence against ..." is not "Falsifying ...". It is a > weaker formulation, and I think it can be sustained in this form, > because any of those realistic autocorrelation models will nonetheless > inherently treat first- and higher-order intervals alike (at least to my > intuition). If this is not so, please demonstrate it. Part of what bothers me isn't necessarily your fault, but what I hear from others who have read (or know of) the paper and come away thinking that all pitch models based on all-order interval representations have been invalidated. That may not have been your intention. > > While autocorrelation alone does not account for the masking > > (you are right, how could it?), I think cochlear filtering + neural > > processing + central all-order interval analysis does. > > I would be happy to see the proof of this statement. If we see a pattern of neural responses in the auditory nerve, I assume that it is explicable in terms of these general factors. The problem is not in an all-order interval analysis per se, but in our models of auditory nerve responses, which very often don't replicate temporal discharge patterns very well. In order to get at the kinds of spike precedence effects that one sees in connection with these kinds of stimuli, I think one would need to model individual spike initiation and recoveries. Maybe simple recovery models would suffice to explain this. If fibers fire with high reliability at the first click and are put into refraction for several milliseconds, then there will be few responses to a second click. If you randomize the interspersed clicks, and you have this sort of process, then the intervals produced won't have much of a peak at 1/F0. > > The specific adjustments that we need to make in our assumptions > > involve taking into account the kinds of temporal precedence effects > > that seem to be operant in high-CF fibers when one has unresolved > > harmonics (higher frequency components & higher harmonic numbers). > I can easily imagine that with adaptation processes etc. one could tune > a model such that it would find KXX sequences (K=5 ms, X random from > [0,10]ms, the triple being repeated over and over again) but not find > ABX (A random [0,10], A+B = 10, X random [0,10]). This would be so > because for KXX the model would have to look for K=5ms, and for ABX it > would have to look for A+B=10ms, i.e. at a different temporal "region". > Or one could tune a model such that it would detect KXX (K=5, X random > [0,10]) but fail to discover ABX (A+B=5, X random [0,5]) because of the > higher overall click density. Both approaches would, however have > difficulty to explain why it is possible to detect KXX for K=2.5, 5, 10, > and 20 ms, and why one fails to detect ABX for A+B=5, 10, and 15 ms. I just made some isochronous click trains from harmonics of 100, both 3000-10000 and 5000-10000, and added the click trains to themselves, offset with a delay that ranged from 1-5 msec. (For example A = 1 msec, B = 9msec). 101000000101000000101000000 The 1 msec offset hardly masks the pitch at all, while there is a little masking by the time one gets to 3 and 4 msec, but not enough to obliterate the low pitch. None of these offsets changes the pitch. Now, if one assumed 1 spike per click, and a first-order interval representation, then the pitch should change and the 100 Hz pitch should be masked out (but it doesn't and isn't). What would be your interpretation of this? I think the increase in masking going from 1 msec to 4 msec offset is perhaps due to the greater number of fibers responding to the second click, that then reduces the relative fraction of 10 msec intervals in the all-order distribution (and salience of the pitch). In any case, the strong masking effect seems to be linked with random interspersing intervals (AB) rather than regular ones (K), the former being the more disruptive in the presence of spike precedence effects. -- Peter Cariani

This message came from the mail archive
maintained by:
DAn Ellis <>
Electrical Engineering Dept., Columbia University