| ||||||||||||||
|
|
The Quiz - analysis of the reasons givenThe following Excel table shows the reasons given by the audience when attributing a composition to a virtual or to an authentic recording. The reasons given were divided into 12 categories. However, it is clear that not all of the categories have the same relevance. The number of submitted questionnaires was not large enough to provide real statistical evidence!!! Moreover, not all the users gave their reasons, or they did not give the reasons for all the compositions. So, the numbers seem to be quite low to give statistically important results. On the other hand, it can perhaps still be useful for the overview, given that a) the people who answered the test were suppossingly experts in listening to the organ recordings, b) the results can show some basic facts about which reasons contribute to finding the right answer and which reasons are irrelevant or even misleading. As I lack the corresponding statistical background, I rely fully on the analysis of the results provided by Christian Datzko, to whom I owe much when interpreting the results of the quiz. This was the procedure he outlined when interpreting the results.
Correlation coefficientsaction: 0,41 When interpreting these figures, we have always to remember that the amount of data is most probably insufficient to give statistically relevant data. Given that, it can be added that only listening to the action noise might be helpful to distinguish the authentic and virtual recording. On the other hand, listening to the tuning/mistuning of the organ proved to be misleading. (See below for the explanation). The rest of arguments do not give any help for a good guess. Analytical tableDownload the Excel sheet to see the analytical table. The categories of reasons:1) Action noise (Tracker): It has been given many times that the tracker response should vary greatly in authentic recording while it will be more or less constant in virtual recording where only a limited number of key noise samples are available. The results show that it might be a good clue to finding a good result. A personal note: indeed when I myself listen to the recordings, this is the only difference which I - as a creator of the sample set - find remarkable. A comment: St. Carlo sample set offers variable keyboard noise samples. There are about 3-4 different noise samples per key chosen at random for playback, so if you press the same key several times, it can give different sounds. Moreover, the key noise volume responds to the MIDI velocity, so that if you play quicker, the sound of the tracker is louder and vice versa. Of course, you need a MIDI velocity sensitive keyboard for this. The recordings were recorded with velocity insensitive keyboard, except for no. 6 where MIDI velocity sensitive keyboard was used. Still, the reaction of the tracker in reality is different than it is modeled in Hauptwerk. The clue lies in the enormous variability and unpredictability of the noise of the tracker which could be modeled only using enormous amount of various noise samples. This would increase the RAM demands very much (compare this to a good drum simulator which itself usually takes tens of GB of data) and I think it is perhaps even better not to model this in great detail as it is really a casual phenomenon. (You can read the notes on the "degree of realism" for further discussion of an organ model limits). 2) Ambience: this category deals with the overall sound impression and the spatial representation of the recording Although many people tried to interpret the ambience of the sound, the results are almost negligible. Moreover, it is difficult to say if the meaning of the term is univocal among users, so the relevance of this column is really questionable. 3) Attack: the third column represents those arguments which were connected with the attack phase of the pipe speech (attack, pipe coupling and similar). Next to the action noise discussed above, this gave some results, although they are not a guarantee. 4) Beating: This category is for the Composition no. 5 only, where a beating Fiffaro was used. Unfortunately, there was not enough answers to give statistically relevant details. 4) Brilliance: the brilliance of the sound was often employed as a criterion. The results show that listening to the brilliance of the sound lead to nothing. Since high quality denoising technique was applied during the preparation of the sample set, there is no loss of brilliance of the sound in the Hauptwerk version. This argument can hardly be used to provide a valid criterion for discernment. Note: However, I have the impression that the effect of coupling of the pipes in reality can produce change in brilliance of stops when they are combined together. It seems to me that an addition of Ottava to a Principale causes an excitation of higher aliquotes producing slightly brighter sound in reality than if the sound the two stops is only mechanically summed in Hauptwerk internal "mixing desk". On the other hand, when perfoming with a plenum, I had the impression that Hauptwerk sounds brighter than the real instrument. But this might be only subjective, so I do not guarantee this. 6) Loops: very few answers, nothing statistically important (users were reporting "wrong" loops even in authentic recordings). 7) Parasitic noise: users were reporting to hear a parasitic noise in the recording (casual hits and noises from the street and similar). Of course, these can occur in the authentic recording but I tried to use only those where these were not present. So, this criterion could not be used for the discernment of the authentic/virtual composition. It seems to me that some user were also mislead by the blower noise which is present in the virtual as well as in the authentic recording. 8) Performance: some users thought that the errors and glitches in the performance can be the criterion. It cannot be really important criterion as I make many errors in performance every time I play the organ... 9) Reverb: many times users reported difference in reverb tails of the compositions (especially the very end reverb). However, the correlation coefficient shows that the results of guesses based on this criterion are almost negligible. 10) Staccato effect: this refers the known short notes handling issue in Hautpwerk, called also "bell" or "harp" effect. Since these recordings were made with the preliminary version of the sample set using only one reverberation tail, I expected this criterion to be of a great help to the listeners. However, this was not a good criterion, because it didn't help at all when distinguishing the authentic/virtual recordings. A note: St. Carlo in the final version is distributed as multi-release sample set. There are up to 4 different release tiles depending on how long you keep the key pressed. The short note handling is improved very much using this feature and also the overall spatial representation is much better. Therefore, there might be an inner connection of this criterion with the ambience criterion mentioned above (point 2). 11)Tuning: some respondents thought that they can tell the difference of the authentic and virtual recording based on the overall tuning errors of the pipes. As in reality each organ is always a bit out of tune, the authentic recordings should be those with the greater tuning errors. However, Hauptwerk is capable of modeling random tuning errors, so this cannot be a good criterion at all! Indeed, I used the random tuning error intentionally at a higher level (sometimes even at 400%). Listening to the tuning errors therefore proved to be very much misleading. One could suggest not to listen to the tuning errors. 12) Wind modelling and instability. This criterion was used only in rare cases and no relevant results are given. Special thanks to Christian Datzko who helped interpreting the results of the St. Carlo quiz! |