The Sonogram: A Tool for the Documentation of Musical Structure

 

 

 

Tamas Ungvary and Simon Waters

 

 

                    The purpose of this paper is threefold: to introduce a technique for representing musical information, to suggest potential uses for the tool described, and to give some brief practical examples of the sonogram's uses as an analytical tool.

 

                    The paper includes a rudimentary tutorial in reading and interpreting this new type of "score", a brief historical background, and a basic description of the two computer programs involved in the production of a sonogram.  This will demonstrate the extent to which the sonogram's selection of information and flexibility of representation will permit its optimization for many different situations.  In suggesting potential uses of the sonogram, emphasis is placed on its role as an educational tool with implications for both the music analyst and the composer. 

 

                    The sonogram's particular value lies in its provision of a visual reference for musics which cannot be notated conventionally (e.g. ethnic or electro-acoustic music), but it is suggested that the nature of the insight into many areas of musical structure which it allows will encourage applications even with notated musics.

 

 

Basic Principles

 

                    The fundamental difference between a sonogram and a conventional musical score is that the latter is prescriptive, which is to say that its primary purpose is to convey a series of instructions to the performer.  These instructions consist of several types of symbolic code representing high-level musical concepts such as pitch, rhythm and dynamics, allowing the performer to reconstruct the music, but leaving open some degree of interpretive freedom concerning the exact details of performance.  Conventional notation thus offers relatively little visual information about the music for those who are not trained to read or decode it.  It has evolved as the most elegant possible solution to the problem of providing instructions from which trained experts can recreate a certain type of pitch and rhythm based music. 

 

                    In situations where conventional notation becomes too complex for easy recognition of its constituent shapes and structure, or when the main musical concerns are not pitch and rhythm but perhaps timbre, the sonogram, as a descriptive notation which enables representation of sound in relatively unambiguous form, becomes useful. In the sonogram the vertical axis represents frequency and the horizontal axis time.  Amplitude is represented by printing density (an increase in amplitude for a given frequency being represented by a corresponding darkening of the printing matrix).  Speech researchers have long used the same basic format, in analogue form, to make `voice prints' using an apparatus known as a sonograph.  See figure 1a and b for a comparison of analogue and digital sonograms.

 

 

 

 

           Figure 1a: Analogue sonograms (1973) from working process of Tamas Ungvary's "Anons"[1]

 

 

 

 

 

 

Figure 1b: Digital sonogram (1988) from the finished work. 

                    The sonogram makes it possible to relate sounds intuitively to visual images.  It allows the recognition of individual sound objects, descending and ascending contours, silences, sound density in both the frequency and time domains, registration, sound profiles, characters and gestures.  It represents the spectrum change over the course of a given musical work or performance, the resulting frequency and amplitude contours apparently relating closely to the experience of many listeners.

 

 

Historical Background

 

                    The sonogram has many historical antecedents.  The invention of the first permanent method for recording the visual world in 1816 (the camera obscura) was closely followed by attempts to make visible records of the sound world.  The first documented successes in this field, in 1857, were those of Leon Scott in the USA, using his "phonautograph"[2].  By 1890 the Finnish phonetician, Pipping, was able to derive spectral information relating to the amplitude and phase of vowel sounds[3].  F. Winckel gives detailed descriptions of the construction of the `sonograph', and its significance in acoustics research[4].  Gunnar Fant[5] gives a history, theory and practice of the use of `visible speech', also referred to as a `time spectrogram' and `voiceprint'[6].  All of these researchers used the same basic format (time/frequency, sometimes with an amplitude plot), using analogue techniques.  An extensive bibliography by Wolfgang Thies[7] is a testament to the increasing use of such techniques and Thies himself suggests the use of the `spectrogram', oscillogram, and perspectival spectral surfaces as a physical-acoustic score. Ethnomusicologists were among the first to recognize the potentials (and pitfalls) of what was often referred to as `automatic transcription'[8], Charles Seeger being a notable pioneer in this field[9].  Increasingly, researchers in all fields are using digital techniques to analyze and display the secrets of both individual sounds, and of musical and natural soundscapes.  The primary differences between the sonogram system we describe here and existing commercial `equivalents' lie in scale, definition and flexibility of format.  It becomes a practical proposition to analyze and print a complete musical work, with remarkable clarity, and with the sonogram optimized for its required purpose in terms of layout.

 

 

Perceptual Considerations

 

                    Pitch, rhythm and dynamics were referred to above as high-level musical concepts, to distinguish them from their `objective' scientific equivalents, frequency, time and amplitude.  The nature of the distinction can be illustrated by pitch, for example, this being a term for a group of frequencies in which one particular frequency is perceived as dominating, this being often (but not always) the fundamental (lowest) frequency. Frequency can therefore be seen as an `objective' component of the higher-level perceptual construct, pitch.  The extent to which pitch is context dependent (subject to considerations of relative amplitude, register, cultural conditioning and environmental situation) are well documented. In interpreting the sonogram, which documents frequency rather than pitch, it is important to recognize such disparities between perception and representational practicality.

 

Technical Description: Analysis

 

                    The music is sampled (digitally encoded) at a rate of 50kHz (50,000 times a second), which is a process analogue to the discrete `snapshots' which provide the illusion of continuity in film (where the `sampling rate' is 24 Hz or 24 times a second).  Once the sound is stored in digitized form it can be analyzed with signal processing software.  The program used to obtain a spectral analysis is the Fast Fourier Transform (FFT), which works on the principle that any signal which represents a complex sound can be broken down into component sinusoids of different frequencies.  Using a very simple, steady state sound as an input example, the result of the FFT might typically be a strong fundamental frequency with a series of harmonic frequencies of lower amplitude.  In almost all sounds however, the spectrum is not steady state but constantly changing over the course of time, even within the duration of a single note.  An analysis of the spectrum at a single point in time does not therefore provide sufficient information to constitute a useful description of sound.  By repeatedly analyzing the sound's spectral content from moment to moment, and displaying it in appropriate form, the character of the sound becomes evident.  This repeated analysis necessitates the successive segmentation of the sound into blocks, the information within each of which is changed by the FFT from the time-amplitude domain into the frequency-amplitude domain.  Each block of values is called a window, and the FFT may be utilized with windows of different size, the selection of the window size having ramifications for both frequency and time resolution.  In general, a larger window size will increase frequency resolution, allowing analysis of lower bass frequencies, and of a larger number of frequencies while a smaller window size will improve time resolution (and therefore provide enhanced rhythmic detail).  Clearly there is a tradeoff between good frequency resolution and good time resolution, and the balance between these two should be determined according to the intended purpose of the sonogram.  The tradeoff can, however, be overcome relatively simply by using overlapping, rather than discrete sequential windows. This results in very high resolution at the expense of increased computing time. Figures 2 and 3 for a comparison of sonograms with different resolutions.

 

  

 

 

Figure 2: Charles Mingus/Eric Dolphy "Better Git it in your Soul" (1959). 4096/1024/130/16640/-25

 

 

 

  

Figure 3: Charles Mingus/Eric Dolphy "Better Git it in your Soul" (1959).16384/512/65/2080/-25

 

 

 

Technical Description: Graphics

 

                    Following the analysis, the data amassed must be interpreted and transformed into graphical form by a second programme.  The graphics quality is dependent on the characteristics of the available printer.  For the current examples, a printer/plotter capable of printing 792 dots per line on the vertical (frequency) axis was used.  Since each frequency and its associated amplitude is represented by a 3*3 dot matrix, the program can plot 264 (793/3) individual frequencies only.  Despite this theoretically unsatisfactory frequency resolution, it appears to provide sufficient visual information to satisfy the requirements of many applications.  The potentially problematic printing resolution, and the likelihood of differing requirements of presentation, indicate that the graphics program should be as flexible as possible, with user-defined interpretation of the data produced by the FFT, and of the format or layout of the printed result.

 

                    Among the user-definable parameters, the following are the most significant:  Printing format, which my be B4, A4 or smaller, continuous or with separate pages, and with optional music staves above the sonogram. The nature of the information represented, which includes selection of the lowest frequency to plot (in Hz), the number of octaves to be plotted (max 14), whether continuous octave lines should be printed (as is the case in most of our examples), and the threshold in dB for each octave of the analysis.  This last facility allows the removal of spurious information or the stressing of significant spectral areas.  The plotting of a summed amplitude curve is a further option, with the possibility of quantization of the graphic output;  and various divisions of the horizontal scale are offered, the most obviously useful being the indication of seconds and of windows.  The start and end times on any input FFT datafile can be specified to enable extraction of specific portions of an analysis for printing, which is also useful for running tests to determine the most appropriate user values as input to the plotting program.  The spacing allotted for each octave can be determined to allow enlargement of significant spectral areas within a given sonogram, and the positioning of such options as amplitude plots and time markings is also definable.  One of the three different matrix sequences can be chosen but the program will always scale the 10 plotting densities from white to black (which result from the 9 fonts of the 3*3 plotting matrix), to 9 levels of amplitude between the chosen threshold and the maximum amplitude of the whole input sound file (a pseudo grey scale).

 

 

 

Potential Applications

 

                    We would like now to consider some of the potential applications of the sonogram, concentrating on those aspects of its use which differ from other more widely available notational/representational systems.  The difference between this particular programme and others of similar conception which are presently available for popular computers is primarily a matter of scale and flexibility.

 

                     The possibility of viewing a complete musical formal structure or architecture at a single scan is one of the most significant advantages offered by this approach.  This also facilitates the critical comparison of different works within a given genre (see figure 4a,b,c,d and of different characteristic musical forms (figure 5 a and b).  It becomes possible to distinguish characteristic spectral tendencies within the output of a given composer; Robert Cogan's observations[10] about the spectral clarity of Stravinsky's writing being a case in point.  The sonogram's use as a critical or analytic tool is not however restricted to the global structural level.  The possibility of relating different levels of the structural hierarchy, of seeing similarities of gesture or contour at different levels, of recognizing instantly such devices as transposition, repetition and elongation, is also a significant feature. See figure 6a, b, and c for sonographic representations of different compositional devices.  The internal micro-structure of individual sounds is also beautifully displayed (figure 7). As an educational tool within the context of aural analysis, formal criticism, compositional pedagogy, or in comparative study of notations/graphic score systems such features are of immediate practical use.

 

 

 

Figure 4a: Tamas Ungvary. "Anonce".  4096/2048/130/8320/-35

 

 

 

   

Figure 4b: Varèse: "Poème électronique"  4096/2048/98/12544/-35

 

 

 

Figure 4c: Ligeti: "Articulation".  4096/2048/130/16640/-30

 

  

 

 

 

Figure 4d: Mathias Fuchs: Digitally generated vocal transformation using the `Chant' programme.

 

 

 

 

 

Figure 5a: Verdi: Don Carlos, Act IV, "Ah, je ne verrai plus la Reine" (Eboli, Lucia Valentini Terrani, DG 415981-2). 4096/2048/130/16640/-35.

 

 

 

 

Figure 5b: Charles Mingus/Eric Dolphy:  "Better Git it in your Soul" (1959) 4096/1024/130/16640/-25.

  

                    A valuable example of such practical use, in the field of critical evaluation of performances, may be cited in the work of Anne Chatoney Shreffler[11].  Shreffler uses sonograms to support her thesis that the spectral characteristics of the (conical-bored) baroque flute, which are almost the inverse of those of its (cylindrical) modern equivalent, are essential for the clear structural articulation of the music which was written for it.  In the case of the example cited, the Bach Partita in A minor for unaccompanied flute, her thesis is totally convincing.

 

                    Of particular interest within the educational field and to the non specialist public is the sonogram's intuitive `legibility'.  The majority of the features so far described are evident to the non- musically literate after the briefest of explanations of the principles of representation, not the least because the possibility exists to draw analogies with other disciplines at various levels from the obvious (landscape) to the bizarre (weaving notation).  It is admittedly sometimes more difficult to separate individual sound objects within the sonogram because of the tendency for sounds to be spectrally spread or split over several frequency areas, or equally because of the tendency for composers to use sounds with spectral overlap - a sound object is not always represented by an equally obvious visual equivalent - but this seems to be one of the few aspects of the sonogram which has to be `learned' in the same way that certain fundamental aspects of the interpretation of conventional musical notation must be learned.  In most other respects the representational system of the sonogram is logically more consistent that conventional musical notation which relies on the simultaneous interpretation of various different levels and types of information encoding.  This is not to imply any general superiority for the sonogram as such; merely that it represents that information it contains in a clear and consistent manner, just as conventional notation provides the most elegant solution we have as a prescriptive tool for lattice-based (pitch/rhythmic) music.

 

 

 

   

Figure 6a: Transposition, repetition and elongation in Tamas Ungvary's "Anons" 4096/2048/130/8320/-35.

 

 

 

 

 

Figure 6b: Reflection and expansion in Messian's "Abîme des oiseaux" (Quattuor Pour la Fin du Temps)

 

   

 

 

  

Figure 6c: Superimposition of trills in Tamas Ungvary's "Anonce" 4096/2048/130/8320/-25.

 

 

 

Figure 7: Internal micro-structure of sound.  Spectral decay in Messian's "Abîme des Oiseaux" (Quattuor Pour la Fin du Temps).

 

                    Within the more specialized field of electroacoustic music there have been suggestions that the sonogram is more difficult to read than `conventional' mixed-parameter GRM[12] type study/diffusion scores (figure 8), where the rationale for representation of a sound object/event is based on the most elegant representation solution for each perceptually significant situation - rather than on the more scientifically consistent sonogram principle.  The problem of establishing a representational system which is both perceptually and practically useful, and methodologically consistent, is a real one which necessitates the addressing of aesthetic or philosophical problems as well as acoustical ones.  (Ethnomusicologists were among the first to recognize the problematic nature of `transcription'[13]).

  

 

  

Figure 8: Denis Smalley "Vortex" Extracts from the composer's diffusion score. 

 

 

 

Objectivity and Subjectivity

 

                    There is a tendency to feel that the sonogram presents an objective representation of the matter of the piece, which is not the case, as many choices about the manner of representation are made at the programming level.  The user-definable parameters described above further complicate this `objectivity'.  Certain constants nevertheless exist, and it is possible to say that the sonogram can provide a time-objective working tool for artists and technicians working in fields where music and other media interrelate - choreographers and dancers, performers working in conjunction with electroacoustic music on tape, film and video makers.  The accurately and proportionately  placed events on the sonogram form a fixed temporal reference for anyone concerned with the synchronisation of such media.

 

 

 

Composition

 

                    To recognize this provision of a fixed temporal reference is a step towards acknowledging the compositional usefulness of the sonogram.  As artists in different fields increasingly search for a common basis for their work, the digital encoding of such attributes as contour, gesture and transformation allows for the translation of constructional devices between media, with a consequent enriching of the conscious formal dialogue.  A practical example of such dialogue can be cited in the Nuntius project[14] in which composer and choreographer use a notational system with a common logical base, and can specify contours or gestures in terms of functions which map into parameters within each medium.  Obviously the success of such an experiment is largely dependent upon the skill (lack of arbitrariness) of such a mapping.  A further example of the sonogram's use as a reference point for collaborative artists working in different media is the GAZ project, a combination of dance, music, lighting and multiple slide projection commissioned by Fylkingen and Stockholm[15].  During the rehearsal period the sonogram was repeatedly referred to by all the participants, but in particular by the dancers who found no apparent difficulty in its use.

 

 

                    Within non mixed-media contexts the sonogram's compositional usefulness is undiminished.  In 1973 one of the authors (TU) developed speech synthesis software at EMS[16] in order to realize a composition containing a phrase in Swedish, "Sőkes idémän." The software/hardware available at the time was only adequate for trial and error procedures[17], but the availability of analogue sonograms (figure 1a) allowed regular comparison between the "visible speech" of the reference phrase and the newly generated material.

 

                    Another practical advantage for studio-based composers is of a more prosaic nature.  Most composers will admit to experiencing a gradual `numbing' of the critical faculties through prolonged exposure to the same sound material[18].  This loss of aesthetic critical ability can be counteracted in various ways, the presence of a second person in the studio being a widely recognized antidote, and the sonogram can provide a similar service, as a reference to the sounds themselves, reducing the necessity for repeated playing, and as a control against which the vagueries of aural critical ability can be measured.  Tamas Ungvary's "Präludium" for organ and tape (1987) made use of the sonogram in this manner, the visual documentation of the architecture of the work freeing the composer from the limitations of simple audience comparison.

 

                    Other compositional possibilities center around the use of the sonogram or the date from which it is derived as the basis of both re-synthesis and transformed re-synthesis methods, ultimately with user-defined graphic input.  Since the sonogram reveals details of musical structure, the data involved in its production can also be used as input for other synthesis or notation systems.  A method utilizing this procedure has recently been established at EMS by guest composer Mathias Fuchs (Vienna) and one of the authors (TU).

 

                    Within tape-based electroacoustic music the sonogram provides useful visual support for aural decisions involving the matching or differentiation of spectral areas.  Such spectral information also allows the identification of specific frequencies thus facilitating an accurate tailoring of the relationship between, for example, a tape or computer part and instrumental pitches/formant regions, or even the reworking of material from tape/computer sources in instrumental terms, perhaps using `spectra as chords' as suggested for example by James Dashow[19].  Dashow suggests the possibility of discovering hierarchic structural functions in non-harmonic frequency relationships, an approach that electro-acoustic composers have been investigating intuitively for a long time, and perhaps by analyzing sonograms of such intuitively produced works we may come towards a better understanding of such structural functions.  The relationship between certain gestural archetypes (in both physiological and music/utterance senses) and their affective response is also a fruitful research area and one where sonograms have already proved useful. Here work of Trevor Wisehart[20] is particularly notable.

 

                    The sonogram has further application as a diffusion aid because of its accurate distinction between spectral areas.  This enables clear articulation and spatial separation of materials with different spectral characteristics through the use of loudspeakers with appropriately tailored frequency responses.  The most obvious application here is in multi-loudspeaker arrays at concert presentations of electroacoustic music, but this information could also be used for the optimization of crossover points in large multi-speaker systems for other purposes.

 

                    The elegance of the sonogram as an object in itself, the fact that it forms a visual reference for the musical public, making musical structure comprehensible in an intuitive manner, is at least as important a motivation for the making of the programme as the more specialized information it provides for the professional.  In solving the `problem' of the lack of a publishable score in the electroacoustic composer's output, the sonogram provides an explanation for the public of a music which has hitherto been regarded as difficult' partly because of that lack.  The public is as dependent as the music-publishing/promoting world upon documentation and documentability as explanation and `proof'.

 

Analysis

 

                    In introducing some short examples of sonogram-aided analysis, reference must be made to its relationship in the field of music analysis to work by Schaeffer[21], Chion[22], Smalley[23], Cogan and Escot[24] and Thoresen[25]. All of these authors start from the principle of the primacy of aural experience in analysis, and the sonogram can be seen as a tool which supports this emphasis because of its provision of a description which is not dependent on words.

 

                    As an indication of the way in which a sonogram can illustrate some of the concepts used by Denis Smalley[26] in his spectro-morphological approach to the problem of musical description, we would like to present several brief examples.  Smalley's analytical method investigates musical structure at various levels.  Figure 9 illustrates, at the level of spectral typology, a transformation through the pitch to noise continuum, where the uppermost element, a steady since tone, fluctuates a little, returns to stability, then gradually transforms into a widely distributed noise spectrum. Figure 10 indicates a similar transformation from a single pitch, through harmonic into inharmonic spectra, while simultaneously illustrating the nature of the movement - a plane followed by a sweeping descent with a rapidly increasing downward momentum.

 

                    Of the different morphological archetypes proposed by Smalley, the most universally accepted is the attack with decay distinguishable on the sonogram because of the tendency for an attack to exhibit not only the characteristic       amplitude envelope, but also a similar spectral envelope.  The presence of transients in the initial impulse stage of an attack typically results in a greater degree of inharmonicity, a wider frequency spread, and an increased high frequency content than in the coninuant (decay) stage of a given sound. For this reason the attack points can be clearly determined in figure 11 despite the relatively sustained nature of the material.

 

                    Smalley's motion typology is complex, and it would be inappropriate to elaborate on it in detail here.  Suffice it to say that concepts such as contraction, divergence, accumulation, dissipation, undulation and convolution (to name but few) are as well displayed by the sonogram as the plane and descent in figure 10, as in the style of motion

(continuous/discontinuous, periodic/aperiodic, streamed/flocked).

 

  

 

 

 

Figure 9: Simon Waters: "Dangerous Liasons"

 

 

 

 

Figure 10: Simon Waters: "Suspended Animation 4"

 

 

                    More problematic is the identification of structural functions.  This is rarely possible from the sonogram alone, however in combination with aural analysis it becomes possible to distinguish such functions.  In figure 12 the relationship between the constant pitched pulsing of the central block of material and the preceding and following material is clarified by the sonogram.  The high-frequency trace `A' betrays the fact that the onset of the central block of material is a gradual emergence overlapping the preceding low frequency tam-tam sound `B', and that, after its establishment at `C' as the principal statement (coincident with the abrupt termination of the low tam-tam), the chiming material then gradually dissolves into the following section.

 

 

 

  

Figure 11: Simon Waters: "Suspended Animation 4"

 

 

Figure 12: Simon Waters: "Dangerous Liasons"

                    The sonogram's usefulness in the `phonological' tone-color analysis method devised by Cogan[27] is clear.  Cogan's own analyses use spectrum photos, so the substitution of the digital sonogram for these seems entirely appropriate.  In order to define the sonic properties and spectral development of any given music, Cogan divides the work

into sections and subsections relating to its evident architecture, which are then assessed according to a scheme of `oppositions' (pairs of qualitative opposites), with a minimum of thirteen such oppositions being regarded as a statistically useful sample.  The adjectives defining these oppositions are assessed as positive or negative (+ or -), and intermediate states as neutral or mixed (0 or +/-).  The context-dependent nature of the adjectives chosen is recognized, and their application is related to the work as a whole.  For definitions of the oppositions the reader is referred to Cogan's New Images of Musical Sound (Harvard U.P., 1984) and "Stravinsky's Sound: A Phonological View," Sonus, (Spring 1982) pp. 15-20.

 

                    Figure 13 (opposite) shows the sonogram score of the first section of T. Ungvary's  `Präludium' for organ and tape, and Figure 14 (below) indicates the result of an analysis of this section of the tape part in tabular form.  These results are then translated into graph form, and figure 15 shows the resulting graphs for the whole work, with separate assessment of the organ and tape parts. A third graph indicates the `rate of change' of sonic characteristics, with separate traces for organ and tape.  (These graphs represent the complete work.  Only the first part of them relates to the preceding figure and sonogram.)

 

 

  

 

Figure 13: Tamas Ungvary: "Präludium". The first part of the sonogram/score.

 

 

                                  

 

Figure 14: Table of Oppositions" from analysis of the first part of Tamas Ungvary's "Präludium".

 

 

 

 

 

 

Figure 15a: Comparison of the graphs of the TOTALS for organ and tape parts.

 

  

 

 

 

Figure 15b: Graphs indicating rate of change of sonic characteristics for organ and tape parts.

 

                    Cogan's qualitative analyses proceed according to four implied considerations: space, language, time and color. The term space is used in the sense of spectral distribution, and the section of "Präludium" under analysis can be seen to exhibit broad spectral tendencies rather than changes of detail.  This tends to invite an `acousmatic'[28] approach to listening as there is no significant referential surface detail.  Although the section shows an overall tendency to increasing density, three separate `field-spanning' gestures are identifiable: a band of frequencies leading from neutral to grave, a band leading from grave to neutral, but expanding in conjunction with the organ to fill the whole spectral register.

 

                    In language terms there is a clear opposition between the constant pitch reference and gradual scalar accumulation of the organ part, and the tape part which has its own harmonicity (its component frequencies maintain a fixed spacing with respect to one another) but which moves with respect to the organ's constant reference pitch.  The changing nature of this interrelationship, always inharmonic yet always sliding towards resolution, is the main characteristic of the opening of the piece.

 

                    The time element seems to counter the tri-partite spectral structure.  There are perceptually two subsections within the continuum of the opening, the break point being the introduction of the organ trill.  Color is not the main concern of the work at this point.  The characteristic spectral shape, which incidentally results from frequency modulation of a relatively strict harmonic series, results in a perceptually relatively constant color.  The changes which occur (e.g. the rapid changing of stops during the first sustained organ note, and the introduction of the trill gives a sense of instability to the otherwise stable block of color.

 

 

Acknowledgments

 

The above research and programming was conducted at EMS (Institute for Electro Acoustic Music) Stockholm, Sweden. The writing of this paper was supported by the Bank of Sweden Tercentenary Foundation and by the National College of Dance, Stockholm.  The authors would like to acknowledge the essential contribution made by Paul Pignon, Per-Olof Strőmberg and Mattias Fuchs, without whose help it could not have been realized.



     [1] Where a sequence of five numbers in included with a sonogram figure, their interpretation is as follows: window size (in samples) / window overlap increment (in samples) / minimum frequency (in cps.) / maximum frequency / threshold (in dB) /  Horizontal dividing lines on all sonograms designate octaves.

     [2] Danninger, Helmut: "Daten zur Mechanic, Electronic, Synastesie, Environment und Performance", in Fur Augen und Ohren, Akademi-Katalog 127, (Akademie der Kunste, Berlin, 1980), pp.224-294.

     [3] Pipping, H.: "Om klangfargen hos sjunga vokaler" (Helsinki, 1890) in Fant, op.cit. (4).

     [4] Winckel, Fritz: Music, Sound and Sensation (Dover Publ. Inc., 1967) p.160.

     [5] Fant, Gunnar: "Analysis and synthesis of speech processes" in Malmberg (ed.): Manual of Phonetics (North Holland Publ., Amsterdam, 1968)

     [6] Sundberg, Johan: The Science of the Singing Voice (Northern Illinois Univ., 1987) p.90.

     [7] Thies, Wolfgang: "Vorschl@ge fhr eine physikalisch-akustische Notation Elektronischer Musik" in Interface, Vol. 16 (1987) pp.247-267.

     [8] Jairazbhoy, Nazir A.: "The `objective' and subjective view in music transcription" in Ethnomusicology Vol.21, no.2, (May 1977) pp.263-273.

     [9] "Prescriptive and descriptive music writing" in Musical Quarterly Vol.44, (1958).

     [10] Cogan, Robert: "New Images of Musical Sound" (Harvard University Press, 1984) and Cogan, Robert: "Stravinsky's Sound: A phonological View" in Sonus vol.2, (Spring 1982) pp.15-20.

[11] Shreffler, Anne Chatoney: "Baroque Flutes and Modern: Sound Spectra and Performance Results", in Galpin Society Journal 36, (Mar. 1983), pp.88-96.

[12] Groupe de Recherches Musicales, ORTF, Paris.

[13] Jairazbhoy, op.cit.

[14] Ungvary, Tamas and Rajka, Peter: "Nuntius" in International Conference on Coordination Method, Dance, Notation and Application. Digest and Papers (Nanjung Institute of Technology, China. 1988) and Ungvary, Tamas, Waters, Simon and Rajka, Peter: "Nuntius: A computer system for the interactive composition and analysis of music and dance," forthcoming in Leonardo (Pergamon, Oxford).

[15] GAZ project performed at Fylkingen, Stockholm, 29 May 1988.

[16] EMS Institute for Electro-Acoustic Music, S`der Mälarstrand 61, S-11725, Stockholm, Sweden.

[17] Wiggen, Knut: "The Electronic Music Studio at Stockholm, its Development and Construction" in Interface, Vol.1, (1972) pp.127-165.

[18] Keane, David: "Some Practical Aesthetic Problems of Electronic Music Composition" in Interface vol.8, (1979) p.203.

[19] Dashow, James: "Spectra as Chords," Computer Music Journal vol.4 no.1, (Spring 1980) p.43.

[20] Wishart, Trevor: On Sonic Art (Imagineering Press, York, 1985).

[21] Schaeffer, Pierre: Traite des objets musicaux (Edn. Seuil, Paris, 1966).

[22] Chion, Michel: Guide des objets sonores (Buchet Castel/INA GRM, Paris, 1983).

[23] Smalley, Denis: "Spectro-morphology and Structuring Processes" in Emmerson (ed.) The Language of Electroacoustic Music (Macmillan, London, 1986) pp.61-93.

[24] Cogan, Robert and Escot, Pozzi: Sonic Design:  The Nature of Sound and Music (Prentice Hall, 1976) and Cogan, Robert: "Imaging Sonic Structure" in Proceedings of International Computer Music Conference (Computer Music Association, 1986).

[25] Thoresen, Lasse; "Une Modele d'Analyse Auditive" in Analyse Musical, Nov. 1985, p.44.

[26] Smalley, op.cit.

[27] Cogan, op.cit.

[28] Schaeffer, op.cit.