Some Comments on the Visual/Spatial Analogy in Studies of the Perception of Music Texture
John MacKay
The term "texture" has a unique and somewhat ambiguous status in the vocabulary of contemporary music theory and analysis. Unlike such terms as "harmony", "melody", "counterpoint", "rhythm", etc., it still carries a strong metaphorical significance. The metaphor of texture in music however, is not universally interpreted in the same manner. Some authors draw the analogy with fabric, much as if complex musical sound structures, being composed of many simultaneous contrapuntal lines, were similar in structure to woven textiles.1 Others present descriptions of texture which imply analogies with physical surfaces such as "rough", "jagged", "smooth", "transparent", "opaque", etc.2 The essential common element in most meanings of the term "texture" is the perception of a stable, globally oriented quality of sound structure which usually but not necessarily arises in ensemble contexts. The current variance in the interpretation of the term is perhaps not surprising in view of the relatively recent emergence of texture as a dominant musical element in itself. Compositions from the mid-1950's which have created uniquely interesting "massed sonorities" of often up to sixty different simultaneous parts have clearly removed the phenomenon of musical "texture" from its traditional status as an incidental by-product of harmonic/rhythmic activity and contrapuntal complexity, and have explored it as an element of primary musical importance.
Whatever its
specific interpretation in individual works, the visual/spatial analogy has remained a strong influence
throughout the relatively recent musical cultivation
of massed sonorities. Ligeti's often cited comment on
the nature
The function of shaping the form, which was once restricted to individual melodic lines, motifs or chordal shapes, has been handed over in serial music to more complex categories such as Groups, Structures or Textures, and, because of the way these are woven now takes over a very eminent role in the compositional design. It is possible to distinguish various 'aggregate conditions' of the material. One can see most clearly how such conditions articulate the form in compositions where the diverse types of `weave' are accompanied by considerable differences in timbre and density, and are thus even more clearly differentiated. In Stockhausen's Gruppen for example, the backbone of the form is given by contrasted type - hacked, pulverized, melted, highly condensed and their gradual transformations and mixtures one with the other. ... The two extreme types enjoy exceptionally good mutual permeability: a dense, gelatinous, soft and sensitive material can be penetrated ad libitum by sharp hacked splinters. .. . A compositional method that concentrates mainly on conditions of the material inevitably brings with it associations with visual and tactile sensations. Here we have an unambiguous case of that ‘pseudomorphosis to painting' described by Adorno in connection with the music of Debussy and Stravinsky.3
In Xenakis too the visual metaphor arises incessantly in his comments on the "ruled surfaces" of Metastasis suggested by the glissandi in the strings and the notion of "clouds of sounds" controlled by stochastic processes.4 Certain passages in the opening chapter of Xenakis' Formalized Music reveal, however, that the underlying metaphor in his complex textures is not always strictly a visual one. The musical images suggested by hail falling on a tin roof, the cicadas in the summer field or the many sounds of voices rising gradually in a massed political demonstration (pg. 9 of Formalized Music) are examples of "textures" which have their origin in purely sonic phenomena.
The difference between the static, visual metaphor with painting and the more temporal, (almost cosmic) "multiplicity of events" metaphor can be grasped quite clearly in the differences between many passages of Xenakis and Ligeti. Ligeti is known for highly fused mass sonorities which embody subtle micro-fluctuations and harmonic nuances. In Xenakis, while certain of his denser sonorities take on a static, visual character, many passages of lesser density suggest a heterogeneous multitude of separate events. This particular issue of the fusion of various densities and the suggestion of a static, visual field as opposed to a complexly active temporal condition will be reviewed in detail in the body of the discussion.
To draw up a working perspective for our present purpose, musical texture can be loosely regarded as any integration of distinct sound sources and materials in a musical sound structure. All non-monodic music, therefore, will possess texture in some form or another. The instances of texture which will be of most interest in this discussion are those particularly complex rhythmic, contrapuntal and harmonic passages in which global qualities of sound structure such as "density" and "stratification" are of primary musical interest (as opposed to isolated contrapuntal, rhythmic or harmonic effects). The visual/ spatial analogy will be explored in the course of this discussion as a theoretical means for describing and analyzing textural effects in music, many of which bear more than a superficial resemblance to global textural effects in vision. The spatialization of musical images has traditionally involved a translation of the vertical/horizontal space into musical pitch and time. This remains a useful theoretical point in the analogy since perceptual association of musical entities via proximity in pitch and time is similar to the visual association of objects in two dimensional space. To this can be added the implication of distance in a sonic texture via various dynamic and timbral relations, and also the parallel phenomena of "density" in both visual and sonic fields. These are probably not the only possibilities of analogy for visual and musical phenomena, but they form a useful starting point which is particularly adaptable to a study of complex musical textures.
The most interesting and useful aspect
of this study for a musician, however, is the characterization of various well known global
"texture" effects in terms of the micro-structural determinants from which they are composed.
In an attempt to lend some
insight into this issue, the points of analogy between visual and musical/auditory texture will be
discussed in terms of analogous perceptual phenomena which have been observed in cognitive psychology.
Given the strong visual
orientation discussed so far in the composition of massed sonorities since the 1950's, it seems
highly likely that analogies with visual perception can be quite useful in theoretical accounts
of the perception of such musical
textures. The following discussion will explore three principal aspects of the perception of musical texture
based largely on abstractions from studies in traditional Gestalt psychology
and cognitive/perceptual psychology. Each of these areas - figure/ground relations, grouping
strategies and density discrimination, will be
reviewed separately with respect to its possible significance in
relating the perception of visual/spatial textures to the perception
of textures which can be created in the auditory/temporal dimension of music.
• This will be followed by a type of synthesis and outline of implications for further
musical/theoretical exploration.
Figure/Ground Relations
The figure/ground relationship has many readily apparent counterparts in traditionally conceived musical texture such as melody/accompaniment or the prominence of the upper line in chordal succession. Figure/ground relations are, however, highly complex and momentary in the modern massed sonorities. Howard Gardner, in an article entitled "On Figure and Texture in Esthetic Perception" presents a valuable explanation of the application of the figure/ground principle which can easily be adapted to the contemporary musical context.
This discussion of figure has emphasized the characteristics of that portion of the display which is perceived as dominant and has relegated the ground to secondary status. In many instances, however, there is no portion of a display which is dominant in a figural sense and yet perceptual experience does take place. Commonplace examples would include the perception of wall paper, a rug or a field in which there are no dominating figural aspects and yet a rich and differentiated mesh of cues which can be discerned. Much perceptual experience which is not dominated by a figural aspect has this textural property, the notable exception being the Ganzfeld of the laboratory in which the subject is placed into an unpatterned white environment and has difficulty perceiving anything at all. Thick fogs, mists, heavy clouds or muddy waters probably come closest to the Ganzfeld in the lives of those outside the experimental laboratory.5
For Gardner, textural perception involves a lack of clear figure/ground distinction and as such constitutes a. less focused perception than that which is involved in figure/ground distinctions. The textural percepts associated with rugs, wallpaper, etc., are quite compatible with what is generally sensed in the massed sonorities of Ligeti or Xenakis. Since in each case one's attention is spread over a single dense mass or sound structure which is composed of a regular or irregular complexity of sub-particles. One possible instance of the Ganzfeld condition in audition can be seen in the phenomenon of white noise in which there is no dynamic differentiation over the completely saturated frequency range. The effect is analogous to that of very thick fogs in which one loses all perception of auditory depth and object differentiation.
Given any stable spectral condition, a momentary inconsistency in the prevailing sound structure may constitute an event which stands out much as figure against ground. Figure/ground distinctions can be imagined statically in terms of different concentrations of energy over the full spectrum, but as will be discussed shortly, it is the "event" effect rather than a strictly spectral differentiation which gives rise to a musical figure/ground relation. The following diagram is a conceptual representation of temporal and imaginable atemporal figure/ground relations.

Static, atemporal figure ground relationships in music are a rarity if not a complete impossibility. Any unchanging spectral structure will tend to fuse into a single percept (such as a chord or timbral quality) even with extreme spectral and pitch differentiations between possible figure and ground components. Taking the example of the spectral figure/ground relation above, it can be imagined that the higher frequency concentration of energy could be made to take on a genuinely figural relation to the lower frequency energy if it were made to pulsate or to dynamically oscillate slightly in frequency, or in some way "behave" differently over time from its lower frequency counterpart.
The creation of interesting textures is therefore a matter of both interesting spectral differentiations and interesting temporal construals of these differentiations. Robert Erickson summarizes this perspective very clearly and in the following comment points out the very prevalent spatial analogue (i.e., implied real acoustic space) for many dynamic spectral differentiations.
In any texture some elements stand out and others recede. Completely featureless textures are rare. Louder, higher, contrasting elements attract attention and we hear them in the foreground; the spatial terminology is quite appropriate - music without foreground and background elements is difficult to find - but there may be more than one kind of space involved. Foreground elements have the character of figure in relation to ground; they are more distinct, more formed, and 'closer' than ground which is less distinct, less formed, and 'farther away'. For the terms 'closer' and 'farther away' one could substitute words such as 'louder/softer' or 'duller/brighter', but the experience is undeniably spatial, and not entirely different from the visual space of everyday life.6
The figural events in massed sonorities can have a conceivably wide variety of durations from subtle, instantaneous penetration of the prevailing texture to extended gestures, as long as the single event percept is maintained. In this way, many varieties of textural hierarchy and superposition can be imagined. Generally speaking, however, figural events in the "classical" massed sonorities (i.e., of Ligeti and Xenakis) have been relatively minimal in duration - short protrusions directly from the micro-structure or inconsistencies in the prevalent granularity of the texture. In this way they have the effect of being integrated into the fundamental textural granularity rather than being longer, more conspicuously superimposed figural events.
The issue of perceived granularity7 and hierarchy will be a recurrent one in this discussion. It can be imagined, for example, that if certain textures with sufficiently articulated micro-structure (such as the openings of Xenakis' Pithoprakta or Eonta) were gradually slowed down, the perception of a unified texture would gradually give way to that of a succession of separate events and short patches of notes. Some insight into possible limits of textural perception may be gained from certain observations which Paul Fraisse records in The Psychology of Time. Based on an extremely wide variety of experiments which in various ways involved immediate reaction to an auditory stimulus, Fraisse posits a 150-200 ms interval for listeners to become aware of a sound, and then another 400-500 ms to identify the sound (appropriately for the various experimental response contexts), after which attention sharply declines.8 The short figural events of massed sonorities involve, for the most part, a simple awareness of the presence of a protrusive element rather than a substantial identification. Even in the sparsest imaginable passages, a sense of "granularity" and textural cohesion is maintained if individual elements are short and completely self contained.
In both rarefied textures and massed, fluctuating sonorities, much of the static, objective textural impression is related to the listeners inability or lack of desire to tangibly anticipate the succession of events because they are either too sparse and rhythmically unpredictable or too densely packed one next to the other. Rather than focus on the linear aspect of succession, the listener's attention will shift to the global static impression in viewing a musical texture as a result of simultaneously active though not necessarily coincident micro-structural elements. Other factors which give rise to various aspects of the textural percept will be reviewed in the following section, but is is worth noting a valuable characterization of the typically complex hierarchical relationships in a massed sonority which Bruce Reiprich has presented in his analysis of Ligeti's Lontano.
The perceptual effect of Lontano is dependent upon elements that, at times, contribute more to foreground than to background features of the mass. Crescendi are attached to individual pitches and groups of pitches within canonic lines, resulting in a profusion of non-coincident dynamic activity. And combined with the range and timbral characteristics of each orchestral participant, these phenomena cause particular instruments to pierce through the overall texture and in so doing project pitches that can overshadow emphases such as those illustrated by pitch-count predominance. Ultimately, mass complexity in Lontano does not entirely consume the contrapuntal movement nor does the mass ever become totally homogeneous. Instead, interaction of canonic activity perceived at a number of levels produces a contextual hierarchy within the mass. The structural integrity of the whole is perceivable in terms of the activity of continuously shifting concentration between background and foreground: that is between a complex totality and locally projected pitches.9
The important element which Reiprich
brings out is the shifting complexity of foreground or figural events which prevent the music from
being heard as a succession of totally
fused texture. Attention is therefore divided between the granular details and the global presence of the sound
structure. Much of the spatial
"lontano" impression of the work is
projected by the extremely fine canonic granularity and subdued timbral
consistency which creates the effect of a dense complexity
perceived at a distance.
Phenomena of Grouping and Continuity
The four traditional grouping strategies of Gestalt psychology which
have been most easily adapted to audition and to the study of
musical texture are those of proximity, similarity, continuity and
"common fate".10
Proximity groupings are
conceived to apply both in successive temporal associations and in pitch
distances between individual events, as in the following conceptual illustration.

Similarity grouping has been applied most commonly in terms of the accentuation of individual pitches, but can also be taken to apply to other dimensions of the complex musical stimulus such as articulation, timbral characteristics and also even in static pitch relations where certain intervals (such as the octave and fifth) may create strong harmonic grouping relationships in close succession. The following can be taken as a conceptual illustration of similarity grouping:

"Common fate" and continuity groupings in musical contexts are somewhat more complex than those of similarity and proximity in that they involve simultaneously both temporal and acoustic dimensions. Elements grouped by continuity relations will all lie on a single trajectory or melodic contour. "Common fate" groupings are made on the basis that disparate elements which seem to be undergoing the same processes are probably part of a larger organization. One possible illustration of the principle of common fate which has already been touched upon is that if two materials are simultaneously undergoing the condition of total stability, they will tend to merge into a single complex percept. Other examples of common fate grouping can be imagined in cases where complex heterogeneous sound masses are subjected in their totality to large scale dynamic movements or to shifts in register, thus creating the impression of a more coherent and perhaps even more homogeneously perceived sound mass.11
The activation or non-activation of the above grouping strategies of proximity, similarity, common fate and continuity in complex musical passages has a direct relevance to the limitations of textural perception, since textural complexity is a reflection of the listeners inability to integrate a sound structure into a straightforward linear parsing of its constituent elements. Perhaps the most influential experimental paradigm to have arisen in relation to these perceptual abilities or tendencies is that of auditory stream segregation developed principally by Albert Bregman with several different colleagues over the past ten years. Essentially this work is an expansion of the "trill threshold" experimentation of Miller and Heise (1950) in which subjects were asked to judge whether they perceive two alternating pitches as a trill (a "single stream" in Bregman's terminology) or as separate alternating entities. Melodic interval and rate of alternation were found to be inversely related in the trill or melodic line percept. Greater melodic intervals between tones required a slower rate of alternation in order to preserve the single line percept and conversely faster rates of alternation required smaller melodic intervals to maintain sense of trill or single line.
Other dimensions of similarity have
been tested within the streaming paradigm such as timbral
differences and different dynamic levels between successive pitches with findings too numerous
even to summarize here.12 The most accessible and reliably
interpretable finding of this literature however, is the postulation of trajectory margins for
perceived melodic continuity. The following graphic representation of temporal/pitch interval
limitation for melodic (i.e., "trill" in
Miller and Heise's terminology) trajectories, is
excerpted from Bregman and
McAdams article "Hearing Music Streams" but is based essentially on material compiled in Leon
van Noorden's dissertation (van Noorden,
1975).

(reproduced with permission from author and publisher)
The graph illustrates that in contexts of alternating tones such as the following:

tone A and B can stay within the melodic trill percept as long as their pitch and time difference falls within the ambiguous region.13 Pitch/time intervals which fall within the "always segregated" region will assumedly never form a trilled percept. This implies, for example, that tones at an interval of five semi tones (a perfect fourth) and at a rate of 10 per second will be "stream segregated" or perceived independently without any melodic connection. Pitches at the same intervallic distance only at rates of less than 5 tones per second may form a type of melodic stream. The melodic continuity within the ambiguous region is not a guaranteed effect since other factors of similarity between pitches such as timbre, dynamic level or articulation may serve to link or segregate the pitches in the pattern.
The implications of these findings for the study of musical texture are quite far-reaching. Firstly, the stream segregation research quantifies the emergence of the basic textural percept (i.e., the presence of disparate entities in a single sound structure) from the melodic percept. Complex textures can be implied in single instrumental lines if played fast enough and with enough factors of differentiation between the pitches. Moving in the other direction, that is, starting with a saturated field of isolated pitch events instead of an individual line, it can be foreseen that events within the field can form separate streams or channels if they fall consistently within the pitch/time margins and the various similarity criteria for continuity. In relation to the streaming phenomena, therefore, the textural complexity of a complex field of events falls within a continuum between two poles: either stratified, involving streaming relations, or, at the other extreme, totally fused, in which case there is an unresolvable complexity of overlapping streaming relations. The two types and the relevant nature of the streaming relations can be represented in the following way:

It still remains an
open and very challenging question as to how
segregated streams are perceived in relation to each other and how
many streams can be apprehended simultaneously. In the most fundamental
condition of two alternating stream segregated tones, the
listener may hear one as a foreground element and the other as
background. Another possibility, although it takes a certain degree of effort,
is to stretch one's attention and hear the two streams simultaneously.14
The same holds for three alternating stream segregated pitches with
the upper and/or lower pitches often forming a foreground or figural

With upwards from four pitches, listeners have difficulty remembering the individual elements of the pitch collection and so have difficulty in stretching their attention to hear four or five simultaneous streams. In addition, if there are too many pitches, the sequence will occupy too long a slice of time to be perceived as a continuous textural phenomenon (i.e., there will be too much time between recurrent pitches in the same stream to be perceived as a continuous textural stratum). With more than three pitches in the sequence, the foreground protrusion of the lowest and highest pitches become more pronounced causing a lack of distinction in the interior pitches of the sequence.
One possible means of exploring the nature of perceptual attention in this context might be to see how soon very subtle changes of intonation will be noticed on different pitches of the complex. It can be imagined that changes in the texturally less prominent pitches will be less readily detected or have to be more substantial in order to be noticed. This would be quite analogous to the string quartet situation where one can get away with slightly less accurate intonation in an inner voice than in the outer voices, and especially when there are more than three voices playing at once.
In closing this section of the
discussion, one further grouping phenomenon can be mentioned here briefly as a possibility for
which could be elaborated upon in another study. It can
be observed that in actual musical contexts, the presence of a very strong figural
element, various background elements which by themselves
would have been stream-segregated, tend to merge into a single
percept relative to the foreground element. Take the example of a typical complex
massed sonority in which there is a fine granularity from which certain micro-structural
events are made to protrude. If, as can be imagined from the opening
Ligeti quotation (see pg. 40), loud incisive events
were imposed upon this texture, it is conceivable that the original,
subtly differentiated texture would become more uniformly perceived
as a background to the new figural elements. In this way, many groupings may be
possible which would be extremely unlikely under the basic
principles of proximity and similarity. This, however, would appear
to be a significantly different aspect of attention from the primary grouping
strategies. Where temporal/acoustic principles illustrate fundamental tendencies
of grouping in fairly uniform and abstracted contexts, the interplay of these
factors with imposed figure/ground relations brings into play the more volitional
and esthetic elements of perception which ultimately make up both real world and musical
experience.
Density Discrimination
Perceptual phenomena of density discrimination would appear to act inversely to those of grouping and stream segregation within a texture. Where the grouping phenomena will constitute perceptual analyses of complex fields into a number of different lines, the qualities of density discussed here will assume the presence of essentially fused massed sonorities and their purely global percepts.
Bela Julesz has made numerous contributions in a particular area of visual perception dealing with the ability to discriminate two dimensional visual "textures" or fields of dots. The framework within which Julesz has worked characterizes visual textures in terms of what he calls first and second order statistics. Basically, textures with the same first order statistics are those which have the same number of dots over the same area as in the following of Julesz's textural pairs and so have the same average brightness or darkness.

(Example figure reproduced with permission from author and publisher.)
Note that this textural pair is easily discriminable because of the difference in what Julesz appropriately calls "granularity" which in this case is the isolated concentration of dots which often form larger dots in the left textural field. Granularity relates essentially to second order statistics involving the consistency of spacing between dots. Randi Martin and James Pomerantz (1978) in a review of Julesz (1975) have presented a very clear explanation of the implication of first and second order similarities which is worth reproducing here.
First-order statistics (OS) which correspond phenomenally to the average brightness of a texture, are measured by the probability that a point thrown randomly on a texture would land on a black dot. ... Second OS (order statistics) which correspond phenomenally to granularity or uniformity of a texture, are measured by determining the probability that pins (or dipoles) of all possible lengths and orientations thrown on a texture would land with both ends on black dots. Consider two textures that have the same density of black dots. If on the first texture, no two dots are allowed to fall closer than some predetermined distance while, on the second, there is no such constraint, the two textures would differ in their second but not their first OS. The average brightness of the two textures would be the same but the unconstrained texture would be more 'clumpy'. Third OS are not easy to describe, but are measured in a manner similar to second OS: they refer to the probability that randomly thrown triangles of a fixed size, shape, and orientation would land with all three vertices on black. Two textures of the same second OS may differ in third OS.15
Julesz contends that simple dot
textures of similar dipole (second order) statistics are indiscriminable. Certain qualifications of this assertion
have been suggested by Pomerantz and Martin
and by Julesz himself, to the effect that textures
made up of micro-patterns (tiny arrangements of dots such as. or are sometimes discriminable
depending on the orientation or the shape of the micro-patterns.
This would be the case in the following textural pairs in which a patch of patterns are embedded within a field
of patterns.

(Example figure reproduced with permission from author and publisher.)
The issue of micro-pattern orientation as an agent of textural density discrimination has led Julesz to posit two modes of textural perception in vision, one dealing with purely statistical, density factors and another which recognizes perceptual "quarks" or fundamentally simple forms for which there may be some innate detection mechanism in the visual system.16 This appears to be an entirely logical resolution of the problem as brought up by Martin and Pomerantz (1978) since the perception of global features such as density would necessarily assume a uniformity of the micro-structure of the display.
Much of the difficulty of the whole line of reasoning which Julesz and Pomerantz17 have followed (i.e., in bringing micro-structural characteristics into the issue of textural discrimination) is that the very existence of the textural phenomenon in vision seems often to be taken for granted. There are no established boundary conditions as to which arrays of dots are too sparse to yield the visual textural percept or conversely, or as to how far away one should be from a display or how unfocused one's glance should be before the perception of a field of constituent gestalt elements will give rise to the textural percept. James Pomerantz offers some insight into the problem in the following example excerpted from a recent article entitled "Perceptual Organization and Information Processing". The smaller micro-structure, larger number of micro-structural constituents, and denser positioning of constituents allow relatively easy isolation of the odd factor of each square in panel A. The larger microstructure (if it can be called such), wider spacing, and the fewer number of constituents create unique gestalt outlines for the arrays of panel B and make the task of choosing the odd factor in each square more difficult. Panel A decisions clearly involve effects of textural regularity and in panel B, it is the individual elements and their outline which influence the decision.

.
The interaction of Gestalt and textural perception is an equally critical issue in the perception of musical texture, but it can be seen that certain of the figure/ground and grouping phenomena discussed so far can be used to sketch boundary conditions for the perception of musical texture perhaps more clearly than what has been done for visual textures. The minimal temporal granularity condition for massed sonority effects (see pg. 44) is an interesting reflection of the necessary limitation on the size of the micro-structure implied in Pomerantz's visual example, and similarly, the auditory stream segregation findings give a fairly clear idea of the necessary temporal density and linear discontinuity between micro-structure constituents for the complex textural percept to arise.
It is not difficult to imagine possible auditory/musical analogues to Julesz's micro-pattern textures. If the isolated staccato is
intuitively analogous to the simple dot element, such figures as
,
or
(simultaneous attacks) seem sonically equivalent to Julesz's
or
micro-patterns. These types of figures in
both their brevity and simplicity of shape, may also
have the possibility of creating micro-structural
"gestalt" relation without completely detracting from a global,
statistical awareness of texture as in the Julesz
examples cited earlier. The
extent to which this effect arises will depend largely upon the density
of individual textures which will allow the micro-structural elements to be perceived individually
within the texture.
A notable parallel to Julesz's work in visual texture can be seen in the composition and theory of Xenakis. Similar to Julesz's simple dot textures, Xenakis has composed textures such as those at the beginning of Pithoprakta or Eonta which consist of densities of staccato pitches. In using such minute micro-structure elements or "grains" as he refers to them, Xenakis creates complex textures using stochastic Markov chain processes very similar to those of Julesz. Julesz, in generating many of his arrays, divides the available space of the texture into small cells (like the minute cells on a video screen) and determines the content of each cell (either black, white or in some cases intermediate shades of gray) successively in a linear process according to the probability values which will give rise to the desired overall darkness and granularity. Xenakis' adaptation of the same approach is to divide musical time into slices or "screens" (a given passage would be considered a "book" of temporal slices) which contain granular values for both pitch and dynamic level, as in the following excerpted example from Formalized Music.
The overall density arises from the probability determinations in each temporal slice of which register and at what dynamic level the individual grains will occur.

This comparison of Xenakis'
and Julesz's statistical generating processes underlines
the fundamental issues in the analogy of visual texture with musical texture. The horizontal/vertical
density of visual textures can be translated into
Among the many interesting questions which arise in his Formalized Music, Xenakis offers some conjecture on possible ratio thresholds for the perceptual discrimination of textural density in his music. This notion was developed in the context of his experimental work Diamorphoses20 in which a logarithmic density scale to a base between two and three is proposed for equal steps in perceived density. This would mean that equally valued changes in density would be in a ratio of between 2 and 3 to 1. In this instance it is necessary to specify beyond the information Xenakis has given in Formalized Music, what is involved in a quantitative value of a density. Numerous interpretations would seem possible and of equally promising theoretical interest. For example, quantitative density could be viewed in terms of a frequency/time slice (i.e., 30 Hz - 10,000 Hz over .5 second) and the number of micro-structural grains within the slice. Density could also involve a third dimension, with dynamic differentiation of the individual grains. In this way, a passage with a wider range in the dynamic level of its micro-structure would be texturally less dense than a passage in which all micro-structural elements were at the same dynamic level. Both this three dimensional interpretation and the preceding two dimensional interpretation of musical/auditory density involve the fundamental intuition of imagining a given amount of material spread out over a given space either in pitch and time or possibly in pitch and time and dynamic range. This is very similar to the apprehension of density in the physical, visual world.
Very
generally, the idea of characterizing readily noticeable two dimensional changes (i.e., with respect to
the pitch and time dimensions) in auditory density can be rephrased in Julesz's
paradigm in terms of how large a
difference in second order statistics will lead to discriminable textural pairs. This constitutes an interesting and still relatively unexplored empirical question for research in both visual and auditory/musical texture.
Summary and Conclusion
Rather than presenting detailed accounts of analogous visual and auditory textural phenomena, the present discussion has explored a possible framework for the conception and description of auditory textural effects in visual terms. For the most part this paper has served as a musical/theoretical orientation of existing knowledge. While the visual/spatial metaphor of musical texture remains a very rich possibility for the characterization of various musical effects, it can be mentioned here that in order for such a perspective to make substantial predictions, it must be supplemented with an understanding of the acoustic auditory space in which music exists. Such phenomena as the spectral clarity of the higher register as opposed to the lower register, the role of pitch onset in differentiating materials, the influence of global dynamic level and the resonance of the performing space are all crucial factors in determining qualities of density and textural fusion. Even an introductory discussion of these factors, however, would require another paper at least equal in length to this one.
Very generally, the three issues of figure/ground relations, grouping effects and density discrimination were all seen to be at the root of many of the perceptual effects involved in massed sonorities which has been the principal musical object in this discussion. Figure/ground relations are relevant to the present purposes in characterizing the acoustic and temporal conditions which will cause events to stand out and create fluctuant hierarchical relations within a complex musical texture. The grouping strategies, particularly involving the phenomenon of audio stream segregation with respect to pitch and tempo were reviewed here as a possible increment for describing when stratified hierarchical percepts arise within sequential single melodic line conditions or also within densely saturated fields of events. The issue of density discrimination involved the relatively uncharted area of global auditory awareness of the vertical and horizontal (temporal) dimensions of a massed sonority on which there has been considerable research in visual perception. Some speculation was offered regarding possible approaches to the quantification of texture density in music. In analogy with the statistically controlled paradigms involved in the discrimination of visual texture, this would involve clearly articulated micro-structural design and a variable acoustic/temporal space for the diffusion of the micro-patterns, similar to that involved in the classical massed sonorities of Xenakis and Ligeti.
Perhaps the most important factor reviewed here is the nature of the temporal construction of complex musical textures. Clearly any further advances in the cognitive scientific understanding of the interrelation of rates of succession and tendencies of perceptual organization will be of considerable value in any future musical/theoretical conjecture on the perception of texture. The following points are presented here as possible directions for continued empirical study within the perspective developed in this discussion.
Boundary conditions on micro-structural elements in a texture: Given the necessarily short duration for micro-structural elements in "classical" massed sonorities, the question arises as to what length of micro-structural element or duration is too long to form the percept of a massed sonority. Conceivably, factors such as the dynamic level and articulation of the micro-structure or the reverberation in the acoustic space are influential in borderline cases between sequential Gestalt and textural perception.
The relationship between "streams" within a texture: The question of how many disparate elements can be sorted and simultaneously apprehended in a musical texture is worthy of further exploration for its interest in predicting the perceived complexity of a stratified textural sound structure. This is a particularly challenging question since it will involve the invention of means for testing selective attention in a complex texture.
Density discrimination in musical texture: In pursuing the perspectives
on textural density initiated by Julesz
and Xenakis in their respective fields, further
research can be done to elucidate conditions for density discrimination in complex
auditory fields, i.e., possible density ratio minimums for discrimination of textures.
Similar work remains to be done in defining boundary conditions. Under
which micro-structural features become a factor in discriminating complex textures.
Bibliography
Erickson, R. E. The Sound Structure of Music. Berkeley, UC Press, 1975. Fraisse, Paul. The Psychology of Time. Trans. Jenifer Leith. London, Eyre and Spottiswoode, 1964.
Gardner, Howard. "On Figure and Texture in Aesthetic Perception," British Journal of Aesthetics, 12/1 (1972), 40-59.
Julesz, B. "Figure and Ground Perception in Briefly Presented Isodipole Textures," in Perceptual Organization. Kubovy and Erlbaum ed. New York: Lawrence Erlbaum Associates, Inc., 1981, 27-54.
"Experiments in the Visual Perception of Texture," Scientific American, April 1975, 34-43.
Ligeti, Gyorgy. "Metamorphoses of Musical Form," die Reihe VII (1965) 5- 19.
Martin, R. C. and Pomerantz, J. R. "Visual Discrimination of Texture," Perception and Psychophysics, 24 (5) 1978, 420-428.
McAdams, Stephen "Spectral Fusion and the Creation of Auditory Images," unpublished paper, Stanford University, 1981.
McAdams, Stephen and Bregman, Albert. "Hearing Musical Streams," Computer Music Journal, 3(4) 1979, pp. 26-41.
Miller, G. A. and Heise, G. A. "The Trill Threshold," Journal of the Acoustic Society of America, 22/5 (1950), pp. 637-638.
van Noorden, L.P.A.S. Temporal Coherence in the Perception of Tone Sequences. Ph.D. dissertation Tech. Hogeschool, Eindoven, The Netherlands, by the Institute of Perception Research, Eindove, The Netherlands.
Pomerantz, James R. "Perceptual Organization and Information Processing," in Perceptual Organization. Kubovy and Erlbaum ed., New York: Lawrence Erlbaum Associates Inc., 1981, 141-180.
Reiprich, Bruce. "Transformation and Color in Gyorgy Ligeti's Lontano," Perspectives of New Music, 16/2 (1977), 167-180.
Xenakis, I. Formalized Music. Bloomington: Indiana University Press, 1975.
1 See, for example Malcolm Goldstein's characterization in the Vinton Dictionary entry on "Texture" as "the characteristic disposition or connection of threads in a woven fabric."
2 See Erickson (1975) pg. 149 for an interesting vocabulary of visual textural characterization of musical texture.
3 Ligeti, 1965, pg. 14 and 15.
4 Xenakis, 1975, pg. 12.
5 Gardner, 1972, pg 43.
6 Erickson, 1975, 139.
7 Granularity is a term which arises constantly in any discussion of musical or visual texture. A good definition of what is generally understood by "granularity" in both circles would be the consistency both in size and frequency of occurrence of the micro-structure constituents.
8 See Fraisse, 1964, pg. 121-124.
9 Reiprich, 1977, pg. 177 and 179.
10 For a good discussion of these principles in the auditory dimension, see Albert Bergman
11 For an example of common fate groupings of spectrally disparate tones, see McAdams, 1981. 1
12 For an extremely valuable review of the entire stream segregation literature, see Stephen McAdams and Albert Bregman, 1979.
13 The "fission boundary" in van Noorden's work refers to that melodic and temporal interval of alternation at which two pitches can not possibly be heard as belonging to separate streams. The "ambiguous region" implies that subjects could hear the alternating tone sequences as either trills or separate streams depending on any predisposition they might have, and any melodic/temporal interval in the "always segregated" region can not be heard as a melodic line.
14 The following points are based on some informal experimentation done by the author and others in the context of a seminar given by Robert Erickson at UCSD. The essential can easily be replicated in any electronic music studio using a standard sequencer.
15 Martin and Pomerantz, 1978, pg. 20.
16 See Julesz, 1981, pg. 38-42.
17 See Pomerantz, 1981
18 See Pomerantz (1981) pg. 178, figure 6.17.
19 Generally speaking, our purpose here has been not so much to probe the limitations of the visual/spatial analogy as to examine how such an analogy may be asserted in terms of perceptual phenomena. The relatively simple equation issued here of the horizontal vertical/density of two dimensional visual textures with these of the musical dimensions of pitch and time is only a rudimentary first step in the characterization of musical texture. As will be discussed in the summary and conclusion, many other acoustic factors must be considered to fully describe the perceptual phenomena in this area.
20 See Xenakis,
1975, pg. 51.