Thursday, 4 June 2026

Visual Grammar and the View from Above 3. System or Catalogue?

Once explanation is oriented from below, a further consequence tends to follow almost automatically: the proliferation of categories.

As visual analysis proceeds from observed features—colour contrasts, compositional balance, salience, framing, gaze, spatial arrangement—it becomes necessary to organise these observations into describable groupings. Over time, these groupings are refined, subdivided, and extended. What begins as a small set of analytic distinctions gradually expands into a more elaborate taxonomy of visual resources.

At a certain point, however, a question becomes unavoidable: is this a system, or a catalogue?

This question is not about empirical adequacy. It is about theoretical architecture.

In a systemic-functional framework, a system is not simply a collection of terms for describing phenomena. A system is a network of meaningful contrasts: a structured set of mutually defining options in which each choice gains value through its relation to other possible choices.

A catalogue, by contrast, is an inventory. It lists features, types, or recurrent patterns without necessarily specifying the structured relations of opposition that would make those features part of a system.

The distinction is subtle but decisive.

A system explains structure. A catalogue describes structure.

A system is paradigmatic in orientation: it is organised by difference. A catalogue is classificatory: it is organised by resemblance.

This is where much work on “visual grammar” becomes methodologically revealing.

When one examines the analytic vocabulary of visual grammar traditions, one often finds a rich set of descriptive distinctions:

  • vectors and narrative processes
  • information value zones (left/right, top/bottom, centre/margin)
  • salience hierarchies
  • framing and separation devices
  • gaze and interactional orientation
  • modality markers (colour saturation, brightness, detail)

Taken individually, these distinctions can be extremely insightful. Collectively, they can produce a highly sensitive descriptive apparatus for analysing images.

The question, however, is whether they constitute a system in the systemic-functional sense.

Do these categories form a tightly organised network of oppositions in which each term is defined through its relation to the others?

Or do they function more as a flexible inventory of analytical lenses that can be applied as needed to particular images?

This is the point at which the difference between system and catalogue becomes critical.

A systemic-functional system is not defined by the number of categories it contains, but by the internal necessity of their relations. In a system, categories are not simply co-present; they are mutually constraining. The value of one term depends on the structured set of alternatives from which it is selected.

Without this relational architecture, what remains is classification rather than system.

And classification, however detailed, does not yet provide explanation in the systemic-functional sense.

It describes what is there, but not what makes what is there necessary within a network of meaning potential.

This is why the question of direction of explanation matters.

When analysis proceeds from below, categories tend to accumulate in response to observed variation. The more images are examined, the more distinctions are introduced to account for differences. Over time, the analytical framework expands horizontally: more types, more labels, more descriptive precision.

What is often missing is vertical integration: the explanation of why these categories exist as a system of differences rather than as a list of observations.

From a view-from-above perspective, the direction is reversed. One begins with a hypothesis about the organisation of visual meaning potential and asks how particular structures realise positions within that system. Categories are not derived from accumulation; they are specified by constraint.

This shift has a significant consequence for the status of “visual grammar”.

If visual grammar is understood as a system in the Hallidayan sense, then its categories must be internally motivated by structured oppositions. If, however, it functions primarily as a catalogue of useful distinctions, then it remains descriptively valuable but theoretically underdetermined.

Neither outcome invalidates the work. But they are not equivalent.

The former claims explanatory power. The latter offers descriptive coverage.

The issue, therefore, is not whether visual analysis should classify phenomena. It inevitably must. The issue is whether classification is mistaken for system.

Once this distinction is made explicit, a further question emerges: what would it mean to reconstruct visual analysis so that its categories are not simply accumulated, but derived from a principled organisation of visual meaning potential?

That question marks the transition to the next stage of the series.

Visual Grammar and the View from Above 2. The Hidden Default — Explanation from Below

One of the most characteristic features of work described as “visual grammar” is not always visible at the level of terminology. It is visible instead in the direction of explanation.

Despite its systemic-functional inspiration, much of this work proceeds in a consistent way: it begins with the observable image and works upward toward meaning.

This orientation is so familiar that it can appear methodologically neutral. An image is presented. Its compositional features are identified. Colour contrasts, salience patterns, spatial arrangements, framing devices, and vectors of attention are described. From this descriptive base, interpretive claims are then constructed: this element signifies stability, that element conveys dynamism, this arrangement produces authority or intimacy.

The movement is from structure to meaning.

This is what we can call explanation from below.

It is important to be precise here. There is nothing inherently illegitimate about attending to structure. The issue is not descriptive detail. The issue is explanatory direction.

In an explanation-from-below model, structure is treated as primary. Meaning is treated as derivative. The analyst begins with the visual artefact as a self-contained object and asks what it might signify. Meaning is inferred from configuration.

Even when systemic-functional terminology is used, this orientation often remains intact. Terms such as “realisation”, “resource”, “choice”, or even “metafunction” may be deployed, but the analytical movement still proceeds from observed form toward hypothesised meaning.

In this respect, the methodological logic differs subtly but decisively from Halliday’s principle of the view from above.

From a systemic-functional perspective, explanation does not begin with structure. It begins with system.

Structures are not starting points. They are outcomes.

They are not interpreted first and then assigned meaning. They are explained as the realisation of meaning potential.

This inversion is not a stylistic preference. It is a theoretical commitment.

To see the difference clearly, consider what each orientation takes as primary:

  • Explanation from below:
    Visible form → inferred meaning → tentative systematisation
  • Explanation from above:
    Systemic potential → functional organisation → structural realisation

The two are not different emphases on the same process. They are different models of what explanation is doing.

In the first, the analyst reconstructs meaning from observable features. In the second, the analyst explains observable features through meaning.

This distinction is crucial because it determines what counts as explanation.

In the lower-up model, explanation is successful when it produces a plausible interpretation of what is seen. In the upper-down model, explanation is successful when it shows how what is seen is a realisation of a system of meaning.

The difference is subtle but consequential.

In practice, explanation-from-below tends to encourage a particular kind of analytical proliferation. Once meaning is inferred from structure, the system must be rebuilt inductively from repeated observations. Categories accumulate: types of salience, types of framing, types of gaze, types of composition. The result is often a rich descriptive inventory, but one whose systemic status remains uncertain.

This is precisely where the term “grammar” becomes methodologically significant. If “grammar” is understood as a system of meaning potential, then it must precede and explain such inventories. If it is understood as a label for recurring patterns, then it emerges only after those patterns have been observed.

The ambiguity of “visual grammar” therefore reflects a deeper ambiguity in explanatory direction.

Systemic Functional Linguistics, in its Hallidayan formulation, resolves this ambiguity through its methodological commitment to the view from above. Systems are not abstractions from structures. They are explanatory conditions for structures. Meaning is not inferred from form. Form is interpreted as the realisation of meaning.

Once this orientation is adopted consistently, explanation-from-below appears not as a competing theory but as a reversal of explanatory priority.

This is the first point at which the stakes of “visual grammar” become visible.

The question is not whether visual analysis should attend to structure. It must. The question is whether structure is taken as the starting point of explanation, or as something to be explained.

The remainder of this series will explore what changes when that distinction is taken seriously.

Visual Grammar and the View from Above 1. Why “Visual Grammar” Is Not a Neutral Term

The phrase “visual grammar” has become one of the central organising terms in multimodal research. It appears to offer a natural extension of linguistic description into the visual domain: if language has a grammar, then images must have one too. The term seems, at first glance, methodologically innocent—simply a convenient way of describing structured regularities in visual meaning.

But terms are never neutral when they import theoretical architecture.

“Grammar” is not a general synonym for structure. In Systemic Functional Linguistics, grammar refers to a specific stratum within a stratified semiotic system: lexicogrammar is the interface where semantic options are realised in syntagmatic form. It is not merely “patterning”; it is a theoretically defined level in a carefully differentiated architecture.

Once this is recognised, the phrase “visual grammar” becomes ambiguous in a very precise sense. It may mean one of two things:

  1. A claim that images possess a stratified content plane analogous to language, including a grammatical stratum; or
  2. A metaphorical extension in which “grammar” simply means structured organisation of visual resources.

These are not minor variations in terminology. They are fundamentally different theoretical commitments.

The first position imports the full burden of Halliday’s stratification model into visual semiosis. It implies content organised through an intermediate stratum of realisation, and thus a relation between system and structure analogous to that found in language.

The second position does something quite different. It abandons stratification while retaining the vocabulary of grammar. “Grammar” becomes shorthand for regularities in composition, salience, framing, colour distribution, or spatial arrangement. In this usage, the term no longer names a stratum. It names a descriptive convenience.

The problem is that both uses frequently operate simultaneously, often without explicit acknowledgement of the shift. As a result, “visual grammar” appears to carry theoretical weight derived from Systemic Functional Linguistics while functioning descriptively in a much weaker sense.

This ambiguity matters because it masks the explanatory direction of the analysis.

If “grammar” is taken in its systemic-functional sense, then analysis must proceed from above: from systems of visual meaning potential towards their structural realisations. If, however, “grammar” is used as a label for observable patterns, then analysis proceeds from below: from structures upwards to inferred meanings.

The same term thus encodes two incompatible methodological orientations.

This is why “visual grammar” is not a neutral term. It is a point at which two different theories of explanation silently diverge while appearing to coincide.

The issue is not whether images are structured. They clearly are. The issue is how that structure is to be explained, and in which direction explanation is allowed to move.

A systemic-functional approach begins from a simple methodological commitment: explanation proceeds from the view from above. Systems are not derived from structures; structures are explained through systems. Function is not inferred from form; form is explained through function. Meaning is not reconstructed from observable features; observable features are interpreted as realisations of meaning.

Once this orientation is adopted, the ambiguity in “visual grammar” becomes visible as a theoretical fault line rather than a terminological preference.

The task of the following discussion is therefore not to reject the term outright, but to ask what kind of explanation is being performed when it is used—and whether that explanation is genuinely systemic-functional, or only borrowing its vocabulary.

Towards a Systemic-Functional Theory of Images: The View from Above and the Study of Visual Semiosis

Systemic Functional Linguistics is often characterised through its analytical concepts: system, metafunction, register, stratification, grammar, semantics, and many others. Yet Halliday repeatedly characterised the theory in a different way. He described it as giving priority to the view from above.

This principle is so familiar within Systemic Functional Linguistics that its significance is often overlooked. Yet it may be the defining methodological commitment of the systemic-functional enterprise.

The present article does not propose a new interpretation of Halliday's work. On the contrary, it returns to a principle that Halliday himself repeatedly foregrounded and explores its implications for the study of visual semiosis.

The argument is straightforward. A theory is systemic-functional to the extent that it gives explanatory priority to the view from above. This orientation is not confined to any one part of the theory. It recurs throughout Halliday's architecture. When applied to visual semiosis, it suggests a distinctive approach to images—one that begins not with visible forms but with the systems of meaning those forms realise.

The View from Above

Halliday's notion of the view from above is methodological rather than merely descriptive. It concerns the direction in which explanation proceeds.

In traditional approaches to grammar, explanation often begins with structures. The analyst identifies formal configurations and then attempts to determine their significance. Halliday reversed this orientation. Grammar was approached as a network of meaningful choices. Structures were explained through the systems they realised.

This commitment extends far beyond grammar.

Across the architecture of Systemic Functional Linguistics, explanation repeatedly proceeds from the more abstract pole of a relation towards the more concrete.

Structures are explained through systems.

Forms are explained through functions.

Language is explained through context.

Instances are explained through potential.

The precise nature of these relations differs across dimensions of the theory. The relation between system and structure is not identical to that between context and text, nor is either identical to the relation between potential and instance. Yet the explanatory orientation remains remarkably consistent. Semiotic phenomena are explained by relating them to the broader potentials, functions, contexts, and systems in which they participate.

This is the significance of the view from above.

It does not deny the importance of structures, forms, texts, or instances. It simply refuses to treat them as self-explanatory.

Beyond Language

The importance of this methodological principle becomes particularly apparent when Systemic Functional Linguistics is extended beyond language.

Discussions of visual semiosis often begin with visible forms. Analysts examine colours, shapes, spatial arrangements, framing relations, and other observable features. The central question becomes: what do these forms mean?

From a systemic-functional perspective, however, this is not the natural starting point.

The question is not what a visible form means. The question is what system of meaning the form realises.

The distinction is fundamental.

The first approach begins from the observable artefact and works towards explanation.

The second begins from meaning potential and seeks to explain the artefact.

The difference is methodological rather than terminological.

Indeed, an analysis may employ systemic-functional vocabulary while proceeding from below, just as an analysis may remain deeply Hallidayan while using very little specialised terminology. What matters is not the presence of particular labels but the direction of explanation.

The view from above therefore provides a criterion for evaluating what it means for an analysis to be genuinely systemic-functional.

Images and the Architecture of Semiosis

Once the view from above is adopted, a number of questions concerning visual semiosis appear in a different light.

The first concerns the status of images themselves.

Images are semiotic. They possess content and expression. They participate in contextual meanings. They may exhibit metafunctional organisation. They may vary according to register. None of this is controversial.

What remains less clear is whether visual semiosis possesses the same internal architecture as language.

Language is distinctive in that its content plane is stratified. Semantics is realised by lexicogrammar, which is in turn realised by expression. This architecture has often encouraged the assumption that grammar is a necessary component of semiosis.

Yet no such conclusion follows.

A semiotic system may possess content and expression without possessing a lexicogrammatical stratum. Indeed, Halliday's account of semiosis strongly suggests that language is exceptional rather than typical in this respect.

The implication is significant.

Images need not be treated as languages in order to be treated as semiotic.

Nor does the absence of grammar imply the absence of semiotic organisation.

The task is therefore not to identify visual equivalents of clauses, phrases, or grammatical structures. The task is to investigate the systems of meaning that organise visual semiosis on its own terms.

Content and Expression

The importance of the view from above becomes particularly clear when content and expression are distinguished.

Visual analysis frequently moves directly from visible features to meanings. Colour provides a familiar example. Blue may be associated with tranquillity, red with danger, green with nature, and so on.

Such observations may be insightful, but they often blur the distinction between content and expression.

Colour belongs to expression.

Meaning belongs to content.

The significance of colour lies not in the fact that colour is meaning, but in the fact that colour may participate in the realisation of meaning.

The distinction is crucial because it preserves the architecture of semiosis. Content and expression are related through realisation; they are not identical.

A systemic-functional theory of images must therefore resist the temptation to collapse visible forms into meanings. The task is not to assign meanings to colours, shapes, or spatial arrangements. It is to understand how such expressive resources participate in systems of visual meaning.

Reclaiming the View from Above

At this point it becomes possible to state the central claim of this essay.

The challenge facing the study of visual semiosis is not whether linguistic categories should be extended into visual domains. Nor is it whether images possess a grammar analogous to that of language.

The more fundamental question is methodological.

What would it mean to approach images from above?

Such an approach would begin not with visible forms but with visual meaning potential.

It would seek to identify systems of visual meaning before attempting to describe their structural or expressive realisations.

It would treat visual structures as requiring explanation rather than as providing explanation.

And it would investigate visual semiosis through the same methodological commitment that Halliday placed at the centre of systemic-functional theory.

In this sense, the goal is not to transform images into language.

It is to take semiosis seriously.

Towards a Research Agenda

A systemic-functional theory of images remains largely undeveloped.

The purpose of this essay has not been to provide such a theory but to clarify the methodological principle from which one might emerge.

Once the view from above is adopted, a range of questions come into focus.

What systems organise visual meaning?

How is visual content differentiated?

How are visual content and visual expression related?

How are visual registers organised?

How are ideational, interpersonal, and textual meanings actualised in visual semiosis?

These questions cannot be answered in advance. They require sustained theoretical and empirical investigation.

What can be said, however, is that the answers are unlikely to be found by beginning with visible forms alone.

Halliday's enduring contribution was not merely a collection of analytical categories. It was a distinctive mode of explanation. He insisted that semiotic phenomena are best understood from above rather than below, through the broader systems, functions, contexts, and potentials that make them possible.

If a systemic-functional theory of images is to emerge, it will emerge from that same commitment.

The view from above remains not only Halliday's methodological principle. It remains the indispensable starting point for any genuinely systemic-functional account of visual semiosis.