Thursday, 4 June 2026

Visual Grammar and the View from Above 1. Why “Visual Grammar” Is Not a Neutral Term

The phrase “visual grammar” has become one of the central organising terms in multimodal research. It appears to offer a natural extension of linguistic description into the visual domain: if language has a grammar, then images must have one too. The term seems, at first glance, methodologically innocent—simply a convenient way of describing structured regularities in visual meaning.

But terms are never neutral when they import theoretical architecture.

“Grammar” is not a general synonym for structure. In Systemic Functional Linguistics, grammar refers to a specific stratum within a stratified semiotic system: lexicogrammar is the interface where semantic options are realised in syntagmatic form. It is not merely “patterning”; it is a theoretically defined level in a carefully differentiated architecture.

Once this is recognised, the phrase “visual grammar” becomes ambiguous in a very precise sense. It may mean one of two things:

  1. A claim that images possess a stratified content plane analogous to language, including a grammatical stratum; or
  2. A metaphorical extension in which “grammar” simply means structured organisation of visual resources.

These are not minor variations in terminology. They are fundamentally different theoretical commitments.

The first position imports the full burden of Halliday’s stratification model into visual semiosis. It implies content organised through an intermediate stratum of realisation, and thus a relation between system and structure analogous to that found in language.

The second position does something quite different. It abandons stratification while retaining the vocabulary of grammar. “Grammar” becomes shorthand for regularities in composition, salience, framing, colour distribution, or spatial arrangement. In this usage, the term no longer names a stratum. It names a descriptive convenience.

The problem is that both uses frequently operate simultaneously, often without explicit acknowledgement of the shift. As a result, “visual grammar” appears to carry theoretical weight derived from Systemic Functional Linguistics while functioning descriptively in a much weaker sense.

This ambiguity matters because it masks the explanatory direction of the analysis.

If “grammar” is taken in its systemic-functional sense, then analysis must proceed from above: from systems of visual meaning potential towards their structural realisations. If, however, “grammar” is used as a label for observable patterns, then analysis proceeds from below: from structures upwards to inferred meanings.

The same term thus encodes two incompatible methodological orientations.

This is why “visual grammar” is not a neutral term. It is a point at which two different theories of explanation silently diverge while appearing to coincide.

The issue is not whether images are structured. They clearly are. The issue is how that structure is to be explained, and in which direction explanation is allowed to move.

A systemic-functional approach begins from a simple methodological commitment: explanation proceeds from the view from above. Systems are not derived from structures; structures are explained through systems. Function is not inferred from form; form is explained through function. Meaning is not reconstructed from observable features; observable features are interpreted as realisations of meaning.

Once this orientation is adopted, the ambiguity in “visual grammar” becomes visible as a theoretical fault line rather than a terminological preference.

The task of the following discussion is therefore not to reject the term outright, but to ask what kind of explanation is being performed when it is used—and whether that explanation is genuinely systemic-functional, or only borrowing its vocabulary.

No comments:

Post a Comment