Film, in general, is a narrative medium, or, at least, a medium of many narrative capacities. Nearly every film, except specific types of experimental films and documentaries, includes at least a few basic narrative structures. This applies especially, but not only, to feature films. If we take the representation of a change of state as a basic necessary condition for narrativity—and thus follow a broad definition of narrativity—moving pictures have at least two basic possibilities of narrative representation: a) to represent motions (and therefore changes) within one shot; b) to confront two (or more) comparable states through the combination of shots into sequences (i.e. the process of editing or montage in terms of classical film theory). Both modes of narrative representation have a visual and an auditive dimension, as virtually every sound film has a visual and an auditory channel addressing the spectator’s sense of vision and sense of hearing.
The general proposition of a narrow definition of narrativity that there is no narrative without a narrator (Margolin → Narrator) poses particular problems when applied to narration in feature films. Though almost all feature films abound in storytelling capacities and thus belong to a predominantly narrative medium, their specific mode of plurimedial presentation and their peculiar blending of temporal and spatial elements set them apart from forms of narrative that are principally language-based. The narratological inventory, when applied to cinema, is bound to incorporate and combine a large number of “co-creative” techniques “constructing the storyworld for specific effects” (Bordwell 1985: 12) and creating an overall meaning only in their totality. Instead of a single, language-based narrator, the concept of a more complex “visual” or “audiovisual narrative instance” was introduced (Deleyto 1996: 219; Kuhn 2009, 2011: 87ff.), mediating the paradigms of overtly cinematographic devices (elements relating to camera, editing, sound) and the mise en scène (arranging and composing the scene in front of the camera).
On the other hand, the most solid narrative link between verbal and visual representation is sequentiality, since literary and filmic signs are apprehended consecutively through time, mostly (though not always) following a successive and causal order. It is this consecutiveness that “gives rise to an unfolding structure, the diegetic whole” (Cohen 1979: 92). Both media, narrative literature and film, have a “double chronology” or “double temporal logic,” i.e. an external movement (“the duration of the presentation of the novel, film […]”), and an internal movement (“the duration of the sequence of events that constitute the plot”) through time (Chatman 1990: 9). The main features of narrative strategies in literature can also be found in film, although the characteristics of these strategies differ significantly. In many cases, it seems to be appropriate to speak of “analogies” between literary and filmic storytelling. These analogies are far more complex than is suggested by any mere “translation” or “adaptation” from one medium into another.
Broadly speaking, there are two different outlooks on cinema that divide the main camps of narratological research. If the medium itself and its unique laws of formal representation serve as a starting-point, many of its parameters either transcend or obscure the categories that have been gained in tracking narrative strategies of literary texts. Thus Metz states that film is not a “language” but another kind of semiotic system with “articulations” of its own (Chatman 1990: 124). Though some of the analogies between literary and filmic narrative may be quite convincing (the establishing shot of a panoramic view can be approximately equated with what Genette  1980 calls zero focalization), many other parallels must necessarily abstract from a number of diverse principles of aesthetic organization before stating similarities in the perception of literature and film. Despite the fact that adapting literary texts into movies has long since become a conventional practice, the variability of cinematographic modes of narrative expression calls for such a number of subcategories that the principle of generalization (inherent in any valid theory) becomes jeopardized.
If, however, narratological principles sensu stricto move to the fore of analysis, the question of medial specificity seems to be less important. Narratologists of a strongly persistent stance regret that connotations of visuality are dominant even in terms like point of view (Niederhoff → Perspective – Point of View) and focalization (Niederhoff → Focalization), and they maintain that the greatest divide between verbal and visual strategies is in literature, not in film (Brütsch 2011). They further hold that narratological categories in film and literary studies differ much less than most scholars would suggest. Since Genette’s ( 1980) model presents a primarily narratological, transliterary concept (albeit close to novel studies), mediality is seen as affecting “narrative in a number of important ways, but on a level of specific representations only. In general, narrativity can be constituted in equal measure in all textual and visual media” (Fludernik 1996: 353).
The two approaches depend on which scholarly perspective is preferred: either how far narrative principles can be limited to questions of narrativity alone, or whether the affordabilities of the medium have conclusive consequences for its narrative capacities. It is our view that the position most suitable for a narrative theory of film lies in between these approaches. Approaches that put their main focus on media-unbound narrative strategies should be confronted with questions of mediality. Furthermore, approaches that concentrate overwhelmingly on questions of mediality should match their results with general narrative theories. If, for example, we take established narratological concepts such as focalization, order or diegetic level as a point of departure to develop a systematic model for narratological film analysis, we have to discuss the potentials and limitations of each category in terms of mediality and modify these concepts accordingly (Kuhn 2011: 7ff.). Consistently, due to the hybrid and multimodal nature of film, an approach that examines narrative in film is per se more complex than a theory of literary narration (9).
Film as a largely syncretistic, hybrid and multimodal form of aesthetic communication and bears a number of generic characteristics which are tied to the history and capacities of its narrative constituents.
The conventional separation of “showing” and “telling” and (on a different level) of “seeing” and “reading” does not do justice to the plurimedial organization of cinema. Earlier attempts at defining film exclusively along the lines of visualization were meant to legitimize it as an art form largely independent of the established arts. However much meaning can be attributed to the visual track of the film, it would be wrong to state that it is “narrated visually” and little else. Such approaches ignore the plurimedial nature of cinema which draws on multiple sources of temporal and spatial information and its reliance on the visual and auditive senses. This peculiarity makes it difficult to sort out the various categories that are operative in its narration. Like drama, it seems to provide “direct perceptual access to space and characters” (Grodal 2005: 168); it is “performed” within a similar frame of time and experienced from a fixed position. Unlike drama, however, a film is not produced in quasi-lifelike corporal circumstances; rather, its sequences are bound together in a technically unique process (“post-production”) to conform to a very specific perceptual and cognitive comprehension of the world (Grodal 2005: 169). Similar to literary narration, it can influence the viewing positions of the recipient and dispose freely of location and temporal sequences as long as it contains generic signals of shifts in time and space.
Films are generally made by a large group of people, aside from the very few exceptions where the team is reduced to an extremely small group (thus in Fassbinder’s In a Year of Thirteen Moons, 1978, the director is producer, camera operator, sound expert and actor all at the same time). Film, in short, is the result of collective authorship (Gaut 1997; Sellors 2007; Kuhn 2011: 115ff.). It derives its impact from a number of technical, performative and aesthetic strategies that combine in a syncretizing, largely hybrid medium, establishing interlocking conventions of storytelling. As an industrial product, it also reflects the historical state of technology in its narrative structure, whether it is a silent film with intertitles or a film using high-resolution digital multi-track sound, whether a static camera is turned on the scene or a modern editing technique lends the images an overpowering kinetic energy, etc. Not only the mode of production but also the reception of highly varied formats in film history have altered narrative paradigms that had formerly seemed unchangeable. It has thus long been a rule that the speed and the sequentiality of a film’s projection is mechanically fixed so that the viewer has no possibility of interrupting the “reading” to “leaf” back and forth through the scenes or of studying the composition of a single shot for longer than the actual running time. In the auditorium-space, the spectator lacks any manifest control over the screen-space. It was with the introduction of video and DVD that the viewer could control speed variations, play the film backwards, view it frame by frame and freeze it and (as in DVD and Blu-ray) use the digitalized space of navigation to interact, select menus and “construct” a new film with deleted scenes, an unused score and alternative endings (cf. Distelmeyer 2012).
Silent movies from 1895 onward lacked not only verbal expression but also narrative structures beyond the stringing together of stage effects, arranged tableaux and sensationalist trick scenes. What was then perceived as the only striking narrative device consisted in showing these scenes within a framed space and against the common laws of temporal continuity. But on the whole, these movies were still very much indebted to the 19th-century apparatus in which the process of seeing as a perceptual and motoric element was closely connected with pre-cinematic “spatial and bodily experiences” (Elsaesser 1990: 3).
This early “cinema of attractions” (Gunning 1986) gradually made way for “narrativization” (233) from 1907 to about 1913, when films began to move from funfair and vaudeville to the first nickelodeons and Ladenkinos (Paech 1988: 25ff.) through the process of structural organization of cinematic signifiers and the “creation of a self-enclosed diegetic universe” (Gunning 1986: 233). The result, initiated by David Wark Griffith in particular, was an “institutional mode of representation,” also known as “classical narration” (Schweinitz 1999: 74), “continuity editing” or “découpage classique.” The filmic discourse was to create a coherence of vision without any jerks in time or space or other dissonant and disruptive elements in the process of viewing. The basic trajectory of the classical Hollywood ideal (also taken over by UFA and other national film industries) involves establishing a cause-and-effect logic, a clear subject-object relation, and a cohesive effect of visual and auditive perception aimed at providing the story with an “organic” meaning, however different the shots that are sliced together might be. A “seamless” and consecutive style serves to hide “all marks of artifice” (Chatman 1990: 154) and to give the narrative the appearance of a natural observing position. The “real” of the cinema is founded at least as much on the real-image quality of its photography as it is on the system of representation that shows analogies to the viewer’s capacity to combine visual impressions with a “story.” The reason for the latter is that by watching films the spectator becomes more and more used to conventions of classical narration and genre-stereotypes.
Modernist cinema and non-canonical art films, especially after 1945, repudiate the hegemonic story regime of classical Hollywood cinema by laying open the conditions of mediality and artificiality or by employing literary strategies not as an empathetic but as an alienating or decidedly modern factor of storytelling. They disrupt the narrative continuum and convert the principle of succession into one of simultaneity by means of iteration, frequency (Kurosawa’s Rashômon, 1950, repeating the same event from different angles) and dislocation of the traditional modes of temporal and spatial representation (Resnais’ L’année dernière à Marienbad, 1961). In each of these films, there is an ever-widening gap between story and discourse. Modern cinema also made possible the flash-forward as the cinematographic equivalent of the prolepsis (Losey’s The Go-Between, 1970); it used jump cuts (Godard’s À bout de souffle, 1960) and non-linear collage, blurred the borders between “objective” diegetic reality and subjective perception (Polanski’s Le locataire, 1976) or reality and dream (Dead of Night, 1945, diverse directors), broke with the narrative convention of character continuity, as when a central protagonist disappears in the course of events (Antonioni’s L’Avventura, 1960) or used ironic forms of interplay of verbal and audiovisual narration (Truffaut’s Jules et Jim, 1962). All of these assaults on traditional narration nevertheless “depend upon narrativity” (or our assumptions about it) and “could not function without it” (Scholes 1985: 396). Even within the context of Hollywood cinema one can find more complex forms of narration, partly, but not exclusively due to influences and directors from Europe, as in the classical period of film noir (Siodmak’s The Killers, 1946; Dassin’s The Naked City, 1948), in the work of Orson Welles (Citizen Kane, 1941; The Touch of Evil, 1958), or in films that are ascribed to New Hollywood in a broader sense (Nichols’ The Graduate, 1967; Scorsese’s Taxi Driver, 1976).
Postclassical cinema, responding to growing globalization in its world-wide distribution and reception, enhances the aesthetics of visual and auditory effects by means of digitalization, computerized cutting techniques, and a strategy of immediacy that signals a shift from linear discourse to a renewed interest in spectacular incidents (see § 3.5).
Editing is one of the decisive cinematographic processes for the narrative organization of a film: it connects montage (e.g. the splitting, combining and reassembling of visual segments) with the mix of sound elements and the choice of strategic points in space (angle, perspective). The most prominent examples in the early history of filmic narrativization are as follows: (a) the simple cut from one scene to another, thus eliminating dead time by splitting the actual footage (ellipsis); (b) cross-cutting, which alternates between shots of two spaces, as in pursuit scenes; (c) parallel montage to accentuate similarity and opposition; (d) the shot-reverse-shot between two persons talking to each other; (e) the “cut-in,” which magnifies a significant detail or grotesquely distorts certain objects of everyday life.
Continuity editing aims primarily at facilitating orientation during transitions in time and space. One basic rule consists in never letting the camera cross the line of action (180-degree rule), thus respecting geometrical orientation within a given space. Whereas continuity editing presupposes a holistic unity in a world which is temporarily in conflict but finally homogenized, Ėjzenštejn’s collision editing accentuates stark formal and perceptual contrasts to create new meanings or unusual metaphorical links (Grodal 2005: 171). For other directors (e.g. Pudovkin), narration in film concentrates not on events being strung together in chronological sequence but on the construction of powerful situations and significant details presented in an antithetical manner of association. “Internal editing,” as advocated by André Bazin, avoids visible cuts and creates deep focus (depth of field), making foreground, middle ground and background equally sharp and thus establishing continuity in the very same take, as is the case in the work of Orson Welles (e.g. Citizen Kane, 1941).
To evoke a sense of the “real,” film creates a temporal and spatial continuum whose components can be separated only for heuristic purposes. “[I]n their succession and fusion they [images] permit the appearance of temporally extended events in their total concrete development” (Ingarden  1973: 324). The temporally organized combination of visual and acoustic signs corresponds to the unmediated rendering of space, albeit on a two-dimensional screen. The realization of a positioned space lies in movement, which imposes a temporal vector upon the spatial dimension (Lothe 2000: 62). Panofsky describes the result as “a speeding up of space” and a “spatialization of time” ( 1993: 22). This also explains the inherent dialectic of film as the medium that appears closest to our perception of the real world, and yet deviating from real-life experience by its manifold means of mediating and establishing a “second world” of fantasy, dream and wish fulfillment. Time can be either stretched out in slow motion or compressed in fast motion; different spaces may be fused by double exposure or by a permanent tension between external and internal time sequences. Thus narration in cinema has to deal both with the representational realism of its images and its technical devices in order to integrate or dissociate time and space, image and sound, depending on the artistic and emotional effect that is to be achieved.
Fulton emphasizes the role of sound in film: “[It] is one of the most versatile signifiers, since it contributes to field, tenor and mode as a powerful creator of meaning, mood and textuality” (Fulton 2005: 108). It amplifies the diegetic space (thus Bordwell [1985: 119] speaks of “sound perspective”) and emphasizes modulation of the visual impact through creating a sonic décor or sonic space. Language, noises, electronic sounds and music, whether diegetic or (like most musical compositions) non-diegetic, help not only to define the tonality, volume, tempo and texture of successive situations but also to orchestrate and manipulate emotions and heighten the suggestive expressivity of the story. Sound can range from descriptive passages to climactic underlining and counterpointing what is seen. Again, what was once considered as a complete break with narrative rules has become a convention, so that when off-camera sounds are used before the scene they are related to, they serve as a “springboard” between sequences.
As Elsaesser and Hagener point out, there is a potential dissociation between body and voice as well as between viewing and hearing which can be used for comic purposes, but which also stands “in the service of narration” (2007: 172–73). A voice may have a specific source in the diegetic space, although separate from the images we see (“voice-off”), or it can be heard beyond the diegetic limits (“voice-over”) (Kuhn 2011: 187ff.). Irritating effects can be achieved when the interplay of voice and vision is used in an unconventional way, as when in a long narrative passage in mainstream cinema the words of an (extra- or intradiegetic) voice are not supported by images at all. Thus Chion, for example, speaks of a “specifically cinematic” event “when the screen doesn’t show what the words evoke, and instead the camera remains exclusively with the talking face of the storyteller and the reactions of onscreen listeners” (2009: 399–400, emphasis in the original). New technologies such as multi-track sound with high digital resolution (e.g. Dolby Surround) negate the directional coherence of screen and sound source, thus leading to tension between the aural and the visual. While the image can be fixed, sound comes into existence from the moment it is perceived.
One of the most controversial issues in film narratology concerns the role of the narrator as an instrument of narrative mediation. This reflects the difficulty of specifying the narrative process in general and, more than any other question, reveals the limits of literary narrativity when applied to film studies.
With the exception of the character narrator and the cinematic device of the voice-over, the traces of a narrating agency are virtually invisible, so that the term “film narrator” is employed as hardly more than a metaphor. Disagreements over terminology sprung up from the beginnings of film theory. Thus the term “film language,” if not used for a system of signs as was done by the formalists, bore the implication that there must also be a “speaker” of such a language. Modeling cinema after literature in this way, however, runs counter to cinema as an independent art form. For this reason, Ėjxenbaum transferred the structuring of cinematographic meaning to “new conditions of perceptions”: it is the viewer who moves “to the construction of internal speech” ( 1973: 123).
The first systematic interest in narratology came from the semiotic turn of film theory starting in the 1960s, notably with Metz’s construct of the grande syntagmatique (1966). In order to overcome the restriction to small semiotic units (e.g. the single shot in cinema), the concept of “code” was used to encompass more extensive syntagmata in film such as sequences and the whole of the narration. In Metz’s phenomenology of narrative, film is “a complex system of successive, encoded signs” (Lothe 2000: 12). Metz’s position was criticized by Heath (1986), who saw in it a neglect of the central role of the viewer in making meaning (Schweinitz 1999: 79). By excluding the subject position of the spectator, a predominantly formalistic approach overlooks the potentially decisive impact of affectivity and subconscious processes. For this reason, psychoanalytic theories concentrated on the similarities that exist between film and dream, hallucination and desire, as important undercurrents of the realist surface. Feminist theories dealt with the gendered gaze that is applied not only in the film itself, but also cast on the film by the viewer, thus creating a conflict between voyeurism and subjugation to the power of images. Studies of popular culture, finally, examined the functioning of cinematic discourse within a wider cultural communicative process which is conveyed by a host of visual signs.
Whether one follows the notion of film narrator or not, and whether or not one emphasizes the role of the spectator in the process of making meaning, the act of audiovisual narration is to be described as an interplay of different visual, auditive and language-based sign systems or codes. Not only the moving picture within one shot (i.e. the process of selection, perspective and accentuation by the camera, or cinematography), but also the combination of shots into sequences (through the process of editing) is of crucial importance for the act of audiovisual narration. When cinematic narration is realized through showing, there is no categorical separation between what the camera shows within a shot and what the editing reveals through the combination of various shots. Quite often the difference from one shot to another is the only indication of a change of state. However, aspects of the mise en scène are also part of the act of narration. Camera parameters as well as parameters of the montage mediate the narrative events and the mise en scène. Thus shot composition, lighting and set design can contribute significantly to audiovisual narration. The same holds true for all elements of sound (see § 3.1.7).
The same change of state (e.g. a collapsing building) can be represented within one shot (hence mediated through the parameters of the camera) or through a combination of two (or more) edited shots (hence mediated through the process of montage). This extends to more complex chains of events. The normal case is a combination of camera and montage supported by other auditive and visual elements of the mise en scène (Lohmeier 1996: 37; Kuhn 2011: 72ff.). Coherent actions and events are often, but not always, separated into different shots, as in shot-reverse-shot sequences to represent a conversation or in cross-cutting sequences to represent a car chase (see § 3.1.5), although there is no necessity to do so. Many events, such as movements of characters within space or even highly eventful incidents like a murder, can be represented within one shot. Complex camera movements can show many connected or episodic actions within one single shot, as in long-lasting sequence shots like the famous opening of Welles’ Touch of Evil (1958), or in forms of “internal montage” (see § 3.1.5). Extreme sequence shots can be found in movies that consist of only one or very few shots, like Hitchcock’s Rope (1948) or Sokurov’s Russkij kovcheg (2002). In contrast, a conventional feature film usually has more than 300 shots. This explains why any approach that takes the camera as narrator—as in the so called invisible-observer models—is as one-sided as the opposite position that overestimates the role of montage or editing in the act of audiovisual narration.
In the 1980s, the more systematic narrative discourse of the Wisconsin School resorted to a cognitive and constructivist approach, defining the narrative scheme as an optional “redescription of data under epistemological restraint” (Branigan 1992: 112). Its main interest lies in a strictly rational and logical explication of narrative and in mental processes that render perceptual data intelligible. Whereas Chatman’s concept of narration is still anchored in literary theory (Booth, Todorov), seeing the visual concreteness of cinema as its basic mark of distinction from literature, Branigan and Bordwell abandon straightaway the idea of a cinematic narrator or a narrative voice. They hold that the construct of the narrator is wrapped up in the “activity of narration” itself, which is performed on various levels: “To give every film a narrator or implied author is to indulge in an anthropomorphic fiction” (Bordwell 1985: 62). The author as an “essential subject” who is in possession of psychological properties or of a human voice is replaced by the notion of narration understood as a process or an activity in comparison to narrative and which is defined as “the organization of a set of cues for the construction of a story” (62) presupposing an active perceiver of a message but no sender. According to Bordwell and Branigan, cinematographic narratives cannot be understood within a general semiotic system of narrative but only in terms of historically variant narrative structures that are perceived in the act of viewing. It follows from this that certain prerequisites of filmic narration are not “natural” or taken from literary models, but have been conventionalized: such is the case when a character’s walk from A to B is shortened to the points of departure and arrival with a sharp cut in between, or when a flashback bridges vast leaps of time, or when non-diegetic music forms no part of the story proper even though it may reflect the inner state of a character or establish a certain mood. The same holds true for the almost imperceptibly varying amount of information that is shared by characters and audience alike.
The effacement of the narrator and the idea that film seems to “narrate itself” stand in contrast to the impression that all visual and auditive modes impart an authorial presence or an “enunciator,” however impersonal. Many different terms and theoretical constructs have been introduced to overcome the logical impasse of having a narration without a narrator in the narrow sense (cf. Griem & Voigts-Virchow 2002: 162; Steinke 2007: 64): “camera,” “camera eye,” “invisible observer” (cf. Bordwell 1985: 9ff.); “intrinsic narrator” (Black 1986); “ultimate narratorial agency” or “supra-narrator” (Tomasulo 1986: 46); “cinematic narrator” (Chatman 1990: 124ff.); “‘camera’” in a metaphoric sense (Schlickers 1997); “film narrator” (Lothe 2000: 27 ff.); “mega-narrator” (Gaudreault 2009: 81ff.); “audiovisual/visual narrative instance” (Kuhn 2011: 83ff.), etc. Kuhn (ibid.) suggests, as a heuristic step in the process of analyzing the narrative structure of feature films, differentiating between “(audio)visual narrative instances” and “verbal narrative instances,” preceding a description of their interplay in the process of audiovisual narration.
What is common to most definitions is the existence of some overall control of visual and sonic registers where the camera functions as an intermediator of visual and acoustic information. The invisible observer theory even maintains that it is the camera that narrates (the French director Alexandre Astruc coined the famous phrase “caméra stylo”). This view, however, ignores the impact of editing, non-diegetic sound and aspects of the mise en scène to the act of audiovisual narration (cf. § 3.2.2). The few experimental films that construct events “through the eyes” of the main character (e.g. Montgomery’s The Lady in the Lake, 1947), thus creating an unmediated presence by means of internal ocularization (cf. § 3.3.1), make the viewer painfully aware of the impersonal and subjectless apparatus of the camera which alienates them from the character rather than drawing them into his ways of seeing and feeling. In recent years there have been more convincing examples for “point-of-view-camera films” that ground the limitations of the apparatus in a specific thematic constellation, as when the subjective camera is to represent the subjective perception of a locked-in syndrome patient (Schnabel’s Le scaphandre et le papillon, 2007) or the perception of a disembodied consciousness (Sokurov’s Russkij kovcheg, 2002) (see Kuhn 2011: 177ff.).
Though there are filmic devices to give a scene the appearance of unreliability or deception, the “visual narrator” in film cannot tell a downright lie that is visualized at the very same moment unless the veracity of the photographic image is put into question (cf. the fabricated, hence “untrue” flashback in Stage Fright, 1950, which director Alfred Hitchcock considered a failure). However, there can be various types of fictional contracts with the audience that transcend the postulate of narrative verisimilitude, allowing even a dead person to tell his story as a “character narrator” (Wilder’s Sunset Boulevard, 1950; Mendes’ American Beauty, 1999), or when a film is built around a puzzle, putting into question any form of reliable narration (a summary of “unreliable situations” in cinema is given in Liptay & Wolf eds. 2005, passim; Helbig ed. 2006, passim; Laass 2008, passim; Shen → Unreliability). Recent cinema has seen a variety of forms that can be subsumed under the term of unreliability in a broad sense, e.g. films that make use of the tension between verbal and visual narration, between what Genette calls internal and zero focalization or between different diegetic levels in order to achieve different effects of unreliability. Very often such films get along without misreporting in terms of “lying pictures” (i.e. pictures that provide erroneous information about the storyworld) by using forms of irritating, ambivalent or misleading editing or different types of underreporting. However, nowadays one can also find forms of unreliable narration that contain “lying pictures” such as those used by Hitchcock in Stage Fright but that are embedded in more complex narrative structures, such as the multi-level flashback structure of The Usual Suspects that creates a tension between what Kuhn (2011) calls intradiegetic, homodiegetic verbal and extradiegetic, heterodiegetic visual narration.
Point of view (POV) clearly becomes the prime starting point for narratology when applied to film. Although it has been defined as “a concrete perceptual fact linked to the camera position” (Grodal 2005: 168), its actual functions in narrative can be far more flexible and multifarious than this definition suggests. As Branigan states, point of view can best be understood as organizing meaning through a combination of various levels of narration which are defined by a “dialectical site of seeing and seen” or, more specifically, the “mediator and the object of our gaze” (1984: 47). Branigan offers a model of seven “levels of narration” which allows for constant oscillation between these levels, from extra-/heterodiegetic and omniscient narration to adapting the highly subjective perception of a character. Fulton speaks of a “multiple focalisation” that is “realized by different camera angles that position us to see the action from a number of different viewpoints” (2005: 114). Yet there are many more focusing strategies which select and control our perception as well as our emotional involvement such as deep-focus, the length and scale of a shot, specific lighting, etc. The prerequisite for any POV analysis, however, is the recognition that everything in cinema consists of “looks”: the viewer looks at characters who look at each other; or s/he looks at them, adopting their perspective of the diegetic world, while the camera frames a special field of seeing; or the viewer is privileged to look at something out of the line of vision of any of the characters. Thus the very question “Who sees?” involves a categorization of different forms of POV that organize and orient the narrative from a visual and spatial standpoint and that also include cognitive processes based on a number of presuppositions about a proper perspective, not to speak of auditory information. Therefore, in almost every narratological model of focalization and narrative perspective, the camera perspective (in a technical sense) is not understood as the only factor for determining focalization and/or narrative perspective (focalization/narrative perspective ≠ camera perspective). To analyze focalization, one has at least to take into account the complex interplay between camera parameters, montage and auditive elements. The question of focalization in film becomes even more sophisticated in the case of voice-over narration, as there is the possibility of different forms of interaction and/or tension between verbal and audiovisual narration.
POV has been understood as an optical paradigm or, quite literally, as visual point (or “eyepoint”): it is “ocularization” that is believed to determine both the position of the camera and the “look” of a character. Schlickers speaks in this respect of a “double perspectivation” (2009). In many cases it seems almost impossible to come to a clear conclusion whether the camera imitates the eyepoint of a character (i.e. the literal viewpoint as realized in “eye-line matches”) or whether it observes “from outside” in the sense of narrative mediation. So we may see something “with the eyes” of a character whose back is visibly turned to us (“over-shoulder shot”) or of a character who tries to grasp a tangible object that dissolves in the air like a hallucination, as is the case in Lang’s Die Nibelungen (1924) when the Nibelung treasure appears to Siegfried on a rock. Jost suggests distinguishing between internal focalization and zero focalization ( 1989: 157) whereas Bal differentiates between focalization on “perceptible” objects and focalization on “imperceptible” objects ( 1997: 153). Both alternatives, however, neglect the possibility of the blurring of the two types of focalization. Moreover, it makes a difference whether we are to gain an impression of what a character feels and thinks or whether the film seeks to present “objective” correlatives of the mental and emotional dispositions of a protagonist. The possible mingling of “real” and mental aspects makes it difficult to differentiate. Focalization can shift all around its diegetic world (Fulton 2005: 111) without any noticeable breaks in the narration or any unconventional narrative techniques. Though narratology possesses tools for analyzing these shifts, the categories used for film analysis seem to be far more complicated than those employed for literary narration. Kuhn (2011) developed a model for fine-grained analysis of focalization, ocularization and auricularization on the macro- and micro-levels. He understands focalization in terms of knowledge, i.e. the relation of knowledge between (audiovisual and verbal) narrative instance and character, and separates it from questions regarding perception in a narrower sense. In the context of the visual aspects of perception (seeing), he uses the term ocularization, and for the auditory aspects (hearing), the term auricularization. Based on the models by Jost ( 1989) and Schlickers (1997), but with more differentiated categories, Kuhn (2011: 122ff.) defines each internal, external and zero focalization, ocularization and auricularization, describes the main types that can be found in feature films and relates different forms of internal ocularization to Branigans model of point of view structures (Branigan 1984: 103ff.; Kuhn 2011: 140ff.). To reveal the capacities to represent subjectivity and mental processes in film, i.e. the possibility of character introspection in film, Kuhn identifies several forms of “mindscreen” and proposes categories such as mental metadiegesis, mental projection, mental overlay and mental metalepsis as heuristic tools (149ff.).
Films and audiovisual artifacts such as Fassbinder’s epilogue to Berlin Alexanderplatz (1980) are characterized by a complex interplay of different audiovisual and verbal narratives or, in terms of a communication model, by an interplay of different narrative instances or agents. Next to visual narration, various verbal narratives are employed on the extradiegetic level (in the form of various voice-overs, intertitles, and text captions). Every extradiegetic verbal narrativeinstance can be either heterodiegetic or homodiegetic in its relation to the diegetic world. Each of them can focalize differently and be in opposition to the audiovisual focalization.
There is, in general, no categorical relation of dominance between visual and verbal narration in film, no primacy of the image. The verbal narrative is not automatically superior to the visual narrative or vice versa. A bulk of different relations is possible: the reliable extra-heterodiegetic visual narrative instance can, for example, uncover the unreliable extra-homodiegetic verbal narrative instances(Mankiewicz’s All about Eve, 1950). However, the visual narrative instance might also be unreliable (Fincher’s Fight Club, 1999), or its reliability can be called into question with the help of verbal narrative instances (Kurosawa’s Rashômon, 1950). An extradiegetic verbal narrative instance possibly dominates the visual narrative instance and reduces it to an illustrating function (the opening of Anderson’s Magnolia, 1999); however, it can also just serve to structure what the visual narrative instance shows, order it in time and space or summarize the back story (expository voice-overs, intertitles indicating the action’s setting in silent movies). The relation can be alternating and ironical, as in Truffaut’s Jules et Jim (1962), or ambivalent, as in Resnais’ L’année dernière à Marienbad (1961). In silent movies this interplay is also encapsulated in a complex way because of different methods of speech representation, such as reports by a narrator or quoted direct speech in intertitles.
To illustrate the interplay of verbal narration and visual images in film, Kozloff (1988: 103) suggests “a continuous graph” comprising three areas: “disparate,” “complementary,” “overlapping.” She does not introduce either binary or clearly delimited categories, speaks rather of the “degree of correspondence between narration and images”—a reasonable proposal because distinct boundaries cannot be drawn. Kuhn (2009: 265–66; 2011: 98ff.) has suggested some new and useful modifications to Kozloff’s categories so as to develop a model for describing the dynamic relations between visual and verbal narrative instances as contradictory, disparate, complementary, meshing, polarizing, illustrating or paraphrasing.
Since the mid-1990s an increasing number of popular mainstream films have made use of several special devices of audiovisual narration in order to achieve dense and complex narratives and/or create suspense through narrative discourse rather than through their storylines: the conventions of classical filmic narration are subverted and/or become the subject of a self- and media-reflexive game through the use of multiple narrative levels (Amenábar’s Abre los ojos, 1997; Jonze’s Adaptation, 2002), different forms of narrative unreliability (Singer’s The Usual Suspects, 1995), sudden final twists (Shyamalan’s The Sixth Sense, 1999), creative use of genre conventions (Tarantino’s Pulp Fiction, 1994); and/or intertwined film-in-film and narrative-in-narrative structures (Almodóvar’s La mala educación, 2004), etc. Encapsulated and fast-changing processes of focalization are used to build puzzle and mystery structures (Marcks’ 11:14, 2003) or to deceive the recipient (Colombani’s À la folie … pas du tout, 2002). A “real” diegetic character turns out to be a mental metalepsis at the end of the film (Fincher’s Fight Club, 1999; Howard’s A Beautiful Mind, 2001); two diegetic levels (realty vs. dream) are being reappraised during the film (Amenábar’s Los otros, 2001); the circumstances of production are simulated within the film in a self-reflexive manner (Kraume’s Keine Lieder über Liebe, 2005).
When discussing these forms of narration in feature films of the 1990s and 2000s, one should not forget that movies with self-reflexive, paradoxical and ambivalent narrative structures are not entirely new (cf. § 3.1.3). However, the frequency with which many of these narrative experiments are found in popular feature films nowadays—and also increasingly in popular TV series (Lost, Breaking Bad)—cannot be denied (Helbig 2005: 144).
What is pointed out in the previous section also holds true for many narrative phenomena that can be regarded as trends in recent cinema and TV. For instance, we can find the phenomenon of metalepsis in films like McTiernan’s Last Action Hero (1993), where a character of the diegetic storyworld happens to get into a metadiegetic action film and returns back to diegetic reality accompanied by the action hero of this film-within-a-film, or in Gary Ross’s Pleasantville (1998), where characters of a contemporary diegetic world get lost in a metadiegetic black-and-white TV series of the 1950s. These kinds of structures have forerunners in film history: as early as 1924, in Buster Keaton’s Sherlock Jr., the main character, a film projectionist, “dreams himself into” the movie he projects. In Allen’s classic The Purple Rose of Cairo (1985), a metadiegetic character jumps out of the screen to live within the diegetic world (Pier → Metalepsis). The same applies to phenomena of mental representations (“mindscreen,” mental projections, mental metadiegeses, etc.). Creative forms of representations of subjectivity that nowadays appear in the micro-structure of movies like Jeunet’s Le fabuleux destin d’Amélie Poulain (2001) or in the macro-structure of movies like Nolan’s Inception (2010) can be compared with examples throughout film history: in Murnau’s classic Der letzte Mann (1924) one can trace specific forms of representing dreams and hallucinations due to heavy use of alcohol; memory and dream sequences are as typical of Bergman’s Smultronstället (1957) as hallucinatory sequences of Liebeneiner’s Liebe 47 (1949) or ambivalent delusions of Polanski’s Le locataire (1976).
Given these and (many) other examples, hypotheses on narrative “trends” in recent cinema and TV should be modified with regard to historical development. A historical film narratology will seek to identify these narrative forms and devices throughout the history of the film on the basis of existing systematizations and classifications and describe their geneses. The international influence of classical Hollywood cinema (Bordwell et al. 1985) was one of the main reasons that for quite a long time of film history, narrative experiments that are regarded as innovative even today could hardly be found in US-American and European mainstream cinema. On the one hand, many prototypes of experimental and complex narration, as used in recent feature films, also appear in earlier periods of film history beyond the Hollywood cinema (even quite early in the history of the feature film). On the other hand, however, there are numerous new possibilities for achieving narrative effects with the help of film and computer technology, notably the creation of visual effects using digital devices. Digital effects are more than just a surprising “gimmick” when being functionalized for different aspects of narration (cf. Kuhn 2012a). This is not the only reason why more innovative narrative forms have come to be regarded as verisimilar; another reason is the increasing speed and flexibility of recent filmic narration, which is currently a major trend. Due to developments in media convergence, transmedia storytelling, digital cinema and so-called quality or complex TV, the narrative capacities of film and audiovisual media are by no means exhausted.
(a) Film portrays a story unfolding in time according to the possibilities and constraints of the medium. Various levels of structuring, perception and cognition, many of them rooted in convention, are related to a logic of combination which determines the basic qualities of filmic narration. This paves the way for two approaches which should be tried in fruitful competition. Either the complexity of paradigms can be reduced to a model of abstraction, which makes it possible to compare narrative processes in literature, film, and other media; or there must be an attempt to analyze the multiple forms of interplay that stem from the mediality of filmic narration, the double vantage points of seeing and being seen, sight and sound, spatial and temporal elements, moving images and movement within the images.
(b) If narrative is a fundamental issue in filmic signification, its logic must be re-examined with new ways of storytelling in cinema that play games or lead the viewer into a maze of ontological uncertainties. Narrativity, spectator engagement and inventive techniques of presentation combine to produce a “filmic discourse” which a synchronic formal analysis of narrative strategies can grasp only up to a certain point. A diachronic approach should discuss current forms of filmic narrative against the background of the historical developments of film narration, inseparably interwoven with the achievements and capacities of the medium (cf. § 3.6).
(c) Film is not bound to cinema, at least since TV became popular enough to reach a mass audience. Nowadays one finds audiovisual forms of narration in many different kinds of distribution (videotape, DVD, online-stream, Blu-ray; cf. § 3.1.2) embedded into different media environments (homepages, YouTube and other video platforms, Facebook, etc.). New, genuine online-based forms of audiovisual narration are being developed such as specific YouTube genres or web series (see Kuhn 2012b). Accompanying the proliferation of user-generated content, numerous creative audiovisual micro-narratives have been published (e.g. mash up clips on video platforms that narrate in a dense and highly intermedial way). Computer games increasingly make use of audiovisual sequences (so called cutscenes as in Heavy Rain). Not least, filmic forms are essential elements of huge transmedial storyworlds in which the central storylines are not developed within one but across multiple media (this is, for example, the case of the web series Lost: Missing Pieces that complements the transmedial storyworld of the TV series Lost, surrounded by a vast storytelling universe encompassing different media).