Demonstration Part Three: Assessing the Importance of Television Character Popularity
Computer vision is revolutionizing the way we approach and understand on-screen representation in the media industry. This innovative technology allows for large-scale, objective, and granular quantification of representation patterns that would be impractical to compile manually.
In a forthcoming issue of ViewFinder, Learning on Screen's specialist online magazine dedicated to the moving image and education, the relationship between AI and audiovisual media will be explored. The issue will demonstrate two examples of measuring character prominence in broadcast TV using computer vision.
The first example uses an episode of Mock the Week, a popular British comedy panel show, to show how screen time can be used to measure prominence. Prominence is defined as time spent on screen (as a clear and big enough face), with longer duration indicating higher relative prominence. A face detector, a type of machine learning model, is used to identify faces on screen. The video is broken down into smaller components called 'scenes', and faces are indicated with a rectangular 'bounding box' and the sequence of detected faces are called 'face tracks'.
The second example uses an episode from the American sitcom Black-ish to test the feasibility of generating character prominence metrics when there is a greater variety of camera angles and face sizes. Despite the challenges, the demonstration successfully illustrates how computer vision can be used to measure relative prominence on screen.
The methods can be extended to widen the evidence base around on-screen representation for diversity leads and monitors, content producers, editors and commissioners, and researchers. This empowers a far more comprehensive, objective, and scalable approach to understanding on-screen representation that benefits multiple roles across the media production and research ecosystem.
The potential applications for various stakeholders are vast. Diversity leads can use computer vision to monitor and report on representation metrics across media output, identifying biases or gaps in onscreen diversity efficiently and continuously, facilitating data-driven inclusion strategies. Content producers can gain insights into casting and character representation patterns to make more inclusive creative decisions backed by quantifiable evidence rather than anecdote. Editors and commissioners can employ automated tools to audit visual content for diversity compliance or editorial goals before release, improving accountability and transparency. Researchers can leverage computer vision to conduct large-scale studies on media representation trends over time and across platforms, combining visual grounding with natural language processing for contextual analysis.
Technically, this involves integrating advanced vision-language models (VLMs) and multimodal AI systems that can interpret scenes, identify individuals visually, and associate them with textual metadata or captions. Models like FastVLM enable efficient processing of high-resolution images and videos, crucial for detailed analysis of faces, characters, and UI elements. Multimodal models can handle diverse input types, including imagery and text, facilitating richer extraction of representation data.
In addition, tools using computer vision techniques such as Hough transforms and other image processing methods can detect and segment visual elements systematically. The combination of these methods with generative AI can support creating synthetic data to improve model training for specific domains.
Raphael Leung, a Data Science Fellow at Nesta, and Bartolomeo Meletti, the Creative Director for CREATe at the University of Glasgow, have been instrumental in this work. Despite the limitations, the illustrative demo shows how computer vision can be used to measure relative prominence on screen.
The UK's departure from the EU has changed the way that British firms trade and work with European counterparts, as detailed in a report on post-Brexit migration and accessing foreign talent in the Creative Industries. A new scoping study on the economic consequences and potential market failures of overseas mergers and acquisitions in the UK video games industry is being conducted by the BFI.
It is important to note that representation is just one part of inclusion, and inclusion is far more than a numbers game. The ultimate goal of the measurements discussed is to ensure that no group of people persistently feels mis-/under-represented by our mass media, as inclusion "fuels creativity, engages new audiences, and makes good business sense."
A report on the migrant and skills needs of creative businesses in the UK details the results of a survey of employers commissioned by the Creative Industries Council. A new BAFTA diversity steering group was established in 2020 and several broadcasters renewed their inclusion and diversity commitments.
In conclusion, computer vision is a powerful tool that can significantly widen the evidence base of on-screen representation. By automating the analysis of visual content from videos, images, and user interfaces, we can detect and classify appearances related to diversity, providing large-scale, objective, and granular quantification of representation patterns. This technology has the potential to benefit multiple stakeholders across the media production and research ecosystem, from diversity leads to researchers, and from content producers to editors and commissioners.
- The analysis of character prominence in broadcast TV using computer vision will be demonstrated in an upcoming issue of ViewFinder, revealing its applicability in measuring representation patterns.
- The methods used, such as face detection and tracking, can be extended to create a broader evidence base for on-screen representation, benefiting diversity leads, content producers, editors, commissioners, and researchers.
- Advanced vision-language models and multimodal AI systems are essential for detailed analysis of faces, characters, and UI elements, enabling data-driven inclusion strategies and large-scale research on media representation trends.
- Raphael Leung and Bartolomeo Meletti have been instrumental in this work, demonstrating the potential of computer vision to measure relative prominence on screen.
- Representation is one aspect of inclusion, and these measurements aim to ensure that no group persistently feels mis-/under-represented by our mass media, as inclusion fuels creativity, engages new audiences, and makes good business sense.