An image of a butterfly perched on a branch, framed in a square, surrounded by seemingly energised earthy elements which are seen in a pixellated form.

Nidia Dias & Google DeepMind / https://betterimagesofai.org / https://creativecommons.org/licenses/by/4.0/

Across the UK, cultural organisations produce an extraordinary range of work from West End shows to grassroots community dance. Yet without a standardised way of classifying performances, it’s hard to compare programming, track trends, or measure cultural impact at scale. 

A UK-wide event and artform classification system makes it much easier to answer those big questions around how attendance and audiences differ across the UK by: 

  • Creating a common language for comparing data across venues, producers and regions.
  • Supporting sector-wide benchmarking to understand audience behaviours and programming gaps.
  • Enabling evidence-led funding cases by linking programming choices to social and economic impact.
  • Making it easier to identify and support under-represented genres. 

With the thousands of performances being collected into the Audience Answers Data Trust every day, The Audience Agency (TAA) has worked with the Alan Turing Institute to develop Event & Artform Categorisation, a scalable, ethical machine learning approach that classifies every performance in the Data Trust into a consistent two-tier structure. 

A Decade of Trusted Data & Our Unique Architecture

Since 2015, TAA has stewarded the world’s largest anonymised cultural dataset, held in our dedicated Data Trust and powered by a purpose-built data architecture. This infrastructure enables us to securely ingest, anonymise, store and process vast ticketing, survey and other datasets, without exposing personal or commercially sensitive information. 

By protecting data in this way, we can deliver powerful organisational and sector-wide insights, guided by our Data4Good principles and for the benefit of those who contribute, but also the wider sector. 

Historically, performance records from across the UK were manually tagged, by organisations themselves or the TAA team. To scale this work responsibly, we partnered with the Alan Turing Institute to develop Event & Artform Categorisation, a machine-learning pipeline that automates two-tier classification of performances. The system is backed by human review and integrated into our unique, secure data infrastructure.  

An Ethical Foundation for Machine Learning

Event & Artform Categorisation isn’t just a technical solution, it’s underpinned by safeguards that ensure the process remains transparent, fair, and trusted across the cultural sector. Our four guiding principles are: 

  1. Ethical AI:  Every stage of the categorisation process is assessed through our AI Impact Assessment, which considers privacy, ethical, and commercial risks. This ensures that automating performance classification remains proportionate, keeps a human in the loop, and protects the rights and security of contributing organisations.
  2. Responsible AI: Our categorisation model is designed to be explainable, so venues and partners can understand why a performance has been classified in a particular way. This transparency, combined with fairness checks, gives organisations confidence and control over how AI supports their programming analysis.
  3. Safety in AI: Any tags generated with less than 80% confidence are flagged for review by our curators each week. Their feedback is fed back into the model, improving accuracy over time while preventing low-confidence classifications from influencing sector data.
  4. Governance in AI: We have clear oversight structures for Event & Artform Categorisation, from how models are selected and trained to how they are monitored and updated. This governance ensures decisions are not made in isolation and that the categorisation process continues to serve both ethical commitments and strategic sector goals. 

Guided by the Alan Turing Institute

Our collaboration with the Alan Turing Institute, through the Innovate UK BridgeAI programme, brought invaluable independent expertise to the development of Artform Categorisation . The Alan Turing Institute is the UK’s national institute for data science and artificial intelligence, advancing research and innovation to solve real-world challenges. Working with our assigned Independent Scientific Adviser (ISA), we were able to: 

  • Compare multiple models, costs, efficiencies, and ethical implications to ensure the most suitable approach for the cultural sector.
  • Design a robust two-stage categorisation pipeline to maximise accuracy and transparency.
  • Integrate bias-detection processes to surface and address under-represented genres.
  • Conduct rigorous quality assurance and testing to validate outputs before operational roll-out. 

As featured in the BridgeAI Year Two in Review report, Stephen Miller, CTO of The Audience Agency, reflected on the value of this partnership: 

Our assigned ISA has been instrumental in guiding us through a complex AI challenge, offering both confidence and strategic direction. His expert technical support has empowered our Data Product Analyst in leading ML discovery, while also upskilling the wider team in this process. His ongoing insights and critical signposting are shaping our AI approach—not just for our organisation, but for the broader arts and culture sector.

Stephen Miller, CTO of The Audience Agency

You can read more about our work in the BridgeAI Year Two in Review 2024–2025 report, which highlights how organisations across the UK are operationalising responsible AI with the support of the programme. 

From Pilot to Full Roll-Out 

Following the development of the model, we then operationalised the process to support categorisation of the full dataset: 

  1. Collecting: Daily data flows into our Data Trust, undergoes automated anonymisation, quality checks, and standardisation. 
  2. Categorisation: Once the new data is stored, our overnight categorisation service uses the new models to tag the performances against the Event & Artform taxonomy.
  3. Reviewing: Analysts devote time to monitor the output and quality assure any low confidence categorisation or performances highlight for human review.
  4. Insights: We purposefully separate those tags conducted by humans, to those tagged by our ML models within the Audience Answers datasets. This provides us with the necessary protections around data integrity and source of truth.   

With Event & Artform Categorisation validated, we’re rolling it out across all five million performances and 323 million tickets in our Data Trust and are actively looking at how this new information can be delivered through Audience Answers and our consultancy work. Look out for our TEA Break in September where we'll share our findings about five-year artform trends.

Event & Artform Categorisation is not just about automation, it’s about empowering the cultural sector with clarity, consistency and the confidence to make data-driven decisions that amplify cultural impact.