Main-01.png Read-01.png Recipes-01.png Tactics-01.png Cases-01.png Tools-01.png

Prototyping Youtube: Comparing Memorialisation Videos

Team Members

David Moats. Alex Harrison, Aliya Mirza, Julia Rone, Sasha Scott, Vic Williams


This group set out to "prototype" a tool for studying Youtube, as all-purpose programmes like Netvizz (for Facebook) and TCAT (for Twitter) are not currently available. The YouTube API has also changed recently making existing tools such as Lexi-Web, which could be used to scrape comments difficult to use. We were also particularly interested in Youtube for the opportunity to analyse video content as well as text - either in terms of the visual content or the metadata attached to it.

The Case Study

Sasha's work relates to the phenomenon of memorialisation videos. In particular he drew our attention to videos of Neda Salehi, whose killing (20.06.2009) during a riot was captured on camera and endlessly circulated on youtube. How do events like these become media rituals (Couldry 2005)? Is there an element of ritual repetition? If so what practices get repeated?

Event:*please not the video contains graphic and upsetting content.

The first question we we were faced with was what are the boundaries of this analytic object? Which videos or external websites definte the case and which are not relevant to the case? In this endeavour we found ourselves highly dependent on youtube itself. The Youtube “related videos” sidebar for example pointed us to a few relevant news stories and responses of the original video, but it was was a very ineffective way to explore the case because it is not only a black boxed 'relevance' algorithm but also based on personal viewing patterns. We realised we could hardly explore this case until we first understood the various properties of youtube videos through which an object might be discovered or delineated. Is a particular memorialised object defined through the grouping of users? types of commenting practices? properties of videos?

While the controversy was quite large and nebulous, there were a number of duplicate videos which we could confidently assume were part of the object and so we decided to compare them to see what connected (or divided them). By looking at the "same" video, the properties of youtube could be brought into focus.

Objects of Youtube

We proceded to investigate this case from two directions: 1) qualitatively investigating the videos through the youtube platform and 2) investigating the various digital objects which may or may not be able to be quantified or mapped in various ways. As we proceeded we made a non-exhausive list of elements in the user interface, some of which which could help profile videos and eventually delineate cases.
  • related videos list (tailored to user)
  • likes of the video
  • 'about' video text
  • video category
  • duplications of the video(success of videos)
  • content warning attached to a video
  • sharing of videos
  • recontextualising
  • response videos
  • reporting videos
  • comments
  • comments voting
  • front page top comments

From this long list we then examined what was potentially significant for this particular case.

Video Metadata

First, we noticed the very different qualities of the nearly identical videos, the size pixilation etc. Are certain types of videos seen as more authentic? Were cleaner, crisper videos instead more popular? Could different communities of users be profiled and identified by the way they chose to upload or customise the videos, different file formats etc?

Although this is hard to measure, some information could be gleaned from the API about them which included the following metadata properties. To get at the tech data: command line tool MediaInfo?.fo provides unstructured text data

neda videos plots.png

There were not enough videos for meaningful trends or relationships to be determined but this represented one area for future work, especially when compared to other variables like the nationality of posters.

Alex Wilkie suggested that we could not only look at the meta data but also the video itself. This could be done using a webcam and max msp to do pattern recognition on the videos, then compare to other varibles.

Character of Discussions

Next, we investigated, qualitatively the variable character of discussions. Some discussions were about memorialisation and expressions of grief (i.e. - RIP) while others were firey debates about Islam and the war on terror. A classification scheme for content analyses of YouTube? video comments (2012) has already been published by Amy Madden, Ian Ruthven and David McMenemy?. Their paper outlines a general template into which all YouTube? comments can be posited. But this scheme is a formal analysis which deals with youtube generally not the extreme case of memorialisation.

By studying the reply comments posted in response to the videos recommended by YouTube?, we wrote a new classification system tailored specific to the comments we read responding to videos related to the death of Neda Agha-Soltan. This classification system can potentially be used in conjunction with the geographic data of users (see below) in order to identify trends and differences between the types of comments and the accompanying geographic regions specified by users.

For the sake of keeping things simple, and so as to leave open the option of conjoining the comment typology with our data on user’s stated geographic location, we used the following classification scheme applied to two videos as an example:

























Comment on commenters





Claims Neda was traitor



Sexism / slander



Ambiguous / Difictul to categorise





1 (directly, some could be inferrred in more comments)



This codign scheme of course had overlapps and could be developed further, but next we were curious if aspects of these debates, for example their controversiality or anger could be detected quantitatively as well.

Commenting Patterns

The qualitative coders noticed a few properties of comments which could identify some of these schemes. The length of comments, for example could indicate a diatribe, often a response to a shorter pithy comment which signals the start of a long flame war. Obviously punctuation or capitalisation could also indicate different emotions. But we were mainly interested in patterns of interactions because they contained less assumptions about the properties of language.

Given our limed access to data (we had been using Lexi-Web's comment scraper) we first attempted to graph the frequency and rhythms of posts. This involved converting the time stamps of comments into machine readable unix time using the unix epoch batch converter and importing it into Mondrian.

Screen Shot 2014-05-16 at 14.29.29.png

Unique users on each row, comments by timestamp on x axis.

The above graph did not place the comments into real time, unfortunately, but it would in theory help identify users that posted frequently (along the same horizontal line) perhaps in patterns which indicated a dispute of some kind.

Another graph we would have liked to have visualised was a reply network to see if users would cluster together around certain topics. Unfortunately the current API seems to only give the name of the user replied to, not the unique ID of the comment. Below is a visualisation created with different data from the old API and visualises comments in time

Screen Shot 2014-05-16 at 15.41.41.png

Youtube Comment Reply Network (different data) Time is vertical axis, horizontal axis is modularity cluster (discussions are grouped in vertical lines). Lines are link from comment to comment replied to.

This visualisation reveals the time lag of certain discussions: usually comments are replied to quickly but then discussions are revived later (long links). We speculate that this is because of the top comments function which brings most commented or liked comments to the top of the page, therefore drawing user attention to historical debates.

Nationality of Commenters, and Sharing

Finally, we considered that these videos may have very different lives outside of youtube and suggested that they could be profiled based on where they are shared. We used websites like shared count to check social media shares of the video and manual searching using google url:____ to collect types of websites hosting the video.

These are some examples of sites in which the video was shared.

sites where the video was shared:

George Bush’s legacy should endure forum:


Iran Free Republic

Free Republic

Orange County Register Video of Neda Shooting in Iran

So the video was being used not only by Islamic militants and freedom fighters but by conservative americans as an anti-islamic video! This caused us to re-examine the comments and the nationality of users which we mapped using a dorling diagram:

The spread of nationalities was quite wide for each version of the video which suggests that the videos all have an international audience rather than specific audiences.

Further Work

All though many of these avenues of investigation led to provisional results and non-working visualisations we did learn how valuable it was to pursue the qualitative and the quantitative simultaneously, though in our case they barely met in the middle. In future we would follow Alex Wilkie's suggestion to spend more time with the user interface (the front end) and its properties, rather than letting our investigation be overdetermined by the API (the back end).
Topic revision: r7 - 26 May 2014 - 18:45:48 - DavidMoats