SIL 801, Fall 2016: Media analysis

One of the most rapidly changing industries of today is media. Blogging enabled every user to engage in content production, which had otherwise been a monopoly of large media houses. Social networking platforms like Twitter and Facebook accelerated this velocity of information production and flow even more, by making it easier for users to share information with each other. Companies like Buzzfeed have taken it even further by developing viral optimization techniques which help route relevant information to users who need it. The lines have indeed blurred between news producers, curators, and commentators. In this course, we will take a multi-layered approach to make sense of the changing media landscape and its implications. Some topics we will study, include:
- Political bias in mass media Vs social media. Is social media helping make the debate more complete?
- Interaction between mass media and social media. Is social media a distribution channel for mass media? A curation and commentary channel?
- Modeling of information flow on different media. What kind of information flows where and when in the network? Can information flows be artificially regulated?

We will start with studying the basics of social network analysis, graph models, timeline analysis, tools for text analysis of content, entity resolution, and read multi-disciplinary papers from recent conferences and journals in computer science, political science, sociology, and media.

In the latter part of the course, we will apply some of the tools and techniques we have learned, to analyze political and economic networks. We will see how concepts like competition and political favouritism can be modeled in network terms, and malpractices can be spotted through an analysis of pricing data and news reports.

This will be a project based course, students will work in groups and implement largescale data collection and analysis systems, or write termpapers on some specific topics.


Structure of mass media in today's society: link
  • Examining Bias and Distortion in Mass Media in America
  • The Times of India and Consumerism in the Indian News Media – a Harmful Trend?
  • The Illusion of Choice
  • Who Owns your Media?
  • The Structure and Dynamics of Global Multi-Media Business Networks
  • Procter & Gamble, mass media, and the making of American life
  • How structure shapes content, or why the ‘Hindi turn’ of Star Plus became the ‘Hindu turn’
  • Fair and Balanced? Quantifying Media Bias through Crowdsourced Content Analysis

Influence of the Internet in opinion formation: link
  • The Political Blogosphere and the 2004 U.S. Election: Divided They Blog
  • "Power Laws, Weblogs, and Inequality
  • “Googlearchy”: How a Few Heavily-Linked Sites Dominate Politics on the Web"
  • Media landscape in Twitter: A world of new conventions and political diversity
  • Managing Political Differences in Social Media
  • Political Polarization on Twitter
  • How Online Gatekeepers Guard Our View – News Portals’ Inclusion And Ranking Of Media And Events

Interaction between mass media and social media: link
  • More Voices Than Ever? Quantifying Media Bias in Networks
  • Exposure to ideologically diverse news and opinion on Facebook
  • Hybrid spaces of politics: the 2013 generalelections in Italy, between talk shows and Twitter
  • The interaction between mass media and the internet in non-democratic states: The case of China
  • Consumers and Suppliers: Attention asymmetries. A Case Study of Aljazeera’s News Coverage and Comments
  • A 61-million-person experiment in social influence and political mobilization
  • "Protests by the young and digitally restless: the means, motives, and opportunities of anti-government demonstrations
  • Social media use and participation: a meta-analysis of current research"
  • The Party is Over Here: Structure and Content in the 2010 Election

Modeling of information flow on social networks: link
  • Information Contagion: an Empirical Study of the Spread of News on Digg and Twitter Social Networks
  • The Pulse of News in Social Media: Forecasting Popularity
  • Spatial Influence vs. Community Influence: Modeling the Global Spread of Social Media
  • The Role of Social Networks in Information Diffusion
  • The Lifecyle of a Youtube Video: Phases, Content and Popularity
  • Information Diffusion in Online Social Networks: A Survey
  • Inferring Networks of Diffusion and Influence
  • Information Evolution in Social Networks
  • Campaign Optimization through Behavioral Modeling and Mobile Network Analysis
  • Audience Analysis for Competing Memes in Social Media


Applications on media and social media analysis: link
  • Encouraging Reading of Diverse Political Viewpoints with a Browser Widget
  • Putting news in context, automatically
  • The Future of Journalism: Networked Journalism

Useful tools and techniques: link
  • Creating coding schemas for content analysis
  • Machine learning, what kinds of classifiers to use when
  • Information retrieval basics
  • Social network analysis techniques
  • Entity extraction and resolution


Project ideas

How are media outlets different from each other in terms of the coverage they give to different issues and personalities? When analyzed longitudinally, can we spot any trends in terms of left-leaning, right-leaning, pro-govt, anti-govt, etc?

Build a live media monitor, which gets the latest articles from mainstream Indian news sources, runs them through an entity extractor, and shows how much coverage different media outlets are giving to different entities. The same entity types can be shown together to allow comparison, such as coverage given to various political parties or coverage given to various politicians. An enhancement can be later developed to show sentiment as well. Topics can also be mined by looking at prior corpuses to indicate the amount of coverage given to crime, environment, politics, etc.


Development of an opensource tool

Build a crawler for regional media websites and extend entity extraction engines to work with regional languages. Expose an API so that others can use your base framework to run further analysis.


Are there echo-chambers in Indian social media? When analyzed longitudinally, can we spot any trends in terms of which events catch the fancy of social media users, which events lead to more or less polarization?

Find social media community structures in India and India-related aspects. Start with assembling a seed set of popular blogs and Twitter celebrities, then crawl outwards. Analyze the link structure using various social network analysis techniques. Analyze the content using entity extractors and topic mining tools and sentiment analysis, to see what kind of relationships you can find between the link structures and topics being discussed.


To what extent do people actually engage on social media to debate issues, is there a consistency in the media sources they comment on, through discussions do the people help cross-connect between different media sources?

Analyze Twitter and social media feeds of mass media outlets, by crawling the follower network outwards. Quantify actual evidence of people re-tweeting or forwarding messages in their network, and if people actually comment on articles from multiple mass media sources on the same topic. Other ways of assembling a Twitter dataset can also be used such as by crawling the follower network of celebrities, and then analyze the tweets to identify those which are talking about specific articles published by the mass media.


To what extent do ownership networks of media companies influence the slant and bias in their coverage?

Pick a few hot topics of debate, especially commercially relevant, such as net neutrality, social spending on welfare schemes, GST, etc or political topics like the debate on intolerance. Manually content-analyze the articles published in different media sources and code them on their slant and different types of biases for/against various perspectives. In parallel, find details from the Ministry of Corporate Affairs and other websites on the ownership structure of different media sources and their links with politicians/bureaucrats/companies. See if you can spot any trends in the coverage of issues and the relationship of media companies with other stakeholders.


In what ways do politicians engage on social media? What topics do they engage on?

Build a list of political personalities with Twitter or Facebook accounts, and categorize their messaging on social media in terms of whether they use it for broadcasting their ideas and schemes, or for deliberation by seeking inputs from the people, or for relationship building like birthday wishes and season greetings, etc. In the same way, analyze the topics on which different politicians choose to comment or not comment. Also explore the scope of using machine learning to do this classification.


Understand the challenges in building interesting news analysis tools by carefully examining the performance and pitfalls

Build a tool which given a search term or a set of entities, fetches articles related to it and arranges them in various ways, such as clustered on timeline or around specific entities or topics. This need not be done in an online manner, can be conceived as a batch system which comes back with its analysis after a while, or also as a database driven which assembles a large corpus on which it runs queries.


Is the slant taken by popular columnists consistent over time, is it related to their personal social networks with other personalities or companies?

Build a list of popular columnists and examine their articles over a period of time. Analyze the articles across topics, entities mentioned, political slant, etc and label them according to the affiliations they display. Explore the scope of automated and semi-automated tools in conducting the analysis.