Socmint: The monitoring of social media for community safety purposes within a big data framework in South Africa with specific reference to Orania

  • 0


The information explosion that has taken place since the 1990s has changed almost every aspect of society, including conflict and the intelligence environment. Social media platforms, websites and blogs are increasingly used to convey movements' messages and for communication purposes, which means that information about these movements’ nature and activities has become more accessible to those who have the ability to gather the information and to analyse it. Twitter, for example, is used by the Islamic State of Iraq and Syria (also known as the Islamic State of Iraq and Al-Sham, currently known as Daesh) (ISIS) and by Al-Qaeda's affiliate, Al-Shabaab. By 1999, almost every known terrorist group had a presence on the internet and during the 2011 Egyptian Revolution 32 000 new groups and 14 000 new pages were created on Facebook from within Egypt. Significant mass demonstrations where Twitter played an important role include the civil unrest in Moldova in 2009, the Iranian election protests of 2009–2010, the Tunisian Revolution of 2010–2011, the Egyptian Revolution in 2011 and the Occupy Wall Street (OWS) protest, which took place in the autumn of 2011 in cities around the world. Locally, a lot of conversations around recent movements such as #RhodesMustFall and #FeesMustFall also took place on social media and especially on Twitter.

Because information became more accessible, Open Source Intelligence (Osint) has become increasingly important. For example, the CIA Bin Laden unit claimed that 90% of what they needed was open source intelligence, while W.M. Nolte, former deputy assistant director of the CIA, argued in 2005 that 95–98% of all information provided by US intelligence services is open source intelligence.

The discipline of Osint has also recently been extended to include Social Media Intelligence (Socmint). Socmint is used by overseas intelligence services, for example by the United Kingdom's Ministry of Defence (UK MOD) and the US Federal Bureau of Investigation (FBI). Since Socmint involves the collection and analysis of information that exists in the public domain, it is usually seen as an extension of Osint, although it can be argued that Socmint requires additional skills and can be seen as a separate but closely related discipline. Osint usually involves the targeting of a particular entity – either a person or organisation – after which information about that entity is obtained from open sources. In contrast, the large data sets and statistical analyses, machine learning, artificial intelligence and the like that involve Socmint investigations require more specialised skills and equipment.

Socmint is located not only within the field of Osint and the information explosion, but also within the big data paradigm. Big data has impacted businesses, governments and security globally, with applications as diverse as election campaigns, marketing campaigns and anti-terrorism operations. Big data is usually defined in terms of v's: volume, variety, velocity, value and veracity, where volume refers to the large size of datasets, variety refers to the diverse nature of datasets (structured, semi-structured and unstructured), velocity to the speed at which data is generated and analysed, value to its use and veracity to the trustworthiness of the data.

The current article discusses Socmint against the background of big data with specific reference to how it can be applied to enhance community safety in Orania. It discusses how large amounts of unstructured data are collected from Twitter and analysed in a real-time manner, inter alia regarding the number of tweets per day and per person, the extraction of themes and organisations mentioned, the identification of language for filtering out irrelevant tweets, the identification of sentiment and magnitude, and the extracting of hashtags as well as user names. In the analysis, use is made of Natural Language Processing (NLP), regular expressions, cloud computing and data visualisation, including geolocated data. The identification of important users and hashtags is also discussed using network theory, with specific reference to using centrality measures such as PageRank, Eigenvector centrality, in- and out-degree centrality, betweenness centrality and the like, and using network theory to extract the relevant component of Twitter conversations.

In addition, important tweets and days are highlighted and it is indicated that Orania is mentioned more often when discussing matters that affect the Afrikaner: farm attacks (e.g. Black Monday), Afrikaans as a language of instruction (e.g. the court ruling on the University of the Free State’s language policy and Overvaal High School), the election of a new president and the talks around the banning of the old national flag, against the background of land expropriation without compensation. It is also discussed how methods are combined to determine, for example, which hashtags contain the most negative sentiment, which themes are the most negative, whether there is a rise in negative sentiment or in tweets around specific themes, etc. For example, tweets on language are predominantly expressed in the strongest terms, followed by the combination of racism and language and racism and land. When tweets deal only with education, the message is not phrased in strong emotional terms, but when education and racism are referred to in the same tweet (for example, the discourse around Overvaal High School), more emotionally charged messages are posted. Tweets on language and education are phrased in positive terms (when both themes occur in the same tweet), while the most negative tweets refer to racism and land and the second most negative tweets to racism and education. In this, one recognises the discourse surrounding Overvaal High School and the discourse surrounding land reform that has been widely discussed since the ANC's 54th National Congress in December 2017.

In general, it is found that the sentiment towards the Afrikaner has become increasingly negative and that Orania has become a symbol of the Afrikaner, which necessitates the monitoring of social media for safety purposes.

The key issue of the current study is that Socmint can be used to identify misconceptions by gaining a general idea about a subject and by identifying key role players. When misconceptions and relevant role players have been identified, constructive conversations with role players can be embarked on and misconceptions can be corrected, which can prevent conflict. This strategy has already been fruitfully employed by Orania in the past.

Keywords: Afrikaans; Afrikaner; Big Data; intelligence; Orania; Open Source Intelligence; Osint; political action; protest; Social Media; Social Media Intelligence; social network analysis; Socmint; Twitter

  • 0


Jou e-posadres sal nie gepubliseer word nie. Kommentaar is onderhewig aan moderering.