Language at the centre of the fourth industrial revolution

  • 0

Having recently moved to South Africa from Europe, I soon noticed that certain terms and concepts that were not too well known to me personally, receive much attention here. In particular, the concepts of the “fourth industrial revolution” (4IR), as well as “decolonisation”, pop up regularly in discussions on a wide range of topics. To be able to participate better in these discussions, it is important to understand what these concepts entail.

Let us start with the concept of the fourth industrial revolution. Searching for information on the meaning of this topic reveals descriptions such as: “The Fourth Industrial Revolution represents a fundamental change in the way we live, work and relate to one another. It is a new chapter in human development, enabled by extraordinary technology advances commensurate with those of the first, second and third industrial revolutions” (from Unfortunately, this does not provide much concrete information on what this really encompasses.

Fortunately, the fact that this is the fourth revolution means that there have been three others before. Taking a look at the first three industrial revolutions may provide information on the fourth one. As it is relatively easy to find information on the earlier industrial revolutions, I will provide only a short summary of these revolutions. This will identify properties that may provide insight into what the fourth industrial revolution really means.

We will see that for all industrial revolutions, we can identify a source of power, some enablers (which are essential to be able to use the new source of power), and positive and negative effects.

The first industrial revolution is often simply called the “industrial revolution”. In Europe, this revolution started around 1760. The main driver of this revolution was the availability of power (in particular, water and steam power), which led to mechanised factories. The power, combined with enablers (ways of transporting the power and machines in factories), led to new (and more efficient) manufacturing processes. The positive effect of this was mass production of goods, although this also resulted in pollution (negative effect). This revolution ended around 1830 (followed by a recession).

The second industrial revolution started around 1870. The enablers for this revolution were all technology-oriented (such as the inventions of the telegraph, the telephone, railroads and the related availability of electrical power) and led to technological advancement. Hence, this revolution is typically called the “technological revolution”. Many of the new inventions allowed long-distance communication or transport, which resulted in globalisation (positive effect), but also led to large-scale unemployment (negative effect). This revolution ended around 1914, which was the start of the First World War.

The third industrial revolution is called the “digital revolution”. It started around 1950 and (potentially) is still ongoing. This revolution revolves around digitisation and results in the availability of power for processing information. The enablers were the development of transistors and integrated circuits (ICs), which are the building blocks of computers. Related to this is the development of the internet. In particular, the internet led to global interconnectedness, but also resulted in information overload. The overall result of this revolution is what we call the information age.

Note that if we are now in the fourth industrial revolution, the third industrial revolution should probably have finished. As the earlier revolutions ended with negative events (recession and a world war), we may consider the third industrial revolution to have ended around 2007 or 2008, at which time a financial recession (also called the Great Recession) happened.

In summary, each of the industrial revolutions relied on a source of power (steam/water, electricity and processing) and enablers (machines and pipes, turbines and wires, and transistors and the internet). Also, they had positive effects (mass production, globalisation and interconnectedness), but also negative effects (pollution, unemployment and information overload). Finally, it seems that each industrial revolution ended with a negative event. Based on this insight, can we identify similar aspects that provide information on the fourth industrial revolution?

In order to find the essential parameters (power, enablers, and positive and negative effects) of the fourth industrial revolution, we need to understand the current context, as that provides these parameters. As the concept of decolonisation often co-occurs with the topic of the fourth industrial revolution, they may be related. So, let us concentrate on decolonisation for now.

Looking up the meaning of the term decolonisation, we find, according to the online Merriam-Webster, that decolonising is a transitive verb that means “to free from colonial status”. Now, as South Africa is not a colony anymore (it became the Republic of South Africa in 1960), decolonisation must mean something else in this context. Looking further, the concept of intellectual decolonisation seems to fit better. The “decolonization” entry on Wikipedia states: “Decolonization has been used to refer to the intellectual decolonization from the colonizers’ ideas that made the colonized feel inferior.”

There may be many areas in South Africa that require intellectual decolonisation. However, one of the most visible areas is language. South Africa has 11 official languages (and several other languages that are not officially recognised). Of these, English and Afrikaans can be considered colonial languages, whereas the others are indigenous. The idea of recognising all of these languages is to enhance inclusion: making it possible for anyone to access important information in all official languages.

The current status, unfortunately, is that not all of the official languages receive the same level of support. For example, there is an imbalance in the availability of education in the official languages, the possibility of accessing the verbatim reports of the parliamentary proceedings, etc. In general, there is limited availability of digital resources for most of the official languages.

This is the point where the South African Centre for Digital Language Resources (SADiLaR) comes into play. SADiLaR is a research infrastructural centre funded by the Department of Science and Innovation. The organisation consists of a hub (located at North-West University), as well as five nodes, each with its own focus of expertise. SADiLaR runs two programmes: digitisation and digital humanities.

The aim of the digitisation programme is to create linguistic data collections that contain examples of language use for all of the official languages. The data collections typically consist of electronic texts or spoken language, which are annotated with linguistic information.

Looking at the linguistic data collections currently available for each of the official languages, we see that for English and Afrikaans we can find decent amounts of resources, but for the other official South African languages, the resources are more limited. This corresponds exactly with the notion that the indigenous South African languages are under-resourced. In other words, intellectual decolonisation will need to take place in the context of the digitisation of linguistic resources.

The aim of SADiLaR’s digital humanities programme is to enable and enhance research in the fields of humanities and social sciences using digital techniques. Digital humanities is a relatively new research area and encompasses a range of topics – such as digital archives, new media and novel publication methods (for example, for academic publishing) and cultural analytics – but also human language technology (HLT). The field of HLT develops tools that allow computers to deal with language, such as spell checkers, text-to-speech systems, speech recognition systems and machine translation.

To be able to perform research in the field of digital humanities in the South African context, HLT tools for the South African languages are essential. However, limitations similar to those of the linguistic data collections are found, if we consider the availability of the HLT tools for the official languages.

There is actually a close relationship between the availability of linguistic data collections and HLT tools. Many HLT tools are built using machine learning techniques. Machine learning techniques require training data (example material) to learn what the correct behaviour is. However, for HLT tools, this means that linguistic data collections are required. If only limited data collections are available, the resulting tools will be of limited quality.

This means that in order to make sure all official South African languages are equally supported, large amounts of linguistic data are needed, as this is essential for the development of HLT tools for these languages. However, many of the South African languages – and in particular, the indigenous languages – are currently under-resourced, and, due to the limited availability of linguistic data collections for these languages, there is also only limited availability of HLT tools. The limited availability of these tools is a problem in the context of intellectual decolonisation.

To resolve the issue of limited availability of linguistic data collections (and hence HLT tools), additional linguistic data will need to be collected. Unfortunately, especially for the indigenous languages, this is difficult, as not many texts are available in electronic form.

One way of collecting language samples is to consider situations in which people use the South African languages, preferably already in electronic form. This occurs, for instance, on social media, where people share experiences in their own languages. If we can tap into this source of information, we can collect linguistic data, which provides information on online culture (i.e. how people behave) as well as actual language use (i.e. how people communicate).

In summary, if we want to achieve intellectual decolonisation, we need (at least) equal support for all South African languages. At the moment, there is only limited support for many of the languages. What is needed is the development of additional HLT tools, or the improvement of these tools. For this, however, large amounts of linguistic data are required, which means that new and additional linguistic data collections are essential. Social media may be an excellent source for digital linguistic data, as that is where much of the language practice occurs.

If we can manage to collect linguistic data, improve the availability of HLT tools, and, as a result, provide more equal support for all of the official South African languages, we can tackle intellectual decolonisation. The effect of this is that it will “[represent] a fundamental change in the way we live, work and relate to one another. It is a new chapter in human development, enabled by extraordinary technology advances commensurate with those of the first, second and third industrial revolutions” – which is exactly the definition of the fourth industrial revolution that we identified earlier.

It turns out that the ideas we found when investigating intellectual decolonisation in the area of the official South African languages have a direct relationship to the fourth industrial revolution. If we make sure that intellectual decolonisation happens for these languages (for example, by making high-quality HLT tools available), we will reach a situation that is exactly described by the explanation of what the fourth industrial revolution is. In other words, language is at the heart of the fourth industrial revolution.

Earlier, we identified properties of the industrial revolutions: power, enablers, and positive and negative effects. With the idea of having language at the heart of the fourth industrial revolution, we can also call this industrial revolution the “social revolution”, as it is about social and cultural relationships and interaction. The power of this revolution, then, is shared (linguistic) information, and the enablers are human information (for example, in the form of language) and social media, which allows access to this human information. The positive effect that we can expect is (electronically) accessible information on human nature. In a practical context, it provides linguistic information that allows us to use machine learning to improve linguistic tools for all South African languages. Having access to these tools will improve social and cultural cohesion and will result in inclusiveness. The potential negative effect, however, could be exclusion (if we do not do this properly).


This contribution is part of the seminar "Die Vierde Nywerheidsrevolusie". Read all the contributions here:

Miniseminaar: Die Vierde Nywerheidsrevolusie

  • 0


Jou e-posadres sal nie gepubliseer word nie. Kommentaar is onderhewig aan moderering.