The WHO Early AI-supported Response with Social Listening Platform shows real-time information about how people are talking about COVID-19 online, so we can better manage as the infodemic and pandemic evolve.
More information about the initiative and an exploration of the data can be found at WHOinfodemic.citibeats.com.
More information about the methodology, data and definitions can be found at WHOinfodemic.citibeats.com/methodology.
The platform is powered by Citibeats, a text analytics platform specialized in social understanding. More information can be on the methodology page, or found at www.citibeats.com.
Listening to people's questions and concerns is an important way for health authorities to learn about what matters to communities in response to COVID-19. This social listening platform aims to show real-time information about how people are talking about COVID-19 online, so we can better manage as the COVID-19 infodemic and pandemic evolve.
While the Early AI-supported Response with Social Listening Platform website facilitates ready made visualizations for easy exploration of the data, the aggregated and anonymized data is made available for anyone wishing to integrate the data into their own research, or integrate the data into their existing workflows.
This is available via the public API as well as this GitHub repository.
Any use of the data should use the citation 'World Health Organization, Early AI-supported Response with Social Listening'.
We welcome you sharing with us how you are using the data, as well as feedback on how to improve the platform for your needs.
The data has been obtained from public posts related to COVID-19, using the Twitter API and data aggregators of public sources such as forums, message boards, blogs and comments in news. Please keep in mind that this is only a sample of all COVID-19 conversations.
The data is updated each day with new posts.
NOTE: all data is subject to quality, technical, and ethical requirements before being added to the system.
More information about the data can be read here.
Since the civic situation is constantly evolving and social needs are wide-ranging, there is a need for real-time data to serve the decisions of key actors. At the same time, this data must be treated carefully, as the data query and categories are subject to change along with the conversation. This repository will be kept up-to-date with revised data, and changes noted here. Note that, as new sources are added to the system, records from earlier dates could change.
Every analysis based on Internet data has to deal with representativity. Not everyone is connected to the Internet, and not everyone shares their opinion.
Further information on the methodology, data and definitions can be read here.
CSV columns:
id
: code of the country as defined by ISO 3166-1 alpha-3date
: day in which the raw data was published in YYYY-MM-DD format (UTC timezone)name
: name of the country in Englishdocs-N
: number of documents for the category N on a given daydocs-delta-percent-N
: variation of documents (%) for the category N respect the previous daydocs-percent-N
: percent of documents in the category N respect all the other categories in that daydocs-female-N
: number of documents for the category N on a given day, which are estimated to be from femalesdocs-male-N
: number of documents for the category N on a given day, which are estimated to be from malesdocs-questions-N
: number of documents for the category N on a given day, which were questionsdocs-complaints-N
: number of documents for the category N on a given day, which were complaints
NOTE: each row in the CSV corresponds to one day of data for a specific country.
Level 1 | Level 2 | Definition |
---|---|---|
The cause How did the virus emerge and how is it spreading? |
The cause of the virus | Narratives about the origin of SARS-CoV-2. |
Stigma about the spread | Stigma on people who are thought of spreading the virus: racist expressions, attribution to poor people or immigrants. | |
Stigma about or by infected people | Stigma expressed about or by infected people or have been infected. | |
The illness What are the symptoms and how is it transmitted? |
Confirmed symptoms | Confirmed symptoms as defined by WHO, excluding longer-term symptoms. |
Other discussed symptoms | Other discussed symptoms that have not yet been confirmed by WHO. | |
Prolonged symptoms | Reports on long covid that may or may not be confirmed by WHO. | |
Modes of transmission | Modes of transmission confirmed and unconfirmed by WHO. This includes discussion of asymptomatic and pre-symptomatic transmission as well as possible ways the virus can be transmitted (for example, aerosols and fomites). | |
Transmission settings | Narratives about settings where transmission can be amplified: closed and semi-closed settings. | |
Immunity | General conversations on re-infection, confusion over immunity after infection or the possibility of being infected more than once. | |
COVID 19 Variants | Narratives and concerns about about the development, spread and impact of new COVID 19 Variants. | |
Demographic vulnerability & risks |
Vulnerable and risk groups:
|
|
Impact on mental health | Anxiety, depression and other affections derived from the pandemic situation | |
The treatment How can it be treated or cured? |
Current treatment | Medical treatment as per WHO treatment recommendations |
COVID-19 vaccine | Narratives about the vaccine itself: efficacy, side effects, safety, etc. | |
Health care workers (HCW) and vaccine | Narratives by and about health care workers and vaccine | |
General vaccine discussion | Narratives about vaccines in general, including discussion about others or communities that have different opinions about vaccines; can include any vaccine concerns, not just COVID-19 | |
Science and R&D | Comments on new treatment and vaccines from research and development and evidence and scientific processes | |
Science and R&D | Comments on new treatment and vaccines from research and development and evidence and scientific processes | |
Non proven treatments | Discussion about treatments that are not proven to be effective (examples: sunlight, nutrition, herbal remedies, etc) | |
Myths | Specific myths that WHO and partners have reacted to taken steps to debunk reference | |
The interventions What is being done by government and health authorities and societal institutions? |
Testing | Any discussion about tests – everything from reliability, to access to tests, types of tests, requirement to have tests, etc. |
Contact tracing | Any discussion about the process, requirements and steps involved in contact tracing, use of technology | |
Supportive care | Care given to patients in hospitals by medical personnel | |
Vaccine distribution and policies on access | Narratives about distribution, equity, access to COVID-19 vaccine | |
Personal measures | Individual protection measures recommended by governments/WHO such as wearing masks, handwashing, social distance, isolation when ill... | |
Measures in public settings | Measures implemented by governments in public settings: schools, workplaces, public transport... | |
Travel measures | Measures implemented or suggested by governments/WHO/population/private companies on travel: immunity passports, negative PCR or negative rapid test to enter a country, mandatory quarantine | |
Immunity pass | Vaccine certificates, immunity / health passports, digital and hard copy, including implications for access to businesses, schools, and other services. | |
Reduction of movement | Measures implemented by governments related to movement reduction: lock-down at home, territory lock-down, etc. | |
Protection: medical equipment | Equipment for health workers: PPE advances and accessibility for public. | |
Health Technology | Health technology used to treat patients: medicines, medical devices, vaccines, procedures and systems | |
Digital health technology | Discussions about digital technology used to respond to pandemic: electronic data exchange, electronic notices of passenger lists to health authorities, biometric data coming from wearables, proximity apps (App Covid). Includes people’s attitudes to data privacy, or for modelling and predictive analytics. | |
Pandemic Fatigue | Fatigue from interventions (lock-down, movement restrictions, masks…) | |
Faith | Narratives about faith and religion and COVID-19 (these narratives are recurring, usually around the time of religious holidays and outbreaks in faith based settings) | |
Industry | Narratives about industry, unions and COVID-19 | |
Environment | Narratives about the environment and COVID-19 – some examples: shading in environment, waste water, air pollution as a secondary byproduct of lockdowns | |
Inequalities & Human Rights | Narratives about social inequalities and relation to COVID-19 | |
Civil Unrest | Narratives about civil unrest and COVID-19 | |
Youth | Narratives about youth, effects of pandemic on them, or actions youth is taking | |
Type of information What types of information are most engaging? |
Statistics & data | Conversations about facts, official statistics and data |
Misinformation | Conversations about misinformation | |
Mis- and Disinformation | Conversations about mis- and disinformation | |
Sources & influencers | Conversations about where people look for information |
- New category “Variants” added
- Added keywords in boolean query for Twitter and web comments with variant and vaccine-related keywords.
- Added new categories: Prolonged symptoms, Immunity, Immunity Pass,
- Merged the following categories:
- Modes of Transmission, Asymptomatic transmission and pre-symptomatic transmission.
- Expanded the scope of the following category:
- Inequalities to include Human Rights