What can internet search engines reveal about STD trends and risk?

July 1, 2014

In an invited talk at the 2014 STD Prevention Conference, the AIDS Foundation of Chicago’s Director of Research, Evaluation and Data Services, Amy Johnson, along with UIC Associate Professor Supriya Mehta, discussed the application of search engine data to sexually transmitted disease (STD) surveillance. Highlighting their current study published in the Journal of Sexually Transmitted Diseases in January 2014, they discussed the challenges and feasibility of this novel approach.

Internet-based surveillance of sexually transmitted infections (STIs) has the potential to increase the timeliness of detection and response to trends in infection as well as enhance sensitivity and predictive capacity of the surveillance system.

In this study, Google Trends was used to examine the relationship between STI-related search trends and CDC-reported STI rates by U.S. state. Google Trends analyzes Internet searches to tally how many searches are completed for the terms entered. Data from Google Trends has been used to accurately predict regional outbreaks of influenza 7 to 10 days before conventional surveillance. It has also been applied to other infectious diseases such as West Nile Virus, rotavirus, and more recently, HIV.

In the current study, the frequency of STI search terms was greatest in states where STI rates are the highest. The search term “gonorrhea” was positively associated with STI rates in 2011; however, there was no association for “chlamydia” as a search term. The lack of association between “chlamydia” and state rate of disease may be due to the short period of data analyzed; because screening for chlamydia is much more common relative to gonorrhea, if most chlamydia cases are detected asymptomatically, this may explain the lack of correlation between search terms.

The current study has limitations. For starters, it cannot be concluded that only STI-infected individuals or those at risk are generating all STI-related search terms. There is uncertainty about the cause of trends, and there is no control for differential access to the internet by region.

Next steps include partnering with Google to enhance the user interface and develop disease specific tools, determining the potential for integrating this new method into surveillance settings and determining search engine user characteristics.

It’s a brave new world of health-related big data, and learning to leverage this data and integrate it into public health systems is an innovative and important task.

Categorized under Inside Story.

Recommended Articles