Introduction to the Mobile Reading Data Exchange

Posted November 27, 2017
By Jason Young

We are excited to kick off the Mobile Reading Data Exchange research site. The site is a collaboration between Worldreader, and the Technology & Social Change Group (TASCHA) and DataLab of the University of Washington Information School. We will be using this site to share the ongoing work of a joint research project that has been generously funded by the Tableau Foundation. In this inaugural post we’ll share some background on the project, describe our research questions, and explain what you can expect while following this site.

At its core this project is about using big data analysis and visualization to unlock stories about mobile reading patterns across the globe. The mobile reading data for the project comes from Worldreader, which is a US-, Europe-, and Africa-based non-profit organization dedicated to creating a world where everyone can be a reader through increased access to digital books. Approximately 6 million people, primarily in Sub-Saharan Africa and India, have used their mobile application to read from a selection of more than 40,000 titles available in 43 languages. Each time a user interacts with this mobile application, Worldreader collects information about those interactions – information such as how many pages they have read, how many times they have visited a book, or what genres they tend to read. As a result they collect over 1,000,000 data entries a day from readers – most of whom live in the Global South. This results in a huge dataset that continues to grow rapidly!

This dataset is particularly interesting to TASCHA, as a research group that specializes in understanding how people are using new technological systems – such as mobile devices – to gain access to information that can change their lives. Worldreader has invited TASCHA to help them analyze this dataset to better understand the reading patterns of its users. This analysis can help us to understand digital reading patterns in the Global South more broadly, and thereby builds upon the findings of a UNESCO/Worldreader report on “Reading in the Mobile Era” from three years ago. While this research is valuable on its own, we’ll also be using the results to (1) improve Worldreader data infrastructure and (2) develop computational methods and visualization strategies to expose information from big data sets to the broader public. Our partnership with the Tableau Foundation will be instrumental in achieving these goals, given their expertise in data management, analysis, and visualization.

Worldreader and TASCHA kicked this project off in Seattle over a series of meetings in late September, where we began exploring the data, as well as developing research questions and methods. Based on that initial work, we have a more detailed post on our dataset [here] and a full list of research questions [here]. As a quick summary, though, our dataset includes records of each interaction that a user has had with the Worldreader application. Depending on the type of interaction, we may have data about what book is being read, how much of a particular book a user has finished, what type of query the user has made of the system, the IP address of the user, and more. If the user has registered and added additional demographic data, we may also have access to information such as age and gender. Based on this data we came up with several different clusters of questions, revolving around topical areas such as user demographics, behavior and engagement, and geographic patterns. For example, within the user demographic cluster one question we have is whether we can identify different user groups or profiles based upon demographic or behavioral characteristics. In answering this question we will be interested in determining whether clusters, or groups, of users might be identified based on characteristics such as time spent reading, long-term engagement with the application, book preference, location, language, and more. This would have both applied and basic research significance. For Worldreader, an answer to this question might help determine what types of users are most actively engaged with the application, so that they can better tailor services to those users. It might also help determine which users are least engaged, so that strategies can be developed to better engage those readers. From a basic research perspective, answers to these questions might be generalized to develop better understandings of the types of populations that are impacted by mobile reading, of the demographics and geographies of access to mobile services, and more.

As we begin exploring these data and questions, we will be using this site to make our research as public and open as possible. In the future you can expect additional posts like this, that discuss our research questions, the data, initial analysis, relevant concepts from research literature, and more. These shorter writing pieces may be messy and incomplete – they are meant to give you direct views into the research process as it evolves. Later in the life of the project we will also be sharing more polished pieces, including journal manuscripts and conference presentations based on our analysis. We will also be building tools into the site to let others join in on the analysis – one goal of the project is to produce interactive dashboards that facilitate broader public engagement with Worldreader’s data. We hope that you will continue to follow along!