The Data Story: Part II
Last week, we introduced you to Lars, Kristian, Anders and Birk, who told us about the importance of data to a citizen science projects and what kind of data we are receiving from our players (read it again here). However, we still have a few unanswered questions. In this second blog post, we will ask our developers about the process of data harvesting and the challenges that arise along the way.
Let’s start with a short overview of the whole data harvesting process—and who would be a better-suited person for that than Lars, our Development Manager. Lars, could you walk us through this beautiful diagram?
Lars: Surely! So, as you already mentioned it all starts at Unity game client (1) on player’s computer or mobile phone. When a player logs in, we can track data about when players play, what device they play on and of course, their high score. We also track the mouse movement and the physics of the game 30-40 times per second! At the end of the level, all this data is being uploaded online (2) (about 13kb of data for each individual play) to a service called Parse (3). The Parse service uses a database technology called MongoDB (4), which is a so-called "NoSQL" document database, well suited for storing data "blocks" or objects, such as the data we store about Quantum Moves plays. However, this database has a limited size, as it is running on high-performance mode. Therefore, we periodically pull the data from the Nodechef MongoDB to a much larger database on the Aarhus University cloud (5). The Aarhus University Database (6) stores not only ALL the data points we have ever collected, but also all the optimized hits!
From this point, the data can be accessed by the our researchers through the Databrowser (7), an application we made to allow our researchers to filter data based on all kinds of cross-correlations: e.g. to look at data from players that have completed level 10, but not level 11 of Quantum Moves, or compare Android and iOS players. When our researchers have used the Databrowser to pull datasets they need, they can export the data using Matlab data analysis tool (8). In Matlab, data can be analyzed even further and used to produce content for scientific articles (9).
But let’s go a step back. So the data points we have in our database, represent player-created solutions for a specific game level. For instance, maybe the player tried to move in a smooth direct line, or maybe the player tried to move fast to the target area, then suddenly did something else. So, we use this solution as a starting point for AI optimization (10) techniques to end up with potentially as many as 200 optimized hits for each player hit! This optimization process takes about two minutes of CPU time, and given that we have had 8 million plays recorded so far, we have in excess of 27 years of computation time ahead of us.... We are looking into ways to crowdsource this data optimization.
Finally, the optimized hits will be sent back to the lab (11), helping the experimental researchers optimize their experiment and eventually build a better quantum computer.
Needless to say, this is a simplified version of great data architecture which happens behind the scenes of our project.
So, what are the main challenges we have faced so far?
Kristian: To start with, one of our main challenges is to keep the game experience running smoothly while we package and send our data. We are using a Parse server as our backend, and that allows us to use some simple code to actually start transmitting our data. However, the data objects we bundle up are larger than what Parse was build for, and along with the quantum physics being calculated, our games sometimes stall for a split second. This is highly noticeable while you are playing a game, and trying to keep these hiccups to a minimum is a constant struggle.
Another great challenge we have been facing recently is high scores.
Collecting high scores is a much more straightforward process than the scientific data collection, but at the same time, ranks and high-score lists require an element of sorting, which is not as simple as it sounds.
The scientific data is simply stored in a huge database (hundreds of gigabytes, with many millions of packages). The problem arises because database, optimized for collecting large datasets (such as scientific data), does not mix well when trying to optimize a database for quickly ordering and returning dynamic samples (such as high scores). When Quantum Moves launched, we experienced a breakdown which resulted in a hectic scramble to try to overcome these obstacles. However, we solved it by setting up an algorithm for ordering the high scores and taking that load off the database.
It worked for the most part, but doing fast code often results in some hard to maintain structure (and we're just getting started on our high score problems!).
Our problems continued when we started moving away from the Parse service to Nodechef. We were already knee-deep in code using the Parse framework. This called for some adjustments to our cloud code, the part of the server that receives high scores and data. However, we quickly realized that there was no way of knowing what adjustments worked until we had completed the transition of our clients and backend to Nodechef. One of the solutions was to run [glossary]unit[/glossary] tests and try to catch as many errors as possible.
Finally, despite a few more pitfalls along the way, we updated our clients and were ready to release! The new version of Quantum Moves is already in AppStore and PlayStore, and our users can once again start generating data!
What is more, we have also fixed and improved the scoring in Quantum Moves. Time has been added as a factor to your score, and your fidelity score is now much more lenient, resulting in scores that better reflect the scientific value of the solutions.
We are not quite there yet, as we have to fix a few red flags—but we're close! Down the line, we'll start looking into expanding the data structure further, as we have learned a lot since this data adventure began. But right now, we can almost show you a functional top 10 again!
Need more proof for the hard work of our game developers? Download Quantum Moves Version 1.2. and see it for yourself!