Note
The original article was granted open access licensed under creative commons (CC-BY) where you are free to copy, reuse, modify, distribute, commercial, etc, but required to give full citation:
- Authors are Fajar Purnama (me) and Tsuyoshi Usagawa.
- Title of the article is Using real-time online preprocessed mouse tracking for lower storage and transmission costs.
- Published in Journal of Big Data, Volume 7, Number 27, Page 1-22, 10 April 2020, DOI is 10.1186/s40537-020-00304-x.
- Available at https://journalofbigdata.springeropen.com/articles/10.1186/s40537-020-00304-x.
Althought it is already fully available online, I would like to rewrite it in .html and share to as many platform as possible. Who knows if the original site goes down? Also, I have a circumstance to read it all over again and why not rewrite it at the same time to make it more fun. Eventhough the original is licensed CC-BY but the rewritten version I will license as customized CC-BY-SA where you are also allowed to sell my contents but with a condition that you must mention that the free and open version is available here. In summary, the mention must contain the keyword "free" and "open" and the location such as the link to this content.
Abstract
Pageview is the most popular webpage analytic metric in all sectors including blogs, business, e-commerce, education, entertainment, research, social media, and technology. To perform deeper analysis, additional methods are required such as mouse tracking, which can help researchers understand online user behavior on a single webpage. However, the geometrical data generated by mouse tracking are extremely large, and qualify as big data. A single swipe on a webpage from left to right can generate a megabyte (MB) of data. Fortunately, the geometrical data of each x and y point of the mouse trail are not always needed. Sometimes, analysts only need the heat map of a certain area or perhaps just a summary of the number of activities that occurred on a webpage. Therefore, recording all geometrical data is sometimes unnecessary. This work introduces preprocessing during real-time and online mouse tracking sessions. The preprocessing that is introduced converts the geometrical data from each x and y point to a region-of-interest concentration, in other words only heat map areas that the analyzer is interested in. Ultimately, the approach used here is able to greatly reduce the storage and transmission cost of real-time online mouse tracking.
Introduction
We are living in the age of information where information and communication technologies are becoming mainstream [1]. Computers and the Internet have become inseparable from our daily lives. It is now possible to do many things without the need to travel and wait; examples include, long-distance communication through online text, voice and video calls, social interaction via social media, online learning, ecommerce or online shopping, and many forms of entertainments such as online music, online videos, and online games. Most importantly, the Internet has become the primary source of information around the world.
We are living in the age of information where information and communication technologies are becoming mainstream [1]. Computers and the Internet have become inseparable from our daily lives. It is now possible to do many things without the need to travel and wait; examples include, long-distance communication through online text, voice and video calls, social interaction via social media, online learning, ecommerce or online shopping, and many forms of entertainments such as online music, online videos, and online games. Most importantly, the Internet has become the primary source of information around the world.
We are living in the age of information where information and communication technologies are becoming mainstream [1]. Computers and the Internet have become inseparable from our daily lives. It is now possible to do many things without the need to travel and wait; examples include, long-distance communication through online text, voice and video calls, social interaction via social media, online learning, ecommerce or online shopping, and many forms of entertainments such as online music, online videos, and online games. Most importantly, the Internet has become the primary source of information around the world.
Although eye tracking remains a tool for the laboratory, an alternative method has been invented, mouse tracking [7]. Instead of eye movements, mouse tracking tracks mouse movements and other helpful events. The fundamental strategy of mouse tracking is the recording of mouse clicks, mouse movements, and scrolls. Eye and mouse tracking have been implemented in the fields of education [8, 9], reading patterns [10], search engines [11], visual navigation [12], web evaluation and usability [13, 14]. Mouse tracking can be treated either independently [15] or as a correlation to eye tracking [16] in other words, as replacement.
The biggest problem with default mouse tracking (as well as eye tracking) is the huge volume of data generated, which can be categorized as big data [17, 18]. This high volume is due to the use of geometrical data where each event that occurs on each point of the webpage is recorded. If the distance between left and right is 1000 pixels, then a swipe from left to right will generate 1000 rows of tables. However, analyst may not need all of the mouse tracking data that is generated. Therefore, this paper proposes preprocessing the data based on the analyst’s needs. The preprocessing in this case determines the region of interest in other words; which area the tracking should capture rather than capturing each point of interest. Furthermore, the preprocessing is conducted not only online, but also in real-time mouse tracking session.
Related Work
Implementations of eye and mouse tracking
The use of eye [19] and mouse [20] tracking began in the early 20th century, and since then, there have already been many laboratory experiments conducted using these technologies. Today, there are many attempted implementations of eye and mouse tracking, but it is unclear how widespread and long-running they are. For eye tracking, there is no chance of implementation outside laboratory unless one of two requirements is met: (1) affordable and mainstream hardware [6] or (2) optimal usage of web cameras [21, 22] on laptops and/or cameras on smartphones. By contrast, widespread implementation of mouse tracking is already possible because the required hardware is available by default in all computers and smartphones, but the problem is the generation of big data (the same is true of eye tracking as well).
The following are selected attempts at implementation eye tracking:
- Adaptive E-Learning via the Eye Tracking (AdELE) frame-work, adaptive, integrated, and real-time eyetracking during e-learning processes [8, 23].
- Eye tracking based emphatic software agent (ESA), an eye tracking software that captures the state of awareness of the learners and responds accordingly [24].
- Enhanced exploitation of eyes for effective e-learning (e5Learning) [25].
- Eye tracking based adaptive and personalized e-Learning Systems (AeLS) [26].
- Eye tracking based Eye tracking based programming tutoring system (Protus) [27].
The following are selected attempts at implementation mouse tracking:
- A mouse tracking web application developed by Zushi et al. [9] for their own specific learning management system (LMS).
- Moodle LMS mouse tracking plugin [28,29,30].
- Mouse tracking web browser plugin and client side programming script [31, 32].
Some commercial and open source software programs are as follows:
- Open Gaze and Mouse Analyzer (OGAMA), an open-source software designed to analyze eye and mouse movements in slideshow study designs [33].
- Mousetrap, an integrated, open-source mouse-tracking package on OpenSesame for laboratory experiments [34].
- Known commercial applications: Lucky Orange, Hotjar, Crazy Egg, Fullstory, Ptengine, Heatmap.com, Smartlook, ContentSquare, SessionCam, Seevolution, Capturly, Inspectlet, MouseFlow, Clicktale, and Tamboo [35].
Mouse tracking in web development
The core of mouse tracking in web development is document object model (DOM) which is an application programming interface (API) for Hyper Text Markup Language (HTML) and Extensible Markup Language (XML). It defines the logical structure of documents and the way a document is accessed and manipulated. Supposed a simple HTML page with the codes on Table 1, the DOM structure can be represented on Fig. 1. With the Document Object Model, programmers can build documents, navigate their structure, and add, modify, or delete elements and content. Anything found in an HTML or XML document can be accessed, changed, deleted, or added using the Document Object Model, with a few exceptions. DOM is designed to be used with any programming language. Currently, it provides language bindings for Java and ECMAScript (an industry-standard scripting language based on JavaScript and JScript) [36].
Table 1 A web page code in simple HTML that contains html, head, title, body, p, and footer tags
<html>
<head>
<title>Simple Webpage</title>
</head>
<body>
<p>Hello World!</p>
</body>
<footer>
<p>CC</p>
</footer>
</html>
Fig. 1 DOM representation of Table 1. The html tag is the parent with head, body, and footer tag as the children. Head has a child tag title, body has a child tag p, and footer has a child tag p
The implementation of mouse tracking is based on DOM events, specifically mouse, touch, and user interface (UI) events which are actions that occurs as a result of the user’s mouse actions or as result of state change of the user interface or elements of a DOM tree [37]. Our previous work [31] uses jQuery to access the DOM API and receives information that are related to mouse, touch, and UI events. They can be stored into default dynamic variables or in an ArrayBuffer for enhanced performance. The list of events are as following:
- Mousedown: when either one of the mouse buttons are pressed (usually left, middle, or right button)
- Mouseup: when either pressed mouse buttons are released
- Mousemove: when the mouse cursor moves
- Mouseleave: when the mouse leaves an element (we only indicate when temporary leaving a webpage)
- Mouseenter: when the mouse enters an element (we only indicate when temporary entering a webpage)
- Beforeunload: when the webpage almost closes
- Scroll: when the webpage scrolls
- Touchstart: when a computer device screen is touching
- Touchend: when a touch from touchstart is removed
- Touchmove: when a touch is moving
- Touchcancel: when a touch is interrupted
- Resize: when the webpage is zoomed in or out
The information is then processed by adding important labels such as the date of the received information and duration by calculating the difference between the current and previous received events. Finally the information is either stored locally or sent to a server using hyper text transfer protocol (HTTP) post method. Traditionally, the information is transmitted all at once at the end of the session, but in our study [31], we found that it is better to transmit them in real-time without delay. The difference between offline, regular online, and real-time online mouse tracking is shown in Fig. 2.
Fig. 2 Flow chart of traditional and current mouse tracking method [31]. The left flowchart is offline mouse tracking, the middle flowchart is regular online mouse tracking, and the right flowchart is real-time online mouse tracking
Default eye and mouse tracking generates big data
Although eye and mouse tracking are not yet mainstream, rumors spread that due to large amounts of data generated, they could not be widely implemented other than at big corporations such as Google and Microsoft, which have gigantic data centers. University network and server administrators are hesitant to implement tracking technologies because they not only generate massive amounts of data, but also eye and mouse tracking are not replacements for existing systems but rather additions. Huang et al. [11] performed a mouse tracking experiment on Bing’s search engine and immediately reduced the sampling rate because the data were too large.
Leiva and Huang [38] believed that a swipe could generate a megabyte of data and the authors further investigated and proved that rumor to be true. While a half-year of Moodle log data with approximately 40 students is only approximately 300 kilobytes (kB) [39], mouse tracking data and other event data generated by approximately 22 students reaches approximately 100 megabytes (MB) in only 2 h [31], and that figure will double if eye tracking is included. Imagine how much data would be produced by a university with a large number of students if mouse tracking were running on its website for years.
According to an article by Adekitan et al. [40], Nigerian University Internet traffic can reach terabytes (TB) in a week and is regarded as big data. The authors’ previous mouse tracking session [31] also reaches the same level of Internet traffic if over 100 students are present. Other than volume, mouse tracking met the other 5Vs criteria of big data [17]: velocity, the amount of data generated especially in real-time which is explained in further sections, veracity; meaning that data loss may often occur due to limited connectivity, which can lead to inconsistent data; variety, which is discussed in further sections and previous work [31]; and value, which is discussed in the next paragraph.
It would be wise to start investing in eye and mouse tracking just as big companies today are investing in big data [41], as the data generated by eye and mouse tracking are valuable. By analyzing big data, interesting information can be derived that gives us the knowledge needed to make optimal decisions [42]. Just as companies study customers’ data to find opportunities to increase their revenues [43], traders analyze historical trading data and current sentiments to find optimal positions [44], researchers study optimal prevention, diagnosis, and treatments in Medicare [45], and planners monitor smart cities [46], researchers can use eye and mouse tracking to identify online viewers’ attention, behavior, their evaluation of web contents, etc.
Reducing eye and mouse tracking data
During high-intensity activities, a user may generate an average of 70 or 70 events per second [47], meaning that 70 rows per second will be generated on a table. The traditional way of reducing the size of these tracking data is by reducing the sampling rate [11]. Furthermore, the sampling rate should be adaptive and not static. In other words, snapshots should only be taken when an event occurs such as when the mouse cursor moves or a click occurs, and snapshots should not be taken during idle sessions. Performing transmission in real-time helps distribute the transmission burden across time, avoiding bottlenecks. In other words, the tracking data are immediately transmitted to the server at each event occurrence rather than transmitting the mouse tracking data all at once at the end of the session. Compression methods can also be utilized as demonstrated in Leiva and Huang’s work [38], but their transmission method is still likely not real-time and is suspected to transmit the compressed tracking data all at once in the end of each session.
On the other hand, the preprocessing technique presented in this paper is designed to work in real-time. Not only does it reduce the data cost but also distributes the transmission burden across the time domain. The cause of the enormous data generation is the geometrical data or tracking of each location where the events occur, in other words the x and y coordinates. Tracking these coordinates provides rich data but sometimes all of that data is not needed. For example, Rodrigues et al. [28] only analyzed the amount of key up, key down, mouse down, mouse up, mouse wheel, and mouse movement to measure students’ stress, and Li et al. [48] only needed the time spent on each page. In such cases, the geometrical data can be omitted.
At other times, geometrical data are needed; however, it is not each precise x and y point that is needed but rather each area of the page (multiple points) [49]. Preprocessing is common in any data analysis to derive useful data prior to transmission and storage of the collected data. However, the preprocessing presented in this work is performed on the client before transmission and storage to reduce the server’s burden. Unlike typical preprocessing which is performed to filter redundant data, the preprocessing in this work is specifically based on the demands of the administrators or analyzers; in this case, preprocessing omits the geometrical data or groups them to represent certain areas. This study is a complete work of one the author’s previous works [50].
Method and simulation
System overview
Fig. 3 System overview of mouse tracking data transaction [31]. The framework is divided into two sides: one side is the client and the other side is the server. The client and the server are connected via the Internet. The webpage is in the server consisting of HTML, CSS, and JavaScript. The mouse tracking codes, which are event handling and capturing the command to post its data to the server, are inserted in the JavaScript section. When the client accesses the webpage, it will view the contents that consist of HTML and CSS. The mouse tracking codes within JavaScript are run in the background. The mouse tracking data are sent to the server and processed using server programming language, in this case PHP. Finally, the data are sent to the database; in this case, in SQL
The overall system is the same as in the authors’ previous work [31] with the concept discussed in "Mouse tracking in web development" section. In this section, the implementation of the concept to system is discussed. As shown on Fig. 3, mouse and other event tracking are performed on the client. The tracking codes can be injected internally, for example, as a browser plugin, or externally, for example, where the codes are retrieved alongside the webpage content [HTML and cascading style sheets (CSS)]. Then, the client sends the tracking the data to the server to be stored. The code itself for this work is written in jQuery, which is a simpler coding format of JavaScript for DOM manipulation. The external code can be written as a plugin if desired; for example, the authors wrote a Moodle plugin. The server side can be in any programming language, but in this work, PHP was used, and the database used MySQL. The codes are available on GitHub [32].
Each web framework may developed their own bindings to access the DOM API. However, the most fundamental implementation is still injecting the mouse tracking code into the script section no matter which web framework is used, which is the default option if the web framework did not developed their own bindings. Below is a list of few web frameworks:
- NodeJS is browser JavaScript made into server side programming language. Also, NodeJS is written based on the criticism in 2009 about how Apache HTTP server handled huge concurrent user, sequential programming, and blocking functions [51] while NodeJS is asynchronous and is designed to build scalable network application. Additionally, its runtime is built on Chrome’s V8 Engine which implements C++ features such as hidden classes and inline caching to make JavaScript runs much faster [52]. The popular web framework for NodeJS is Express which is a fast, unopinionated, minimalist web framework for Node.js [53]. The choice for implementing mouse tracking code can either be using developed module available on Node Package Manager (NPM), use TypeScript, or the default option. TypeScript is a typed superset of JavaScript developed by Microsoft that compiles to plain JavaScript. The advantage for developers are defining interface between software components, and interactive static checking and code refactoring during development [54]. The default option is to call the scripts in the webpage layout which is usually written in Jade or Pug.
- Django is a web framework written in Python that uses model-view template (MVT) [55]. Like Python almost every module is available, Django prides itself as a batteries-included framework, meaning that it comes with many modules unlike other frameworks, it is not necessary for a developer to write a module from a scratch. Although it is powerful for building huge web applications, the difficulty in building huge application doesn’t change when building small applications. For mouse tracking, there is a choice to use Python modules but it is not yet known whether it can interact with the DOM elements in the webpage. Most documentation suggests to use vanilla JavaScript in Django.
- Django is a web framework written in Python that uses model-view template (MVT) [55]. Like Python almost every module is available, Django prides itself as a batteries-included framework, meaning that it comes with many modules unlike other frameworks, it is not necessary for a developer to write a module from a scratch. Although it is powerful for building huge web applications, the difficulty in building huge application doesn’t change when building small applications. For mouse tracking, there is a choice to use Python modules but it is not yet known whether it can interact with the DOM elements in the webpage. Most documentation suggests to use vanilla JavaScript in Django.
- Django is a web framework written in Python that uses model-view template (MVT) [55]. Like Python almost every module is available, Django prides itself as a batteries-included framework, meaning that it comes with many modules unlike other frameworks, it is not necessary for a developer to write a module from a scratch. Although it is powerful for building huge web applications, the difficulty in building huge application doesn’t change when building small applications. For mouse tracking, there is a choice to use Python modules but it is not yet known whether it can interact with the DOM elements in the webpage. Most documentation suggests to use vanilla JavaScript in Django.
- Django is a web framework written in Python that uses model-view template (MVT) [55]. Like Python almost every module is available, Django prides itself as a batteries-included framework, meaning that it comes with many modules unlike other frameworks, it is not necessary for a developer to write a module from a scratch. Although it is powerful for building huge web applications, the difficulty in building huge application doesn’t change when building small applications. For mouse tracking, there is a choice to use Python modules but it is not yet known whether it can interact with the DOM elements in the webpage. Most documentation suggests to use vanilla JavaScript in Django.
- ReactJS is a JavaScript library for building UI which are maintained by Facebook and community. Unlike the previous back-end web framework, ReactJS is a front-end web framework. ReactJS have its own mouse event library which is to be injected on each UI [58].
- Angular is a complete rewrite to TypeScript based from the same team that built AngularJS. It is a web framework mainly maintained by Google and by a community of individuals and corporations to address many of the challenges encountered in developing single-page applications. It is one of the most popular framework to build web applications on mobile. The mouse events can be added on the components or templates [59].
Three techniques of mouse tracking
Table 2 Event monitoring difference between default mouse tracking, whole page tracking, and ROI tracking. Second to fourth row contains the three different method of mouse tracking, the first column are the events, and the rest marks whether the events are available on the mouse tracking method or not
Events | Mouse tracking | Page tracking | ROI tracking |
---|---|---|---|
Duration | ✓ | ✓ | ✓ |
Click Left | ✓ | ✓ | ✓ |
Click Right | ✓ | ✓ | ✓ |
Click Middle | ✓ | ✓ | ✓ |
Mouse X | ✓ | ✗ | Partial |
Mouse Y | ✓ | ✗ | Partial |
Touch X | ✓ | ✗ | Partial |
Touch Y | ✓ | ✗ | Partial |
Keyboard Type | ✓ | ✓ | ✓ |
Scroll X | ✓ | ✗ | Partial |
Scroll Y | ✓ | ✗ | Partial |
Other Events | ✓ | ✓ | ✓ |
Events Amount | ✗ | ✓ | ✓ |
For convenience, the techniques of mouse tracking are divided into three types, as shown in Table 2. They are called default mouse tracking, whole page tracking, and ROI tracking. The default mouse tracking precisely records the geometrical data of the event occurrence such as the horizontal x and vertical y of left clicks, right clicks, middle clicks, mouse movements, scrolls, zooms, and if desired keyboard types. The duration between each event is also measured. Whole page tracking omits the geometrical data and summarizes the number of left clicks, right clicks, middle clicks, mouse movements, scrolls, zooms, and if desired keyboard types that occurred on the webpage. In other words, the amount of activity is measured but not where or when it occurs, and only the total amount of time that the user spends on a webpage is recorded. The most complicated task is ROI tracking, which is a gray area between default mouse tracking and whole page tracking. ROI tracking defines the areas of a webpage to be tracked, for example how many left clicks, right clicks, middle clicks, mouse movements, scrolls, zooms, and keyboard types occurred and how long they occurred on a header, menu, content, footer, etc. This method is ideal because it meets the analyst’s requirements and reduces unnecessary resource costs, but the drawback is the heavy labor required to manually define the areas of each webpage. Automatic area definition is possible to certain degree. One way is by attaching “mouseenter” DOM event listener to every element and using “offset” DOM HTML to return the position of the element. Offset DOM HTML returns the left and top element distance from the outermost of the webpage, and by using “width” and “height” DOM HTML to calculate the element’s size, it is possible to find the bottom and right as well. However, the limitation is that it cannot perform smart labelling where it can only extract attributes, texts, and values of the element. An illustration comparing the three types of mouse tracking is shown in Fig. 4.
Fig. 4 Whole page vs region of interest vs default mouse tracking illustration. The left scroll illustrates whole page tracking that summarizes the number of events occurring on the whole page; the middle scroll illustrates ROI tracking that summarizes the number of events occurring in defined areas, and the right scroll illustrates default mouse tracking that records every event and the precise point where it occurs, forming a trajectory
Whole page vs region of interest vs default mouse tracking illustration. The left scroll illustrates whole page tracking that summarizes the number of events occurring on the whole page; the middle scroll illustrates ROI tracking that summarizes the number of events occurring in defined areas, and the right scroll illustrates default mouse tracking that records every event and the precise point where it occurs, forming a trajectory
Fig. 5 Three Types of Mouse Tracking Flowchart. The left flowchart is default mouse tracking, the middle flowchart is whole page summarized mouse tracking, and the right flowchart is region of interest mouse tracking
The flowchart for each implementation of real-time and online mouse tracking is shown in Fig. 5. For default mouse tracking, the information on the event is transmitted to the database each time an event occurrs. For example, when a click occurs, the client immediately transmits the information on where and when it occurs. For whole page tracking, the information is summarized as the number of events that occurred, and they are transmitted to the server when the client closes the webpage. The size of the transmitted data is only slightly larger than that of transmitting single click data when using default mouse tracking. Last, for ROI tracking, the information on the webpage area is summarized and transmitted after the mouse cursor leaves the area, and the process repeats on each movement between areas until the webpage is closed.
Simulation
The three mouse tracking method are tested on the client and server. Since the author lacks subjects to perform an implementation, a simulation based on previous mouse tracking data was conducted on the server. The mouse tracking data contain mouse tracking records from two quiz sessions in Moodle. They were conducted on the 3rd of January 2019 between approximately 12:00 and 14:30 Japan standard time. There are 2 sessions, with each session lasting approximately an hour and including 22 students (44 total students participating) from the School of Engineering and Applied Sciences, National University of Mongolia accessing the Moodle server at the Human Interface and Cyber Communication Laboratory, Kumamoto University. The data were preprocessed to exclude nonstudents and webpages other than the quiz page data. In other words, the simulation is purely a mouse tracking data transmission, which excludes most of the process, such as accessing the server and navigating the whole Moodle page. This approach shows lower resource consumption than the previous work [31].
The setup can be seen in Fig. 6 where a laptop functioning as a client is peer to peer connected to a personal computer functioning as a server. The mouse tracking data are converted into page tracking and ROI tracking data based on Table 2. Three sessions were conducted: the first session was the sending of mouse tracking data to the server, the second session was the sending of page tracking data to the server, and the third session was the sending of ROI tracking data to the server. Since the mouse tracking data contain time interval information between the sending of each event, it is possible to capture the scenario almost exactly.
During these sessions, the data rate is observed, and the central processing unit (CPU) and random access memory (RAM) usages are measured on the server. Figure 6 shows that one laptop serves as a client to send the data to the server which is a personal computer. The client is an MSI Laptop with i7-7820 HK 2.9 gigahertz (GHz) x8 32 gigabyte (GB) RAM while the server is an i7-6850 HK 3.6 GHz x12 32 GB RAM personal computer and the peer to peer connection is a 10 megabyte per second (MBps) network.
For the client testing, the author performs the quiz session recorded by mouse recording software GhostMouse in order to replay the exact mouse events for the three mouse tracking types and for different browsers. The testing time are short around a minute due to the limited profiling time of the browsers. The performance which is only the JavaScript total running time of four different browsers are measured:
- Chrome version 80.0.3987.132 that uses V8 Engine.
- Firefox version 74.069 that uses SpiderMonkey Engine.
- Microsoft Edge version 80.0.361.62 that uses Chakra.
To Be Continued