My Master's Thesis: Building an Academic Performance Analysis Tool in 2010

1 September 2010

code

In 2010, I submitted my master’s thesis at Dublin City University: Academic Performance Analysis using the Google Visualization API. I’d been awarded a scholarship for my undergraduate results and completed the MEng remotely while working full-time as an Operations Engineer at The Now Factory.

The thesis is available as a PDF if you want the full 170 pages. Here are the highlights.

The Problem

Universities collect vast amounts of student data — grades, module results, registration figures, progression rates — but in 2010, most of this sat in databases without any interactive way to explore it. If a lecturer wanted to understand how a cohort was performing, or spot students at risk of dropping out, the process was manual: pull data into spreadsheets, build static charts, repeat next semester.

The goal of the thesis was to build a web application that could take this raw academic data and present it through interactive visualisations, making patterns visible that would otherwise stay buried in tables.

The Research

The thesis had three research strands that fed into the final application.

Statistical techniques for academic data. I surveyed methods for analysing and predicting student performance: mean, standard deviation, confidence intervals, correlation coefficients, linear regression, discriminant analysis, and decision trees. Linear regression stood out as the most practical tool — given a student’s prior results, you could plot a regression line to predict future performance and flag students who were falling below their expected trajectory.

Visualisation technologies. I evaluated seven charting libraries: Google Charts, Visifire (Silverlight-based), HighCharts, AM Charts, Axiis (Flex-based), RGraph (HTML5 Canvas), and raw HTML5 Canvas. HighCharts won on every practical criterion — it worked offline, required no browser plugins, was cross-browser compatible, and rendered fast. Google Visualisations were kept for specific use cases where they excelled, like the motion chart and annotated timeline.

Student performance literature. I reviewed dozens of studies on academic performance analysis: the relationship between entry qualifications and degree outcomes, the impact of residence vs commuter students, gender-based performance patterns, online vs on-campus comparisons, dropout prediction, and the effect of remedial programmes. One finding that stuck with me was how a structured remedial programme for first-year engineering students could measurably reduce failure rates — something DCU could have implemented directly.

What I Built

The application was a three-tier J2EE web application: a MySQL database at the data tier, an Apache Tomcat server running Java Servlets at the application tier, and a browser-based frontend at the client tier.

It had three analysis modules:

Student Analysis — Search for any student by ID and get a full performance dashboard. This included a Google Motion Chart showing grade progression over time (viewable as bubbles, bars, or lines), HighCharts scatter plots with regression lines for predicting future performance, grade distribution charts, a parallel coordinates chart for comparing multiple metrics simultaneously, and heat-colour tables that visually encoded performance by grade.

Module Analysis — View any module’s current semester results alongside historical trends. A Drastic tree map showed the distribution of grades across the cohort at a glance, with the area of each cell proportional to the number of students at that grade level.

Course Analysis — University-wide views of registration trends, enrolment patterns, and demographic breakdowns. A Google geographic map plotted where students came from, and a motion chart animated enrolment data over time.

The frontend was built with jQuery and jQuery UI — accordion menus, tabbed interfaces, and modal dialogs that made the eight-page application feel like a much richer tool. Each visualisation lived in its own JSP file and received data as JSON strings generated by Java Servlets, which themselves called MySQL stored procedures.

The Technical Decisions

HighCharts over Google Charts. Google’s visualisations required an internet connection (they loaded JavaScript from google.com/jsapi at runtime) and some relied on Flash. HighCharts used the HTML5 Canvas tag, worked offline, and had better interactivity — legend toggling, zoom-by-drag, tooltips, and multi-axis support. For a university tool that might run on campus networks with restricted internet, offline capability was essential.

J2EE over PHP. I could have built it faster in PHP, but chose Java specifically for extensibility and security. Java Servlets kept business logic on the server, stored procedures hid the database structure, and the WAR file packaging meant the application could slot into existing university infrastructure. Looking back, I’d say this was the right instinct — choosing the harder path because the use case demanded it.

Stored procedures for all database queries. Every query went through a stored procedure rather than inline SQL. This wasn’t just for performance — it was a deliberate security measure. Stored procedures validate input and hide the database schema from the application tier, reducing the attack surface for SQL injection. The security awareness was academic at the time, but it became a real concern when I later worked in fintech and crypto.

jQuery UI for interaction. In 2010, jQuery UI was the most advanced option for building interactive interfaces without a full framework. The accordion/tabs/dialog pattern gave users a way to navigate through dense analytical views without page reloads. It’s the same pattern that component composition gives you in React today — just with a lot more manual wiring.

What Worked

Linear regression for performance prediction. The scatter plots with regression lines were the most immediately useful output. You could see at a glance which students were performing above or below their predicted trajectory. A student whose GPA was consistently below the regression line was a candidate for early intervention — something the university could act on rather than discovering at end-of-year exam boards.

The motion chart. Google’s motion chart was genuinely ahead of its time. Animating student grade data over semesters — watching a cohort spread out as some students improved and others fell behind — made patterns visceral in a way that static charts couldn’t match. It was inspired by Hans Rosling’s GapMinder project, which used the same technology to visualise global health and development data.

The heat-colour tables. A jQuery plugin that colour-coded table cells by value — green for high grades, red for low ones. Applied to a module’s grade distribution, you could immediately see whether a cohort was clustered around the pass mark or spread across the full range. It was a four-line jQuery integration that added more analytical value than some of the more complex visualisations.

What I’d Do Differently

Everything about the architecture, if I’m honest. The frontend would be a React SPA with an API backend. The charting would use D3 or Recharts. The data layer would be a proper REST or GraphQL API rather than Servlets generating JSON strings. The jQuery UI dialogs and accordions would be React components with proper state management.

But the core data model — centralised database, stored procedures for data access, JSON as the interchange format, charts as a presentation layer over structured data — that pattern is unchanged. It’s the same approach I used at The Now Factory, at Kraken, and at Trust Machines. The tools evolve. The architecture endures.

The biggest gap was testing. The thesis includes a short chapter on “Testing and Strengthening” that covers cross-browser testing and a brief mention of security concerns, but there were no automated tests. No unit tests, no integration tests, no E2E coverage. My subsequent career obsession with testing — Cypress at Kraken, Cucumber at BAML, Playwright at Trust Machines — exists partly because I know what it feels like to ship something without a safety net.

The Through-Line

I wrote this thesis at 25, working full-time and studying remotely. The technologies are all obsolete now — jQuery UI, Java Servlets, Google’s Flash-dependent motion charts, Silverlight. But the problems I was trying to solve are the same ones I work on today:

Take complex data and make it visually accessible
Build tools that replace manual, spreadsheet-driven processes
Choose the technology that fits the use case, not the one that’s most fashionable
Structure applications so they can be maintained and extended by others

Fifteen years later, I’m building Bitcoin wallets and DeFi portfolio views instead of academic dashboards. The frameworks changed. The data changed. The instinct to build interactive tools over raw data hasn’t changed at all.