January 10, 2025

Big Data in Education: Illuminating Knowledge to Support Students

Author: amyschoenrock
Go to Source

As technology supporting online education develops, so does the ability for educational leaders to collect and use that data to shape institutional policies.  While large data sets can provide us with a wealth of knowledge they also come with a unique set of challenges.  Because the nature of learning is influenced by many factors acting in tandem, educational data is rife with confounding variables that must be teased apart in order to illuminate reliable information.   There are main steps that all research studies should take including: (1) asking clarifying questions that align with the data collected; (2) having the research reviewed by experts and stakeholders at every stage of the project; (3) evaluating all underlying assumptions, how they should be addressed and their potential impacts; (4) selecting the appropriate analyses without introducing new biases; and (5) ensuring that the highest level of ethics are always maintained throughout the project.

In a companion blog, What Can End-of-Course Surveys Do For You?, we share results from our study that evaluated almost 60,000 online courses to determine whether there was a relationship between EoCS scores and student success measures (grades, course pass rates, and next course progression) as they relate to faculty type (full-time versus part-time).  As you can imagine, we faced a number of challenges including the validity of survey results due to potential student grade bias and multiple factors acting at the same time.

Below are our some of the key challenges identified by faculty reviewers and how we addressed them.

“What if” Problems and Challenges Addressed

Data Treatment & Analyses

What if the data are non-normal, non-linear, or have outliers that skew the results?

Data were tested to meet the statistical assumptions of linearity, and variance for the statistical tools used.  Several outliers were identified but remained in the analyses after they were verified as authentic.  Removing outliers did not meaningfully change the outcome of our analyses.  

 

What if having a “bad class” skews data 

We averaged data by faculty member to ensure fairness to both faculty and students. This also helped meet statistical assumptions of distribution and normality.

 

(1) Uneven data set;

(2) the data have multiple dependent measures taken on the same set of subjects;

(3) there is a strong possibility of correlation among measures; and

(4) need to identify the dependent variable(s) most involved in the differences among groups.

 

We used multivariate analyses of variance (MANOVA), which is a highly tolerant and robust tool for the uneven data sets, multiple dependent measures from the same data set, correlated variables, and extracting the most salient factors influencing difference among groups. 

 

How do we know that the results we see are related to what was measured instead of outside factors?

(1) We reported the multivariate statistic, Wilks lambda, which provides the likelihood of significant group differences. This lets us know that the effect we see is real and not due to an outside factor.

 (2) We also reported the individual univariate F-statistic because it provides a second level of assurance that a significant finding reflects the variables collected rather than any underlying variate, or a chance variability.

 

What if there is a student grade bias?

(1) We used Pearson’s coefficient to determine the strength of correlations between variables, especially the impact of student success measures on student survey results.

(2) Then, we statistically adjusted for any co-variates (including grade bias) by using a multivariate analysis of co-variance (MANCOVA) in order to reveal factors that influence the student learning experience.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

There are many different ways in which large data sets can be analyzed to identify and predict patterns including regression modeling, cluster analyses, time series analyses, and simulation algorithms. Analytics aside, what matters most in interpreting results and directing educational policy is to have a student-centered focus. By always striving to place our students’ needs first, we can be secure in the knowledge that our students have a fighting chance for success.   


Drs. Zorn-Arnold and Selhorst are presenting their findings from the study described in this post at Accelerate 2019.  Their proposal was designated as the Best-in-Track session for the Research track, and you are invited to attend their session, Mining For Achievement Using Student Performance and End-of-Course Data: A Multi-covariate Analysis of 60,000 Online Courses, on Wednesday, November 20, 2019, 3:45-4:30pm in Southern Hemisphere 1.

The post Big Data in Education: Illuminating Knowledge to Support Students appeared first on OLC.

Read more