Data, AI, and Privacy: Making sense of protecting it all


 

    As a public school teacher in Texas, I feel like I am sometimes surrounded by data. Data is complied on students, their testing, their assignments, the programs we use, the certifications they take, basically every little detail of their lives. That data is then analyzed to death, in my opinion, in hopes we can give students a more personalized education and have a better understanding of their learning. 

    My own son, who just finished second grade, came home with his own binder (photo below) with his data at the end of school this May. Yes, even 8 year-olds have school data! It was filled with TPRI and RTI and MAPS testing showing what kind of student he is in reading and math. One slip told me that he will probably master or meets his STAAR Test next year. I'm glad to see that my child is doing well. I'm glad to see that his reading is off the charts for his age (takes after his momma), and I'm glad that he is doing so well in math (not his momma). However, after learning about data and AI and data privacy this week, it's a little unnerving that my 8 year-old has entered that world of Big Data. 


Big Data

    Big Data comes from large volumes of information from many different places in a speedy manner (Let's Summarize Ed, 2021). The Southern Regional Education Board (SREB) has stated that education leaders need to use accurate, timely data to make decisions on how to improve student achievement (2015). 

    According to the U.S. Department of Education’s Common Education Data Standards (CEDS), “As the efforts of such a diverse group of data users move forward, the ability to communicate via a common language becomes vital, allowing education stakeholders, including early childhood educators through postsecondary administrators, parents, students, legislators and researchers, to more efficiently work together toward ensuring student success, using consistent and comparable data throughout all education levels and sectors” (SREB, 2015). 

    Big Data comes in three levels:

    1. Microlevel - Interaction of individuals and their learning environment. There are six components to Microlevel Data: Knowledge, Metacognition, Identify in Affective States, Evaluating Student Knowledge, Using Data for Actionable Knowledge, and Clustering Student Profiles. 

    2. Mesolevel - Text data that looks at learners writing in a digital setting. There are four components: Cognitive, Social, Behavioral, and Effective. 

    3. Macrolevel - Institutional data, a student's demographics, admission data, class schedule, enrollment, grades, etc... 

(Let's Summarize Ed, 2021). 

    SREB states that linking data systems between education and other sectors, from early childhood through college and workforce, adopt common data definitions across the K-20 state education data systems is the number one issue in educational technology (2018). However, the amount of data that education institutions at all levels collects can be overwhelming if not harnessed and analyzed correctly. 

    Some of the challenges of Big Data include access, analyzing, and decision-making. Educational data exist in a wide array of formats across an even wider variety of platforms. In almost all cases, these platforms were developed for other purposes, such as instruction or educational administration, rather than for research (Fisher, et. al, 2020). Data is also complicated by privacy issues. Parents, educators, and others are rightly concerned about companies’ ability to mine large amounts of sensitive student data and act in ways that are not necessarily focused on bettering individual students’ futures. Fears have been raised that student data that are inappropriately shared or sold could be used to stereotype or profile children, contribute to tailored marketing campaigns, or lead to identity theft (Fisher, et. al, 2020). 

    Data also requires certain skills to analyze. Few education researchers know key programming languages used for data science, such as Python. Education research graduate programs seldom offer instruction in the data-clustering, -modeling, and prediction techniques used to analyze big data. Even for researchers with such skills, error rates and noise pose additional challenges (Fisher, et. al, 2020). 

    Finally, even if we successfully access and analyze big data, additional issues arise related to how such data are used. As education researchers increasingly turn to data mining, they will have to confront the tension between explanation and prediction. Increased focus on prediction using data mining and machine learning techniques can ultimately lead to a greater understanding of behavior (Fisher, et. al, 2020). In using big data, it is critically important to examine and address potential issues of bias, particularly when algorithms associated with big data lead to predictions and/or policy. Maximum transparency in the development of algorithms, conducting fundamental rights impact assessments to identify potential biases and abuses in the application of and output from algorithms, checking the quality of data collected and used, and ensuring that the development and operation of the algorithm can be meaningfully explained (Fisher, et. al, 2020). 

    Some states have made good progress with data systems, but they can only realize the full benefits of evidence-based decision making by:

    • Linking statewide data systems between early childhood, K-12, postsecondary and workforce

    • Adopting common data definitions across K-20 education systems and state agencies

    • Collecting common data elements for better consistency and comparison of data

    • Validating data so that it is consistent across various sources

(SREB, 2018)

    Training in big data and interdisciplinary training between education and data science is needed if educational institutions are going to understand how to properly use Big Data to the fullest (Let's Summarize Ed, 2021). 


(YouTube: Microsoft Education)

Artificial Intelligence and Data

    According to Microsoft Education, by 2027 education systems worldwide will be spending nearly $35 billion U.S. dollars with a growth rate of almost 30% (2022). That's a lot of money, but that is how  fast AI technology is moving. 

    Data analytics and AI opens the door to powerful and timely insights for education leaders to improve education outcomes while also reducing costs (Microsoft Education, 2022). Teachers can use AI programs like MagicSchool AI, ChatGPT, or Claude.AI to make lesson plans, worksheets, quizzes, and tests, saving time. Students can use Khan Academy to talk to a historical figure or a character in a literary work to get a better understanding. 

    Programs like Perplexity can help students with research, which is an AI program that includes source material pulled from the web. 

    With AI, educational institutions can improve equity and inclusion, create more personalized learning environments, and improve teacher effectiveness (Microsoft Education, 2022). Strategies can be implemented to improve learning outcomes, student coursework can be based on potential, interests, performance, and preferred learning modes, and teachers can have professional development that improves learning outcomes and teacher satisfaction (Microsoft Education, 2022). 

    However, AI is still considered questionable and unreliable. Human and systemic biases in generative AI algorithms and large language models' (LLMs) data impact the output of AI tools and consequently can perpetuate inequities when these biases are not removed or addressed (Cornell University, 2024). These models can provide inaccurate, misleading, and unethical information. LLMs can impersonate people and organizations, share intellectual property without attribution, and influence users based on the way information is presented (Cornell University, 2024). 

    The Cornell University website also states that "LLMs store your conversations and can use them as training data," which means any input and any materials uploaded to LMM processors can become part of the model’s training set, and can then be shared in the future without attribution. Resources and intellectual property may thus be used in unexpected ways. As a result, you may wish to only share open information or data that does not need to remain private (2024). 


Data Privacy

    The largest educators’ association in Texas and the U.S. was hit with a data breach that compromised the information of more than 414,000 people in June 2024 (Cardona, 2024). 

    The Association of Texas Professional Educators (ATPE) notified the state of the breach in a data security breach report filed with the Texas Attorney General's Office. The compromised data includes some employees' and members' names, addresses, social security numbers, dates of birth and driver's license numbers (Cardona, 2024). 

    Just a few days later, Texas Speaker of the House Dade Phelan and Representative Giovanni Capriglione touted the Texas Data Privacy and Security Act taking effect as a historic consumer data privacy and digital transparency measure. The Act provides Texans greater control over their personal data online and requires businesses to implement heightened standards on data security and the collection of personal data (Focus Daily News, 2024). 

    Created by House Bill 4 during the 88th Regular Session, the Texas Data Privacy and Security Act grants Texas consumers a definitive set of rights to control their data online, including: 
  • The Right to Know: Consumers have the right to know what personal data is being collected about them, the purposes for which it is being used, and whether it is being sold or shared with third parties
  • The Right to Access: Consumers can request access to the personal data a business has collected about them and obtain a copy of the data
  • The Right to Correct & Delete: Consumers have the right to demand corrections to their personal data and demand the deletion of their personal data held by a business
  • The Right to Opt-in: Businesses may not collect sensitive data without a consumer’s consent or “opt-in.” Sensitive data is any data revealing a protected class, health diagnosis, genetic or biometric data, children’s data or geolocation
  • The Right to Opt-out: Consumers can opt-out of the sale of their personal data, targeted advertising and data profiling when the profile is used to produce legal or similarly significant effects concerning them
  • The Right to Non-Discrimination: Businesses cannot deny goods or services, charge different prices or provide a different quality of service based on a consumer exercising their data privacy rights

    The Act also requires companies to maintain rigorous standards regarding data security and the collection of personal data, as well as requiring a disclosure on a company’s website when it plans to sell sensitive or biometric data.

(Focus Daily News, 2024)

    The SREB puts Data Privacy as number two on their Top 10 Educational Technology issues report. Data security and privacy, though related, are different issues. Data security is about protecting technology systems against unauthorized access and maintaining the integrity of the data within those systems. Data privacy is about the confidentiality rights of the individuals involved, the types of data collected, and how it is used and shared (SREB, 2018). 

    According to a recent study, cyberattacks increased 57% in 2022. And the worst part? Cybercriminals targeted the education sector more than any other industry (Sander, 2023). 

   Unfortunately, the Lone Star State also has a storied history of data security and privacy incidents over the past few years. Here’s a look at some of the most pertinent cases of compromised student data:

  • Dallas ISD: In August 2021, a Dallas Independent School District (ISD) suffered a data breach that exposed the personal data of over 800,000 individuals, including staff, students, and parents. Turns out, this case was fortunately just a mistake made by two students who unwittingly accessed their school’s sensitive information. This is just one of many, many examples across the country of why insider DLP risks are as important to manage than the more sensationalized cybersecurity incidents, like ransomware.
  • Mansfield ISD: An August 2022 ransomware attack took Mansfield ISD’s most critical information systems offline. Once the attack happened, the school notified law enforcement and the appropriate authorities, but by that time, the damage had already been done. Luckily, it didn’t appear that any of Mansfield’s 35,000 students lost personal information during the breach.
  • Judson ISD: Other districts weren’t so lucky. After a November 2021 attack, Judson ISD ultimately had to pay ransomware hackers over $547,000 — the largest known ransomware payment made by a school district in the US at the time. The hackers locked administrators out of computer systems for weeks and threatened to expose student data.
  • Lancaster ISD: A June 2021 data breach leaked thousands of pages of sensitive personal information stolen from Lancaster ISD’s record systems. Hackers posted over 500 staff members’ personal data on the dark web for anyone to freely access it.
(Sander, 2023)

    With data privacy, educational institutions must deal with federal regulations, such as FERPA, COPPA, and the Pupil Privacy Rights Act. SREB states that state and local education data governance policies should address five broad areas: transparency, privacy, collection, use and sharing (2018). 

    Balancing student data privacy with useful data analysis requires careful thinking. If legislation and policies are too restrictive, the data collected may not be useful in helping policymakers improve student outcomes. If they are too lax, students’ privacy is put at risk. To prevent unintended consequences, policymakers should consult a diverse group (including teachers, instructional technology staff, data managers, principals/administrators, purchasing agents, and educational technology specialists) when reviewing proposed policies or legislation (SREB, 2018). 


(YouTube: IBM Technology)

The Future


    Big Data, AI and Data Privacy is something we all need to continue to learn more about. Whether you are in education or the business world, everyone is going to deal with these three issues. 

    For many of my students, even my own children, they might look into careers as a Data Scientist or Data Analyst. These are jobs school districts and higher learning institutions are going to have to implement, if they haven't already, to help those organizations understand all the data collected. 

    School districts and higher learning institutions, as well as the business world, are going to have to come to grips in dealing with AI Technology. They are going to have to implement policies. 

    Data Privacy will also be an ongoing issue for education and business as technology advances and hackers get more skilled. 

    For myself, I need to know this information in order to keep up with what I am teaching my students and to help my fellow teachers and school district administrators as they try to understand all this new technology. 

    We all need to learn about data privacy for ourselves and our family, in order to protect ourselves online. Students need to be taught about their digital footprint and how to stay safe online. 

    Our world is digital and that information needs to remain in safe hands. 


References

(n.d.). Ethical AI for Teaching and Learning. Cornell University. Retrieved July 13, 2024, from https://teaching.cornell.edu/generative-artificial-intelligence/ethical-ai-teaching-and-learning

(2024, July 1). Speaker Dade Phelan, Representative Capriglione Tout Launch of Texas Data Privacy & Security Act. Focus Daily News. Retrieved July 13, 2024, from https://www.focusdailynews.com/speaker-dade-phelan-representative-capriglione-tout-launch-of-texas-data-privacy-security-act/

Cardona, M. (2024, June 28). Texas teachers group says data breach compromised info for more than 414,000 people. KERA. Retrieved July 13, 2024, from https://dentonrc.com/news/state/texas-teachers-group-says-data-breach-compromised-info-for-more-than-414-000-people/article_7b106a1e-3566-11ef-bcb3-e7e6cbe8b9d9.html

Fischer, C., Pardos, Z. A., Baker, R. S., Williams, J. J., Smyth, P., Yu, R., Slater, S., Baker, R., & Warschauer, M. (2020). Mining Big Data in Education: Affordances and Challenges. Review of Research in Education, 44(1), 130-160. https://doi.org/10.3102/0091732X20903304

[Let's Summarize Ed]. (2021, April 6). Big Data: Shaping the Future of Education [Video]. YouTube. https://youtu.be/8dm6YHVIXFM?si=Qtj84Do4BhvGx37r

[Microsoft Education]. (2022, August 19). Using Data Analytics and AI to Transform Education [Video]. YouTube. https://youtu.be/E0kmtQKRzTc?si=4xdgs4lTW7mgYWVK

Sander, A. (2023, February 2). A Closeup Look At Texas Data Privacy Laws. Security Boulevard. Retrieved July 13, 2024, from https://securityboulevard.com/2023/02/a-closeup-look-at-texas-data-privacy-laws/ 

Southern Regional Education Board (2018, February 1). 10 ISSUES IN EDUCATIONAL TECHNOLOGY. SREB. Retrieved July 13, 2024, from https://www.sreb.org/publication/10-issues-educational-technology

Southern Regional Education Board (2015, July 15). Data Definitions: ADOPT COMMON DATA DEFINITIONS WITHIN STATE EDUCATION DATA SYSTEMS. SREB. Retrieved July 13, 2024, from https://www.sreb.org/data-definitions
    

Comments

  1. Hi Melissa,

    I really enjoyed reading your blog! I completely agree with your point about being surrounded by data as teachers. It feels like we're always collecting, analyzing, and discussing data about our students. As a 6th-grade math teacher, I’ve experienced the same (data on test scores, homework, class participation, IXL and more). It feels as if we are acting as data scientists ourselves.

    I really connected with the story about your son's data binder. I also have my students keep track of their assessment grades and daily assignments. I made a form for their assessments (attached to their composition book) and each student has their own manila folder kept in my classroom for daily assignments. This helps to keep track of their progress and for parent teacher conferences. I forgot to mention that I also have a data wall in my classroom for students to track class period progress. All this data is great for personalized learning, but the large amounts of information can be overwhelming at times.

    The privacy issues you mentioned are particularly concerning. It’s important to find a balance between gathering data and implementing strong privacy measures. During my first year of teaching, my mentor advised me not to announce student's grades to the whole class. It's a NO, NO. The Texas Data Privacy and Security Act is a positive start, but we need to stay vigilant to ensure ongoing protection.

    With this being said, how do you think we can best prepare our students to understand and manage their own data privacy as they grow older? It seems to be an essential skill in today's increasingly digital world.

    Melissa, thank you for sharing your insights. Looking forward to your next post.

    ReplyDelete

Post a Comment