What Recent Research Suggests About AI’s Role in Education
Artificial intelligence is a topic quickly permeating every part of education. At the 2026 Association for Education Finance & Policy Conference, several presentations showed why researchers and practitioners are paying attention to AI. Conversations revolved around what research is indicating: AI may help school leaders work faster, but speed is not the same as validity, trust, or good judgment.
That distinction matters. For busy school and district leaders, the appeal of AI is obvious. AI may help summarize text, classify conversations, organize evidence, recommend resources, and reduce administrative load. But education is human-centered work. The question is not whether AI can do more tasks. The better question is: Which tasks should we trust AI to support, and which tasks still require human interpretation?
AI can help manage the volume of educational data
Education produces enormous amounts of text, conversation, and observational evidence. Interviews, survey comments, coaching notes, classroom transcripts, assessment items, and professional learning records all contain useful information. But analyzing those data by hand is slow and difficult to scale.
That is where AI can help.
In one AEFP session on large language models (LLMs) and educational measurement, researchers shared how AI and natural language processing can support tasks such as coding interview transcripts, annotating tutoring conversations, analyzing assessment items, and measuring alignment between college programs and workforce skills.
For example, a presentation from Brown University and Johns Hopkins University examined whether AI could function as an additional qualitative coder. Importantly, the study focused on a narrow task: applying an existing human-designed codebook to interview transcripts. It did not claim that AI could replace the full qualitative research process, including codebook development, interpretation, or judgment.
That distinction is useful for school leaders. AI may be helpful when the task is clearly defined, repetitive, and reviewable. It may be much riskier when the task requires contextual judgment, trust, or high-stakes interpretation.
Efficiency is not the same as validity
A key caution from the research is that AI can be fast and still be wrong.
Researchers raised concerns about privacy, methodological fit, and the meaning of “quality” in qualitative work. Some approaches to qualitative research value interpretation and disagreement among human coders, while others focus more heavily on agreement. That means leaders should be careful before assuming that high agreement, fast coding, or polished summaries automatically mean AI has captured what matters.
Classroom observation is a harder test
The most cautionary example came from a presentation from the University of Virginia. The study compared human ratings and AI-supported ratings of pre-K classroom observations. The findings raised important concerns. In the notes, AI ratings appeared to overrate or underrate some teaching dimensions, showed less variation than human ratings, and showed near-perfect correlations across different teaching domains. That last point is especially important: If different domains of teaching quality look almost identical to the model, the model may not be distinguishing the instructional differences educators actually care about.
The takeaway is not that AI has no place in observations. It’s that classroom observation is not just a scoring task. It requires validity, context, and trust.
AI may be better at supporting coaching than replacing judgment
A more promising direction is using AI to support coaching and professional learning, not to replace the human work of feedback.
Work connected to Teach For America and the University of Chicago’s Cultivate Student Survey explored how classroom observation data and student survey data could be connected to better understand classroom experiences and support teacher coaching. The project used CLASS observations and Cultivate student survey data to help teachers and coaches examine classroom experiences from multiple perspectives.
The data was analyzed with AI models to help coaches identify key focus areas and generate possible coaching recommendations. The participating researchers also cautioned that the framework should remain theory-based and interpretive, should not replace psychometric evaluation, and still requires validity work.
A simple leadership test for AI tools
Before adopting AI for observation, feedback, coaching, or professional learning, district and school leaders should ask the following questions:
- What are we automating? (Summarizing comments is different from judging teaching quality.)
- What evidence supports the tool? (Accuracy claims mean little without validation.)
- Who reviews the output? (AI still needs human interpretation.)
- How will this affect trust? (Teachers may experience AI as support or surveillance.)
- What happens when AI is wrong? (Mistakes can affect careers, relationships, and school culture.)
The guiding principle is simple: Use AI to reduce burden. Do not use it to outsource judgment.
Read more: Authentic School Leadership in the Time of Artificial Intelligence
Final points:
- AI can process data quickly.
- AI can summarize patterns.
- AI can help reduce administrative load.
- AI can support coaching and professional learning.
But AI does not understand a school community the way educators do. It does not build trust with teachers. It does not know when a classroom moment needs context. It does not replace the human relationships at the center of teaching and learning.
So, to automate or not to automate?
The answer is not simply yes or no.
Automate carefully. Validate constantly. Keep humans at the center.
Xintong Li is an experienced researcher specialized in quantitative methods and educational research. He joined the NEE team in 2018 as a senior research analyst. He received his PhD in Statistics, Measurement, and Evaluation in Education at the University of Missouri. He has publications in methodological foundations and applications and is skilled in advance statistical modeling, programming and large-scale simulations using high-performance computer clusters. His current research interests include causal inference using cross-sectional data and motivation in education.
The Network for Educator Effectiveness (NEE) is a simple yet powerful comprehensive system for educator evaluation that helps educators grow, students learn, and schools improve. Developed by preK-12 practitioners and experts at the University of Missouri, NEE brings together classroom observation, student feedback, teacher curriculum planning, and professional development as measures of effectiveness in a secure online portal designed to promote educator growth and development.
