Posted by sethbannon 5 days ago
On the other hand, can an AI exam really simulate the conditions necessary for improving at this skill? I think this is unlikely. The students' responses indicate not a general lack of expertise in oral communication but also a discomfort with this particular environment. While the author is making steps to improve the environment, I think it is fundamentally too different from actual human-to-human discussion to test a student's ability in oral communication. Even if a student could learn to succeed in this environment, it won't produce much improvement in their real world ability.
But maybe that's not the goal, and it's simply to test understanding. Well, as other commenters have stated, this seems trivially cheatable. So it neither succeeds at improving one's ability in oral communication nor at testing understanding. Other solutions have to be thought of.
LLM oral exams can provide assessment in a student's native language. This can be very important in some scenarios!
Unlimited attempts won't work in the presented model. No matter how many cases you have, all will eventually find their way to the various cheating sites.
There is no silver bullet. There's no solution that works for all schools. Strategies that work well for M.I.T. with competitive enrollment and large budgets won't work for a small community college in an agricultural state, with large teaching loads per professor, no TAs, and about 15-25 hours of committee or other non-teaching work. That was my situation.
Teaching five courses and eight sections, 20-30 students per section, 10-20 office hours every week (and often more if the professor cared about the students), leaves little time for grading. In desperation I turned to weekly homework assignments, 4-6 programming projects, and multiple choice exams (containing code and questions about it). Not ideal by any means, just the best I could do.
So I smile now (I'm retired) when I hear about professors with several TAs each, explaining how they do assessment of 36 students at a school with competitive enrollment.
Absolutely the easiest solution would have been to have a written exam on the cases and concepts that we discussed in class. It would take a few hours to create and grade the exam.
But at a university you should experiment and learn. What better class to experiment and learn than the “AI Product Management”. Students were actually intrigued by the idea themselves.
The key goal: we wanted to ensure that the projects that students submitted was actually their own work, not “outsourced” (in a general sense) to teammates or to an LLM.
Gemini 3 and NotebookLM with slide generation were released in the middle of the class, and we realized that it is feasible for a student to have a flaweless presentation in front of the class, without understanding deeply what they are presenting.
We could schedule oral exams during the finals week, which would be a major disruption for the students, or schedule exams during the break, violating university rules and ruining students vacation.
But as I said, we learned that AI-driven interviews are more structured and better than human-driven ones, because humans do get tired, and they do have biases based on who is the person they are interviewing. That’s why we decided to experiment with voice AI for running the oral exam.
Part 2 is that when you are ready, an examiner sits with you, looks over your stuff and asks questions about it, like clarifications, errors to see if you can fix them, fake errors to see if you can defend your solution, sometimes even variations or unrelated questions if they are on the fence as to the grade. Typically that takes 3-10 minutes per person.
Works great to catch cheating between students, textbook copying and such.
Given that people finish asynchronously you don't need that many examiners.
As to being more stressful for students I never understood this argument. So is real life.. being free from challenge based stress is for kindergarteners
In a presentation, you are in control. You decide how you will present the information and what is relevant to the theme. Even if you get questions, they will be related to the matter at hand that you need to dominate in order to present.
In oral exams, the pressure is just too great. I doubt it translates to a proper job. When I'm doing my job, I don't need to come up with answers right there on the spot. If I don't remember something, I have time to think it through, or to go and check it out. I think most jobs are like this.
I don't mind the pressure when something goes wrong in the job and needs a quick fix. But being right there, in an oral exam, in front of an antagonistic judge (even if they have good intentions) is not really the way to show knowledge, I think.
(I invented some kind of metric based on a centered gaussian around a country ahaha)
One big issue that I had is that the system asked for a number in dollars, but if I answer $2000,2000,2000 per agent per month, the answer was always the same, I cannot accept a number, give it in words, after many tries I stopped playing, it wasn't clear what it wanted.
I could see myself using the system. With another voice as it was kind of agressive. More guidelines would be needed to know exactly how to pass a question or specify numbers.
I don't know my grade, so I don't know how much we can bullshit the system and pass
'This next thing is the best idea ever and you will agree! Recruiters want to sell bananas '
'OK, good, what is the... '
I hope this is catched by the grading system afterward.
The student is supposed to submit a whole conversation with an LLMs.
The LLM is prompted to answer a question or resolve a problem, and the LLM is there to assist. The LLM is instructed to never reveal the answer.
More interesting is the concept that the whole conversation is available to the instructor for grading. So if the LLMs makes mistake, or give away the solution, or if the student prompt engineer around it. It is all there and the instructor can take the necessary corrective measures.
87% of the students quite liked it, and we are looking forward to doubling the students that will be using it next quarter.
Overall, we are looking for more instructor to use it. So if you are interested in it please get in touch.
More info on: https://llteacher.blogspot.com/
I'm still somewhat concerned about exposing kids to this level of sycophancy, but I guess it will be done with or without using it in education directly.
Students are very simply NOT doing the work that is require to learn.
Before LLMs, homeworks were a great way to force students to approach the material. Students did not have any other way to get an answer, so they were forced to study and come up with an answer to the homeworks. They could always copy from classmates, but that was considered quite negatively.
LLMs change this completely. Any kind of homework you could assign undergraduates classes are now completed in less than 1 second, for free, by LLMs.
We start to see PERFECT homeworks submitted by students who could not get a 50% grade in classes. Overall grades went down.
This is a common pattern with all the educators I have been talking with. Not a single one has a different experience.
And, I do understand students. They are busy, they may not feel engaged by all the classes, and LLMs are a way too fast solution for getting homeworks done and free up some time.
But it is not helping them.
Solutions like this are to force students to put the correct amount of work in their education.
And I would love if all of this would not be necessary. But it is.
I come from an engineering school in Europe - we simply did not have homework. We had frontal classes and one big final exams. Courses in which only 10% of the class would pass were not uncommon.
But today education, especially in the US, is different.
This is not forcing student to use LLMs. We are trying to force student to think and do the right thing for them.
And I know it sounds very paternalistic - but if you have better ideas, I am open.
- The stuff being covered in high school is indeed pretty useless for most people. Not all, but most, and it is not that irrational for many to actually ignore it.
- The reduction in social mobility decreasing the motivation for people to work hard for anything in general, as they get disillusioned.
- The assessment mechanisms being easily gamed through cheating doesn't help.
It's probably time to re-evaluate what's taught in school, and what really matters. I'm not that anti-school but a lot of the homework I've experienced simply did not have to be done in the first place, and LLM is exposing that reality. Switching to in-person oral/written exams and only viewing written works as supplementary, I think, is a fair solution for the time being.
The key implementation detail to me is that the whole class is sitting in on your exam (not super scalable, sure) so you are literally proving to your friends you aren’t full of shit when doing an exam.
I wonder: with a structure like this, it seems feasible to make the LLM exam itself available ahead of time, in its full authentic form.
They say the topic randomization is happening in code, and that this whole thing costs 42¢ per student. Would there be drawbacks to offering more-or-less unlimited practice runs until the student decides they’re ready for the round that counts?
I guess the extra opportunities might allow an enterprising student to find a way to game the exam, but vulnerabilities are something you’d want to fix anyway…
To the extent of wondering what value the human instructors add.