AI falls down on predicting knee replacement recovery

16th April 2026

Clinicians should use their own expert judgement and not rely on AI to predict how patients will recover from total knee replacement operations, new research suggests.

Colleagues from Salford Royal’s trauma and orthopaedics department carried out the study looking at how accurate a popular AI model was in forecasting how much movement patients would have in their knees six weeks after their surgery. This information could be useful to inform clinical counselling or decision-making before this common operation, which is an effective treatment for advanced knee osteoarthritis.

The research team looked at the recovery of 160 patients who had total knee replacements at Salford Royal, part of Northern Care Alliance NHS Foundation Trust. They used GPT-5 mini AI model to predict flexion values – how much patients could bend their knee, decreasing the angle between the thigh and the shin.

When they compared the predictions with the actual movement each patient was able to reach, they found the AI model significantly overestimated flexion. This applied across most age groups, diabetic and non-diabetic patients, smokers and non-smokers.

The greatest inaccuracies occurred in younger, healthier patients.

Senior author, surgeon Mr Ihab Boutros said: “This is the first time a study has evaluated the use of a large language model of AI to generate predictions of early postoperative knee flexion. Our findings show that the average (mean) error was 7.5 degrees but in some cases, it was as much as 27 degrees.

“Our findings indicate that AI is a poor substitute for our current ways of planning postoperative management and should not be used for this purpose.”

Matthew Earnashaw, Hassan Shadad, Abed Alnsour, Damien Mony, Afolabi Olapade-Ayomidele, Usama Yaseen, Ihab Boutros: Accuracy of the GPT-5 Mini in predicting six week postoperative knee flexion following total knee replacement: A retrospective cohort study has been published in Cureus medical journal.

AI falls down on predicting knee replacement recovery

Grand round: Revisiting the threshold: Age-adjusted D-dimer, avoiding unnecessary CTPA, and the latest pulmonary embolism guidelines

Grand round: Managing refractory shock in critical care – new use for old drugs?

Grand round: Every step counts – managing peripheral arterial disease and diabetic foot infection from referral to follow-up