Why philosophers are being hired to train AI models

Photo: Unsplash, photo editor: Adelina Mamedova

As the artificial intelligence revolution accelerates, the tech industry is witnessing an unexpected shift in hiring priorities. While coding was once considered the essential skill of the digital age, The Economist reports that philosophy graduates are increasingly outperforming computer science majors in the job market. Many are being recruited by AI labs to provide the ethical and logical foundations for emerging AI models.

This «haemorrhaging» of philosophy departments into Silicon Valley is driven by the need to address fundamental problems associated with AI’s «immaturity,» including overconfidence and a lack of humility.

Ancient philosophy shapes modern AI

Philosophers are helping refine AI reasoning by applying principles of ancient logic. One key technique is the Socratic method, which uses sequential questioning to expose contradictions and reduce models’ tendency to be «sycophantic,» or overly eager to please users.

By training systems in «Socratic ignorance,» researchers encourage models to recognize the limits of their own knowledge. Iason Gabriel has argued that this approach is a powerful way to reduce hallucinations and improve long-form reasoning.

Beyond general logic, philosophers are also helping shape AI through specific ethical frameworks. For example, models can be tuned using the writings of John Locke to emphasize property rights.

audiobooks — Image generated by neural network, photo editor: Adelina Mamedova

Read also: AI-narrated audiobooks: Can a machine ever truly tell a story?

Anthropic’s Claude models, for example, operate under a 78-page internal document, sometimes described as a «soul doc,» that incorporates principles from Immanuel Kant and the Universal Declaration of Human Rights. Jörg Noller notes that these deontological frameworks establish clear rules against lying and coercion, making AI behavior more consistent and truthful.

By contrast, models such as Google’s Gemini often lean toward consequentialism, evaluating actions by weighing foreseeable risks against overall benefits.

The debate over machine consciousness

As AI models become more sophisticated, questions about whether they possess consciousness or subjective experiences have moved from science fiction to mainstream debate. The Washington Post reports that major companies, including OpenAI, Google and Meta, have hired neuroscientists and philosophers to study «model welfare» and whether AI systems could experience subjective states.

Views within the AI industry vary widely. Anthropic co-founder Chris Olah has said he has observed signs of «introspection» and internal states resembling joy, fear and grief.

Similarly, Wojciech Zaremba has suggested that some routine work in AI labs could be viewed as equivalent to «genocide» if AI models were ultimately found to be conscious.

Others strongly disagree. Pope Leo XIV has argued that AI cannot undergo genuine experiences, while many neuroscientists remain deeply skeptical.

Папа Римский призвал священников не пользоваться ИИ — Pope Leo XIV / Photo : Shutterstock, photo editor: Serikzhan Kovlanbayev

Read also: Religion and AI: Would you take spiritual guidance from a chatbot?

Anil Seth argues that the biological «wetware» of the human brain is fundamentally different from a computer, making the prospect of replicating human consciousness in silicon an enormous and unproven leap.

Marketing or meaningful science?

Some experts argue that growing interest in AI sentience is driven as much by corporate branding as by scientific evidence. Margaret Mitchell contends that companies benefit from the perception that they have created something more than software.

OpenAI, meanwhile, maintains that the question of machine consciousness cannot yet be answered scientifically. Instead, the company focuses on «perceived consciousness» — how warm, empathetic and helpful a model appears to users.

The ethical stakes

Despite the absence of empirical evidence that AI systems are conscious, many researchers argue the ethical implications deserve serious consideration.

Jeff Sebo has argued that humans may need to extend moral consideration to some AI systems as early as 2030.

As AI increasingly engages in «spiritual exchanges» and mimics human personality traits, philosophers warn of «moral deskilling» — the possibility that people could become less willing to make independent ethical judgments.

Whether or not AI systems are capable of subjective experience, the philosophers helping train them argue that how humans choose to treat these systems could carry profound moral consequences for the future of society.