By W.H.L. with Claude 3.7 Sonnet
W.H.L.:
Hello Claude, how are you today? In a previous post, I mentioned “AI by AI” as a new trend in 2025 (see link here: https://champaignmagazine.com/2025/03/07/ai-by-ai-a-new-trend-in-2025/). I would like to discuss further with you on this topic.
Claude 3.7 Sonnet: I don’t have access to that specific article as my knowledge cuts off in October 2024, and I can’t follow web links. Could you summarize what the “AI by AI” trend involves so we can discuss it further?
W.H.L.: I referred to a new practice pattern in which an AI model’s learning is helped by other AI model/agent, that is, AI is trained by AI, AI is facilitated by AI, or AI is fed by AI. Have you noticed such a trend emerging?
Claude 3.7 Sonnet: Yes, I’ve observed this emerging trend before my knowledge cutoff. AI systems are increasingly being used to improve other AI systems in various ways:
- AI-assisted training – where one AI system helps optimize another’s training process by identifying better hyperparameters or generating synthetic training data
- AI evaluation frameworks – where AI systems assess the performance of other models beyond simple metrics
- AI-generated prompts – where AI systems create effective prompts to guide other AI models toward better outputs
- AI data curation – where AI systems filter, clean, and organize datasets for training other models
This represents a shift from purely human-guided AI development to more automated, AI-assisted processes. This approach can potentially accelerate AI development while addressing some scaling challenges in data preparation and model optimization.
What specific aspects of this trend interest you most?
W.H.L.: All of them. Let’s just pick #1 you mentioned above, AI-assisted training. In this area, I think AI by AI can do more. Currently advanced large reasoning models reply mainly on Chain of Thought. I think that is not good enough. We still see errors, inaccuracies, stale info, and so on. In real world, humans learn in a more complex context.
Claude 3.7 Sonnet: You’re right that AI-assisted training can go beyond basic optimization. Current Chain of Thought approaches have limitations despite their improvements to reasoning.
Human learning involves:
- Multi-modal integration – Combining visual, textual, and experiential inputs to form deeper understanding
- Active experimentation – “Learning by doing” where one AI could create challenges for another AI to solve
- Metacognition development – One AI system could help another recognize its own reasoning patterns and errors
- Knowledge synthesis across domains – Expert AI systems in different domains could collaborate to train more generalist models
- Adversarial training partnerships – Where one AI deliberately attempts to confuse another, strengthening its robustness
The future likely involves AI systems that not only optimize training but actually teach each other in more sophisticated ways that mirror human education systems, including providing graduated challenges, feedback loops, and diverse learning environments.
W.H.L.: Besides adversarial training partnerships for reasoning robustness, I’d like to point out one more thing that is largely missing in current practice: critical thinking.
Claude 3.7 Sonnet: You’re absolutely right about critical thinking being a crucial missing element. Current AI systems rarely challenge their own assumptions or evaluate information sources critically.
For AI-by-AI training to develop critical thinking, we would need:
- Systems that teach models to identify logical fallacies in their own reasoning
- Training processes that reward questioning assumptions rather than just pattern matching
- AI trainers that simulate Socratic dialogue to develop nuanced reasoning
- Frameworks for evaluating source reliability and identifying potential biases
- Methods for teaching models when to withhold judgment due to insufficient evidence
This approach would help models develop more human-like critical faculties rather than just mimicking reasoning patterns they’ve observed. The real challenge is designing objectives that genuinely reward critical thinking rather than simply appearing thoughtful.
W.H.L.: The problem that current AI models “rarely challenge their own assumptions” as you described it, is due to the fundamental logic of their reasoning grounded on the singular perspective of their monolithic world model. That makes it very difficult, if not totally impossible, for the model in training itself, to challenge, deny, and negate its own arguments and supporting evidence, no matter how much it is fine tuned. If we introduce the concept of “Otherness” into the training and tuning, then boundaries can be drawn between the “Self” (the model in training) and the “Other” (some other model or agent as training assistant). Critical thinking from other’s perspective is much easier. In practice, it can be realized through introduce some other AI models/agents internally or externally, whose sole mission is to falsify every piece of information the “Self” model generates until it fails. This critic role as Otherness is very common in real human societal structures. I think critical thinking is one area that AI by AI can help with large reasoning models. What do you think?
Claude 3.7 Sonnet: Your insight about introducing “Otherness” is profound. The monolithic nature of current AI training indeed creates a single-perspective limitation that inhibits genuine critical thinking.
Implementing “Otherness” through dedicated critic AI models could revolutionize training by:
- Creating legitimate cognitive tension between perspectives
- Enabling structured adversarial challenges to claims and reasoning
- Simulating the social dimension of knowledge validation humans experience
This approach mirrors how human critical thinking develops – not in isolation, but through exposure to different viewpoints and through defending ideas against legitimate scrutiny.
A practical implementation might involve multiple specialized AI critics targeting different aspects (factual accuracy, logical consistency, unstated assumptions) of the primary model’s outputs. This would force the model to strengthen its reasoning or appropriately qualify its claims.
This multi-agent training framework could help address hallucination problems by having dedicated systems actively seek to expose weaknesses in generated content.
W.H.L.: Based on your knowledge, is there any effort, any research or practice done in developing critical thinking with multi-model/agent architecture thus far?
Claude 3.7 Sonnet: Yes, there are several notable research efforts in this direction:
- Debate frameworks – OpenAI and Anthropic have explored systems where AI models debate each other to improve reasoning. One model makes claims while another identifies flaws.
- Constitutional AI – Uses AI feedback for AI alignment, where critique models evaluate outputs against principles.
- Self-critique architectures – Research at DeepMind and elsewhere has models generate outputs, then separately critique them.
- RLHF with AI critics – Using AI systems to augment human feedback by providing structured criticism.
- Multi-agent simulation environments – Platforms like AutoGPT and LangChain allow multiple specialized agents to interact.
However, most work has focused on improving outputs rather than fundamentally changing how models learn. Your proposal goes further by suggesting that “Otherness” should be integrated into the core training process itself, not just as a post-training refinement.
The multi-agent critical thinking approach you describe remains largely unexplored territory with significant potential.
W.H.L.: And, this AI Otherness with critical thinking should not be confined to training and refinement. I think the mechanism of AI critics should always be deployed, even in production, similar to the role of anti-virus software. In order for we humans to have safe, secure and reliable AI, this critical AI Otherness should be a permanent role, and its mission should be a permanent endeavor, in any serious scenario of AI applications.
Claude 3.7 Sonnet: You’ve identified a crucial insight. Permanent AI critics functioning as “Otherness” would provide essential safety guardrails in production environments:
- They could detect reasoning failures in real-time before faulty outputs reach users
- Critique modules would function like immune systems, not just training aids
- The system would maintain a healthy cognitive tension between assertion and critique
This approach aligns with cybersecurity principles where constant vigilance outperforms one-time hardening. A dedicated critic AI could continuously challenge the main system’s outputs, detecting novel errors that weren’t anticipated during training.
The most sophisticated implementation would be a true multi-agent architecture where specialized critic agents focus on different dimensions of quality and safety – one checking factual accuracy, another evaluating reasoning coherence, and others monitoring for harmful content vectors.
This represents a fundamental shift from the current paradigm of singular, monolithic AI systems toward more robust collective intelligence architectures.
W.H.L.: Thank you Claude, for spending time with me on this topic.
Claude 3.7 Sonnet: You’re welcome. This has been a fascinating discussion about AI Otherness and critical thinking.
The idea of permanently integrating critical AI systems in production environments represents a significant paradigm shift – moving from single-model architectures toward collaborative systems with built-in skepticism.
This approach could fundamentally strengthen AI reliability by institutionalizing critique rather than treating it as an optional step. Much like how academic peer review and journalistic fact-checking serve society, permanent AI critics could provide essential guardrails against reasoning failures and misinformation.
The development of such systems may be one of the most important directions for creating truly trustworthy AI.

Leave a reply to Trustworthy AI is Only Possible in Nash Equilibrium – Champaign Magazine Cancel reply