Here is a 461-word summary of the key facts and perspectives from the 5 articles about the dispute between Anthropic and the Pentagon over the use of Anthropic's AI models: The Pentagon and AI company Anthropic are locked in a high-stakes standoff over the military's ability to use Anthropic's AI technology, particularly its flagship "Claude" model. The dispute centers around Anthropic's refusal to remove safeguards it has placed on the use of its AI for autonomous weapons and mass surveillance. According to the reports, the Pentagon has given Anthropic an ultimatum - remove these restrictions by this Friday or face severe consequences, including potentially being barred from doing business with any company that also works with the U.S. military. Pentagon officials have accused Anthropic CEO Dario Amodei of having a "God-complex" and being a "liar," while Amodei has argued the technology is simply not reliable enough to be used for such sensitive applications. The conflict escalated after a tense exchange in December, where Amodei reportedly told Pentagon officials that in the event of a missile attack, they should check with Anthropic before using the company's AI to intercept the missiles. Anthropic denies Amodei made such a statement, saying the company has offered to enable its models to support missile defense. Anthropic has maintained that it is willing to work with the military, but cannot in good conscience remove the safeguards it has put in place. CEO Amodei argues that using AI for autonomous weapons and mass surveillance poses unacceptable risks to American troops and civilians. He says Anthropic has offered to collaborate on R&D to improve the reliability of these systems, but the Pentagon has not accepted. The dispute has sparked a broader industry-wide debate, with workers at companies like Google and OpenAI also pushing for limits on how militaries can use AI. OpenAI CEO Sam Altman has told employees the company will seek similar restrictions as it expands military use of its ChatGPT model. The Pentagon, meanwhile, has accused Amodei of wanting to "personally control the US military" and putting "our nation's safety at risk." Under Secretary of Defense Emil Michael said the department will consider labeling Anthropic a "supply chain risk" if it does not comply with the demands. Anthropic has stated that if the Pentagon does take punitive action, the company will work to enable a smooth transition to another AI provider. But the standoff represents a broader clash over the role of AI in national security and the extent to which tech companies should be able to dictate the terms of how their technologies are used by the military.
Here is a 400-word summary of the key points from the 5 articles: OpenAI CEO Sam Altman recently addressed concerns about the environmental impact of AI, stating that the energy required to train AI models is comparable to the resources needed to "train" a human over 20 years. Altman argued that the fair comparison is the energy used to run an AI model versus a human for a specific task, and claimed that AI may already be more energy-efficient in that regard. However, experts have pushed back on Altman's claims. While it's true that human development requires significant resources, the carbon emissions from contemporary AI models are a major concern that Altman's comments seem to downplay. The articles note that even efficient AI models still require substantial computing power and energy, contributing to climate change in ways that human evolution has not. The articles also explore the broader landscape of AI development and adoption. OpenAI is projecting massive growth, with revenue forecasts tripling to $62 billion by 2027. But the company is also revising its cash burn projections upwards by over $100 billion through 2030, as the costs of training and operating AI models continue to spiral. Meanwhile, Anthropic - a rival AI company - is targeting breakeven as early as 2028, suggesting it may be more efficient in its operations. The articles note that there is significant overlap between OpenAI and Anthropic's investor base, raising questions about potential conflicts of interest. Beyond the financial aspects, the articles highlight that while AI agents are thriving in software development, their adoption in other industries remains limited. OpenAI's COO acknowledged that enterprises have yet to see widespread AI integration into their core business processes. The "deployment overhang" suggests there are still significant hurdles to overcome before AI becomes truly ubiquitous in the enterprise world. Overall, the articles paint a complex picture of the AI industry, with concerns around environmental impact, financial sustainability, and real-world adoption. As AI capabilities continue to advance, the industry will need to grapple with these challenges to ensure the responsible development and deployment of transformative technologies.
Distilling the tool-using capabilities of large language models (LLMs) into smaller, more efficient small language models (SLMs) is a key challenge for their practical application. The predominant approach, supervised fine-tuning (SFT), suffers from poor generalization as it trains models to imitate a static set of teacher trajectories rather than learn a robust methodology. While reinforcement learning (RL) offers an alternative, the standard RL using sparse rewards fails to effectively guide SLMs, causing them to struggle with inefficient exploration and adopt suboptimal strategies. To address these distinct challenges, we propose MENTOR, a framework that synergistically combines RL with teacher-guided distillation. Instead of simple imitation, MENTOR employs an RL-based process to learn a more generalizable policy through exploration. In addition, to solve the problem of reward sparsity, it uses a teacher's reference trajectory to construct a dense, composite teacher-guided reward that provides fine-grained guidance. Extensive experiments demonstrate that MENTOR significantly improves the cross-domain generalization and strategic competence of SLMs compared to both SFT and standard sparse-reward RL baselines.
Gradient-based data attribution methods, such as influence functions, are critical for understanding the impact of individual training samples without requiring repeated model retraining. However, their scalability is often limited by the high computational and memory costs associated with per-sample gradient computation. In this work, we propose GraSS, a novel gradient compression algorithm and its variants FactGraSS for linear layers specifically, that explicitly leverage the inherent sparsity of per-sample gradients to achieve sub-linear space and time complexity. Extensive experiments demonstrate the effectiveness of our approach, achieving substantial speedups while preserving data influence fidelity. In particular, FactGraSS achieves up to 165% faster throughput on billion-scale models compared to the previous state-of-the-art baselines. Our code is publicly available at https://github.com/TRAIS-Lab/GraSS.
Large language models (LLMs) have demonstrated promising performance in generating diagnostic conclusions from imaging findings, thereby supporting radiology reporting, trainee education, and quality control. However, systematic guidance on how to optimize prompt design across different clinical contexts remains underexplored. Moreover, a comprehensive and standardized framework for assessing the trustworthiness of LLM-generated radiology reports is yet to be established. This study aims to enhance the trustworthiness of LLM-generated liver MRI reports by introducing a Multi-Dimensional Credibility Assessment (MDCA) framework and providing guidance on institution-specific prompt optimization. The proposed framework is applied to evaluate and compare the performance of several advanced LLMs, including Kimi-K2-Instruct-0905, Qwen3-235B-A22B-Instruct-2507, DeepSeek-V3, and ByteDance-Seed-OSS-36B-Instruct, using the SiliconFlow platform.