Agent-EvalKit offers a comprehensive evaluation infrastructure for AI agents, tracing tool usage and data fidelity. It integrates with popular AI coding assistants and provides detailed improvement recommendations based on code analysis. Effective evaluation requires measuring multiple dimensions of agent quality, with a combination of code-based and large language model evaluators yielding the...
Jinhua Zhao appointed head of MIT's Department of Urban Studies and Planning, known for shaping global mobility systems and bridging research with policy. Zhao's work with leading transportation agencies worldwide and founding of MIT Mobility Initiative highlight his impact on shaping future mobility solutions.
The Hertz Foundation awarded fellowships to MIT students Annika Marschner, Alvin Q. Meng, Zachary S. Siegel, and Matthew Wanta, providing 5 years of financial support for groundbreaking research. Recipients gain autonomy and access to a network of over 1,300 fellows, leading to collaborative breakthroughs in science and technology fields.
Amazon Bedrock Data Automation (BDA) simplifies structured data extraction from varied documents with customizable blueprints. Blueprint instruction optimization enhances accuracy without separate model fine-tuning, revolutionizing document field extraction.
AdaBoost. R2 regression predicts single numeric values by improving tree regressors sequentially. The demo program achieves 82.50% accuracy on training data and 52.50% on test data.
L. L. Thurstone's 1927 paper on random utility models laid the foundation for understanding human preferences. Recent research by MIT experts reveals new insights and potential improvements to these models.
GeForce NOW summer sale offers up to $70 off 12-month memberships, providing instant access to high-performance cloud gaming on any device. Excitingly, Guild Wars 3 is coming to GeForce NOW, enhancing the MMORPG experience for gamers.
Google AI team, with DeepMind researchers, releases DiffusionGemma for text generation. It uses text diffusion for faster, parallel generation on GPUs. DiffusionGemma is a multimodal 26B MoE model with bidirectional attention, supporting 140+ languages.
RANSAC regression detects outliers in training data for better linear regression results. A demo on the Diabetes Dataset shows RANSAC underperformed compared to plain linear regression.
Developers face a challenge in maximizing AI model performance on hardware. Neuron Agentic Development on AWS offers tools to simplify and accelerate kernel development for ML engineers.
Ferveret, founded by Reza Azizian and Matteo Bucci of MIT, is revolutionizing data center cooling with its waterless, energy-efficient system. Their Adaptive Phase Cooling solution improves computational power efficiency by 15% and allows data centers to get 35% more tokens from AI models.
Robotaxis are becoming a reality with driverless rides in cities worldwide, thanks to collaborations like Uber and Autobrains in Munich, Foxconn in Taiwan, and VinFast in Southeast Asia. NVIDIA's Halos Operating System is revolutionizing robotaxi safety with a certified OS foundation and standardized interfaces for AI-driven vehicles.
Large language models like ChatGPT are increasingly used for news consumption. MIT study shows AI dependency paradox: users become worse at detecting misinformation without AI assistance.
Physical AI is transitioning from research to production, with robots trained in high-fidelity simulations before deployment. Amazon SageMaker AI streamlines compute infrastructure for robot policy reinforcement learning, offering resiliency with SageMaker HyperPod.
New technology streamlines FNOL intake for insurance adjusters, reducing repetitive tasks and improving efficiency. Hands-free system combines Strands Agents SDK and Amazon Bedrock for faster, more accurate claims processing.