AI/ML News

Stay updated with the latest news and articles on artificial intelligence and machine learning

AI can control computer just like a human

Anthropic has introduced Claude 3.5 Sonnet, a new AI model capable of controlling a computer similarly to a human. The model uses screenshots of the desktop to navigate applications and perform tasks such as clicking, typing, and gathering information.

Stable Diffusion 3.5 opens new doors in digital art

Stable Diffusion 3.5, the latest release from Stability AI, introduces three powerful model variants that deliver enhanced image quality, speed, and accessibility for consumer hardware. The models are free for non-commercial use, and integrate advanced safety features to prevent misuse.

Controversial science: AI and Nobel Prizes

The 2024 Nobel Prizes in physics and chemistry have set a precedent for acknowledging AI’s contributions to science. While some may question the fit between AI and traditional disciplines, others see this as a necessary step toward recognizing the interdisciplinary nature of modern research.

Movie Gen – the future of AI video generation

Meta has unveiled Movie Gen, an AI-powered tool that creates high-definition videos with synchronized sound from simple text prompts. The model provides advanced video creation and editing features, offering users enhanced control over content generation.

Google releases major updates for Gemini models

With price cuts, increased rate limits, and faster output, new Gemini models by Google make advanced AI more accessible for developers worldwide. They boost speed, reduce costs, and enhance performance across a wide range of text, code, and multimodal tasks.

Autonomous landing innovation – a new era for drones

The Indian Patent Office has granted a patent for the innovative landing system for mini-UAVs. This technology enables precise landings in challenging terrains and has potential applications in both military and civilian logistics, including high-altitude deliveries and emergency.

Will Ideogram 2.0 overtake MidJourney?

The latest text-to-image model from Ideogram AI introduces significant advancements that could challenge the dominance of established players like MidJourney and Leonardo AI. New features are already available, including multiple distinct styles, enhanced realism, and advanced prompting tools.

Collision avoidance system transforms drone navigation

A low-cost, innovative accident avoidance system for drones uses onboard sensors and cameras to autonomously prevent mid-air collisions. This technology is crucial for UAV operations, ensuring safety and efficiency in increasingly crowded airspaces.

Advanced vision system inspired by praying mantis eyes

A new computer vision system significantly reduces energy consumption while providing real-time, realistic spatial awareness. It enhances AI systems' ability to accurately perceive 3D space – crucial for technologies like self-driving cars and UAVs.

MIT's MAIA: an automated agent for interpreting AI models

MAIA can interpret neural networks by conducting experiments and refining its analysis, enhancing understanding of AI models. This agent can identify neuron activities, remove irrelevant features, and detect biases, making AI systems safer and more transparent.

Creating digital elevation models from open data

Nowadays, users can create DEMs with just one click, thanks to radar satellites providing continuous, high-precision data on the Earth's surface and increasingly fast and accessible open-source software. This allows for effective monitoring of terrain changes and natural phenomena.

From barks to words: AI decodes dog vocalizations

AI learnt to decode dog barks, identifying playful versus aggressive barks, as well as the dog’s age, sex, and breed. Originally trained on human speech, AI models have achieved impressive accuracy, offering significant advancements in animal care and communication research.

A new era of multimodal AI with GPT-4o

During the Spring Update event OpenAI’s presented GPT-4о – the unique omnimodel that integrates text, audio and image processing, allowing it to work faster and more efficiently than ever before.

Llama 3: the latest advances in LLM

Llama 3, Meta AI's latest advancement, boasts unmatched language understanding, enhancing its capacity for complex tasks. With expanded vocabulary and advanced safety features, the model ensures improved performance and versatility.

The art of AI music: exploring Udio and Suno music generators

Explore the forefront of AI music synthesis with Udio and Suno platforms. Music generators enable users to effortlessly generate full-fledged songs across diverse genres while offering customizable features for experimenting with styles and crafting original melodies in seconds.

Efficient fact-checking in LLMs like ChatGPT with SAFE

Google’s DeepMind developed a new method for long-form factuality in large language models, – Search-Augmented Factuality Evaluator (SAFE). The AI fact-checking tool has demonstrated impressive accuracy rates, outperforming human fact-checkers.

Google introduces Gemma – a new open-source model

Drawing inspiration from its predecessor Gemini, Gemma is focused on openness and accessibility, offering versatile models suitable for various devices and frameworks. The model marks a significant step towards democratizing AI while emphasizing its responsible development and transparency.

StableRep: transforming how AI learns

The StableRep model enhances AI training through the utilization of synthetic imagery. By generating diverse images via text prompts, it not only solves data collection challenges but also provides more efficient and cost-effective training alternatives.

Step into the future: a 48-qubit programmable processor

Researchers have joined forces to create a programmable quantum processor that operates with high fault tolerance based on logical qubits. This opens up new prospects for large-scale and reliable quantum computing, capable of solving previously intractable problems.

Does the Turing test no longer work?

The Turing test, once groundbreaking for machine thinking, is now limited by AI's ability to mimic human reactions. A new study introduces a three-step system to determine whether artificial intelligence can reason like a human.

Google’s Gemini AI is going to surpass ChatGPT

A groundbreaking NLP model Gemini AI is set to surpass existing benchmarks. With its multimodal prowess, scalability across various domains, and integration potential within Google's ecosystem, Gemini AI represents a significant leap in AI technology.

Does GPT-4 Pass the Turing Test?

In 1950, British scientist Alan Turing proposed a test to determine whether machines can think. To date, no artificial intelligence has yet successfully passed it. Will ChatGPT be the first?

Tracking every pixel: motion estimation with OmniMotion

The latest motion estimation method can extract long-term motion trajectories for every pixel in a frame, even in the case of fast movements and complex scenes. Learn more about the exciting technology and the future of motion analysis in this article about OmniMotion.

AI can now translate brain activity into text

A groundbreaking AI system uses non-invasive methods and fMRI scanner data to translate thoughts into continuous text. With the achieved success rates in converting the content of human thoughts the semantic decoder opens up new possibilities for enhancing communication.

Generative AI Transforms Virtual Characters

Generative AI is revolutionizing the world of gaming by transforming virtual characters and enhancing their conversational skills. The NVIDIA Avatar Cloud Engine (ACE) for Games empowers developers to infuse intelligence into NPCs, reshaping gaming experiences and pushing the boundaries of what is possible.




A memristor-based Bayesian machine

A group of researchers have created a Bayesian machine, an AI approach that performs computations based on Bayes' theorem, using memristors. It is significantly more energy-efficient than existing hardware solutions, and could be used for safety-critical applications.

Benefits of the Look to Speak

Look to Speak is designed to help those with motor function impairments and speech difficulties to communicate more easily. The app lets people use their eyes to select pre-written phrases and have them spoken out loud.

How sound can model the world

MIT researchers have developed a machine-learning technique that precisely collects and models the underlying acoustics of a location from just a limited number of sound recordings.

New AI Model Creates 3D Objects and Characters for Virtual Game Worlds

During the last decade, one of the biggest issues in the gaming industry is the explosive growth of the AAA video games production cost. Studios are always on the look-up for technologies that could help bring down the cost of game development. Recent advances in the neural image generation models bring some hope that the realization of this dream may be not so far away.

Philosophers vs Transformers: Neural net impersonates a famous cognitive scientist

Can computers think? Can AI models be conscious? These and similar questions often pop up in discussions of recent AI progress, achieved by natural language models GPT-3, LAMDA and other transformers. They are nonetheless still controversial and on the brink of a paradox, because there are usually many hidden assumptions and misconceptions about how the brain works and what thinking means. There is no other way, but to explicitly reveal these assumptions and then explore how the human information processing could be replicated by machines.

Old photo restoration using neural networks

Now you won’t surprise anyone with filters that improve the quality of photos. But the restoration of old portraits still leaves much to be desired. Older photos tend to be too blurry, so normal image sharpening methods won't work on them.

No Language Left Behind

Facebook has released the NLLB project (No Language Left Behind). The main feature of this development is the coverage of more than two hundred languages, including rare languages ​​of African and Australian peoples. In addition, Facebook has applied a new approach to the machine learning model, where the translation is carried out directly from one language to another, without intermediate translation into English.

Photorealistic clothing animation for avatars

A group of scientists using machine learning "rediscovered" the law of universal gravitation.

Animated avatars have long become a part of our lives. But realistic modeling of closing animation still remained an open challenge.

On the one hand, modern physical modeling techniques can generate realistic clothing geometry at interactive speed. On the other hand, modeling a photorealistic appearance usually requires physical rendering, which is too expensive for interactive applications.

Rediscovering celestial mechanics with machine learning

A group of scientists using machine learning "rediscovered" the law of universal gravitation.

To do this, they trained a "graph neural network" to simulate the dynamics of the Sun, planets and large moons of the solar system from 30 years of observations. Then they used symbolic regression to discover the analytical expression for the force law implicitly learned by the neural network.

On the way to protect the planet: how analytics can support sustainability

The Nature Conservancy reconsidered its marketing strategy via digital transformation with the help of SAS Customer Intelligence 360. As a result this international environmental nonprofit had its best year ever for membership revenue. That as nothing else contributes to advancing its mission of creating a more sustainable future.

Automation Can Replace over 1.4 Million Jobs

“Employers and employees alike need to change their perspective. The future of work is already here and the introduction of technology does not affect work in a uniform way. We must acknowledge where it supplements existing work and invest in a targeted reskilling approach that recognises the new roles technology is creating and ensures human and machine labour complement one another.

Practices for Creating an AI Serving Engine

AI-powered engines review and analyze information in the knowledgebase, deal with model deployment, and check the performance. They introduce a new approach in which apps can take advantage of artificial intelligence to enhance operational effectiveness and help to address different business challenges.

Using Artificial Intelligence to Analyze Vehicle Occupants

“Over the last decade, Affectiva has continuously pursued new patents as we have pioneered and advanced the fields of Emotion AI and Human Perception AI. The breadth and depth of our patent portfolio reflect our commitment to pushing the boundaries of computer vision, machine learning, deep learning and AI at the edge; and, is a testament to our leadership in defining the many creative and diverse applications of Human Perception AI that are shaping industries today and in the future.”

How AI Can Protect Your Digital Life

With increased social media usage in recent years, and all of us living our lives online yet more, we need to develop the ways to reduce threats, ensure our safety and remove interactions that are creating concern. Artificial Intelligence (AI) is a progressed machine learning technology that plays an important role in contemporary life and is also essential in how today's social media networks function.

Development of Artificial Intelligence

The abilities that computer systems have are very advanced. The earliest equipment not only helped people solve complex mathematical problems, but also stored large amounts of information. Today computers operate complex equipment and systems to prevent human errors.

Designing Soft Robots Which Can Sense

Traditional rigid robots are incapable of a wide range of tasks. Instead soft robots may interact with people more safely or easily access narrow spaces. However, for robots to successfully complete their goal, it is essential to know the exact position of their body parts. That’s a complex task for a soft-bodied robot that can undergo nearly infinite number of modifications.