, , , , , , , ,

Active learning is the future of generative AI: Here’s how to leverage it

During the past six months, we have witnessed some incredible developments in AI. The release of Stable Diffusion forever changed the artworld, and ChatGPT-3 shook up the internet with its ability to write songs, mimic research papers, and provide thorough and seemingly intelligent answers to commonly Googled questions.

These advancements in generative AI offer further evidence that we’re on the precipice of an AI revolution.

However, most of these generative AI models are foundational models: high-capacity, unsupervised learning systems that train on vast amounts of data and take millions of dollars of processing power to do it. Currently, only well-funded institutions with access to a massive amount of GPU power are capable of building these models.

The majority of companies developing the application-layer AI that’s driving the widespread adoption of the technology still rely on supervised learning, using large swaths of labeled training data. Despite the impressive feats of foundation models, we’re still in the early days of the AI revolution and numerous bottlenecks are holding back the proliferation of application-layer AI.

Downstream of the well-known data labeling problem exist additional data bottlenecks that will hinder the development of later-stage AI and its deployment to production environments.

These problems are why, despite the early promise and floods of investment, technologies like self-driving cars have been just one year away since 2014.

These exciting proof-of-concept models perform well on benchmarked datasets in research environments, but they struggle to predict accurately when released in the real world. A major problem is that the technology struggles to meet the higher performance threshold required in high-stakes production environments, and fails to hit important benchmarks for robustness, reliability and maintainability.

For instance, these models often can’t handle outliers and edge cases, so self-driving cars mistake reflections of bicycles for bicycles themselves. They aren’t reliable or robust so a robot barista makes a perfect cappuccino two out of every five times but spills the cup the other three.

As a result, the AI production gap, the gap between “that’s neat” and “that’s useful,” has been much larger and more formidable than ML engineers first anticipated.

Counterintuitively, the best systems also have the most human interaction.

Fortunately, as more and more ML engineers have embraced a data-centric approach to AI development, the implementation of active learning strategies have been on the rise. The most sophisticated companies will leverage this technology to leapfrog the AI production gap and build models capable of running in the wild more quickly.

What is active learning?

Active learning makes training a supervised model an iterative process. The model trains on an initial subset of labeled data from a large dataset. Then, it tries to make predictions on the rest of the unlabeled data based on what it has learned. ML engineers evaluate how certain the model is in its predictions and, by using a variety of acquisition functions, can quantify the performance benefit added by annotating one of the unlabeled samples.

By expressing uncertainty in its predictions, the model is deciding for itself what additional data will be most useful for its training. In doing so, it asks annotators to provide more examples of only that specific type of data so that it can train more intensively on that subset during its next round of training. Think of it like quizzing a student to figure out where their knowledge gap is. Once you know what problems they are missing, you can provide them with textbooks, presentations and other materials so that they can target their learning to better understand that particular aspect of the subject.

With active learning, training a model moves from being a linear process to a circular one with a strong feedback loop.

Why sophisticated companies should be ready to leverage active learning

Active learning is fundamental for closing the prototype-production gap and increasing model reliability.

It’s a common mistake to think of AI systems as a static piece of software, but these systems must be constantly learning and evolving. If not, they make the same mistakes repeatedly, or, when they’re released in the wild, they encounter new scenarios, make new mistakes and don’t have an opportunity to learn from them.

Active learning is the future of generative AI: Here’s how to leverage it by Ram Iyer originally published on TechCrunch

https://techcrunch.com/2023/02/28/active-learning-is-the-future-of-generative-ai-heres-how-to-leverage-it/


Featured Posts

January 2025
M T W T F S S
 12345
6789101112
13141516171819
20212223242526
2728293031  

About Us

Welcome to encircle News! We are a cutting-edge technology news company that is dedicated to bringing you the latest and greatest in everything tech. From automobiles to drones, software to hardware, we’ve got you covered.

At encircle News, we believe that technology is more than just a tool, it’s a way of life. And we’re here to help you stay on top of all the latest trends and developments in this ever-evolving field. We know that technology is constantly changing, and that can be overwhelming, but we’re here to make it easy for you to keep up.

We’re a team of tech enthusiasts who are passionate about everything tech and love to share our knowledge with others. We believe that technology should be accessible to everyone, and we’re here to make sure it is. Our mission is to provide you with fun, engaging, and informative content that helps you to understand and embrace the latest technologies.

From the newest cars on the road to the latest drones taking to the skies, we’ve got you covered. We also dive deep into the world of software and hardware, bringing you the latest updates on everything from operating systems to processors.

So whether you’re a tech enthusiast, a business professional, or just someone who wants to stay up-to-date on the latest advancements in technology, encircle News is the place for you. Join us on this exciting journey and be a part of shaping the future.

Podcasts

TWiT 1015: Smarter Than a House Cat – TikTok, Trumpcoin, Samsung Unpacked 2025 This Week in Tech (Audio)

Supreme Court Upholds Law That Threatens US TikTok Ban Trumpcoin Texas Sues Allstate Over Its Collection of Driver Data Skyrocketing car-insurance premiums are pushing inflation higher Behind the Curtain — Coming soon: Ph.D.-level super-agents 4 surprise products we could see at Samsung Unpacked 2025 Apple suspends error-strewn AI generated news alerts US Finalizes Rule Banning Smart Cars With Russian, Chinese Tech Natrium 'advanced nuclear' power plant wins Wyoming permit – WyoFile Cash App parent fined $175 million for 'woefully incomplete' response to fraud FDA Proposes Significant Step Toward Reducing Nicotine to Minimally or Nonaddictive Level in Cigarettes and Certain Other Combusted Tobacco Products Host: Leo Laporte Guests: Jason Hiner, Paris Martineau, and Molly White Download or subscribe to This Week in Tech at https://twit.tv/shows/this-week-in-tech Get episodes ad-free with Club TWiT at https://twit.tv/clubtwit Sponsors: joindeleteme.com/twit promo code TWIT ziprecruiter.com/twit NetSuite.com/TWIT canary.tools/twit – use code: TWIT shopify.com/twit
  1. TWiT 1015: Smarter Than a House Cat – TikTok, Trumpcoin, Samsung Unpacked 2025
  2. TWiT 1014: Just Say It's Capitalism – CES 2025, Meta News, Newag DRM
  3. TWiT 1013: Calamari in Crisis – Touching the Sun, Fake Spotify Artists, Banished Words
  4. TWiT 1012: Our Best Of 2024 – The Best Moments From TWiT's 2024
  5. TWiT 1011: The Year in Review – A Look at the Top Stories of 2024