, ,

Unlocking the code: AI’s data dilemma

With the capabilities of artificial intelligence (AI) evolving at such a startling pace, one of the most pressing challenges faced by data teams and engineers is how to handle the mass of unstructured and heterogenous data sources.

Unlike structured data, that is able to fit neatly into tables and databases, unstructured data is built up from a vast array of formats, including video, text and images. These formats all have their own intricacies, and the heterogeneity of these data sources can add further levels of complexity.

With this in mind, can teams find a way to optimize the collection and analysis of their data to maximize the impact of AI on their business? Given how activity is trending, agent-based systems and agent-agent communication appear to be the golden idea that will take the AI movement to the next level.

Unstructured data’s historical challenge

Historically, unstructured data, like audio, video, and social media interactions, has posed a substantial challenge for companies trying to interpret and convert it into formats that are structured appropriately for analysis and AI applications. For many organizations, the sheer complexity and cost of processing this unstructured data has meant that until recently, it remained heavily underutilized.

As a result, despite unstructured data comprising the majority of available data and possessing significant unrealized potential, organizations have tended to turn towards structured data, such as excel files and Search Engine Optimization (SEO) tags.

In recent years however, technological developments in the use of AI, along with generative AI, have transformed the way in which unstructured data can be interpreted and extracted.

For example, major cloud companies, including Microsoft and Google, have expanded their cloud services to support creating “data lakes” from unstructured data. Microsoft’s Azure AI now uses a mixture of text analysis, optical character recognition, voice recognition and machine vision to interpret an unstructured data set that could include text or images. Thanks to this advancement, businesses can now access this richer resource of data and finally unlock its value.

What are the current issues with unstructured data?

Organizations can now tap into a wealth of rich information that was previously inaccessible.

However, this is still not without its challenges. For example, navigating the varying levels of content quality, scope and detail of this unstructured data can pose a significant hurdle. With unstructured data, there tends to be much more irrelevant noise. If there is too much of this, then it can be challenging for even AI to accurately identify answers while sifting through information.

Additionally, the lack of regulation when it comes to the creation of unstructured data can impact its usefulness. Whilst these larger datasets generally offer greater levels of consistency, it is still a challenge adapting them to be utilized by AI and thus leveraged more effectively by organizations.

Being able to effectively utilize unstructured data typically involves incorporating it into an organization's existing data framework. A comprehensive understanding of the data’s properties, connections, and possible uses is necessary for this integration. A big challenge for many of these unstructured projects is to simply define a clear goal, in order for these models to be trained accurately.

Many organizations still struggle to leverage these existing data assets to generate business value.

So, whilst the previous issue of unlocking and obtaining the data has been largely solved, being able to hypothesize its potential value and applications remains a significant obstacle.

What is expected next for the GenAI movement?

In the future, we should expect human involvement in data sourcing and interpreting to decrease. We are instead likely to see an increase in agent-based systems, along with agent-to-agent communications, which minimizes the need for human intervention in data handling. The boom in generative AI has paved the way for specialized agents, which include:

  • “Engineering agents” for code generation
  • “Data generation agents” for creating synthetic data for testing
  • “Code testing agents” for validating and testing code
  • “Documentation agents” for generating documentation for various aspects such as code, use cases, and processes

There is no question that a system where specialized AI agents interact with one another can accelerate development, make it more accurate, and more consistent.

Organizations can now devote greater resources to utilizing data rather than preparing it. It is highly likely that in the near future, we will see these AI agents be offered as a product by service providers. These service providers could take a company’s requirements, then produce a fully tested, spec-compliant code produced by AI agents.

By outsourcing these technical tasks, companies would significantly reduce the length of time taken to complete these kinds of tasks, as well as reducing the need for large in-house development teams. It appears the time is now for companies to consider those specific roles generative AI can play to maximize value from their data programs, and ultimately get far greater results from their investment in these recently expanded areas.

It has been known for some time that generative AI has the potential to revolutionize the way organizations operate. However, implementing it effectively into organizations will still mean navigating its weaknesses before maximum capabilities can be achieved.

Organizations are yet to fully embrace AI-friendly data acquisition and integration. Those that adapt can maximize investment value and change their fortunes for the better.

We've featured the best AI chatbot for business.

This article was produced as part of TechRadarPro's Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro

https://www.techradar.com/pro/unlocking-the-code-ais-data-dilemma


Leave a Reply

Your email address will not be published. Required fields are marked *

Featured Posts

August 2024
M T W T F S S
 1234
567891011
12131415161718
19202122232425
262728293031  

About Us

Welcome to encircle News! We are a cutting-edge technology news company that is dedicated to bringing you the latest and greatest in everything tech. From automobiles to drones, software to hardware, we’ve got you covered.

At encircle News, we believe that technology is more than just a tool, it’s a way of life. And we’re here to help you stay on top of all the latest trends and developments in this ever-evolving field. We know that technology is constantly changing, and that can be overwhelming, but we’re here to make it easy for you to keep up.

We’re a team of tech enthusiasts who are passionate about everything tech and love to share our knowledge with others. We believe that technology should be accessible to everyone, and we’re here to make sure it is. Our mission is to provide you with fun, engaging, and informative content that helps you to understand and embrace the latest technologies.

From the newest cars on the road to the latest drones taking to the skies, we’ve got you covered. We also dive deep into the world of software and hardware, bringing you the latest updates on everything from operating systems to processors.

So whether you’re a tech enthusiast, a business professional, or just someone who wants to stay up-to-date on the latest advancements in technology, encircle News is the place for you. Join us on this exciting journey and be a part of shaping the future.

Podcasts

TWiT 994: Time Moves On, but I Don't – Pavel Durov Arrested, Hacking Bikes, Apple Event Rumors This Week in Tech (Audio)

Pavel Durov Arrested, Hacking Bikes, Apple Event Rumors Martin Shkreli must surrender his Wu-Tang album copies Telegram messaging app CEO Durov arrested in France Elon Musk to the Rescue Tesla purging old blog posts claiming all cars have level 5 automated driving hardware National Public Data Published Its Own Passwords – Krebs on Security Ten additional US states join DOJ antitrust lawsuit looking to break up Live Nation and Ticketmaster – Olympics talk Black Myth: Wukong Makes Gaming History in Launch Day Frenzy Bicycles Can Be Hacked Now American Radio Relay League confirms $1 million ransom payment When Is Apple Announcing the iPhone 16? Apple Planning Event on Sept. 10, 2024 Thoma Bravo's Realpage Sued by US in Rental Collusion Case Host: Leo Laporte Guests: Christina Warren, Sam Abuelsamid, and Reed Albergotti Download or subscribe to this show at https://twit.tv/shows/this-week-in-tech Get episodes ad-free with Club TWiT at https://twit.tv/clubtwit Sponsors: 1password.com/twit NetSuite.com/TWIT Fundrise.com/TWIT lookout.com shopify.com/twit
  1. TWiT 994: Time Moves On, but I Don't – Pavel Durov Arrested, Hacking Bikes, Apple Event Rumors
  2. TWiT 993: The Save Money Button – Pixel 9, Dell Layoffs, Apple Robotics
  3. TWiT 992: Why Not Pudding? – Google's Monopoly, Net Neutrality, AI Phishing
  4. TWiT 991: This Show Is Securities Fraud – Intel Layoffs, KOSA, Don Lemon
  5. TWiT 990: Dogecoin Fort Knox – AI Cheese, SearchGPT, "Free" Facebook