, , , , , , , , ,

ChatGPT now supports voice chats and image-based queries

ChatGPT is getting some significant updates that will enable the chatbot to deal with voice commands and image-based queries. Users will be able to have a voice conversation with ChatGPT on Android and iOS and to feed images into it on all platforms. OpenAI is rolling out the features now. They’ll be available to Plus and Enterprise users at first, with other folks gaining access to the image-based features later.

You’ll need to opt in to voice conversations in the ChatGPT app (go to Settings then New Features) if you’d like to try them out. By tapping the microphone button, you’ll be able to choose from five different voices.

OpenAI says the back-and-forth voice conversations are powered by a new text-to-speech model that can generate “human-like audio from just text and a few seconds of sample speech.” It created the five voices with the help of professional actors. Going the other way, the company’s Whisper speech recognition system converts a user’s spoken words into text.

The image-based functions are intriguing too. OpenAI says you can, for instance, show the chatbot a photo of your grill and ask why it won’t start, get it to help plan a meal based on a snap of what’s in your fridge or prompt it to solve a math problem you take a picture of. As it happens, Microsoft highlighted the Copilot AI’s ability to solve math problems in Windows during its Surface event last week.

OpenAI is using GPT-3.5 and GPT-4 to power the image recognition features. To use ChatGPT’s image-based functions, tap the photo button (you’ll need to tap the plus button first on iOS or Android) to take a snap or choose an existing image on your device. You can ask ChatGPT about multiple photos and use a drawing tool to focus on a specific part of the image.

In a blog post announcing the updates, OpenAI noted the potential for harm. It’s possible for bad actors to mimic the voices of public figures (and everyday folks) and perhaps commit fraud. That’s why OpenAI is focusing on ChatGPT voice conversations with this technology and working with select partners on other limited use cases (more on that in a moment).

As for images, OpenAI worked with Be My Eyes, a free app that blind and low-vision people can use to help them better understand their surroundings thanks to volunteers who hop into video calls with them. “Users have told us they find it valuable to have general conversations about images that happen to contain people in the background, like if someone appears on TV while you’re trying to figure out your remote control settings,” OpenAI said. The company noted that it has also limited how ChatGPT can analyze and make direct statements about people that appear in images, “since ChatGPT is not always accurate and these systems should respect individuals’ privacy.” It has published a paper on the safety properties of the image-based functionality, which it calls GPT-4 with vision.

ChatGPT is more effective at understanding English text in images than other languages. OpenAI says the chatbot “performs poorly” in other languages for the time being, particularly when it comes to those that use non-Roman scripts. As such, it suggests that non-English users avoid using ChatGPT to deal with text in images for now.

Meanwhile, Spotify has teamed up with OpenAI to use the voice-based technology for an interesting purpose. The former has announced a pilot of a tool called Voice Translation for podcasters. This can translate podcasts into different languages using the voices of the folks who appear on the show. Spotify says the tool can retain the speech characteristics of the original speaker after converting their voice into other languages.

To start with, Spotify is converting select English-based shows into a few languages. Spanish versions of some Armchair Expert and The Diary of a CEO with Steven Bartlett episodes are available now, with French and German variants to follow.

This article originally appeared on Engadget at https://www.engadget.com/chatgpt-now-supports-voice-chats-and-image-based-queries-144718179.html?src=rss

https://www.engadget.com/chatgpt-now-supports-voice-chats-and-image-based-queries-144718179.html?src=rss


October 2024
M T W T F S S
 123456
78910111213
14151617181920
21222324252627
28293031  

About Us

Welcome to encircle News! We are a cutting-edge technology news company that is dedicated to bringing you the latest and greatest in everything tech. From automobiles to drones, software to hardware, we’ve got you covered.

At encircle News, we believe that technology is more than just a tool, it’s a way of life. And we’re here to help you stay on top of all the latest trends and developments in this ever-evolving field. We know that technology is constantly changing, and that can be overwhelming, but we’re here to make it easy for you to keep up.

We’re a team of tech enthusiasts who are passionate about everything tech and love to share our knowledge with others. We believe that technology should be accessible to everyone, and we’re here to make sure it is. Our mission is to provide you with fun, engaging, and informative content that helps you to understand and embrace the latest technologies.

From the newest cars on the road to the latest drones taking to the skies, we’ve got you covered. We also dive deep into the world of software and hardware, bringing you the latest updates on everything from operating systems to processors.

So whether you’re a tech enthusiast, a business professional, or just someone who wants to stay up-to-date on the latest advancements in technology, encircle News is the place for you. Join us on this exciting journey and be a part of shaping the future.

Podcasts

TWiT 999: Bananas and Browsers – CA AI Bill Veto, Meta's Orion, FTC Vs. Fake Reviews This Week in Tech (Audio)

CA AI Bill Veto, Meta's Orion, FTC Vs. Fake Reviews Sam Altman's AI Manifesto News from Meta Connect Gavin Newsom vetoes sweeping AI safety bill, siding with Silicon Valley The Panel discusses CoPilot The Panel debates AGI James Cameron Joins Board of Stability AI in Coup for Tech Firm SAG-AFTRA Calls Strike Against 'League of Legends' Rabbit says only 5,000 people use the R1 daily Orion: True AR Glasses Have Arrived AI smackdown: How a new FTC ruling just protected the free press DoNotPay has to pay $193K for falsely touting untested AI lawyer, FTC says Firefox Review Checker – Ensure review authenticity in your online shopping New California law requires one-click subscription cancellations The DOJ sues Visa for locking out rival payment platforms NIST proposes barring some of the most nonsensical password rules Some Mad Genius Put ChatGPT on a TI-84 Graphing Calculator 23andMe troubles, company recently settled data insecurity suit for $30 mil Host: Leo Laporte Guests: Denise Howell, Parmy Olson, Daniel Rubino, and Henry Laporte Download or subscribe to this show at https://twit.tv/shows/this-week-in-tech Get episodes ad-free with Club TWiT at https://twit.tv/clubtwit Sponsors: lookout.com 1password.com/twit shopify.com/twit veeam.com flashpoint.io
  1. TWiT 999: Bananas and Browsers – CA AI Bill Veto, Meta's Orion, FTC Vs. Fake Reviews
  2. TWiT 998: Artisanal Locally-Sourced Dopamine – Amazon Returns to Office, CA AI Bill, Elon Backs Down
  3. TWiT 997: Put an OLED on it – iPhone Event 2024, $700 PS5, AI in AU
  4. TWiT 996: The Quiet Office Crackdown – Starlink Backtracks, AI Royalty Heist
  5. TWiT 995: The Story of Us – AnandTech Shuts Down, Brazil Bans X, Alexa Revamp