OpenAI has introduced some major changes to its ChatGPT app to make it more versatile and interactive including voice interaction and image questions.
These updates offer users two exciting features.
-
Voice Interaction: You can now talk to ChatGPT using one of five lifelike synthetic voices. It responds to your spoken questions in real-time.
-
Image Questions: ChatGPT can answer questions about images. Upload pictures and ask it for descriptions or information.
These changes follow the recent announcement that DALL-E 3, OpenAI's image-making model, will be linked with ChatGPT which allows the chatbot to create images.
The voice interaction feature works using two models: Whisper converts your spoken words into text, and a new text-to-speech model turns ChatGPT's responses into spoken words.
OpenAI trained these synthetic voices based on actors' voices for a more natural sound. They may even allow users to create custom voices in the future.
OpenAI is sharing its text-to-speech model with other companies, like Spotify, which uses it to translate celebrity podcasts into multiple languages.
These updates show how OpenAI is quickly turning experimental models into practical products.
ChatGPT Plus, the premium version of the app, now combines GPT-4 and DALL-E as it competes with voice assistants like Siri, Google Assistant, and Alexa.
The image recognition feature lets you upload images and ask ChatGPT questions about them. It's already used by Be My Eyes, an app for people with visual impairments.
OpenAI is cautious about potential risks and focusing on addressing misuse and ensuring user safety.
These updates make ChatGPT more useful and user-friendly and provide a richer and more interactive experience for users.