In a significant stride for Brazilian AI technology, a national startup has unveiled MariTalk: an artificial intelligence chatbot entirely in Portuguese, rivaling its international counterparts. This breakthrough comes amidst the dominance of giants like OpenAI’s ChatGPT, backed by Microsoft, in an attempt to compete.
Six years ago, a team of researchers at the State University of Campinas (Unicamp) in São Paulo delved deep into AI studies. Just a month before OpenAI’s ChatGPT launch, they established a startup dedicated to AI development. Fast forward to May 2023, and their efforts bore fruit with MariTalk, the 100% Brazilian AI chatbot now freely accessible online. Similarly to ChatGPT, MariTalk is trained to assist in various tasks, from tackling exams like Enem(Brazil’s National Student Performance Exam) to understanding laws, all in native Portuguese(although the bot is capable of speaking english and other languages).
The brain behind MariTalk is Dr. Rodrigo Nogueira, a computer science Ph.D. and volunteer professor at Unicamp, Brazil, alongside his colleague Ramon Pires and researchers Hugo Abonizio, Thales Almeida, and Thiago Laitz, forming the Maritaca AI startup. Their ambition? To be Brazil’s answer to OpenAI on the global AI stage.
In a conversation with SBT News, the team shared insights into the startup’s challenges, including MariTalk’s development and their work on an AI that generates images akin to OpenAI’s DALL-E.
Rodrigo Nogueira’s story
Rodrigo Nogueira started working with AI in 2012, when he was in the image field, which was the first major AI boom. There, they successfully identified whether a fingerprint was real or fake. This turned into a product that is special for various companies. In 2018, they started working with a language model. At that time, they were not large, but small language models. They launched the first Brazilian language model, focused on Portuguese, called BERTimbau. Today, it has more than 10 million downloads on their website.
Rodrigo graduated in the United States, then returned to Brazil and worked for a series of other companies. The entire team with him were people who worked in other companies, and then he made the decision to create Maritaca and convinced them to work with them. They received a $1 million Ad Grants from Google, to spend on Google Cloud. Google also gave them access to their computers for training. It was a huge impact.
This culminated in Sabiá-2, which is performing better than GPT 3.5 Turbo. But unlike GPT 4, because of the specialization in Portuguese, they can provide much better service for the brazilian market, also offering radically cheaper prices(ranging from $0.25 to $5 per million tokens, while GPT4’s API costs a minimum of 10$ per million tokens)
Rodrigo and his team are focused on keeping the model up-to-date so that the chatbot stays informed about recent events. This approach stands out from typical language models that have a fixed knowledge limit. For instance, OpenAI’s GPT-3.5 model knowledge goes until 2021, while GPT-4 covers events until October 2023(although some versions of the chatbot are up to August or December of last year)
Their goal is to create a continuous learning model at Maritaca, ensuring it learns about new events(from say, a week ago) while retaining knowledge of past ones.
Additionally, the team aims to empower users by allowing them to customize the models with their own data. This way, users can gain insights into topics they’re interested in. Looking ahead, they plan to introduce multimodal models incorporating both images and text, similar to Google’s Bard.
Maritaca AI Overview
- Founded: October 2022, in Campinas (SP)
- Founder & CEO: Rodrigo Nogueira
- Team Members: Ramon Pires (researcher), Hugo Abonizio (researcher), Thales Almeida (researcher), and Thiago Laitz (researcher)
Based in Campinas, São Paulo, Maritaca AI emerged just before OpenAI’s ChatGPT launch, focusing on AI domains and languages. Their expertise spans advanced AI research, large language models (LLM), and natural language processing in Portuguese. MariTalk’s launch in May 2023 marked a significant milestone in Brazilian AI innovation.