Developments in language technology are advancing incredibly quickly. The latest AI language models can generate everything from poetry to HTML code and interactive text adventures. Now a new method developed by RISE is being launched to steer output even more intelligently.
According to Fredrik Carlsson, a researcher in deep neural networks at RISE, creating text using AI has advanced considerably in recent years. He considers the GPT-2 language model and its successor, GPT-3, to be total game changers, which have given developers access to well-trained resources that normally require tremendous computational power and a big wallet:
“The world’s largest networks are text models that can generate and read text. It has essentially revolved around extremely powerful autocomplete algorithms, which is the same feature you have on your phone. These models do exactly the same thing but at a higher level.”
175 billion parameters
GPT-3 is a so-called transformer model with a mechanism for learning which words are contextually important. Based on context and the training done on the corpus of text – 175 billion machine learning parameters for those keeping count – the model predicts and generates words and phrases.
“Overall, the model is powerful,” says Carlsson. “The volume of intelligible and coherent text that can be generated has increased. On the other hand, steerability has fallen behind – it’s basically still an individual autocomplete, albeit a seemingly magical one.
“It’s great that coherent text can be generated. But we realised that we must address the steering problem, and we found an approach that complements other models.”
Above all, I feel it allows for more human control
Recurrent instructions allow for human control
In short, a new architecture and algorithm have been introduced able to be applied to a language model. The instruction route has been separated from the text route, allowing instructions to be given recurrently and thereby generating new output. Traditionally, instructions have existed only at the first stage when the conditions are established.
The benefits are evident for various text-generating tasks where an author or coder wants to steer the AI towards a certain fact or text construct, introduce characters, or return to previous events.
“It can be used as a tool for journalists, authors, developers, and so on. There are language models that are so proficient that they can generate code.”
“Our method is independent of programming languages. It comes down to the language model available.
“It can also be used in different disciplines. Above all, I feel it allows for more human control.”
The method is available to read here. The scientific paper is being presented at the industry’s biggest conference in Dublin in May, and Carlsson says the technology has already garnered attention:
“The field is progressing incredibly quickly. I wouldn’t be surprised if this becomes an application within six months.”