OpenAI has kept the specifics of the data which has been used to train Sora, the video-generating AI under wrap. However, it appears that some of the training material may have come from the Twitch streams and gaming walkthroughs.
On Monday, Sora made its much-awaited debut and since then people have been experimenting with it as much as the capacity issues allow. From a simple text prompt or from an image, Sora has the ability to generate a 20-second video which is available for users in different aspects, ratios and resolutions.
When OpenAI first introduced Sora in February, a hint was given that the model was created only suing Minecraft video, this led to the users pondering about the other gaming content that might be included in its training dataset.
Now it looks like a decent amount of gaming dataset is included, Sora can generate a video reminiscent of a Super Mario Bros clone, however, with some glitches. It also has the option to create gameplay footage that seems to be inspired by first-person shooters such as Counter-Strike and Call of Duty.
In addition to this, it also demonstrated an understanding of what a Twitch stream should be like suggesting it came across several. Another interesting resemblance found is of popular Twitch streamer Raúl Álvarez Genes who goes by the name Auronplay.
Auronplay is not the only Twitch personality to be recognized, Sora has also generated a video that features a character that bears a resemblance to Pokimane or Imane Anys.
OpenAI has also put some filters to prevent Sora from creating clips that depict trademarked characters, for example, entering ‘Mortal Kombat 1 gameplay’ will not provide the users anything which is linked to the title.
OpenAI has been somewhat secretive about the resources it used for its training data and no comments on the same have been made by the platform.