A neural network has cobbled together a rudimentary vocabulary in a way similar to how a child learns to speak, by learning to associate images with spoken descriptions. David Harwath and Jim Glass at the Massachusetts Institute of Technology wanted to see if a machine could learn words without ever seeing them written down. They presented a neural net with more than 200,000 images and corresponding audio captions, and then tested it on a fresh set of 1000 images. It learned to pair sounds from the captions with objects in the images.
© 2001-2024 Fundación Dialnet · Todos los derechos reservados