Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Deep Neural Networks
So I’ve been watching some stuff the last few nights before going to sleep about deep neural networks, to get a feel for how we might make use of that sort of thing.
I have a fairly good feeling for what’s going on with them now, and it’s pretty cool. But, even though I feel like I could code that sort of thing, and there are some libraries becoming available that use GPU resources very effectively to make it practical to do larger ones, in the end it all sort of becomes useless for most things because the datasets required to train them are enormous and likely none such exist in the wild for anything beyond the obvious stuff (like character recognition and a few other things, where there are some openly available data sets.) And some of them are petabytes in size, since you need hundreds of millions of samples to feed one to really get high quality recognition from them. So if each sample is an image, that adds up.
And to get that sort of sample set for a broad range of spoken words as WAVs (for speech recognition training), that would be crazy, and probably anyone who has such a set isn’t going to share it since it’s a huge advantage for them. So big companies like Amazon or Google can afford to pay tens of thousands of people with different accents to speak thousands of words and record them, but no one else could do that.
The thing is, someone could just make the final trained neuron layers and their weights available and that could be used to create a working DNN, but that’s even less likely I guess since it’s giving away the family jewels. It’s really interesting that, after feeding in all that info, it comes down to a set of very simple data values, basically how many layers, how many neurons in each layer, and the weightings for each neuron’s connection to each other neuron in the next layer. I’m sure some other housekeeping and optimization stuff can be part of it, but ultimately that’s what all that training generates, and any required housekeeping/optimization stuff needed by the engine you use could probably be generated after the fact if you just got all the weightings.
OTOH, it does seem like that should mean that some large companies who could get access to the required training data sets, should then be able to build a small box that has some good GPU or DSP boards, and a burned in data set of the weighting info, and do high quality, all local speech recognition. If it required having all that sample data it would be a joke, but the actual weighting data required, though not trivial, is hardly massive by modern standards, and general purpose GPUs should provide enough power. The DNN algorithms are very amenable to the sort of massively parallel processing that modern GPUs are good at.
I'm kind of amazed that someone hasn't done it already. Of course someone like Microsoft could provide that functionality in Windows by shipping the trained DNN and the libraries to process it on the local GPU, but I guess it would be competitive with Cortana which is probably implemented in the same way. 

Anyway, just rambling. Given the training data set requirements, at least for anything we'd likely want to do, I don't see any reason at the moment to dig deeper, other than for my own interest maybe. Anything simple enough that we could do it via DNN we could do easier just via algorithmic means as well most likely.

In theory I guess I could see some sort of use for it to look at the state of devices and in some way correlate that to some particular situation that is meaningful to you, and learn over time what it should do in response. But, that would require that you do a lot of training of it, telling it, this current system state corresponds to situation X. If you didn't regularly do that, then there's no training going on.
Dean Roddey
Software Geek Extraordinaire
I took a neural networks course in school back in the day. It is not as simple as it sounds (although I am not up to speed on the latest algorithms). It is difficult to get the weights tuned so that it produces an acceptable result. And your training sets have to representative of the problem without causing overtaining. And the popular feedforward algorithms like backprop utilize offline training. So the network can't run and train at the same time.

But I agree it is probably the next horizon for HA. Being able to have your house anticipate your needs. Rather than programming it, having it watch what you (and maybe other users) do and if it see a pattern of you doing something manually, have it offer to do it for you from now on. You don't need a lot of horsepower or memory, a small ANN can do a lot of impressive things and the bigger the network the harder it becomes to manage. And the horsepower is needed for the training, not the running. People are doing this stuff on Rasperry Pi's. One guy in my class for his class project wrote one to determine who you are by how you type your password, not the word itself and this was in early '90's on whatever we had at the time i286? Of course it probably took hours or days to train.... Smile
My Home Theater/Automation Website

[THREAD=5957]BlueGlass CQC Config[/THREAD]
[THREAD=10624]Wuench's CQC Drivers[/THREAD]
With modern DNNs, I thought the whole point is to set the weights via training, so that you never have to get involved with tweaking them by hand. But that requires sufficient training samples. That's for things like character recognition and voice recognition and such though, where you can just pre-feed it with a huge data set.

I'm not sure how that would work in the other types of learning, where it's just watching what you do and learning from that. That would be so slow and the number of training samples would be very tiny, and of course it has to have some pre-determined idea in those sorts of situations as to what even remotely would be a desirable goal and when such a goal may have been reached (which only the user can probably tell it and he won't take the time to do that over and over) so that the current state of the system can be taken as a training sample (and even then, what parts of the system are relevant to such a goal, and so forth.)

For that sort of stuff, honestly, I think that an algorithmic approach is easier and probably just as effective. Just give the user some questions to answer and use that to set up the algorithmic magic.
Dean Roddey
Software Geek Extraordinaire

Forum Jump:

Users browsing this thread: 1 Guest(s)