10-11-2017, 11:06 AM
(This post was last modified: 10-17-2017, 07:16 PM by Dean Roddey.)
This thread is for discussing the 5.3 betas. To start off with, here are some options for what we might do for 5.3. Once we choose what to do, this thread will be to report bugs, discuss options, etc...
 
Obvious Things
--------------------------------------------------
Some of the things we need to do are obvious and will definitely be done unless something very unexpected happens to prevent it, and even then that would only likely delay it somewhat because these are no-brainers.
New Z-Wave Driver
Finish the work on the new Z-Wave driver, the native Z-Stick one. It's well along already but it wasn't ever going to make it in time for 5.2, so I left it alone while finishing up WebRIVA stuff and other pre-5.2 release busy-work. Getting this guy done will mean we can (I hope) never have to write another Z-Wave driver again. Probably about the time I finish it, Z-Wave will go defunct or something. Not that I'd cry about that.
Possible Big Ticket Items
--------------------------------------------------
These seem fairly obvious to me, but if folks can make compelling arguments for other things we can add those to the list. I think that all of these are mutually exclusive, i.e. any one of them one be the one tanker sized feature for the next release, when combined with the Z-Wave driver above and a few smaller things. So anything else would be smaller stuff in addition to that and one of these.
Take the next step on CQC Voice.
Most likely that would be to update it to support multiple mics in a single room. It would run parallel recognition engines from each mic and using a 'voting mechanism' to work out the most likely best match to what was spoken. That serves two purposes. In large rooms it means more coverage, though the distance may mean that effectively only one mic ever has a good read. In smaller rooms it means higher accuracy since there's more likely to be a mic closer to you and pointing towards your face.
This doesn't mean you can just use regular mics, they still need to be array mics to be of much use. This isn't creating a DSP based scheme to do off axis noise rejection. All that still needs to be done in the mics. If you want something like that, buy a conference room DSP system and then you can send it multiple mics and it will do the mic array thing and you can then feed that one resulting audio feed to CQC Voice.
This will be annoying work, but nothing doctoral thesis-like. Everything is there except for the voting scheme. It's just a matter of adjusting the architecture to support more than one per mic. 
And of course just generally improve it while we have the chance as we are going back through it all.
Finish RTP Work, Then SIP/SDP
I have a lot of an implementation of RTP done already from a while back. I was successfully transmitting CD audio to VLC player as a test. We'd be able to send/receive audio between CQC and any RTP enabled device. That would be the basis for a lot of other possible bits, some listed below. This wouldn't be a full on implementation initially. it would probably support CD audio and a couple of the obvious voice audio codec formats. Others could be added over time if needed.
Once we have RTP, then we can implement SIP/SDP. That means we can then send and receive SIP calls. That includes not just phones but other things like SIP based intercom systems and doorbell intercoms and stuff like that. So we could basically implement an intercom engine that should work with most any other SIP based intercoms, and of course amongst our own machines thusly configured.
As far as phones themselves go, CQC could call you and tell you things if you wanted it to. CQC could send you SIP based IMs. CQC could initiate SIP calls for you and act as the phone itself. You just tell it what mic input and audio output to use.
In a subsequent release we could add some extra bits and you could call up CQC on the phone and speak to CQC Voice to tell it to do things. Recognition accuracy should be good since there's no distant recognition concern in that case.
Take a Huge Media Leap
OR, we could concentrate on a big leap forward in media support. There'd be a lot to be said about that. We could basically take our CQSL repo and turn into something much more like J.River/DVD Profiler, but still with all of the advantages of being tightly integrated that our repo has now.
We would find some new metadata sources that will allow us to deal with individual tracks, and we do a major rework on the repo manager to make it vastly better, and more in line with similar products. It won't end up being quite as extensive as J.River or DVD Profiler, which are kind of overkill for most folks, but it will be a lot closer to those, i.e. completely usable, but retaining all it's current integration benefits.
It would still support CD ripping, but would also become open to files added from the outside, which it would watch for via file system changes and show you as new files and let you check the metadata and then add them once you are happy with them. It would also support at least one new code, like FLAC perhaps, maybe a couple more.
It would allow for a number of options as to file layout, so that it could work with other products as desired. I.e. movies could be stored in the way that Plex wants them, music files stored in a way that some other player system wants them.
The goal would be to get to a point where, for most folks, they wouldn't feel the need for a third party tool.
Or, Some other Single Big Ticket Item?
Those are some obvious things that come to mind for me in terms of where I think that we might now benefit from some new capabilities or where parts of CQC have gotten old and could use a face lift.
Medium Sized Bits
--------------------------------------------------
In terms of less galactic things, we might consider some things like below. One or two of these may be doable in conjunction with one of the above big ones.
Future Thinking
--------------------------------------------------
Things we might start thinking about and discussing and exploring during the next release cycle for the one after that.
 
Obvious Things
--------------------------------------------------
Some of the things we need to do are obvious and will definitely be done unless something very unexpected happens to prevent it, and even then that would only likely delay it somewhat because these are no-brainers.
New Z-Wave Driver
Finish the work on the new Z-Wave driver, the native Z-Stick one. It's well along already but it wasn't ever going to make it in time for 5.2, so I left it alone while finishing up WebRIVA stuff and other pre-5.2 release busy-work. Getting this guy done will mean we can (I hope) never have to write another Z-Wave driver again. Probably about the time I finish it, Z-Wave will go defunct or something. Not that I'd cry about that.
Possible Big Ticket Items
--------------------------------------------------
These seem fairly obvious to me, but if folks can make compelling arguments for other things we can add those to the list. I think that all of these are mutually exclusive, i.e. any one of them one be the one tanker sized feature for the next release, when combined with the Z-Wave driver above and a few smaller things. So anything else would be smaller stuff in addition to that and one of these.
Take the next step on CQC Voice.
Most likely that would be to update it to support multiple mics in a single room. It would run parallel recognition engines from each mic and using a 'voting mechanism' to work out the most likely best match to what was spoken. That serves two purposes. In large rooms it means more coverage, though the distance may mean that effectively only one mic ever has a good read. In smaller rooms it means higher accuracy since there's more likely to be a mic closer to you and pointing towards your face.
This doesn't mean you can just use regular mics, they still need to be array mics to be of much use. This isn't creating a DSP based scheme to do off axis noise rejection. All that still needs to be done in the mics. If you want something like that, buy a conference room DSP system and then you can send it multiple mics and it will do the mic array thing and you can then feed that one resulting audio feed to CQC Voice.
This will be annoying work, but nothing doctoral thesis-like. Everything is there except for the voting scheme. It's just a matter of adjusting the architecture to support more than one per mic. 
And of course just generally improve it while we have the chance as we are going back through it all.
Finish RTP Work, Then SIP/SDP
I have a lot of an implementation of RTP done already from a while back. I was successfully transmitting CD audio to VLC player as a test. We'd be able to send/receive audio between CQC and any RTP enabled device. That would be the basis for a lot of other possible bits, some listed below. This wouldn't be a full on implementation initially. it would probably support CD audio and a couple of the obvious voice audio codec formats. Others could be added over time if needed.
Once we have RTP, then we can implement SIP/SDP. That means we can then send and receive SIP calls. That includes not just phones but other things like SIP based intercom systems and doorbell intercoms and stuff like that. So we could basically implement an intercom engine that should work with most any other SIP based intercoms, and of course amongst our own machines thusly configured.
As far as phones themselves go, CQC could call you and tell you things if you wanted it to. CQC could send you SIP based IMs. CQC could initiate SIP calls for you and act as the phone itself. You just tell it what mic input and audio output to use.
In a subsequent release we could add some extra bits and you could call up CQC on the phone and speak to CQC Voice to tell it to do things. Recognition accuracy should be good since there's no distant recognition concern in that case.
Take a Huge Media Leap
OR, we could concentrate on a big leap forward in media support. There'd be a lot to be said about that. We could basically take our CQSL repo and turn into something much more like J.River/DVD Profiler, but still with all of the advantages of being tightly integrated that our repo has now.
We would find some new metadata sources that will allow us to deal with individual tracks, and we do a major rework on the repo manager to make it vastly better, and more in line with similar products. It won't end up being quite as extensive as J.River or DVD Profiler, which are kind of overkill for most folks, but it will be a lot closer to those, i.e. completely usable, but retaining all it's current integration benefits.
It would still support CD ripping, but would also become open to files added from the outside, which it would watch for via file system changes and show you as new files and let you check the metadata and then add them once you are happy with them. It would also support at least one new code, like FLAC perhaps, maybe a couple more.
It would allow for a number of options as to file layout, so that it could work with other products as desired. I.e. movies could be stored in the way that Plex wants them, music files stored in a way that some other player system wants them.
The goal would be to get to a point where, for most folks, they wouldn't feel the need for a third party tool.
Or, Some other Single Big Ticket Item?
Those are some obvious things that come to mind for me in terms of where I think that we might now benefit from some new capabilities or where parts of CQC have gotten old and could use a face lift.
Medium Sized Bits
--------------------------------------------------
In terms of less galactic things, we might consider some things like below. One or two of these may be doable in conjunction with one of the above big ones.
- A hard wired alternative CAB (i.e. not configurable) that works like something like Plex's interface, in that tiled cover art sort of scheme.
- Maybe do the work required to get media fully under voice control. That would only be practical currently under the Echo scheme.
- Some big bang for the buck expansion of the auto-gen system, maybe support for web cams.
- An 'exception' mechanism for actions in the IV, i.e. an OnError handler that gets called if an error occurs in any user driven action in the IV, so that you can display a custom error display and/or take remedial action.
- Geolocation support in WebRIVA.
Future Thinking
--------------------------------------------------
Things we might start thinking about and discussing and exploring during the next release cycle for the one after that.
- Some thoughts maybe about learning capabilities to figure out what you want and do it for you. This is a tricky thing and could become more annoying than useful. But, it's sort of the hype du jour right now.
Dean Roddey
Explorans limites defectum
Explorans limites defectum