Charmed Quark Systems, Ltd. - Support Forums and Community

Full Version: Official 5.3 Beta Discussion Thread
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
(05-03-2018, 04:22 AM)batwater Wrote: [ -> ]Dean, it if would help for a sanity check / comparison I can open my system up, on my ZWAVE network I have SmartThings as the Master Controller, the old VRC0P as a secondary and the ZWAVE stick as a secondary.  As long as the VRC0P configuration in CQC is not affected that is...

Let me make the above mentioned changes. I think that needs to be done either way. That will probably take the whole day to do.
I've been sitting here spinning my mental gears, and just thinking out loud... Ultimately, a unit is only usable once we have auto-identified it or the user manually selects a type. In the end, there are only two reasons this information is being gathered. One is to get the manufacturer ids so we can id it. The other is to gather that plus all the other info for device info dumps you can send to us.

So, that means, in terms of the actual running units, all that matters if the manufacturer ids ultimately. That's all we need to get. The gotcha is that, without the node info, we don't know if it supports the man spec CC or, if it does, if that is a secure class. The latter is not likely but possible. We also don't know if it is a listener or not, so we don't know if a failure to respond it expected or a sign of failure.

Another reason we want to do the basic node info call is that it comes directly from the local Z-Stick, so we can do that all up front, synchronously, and figure out fast if any units are no longer valid or don't match the previous info we had, before we try to register fields. Doing it any other way means having to handle units being figured out piecemeal over time which isn't great. And we can do that initial query whether it's a listener or not, because we are just talking to the local controller. Otherwise we can only know about a unit once it wakes up.

So, I guess we have to do that initial call. It's the only practical thing. And we just have to handle the fact that a lot of stuff that happens now should only happen after a replication. But, the bad thing about is, if you lose the driver config file, the driver won't be able to reliably gather its info again. Even though it's in the network and has security info, if it can't get that initial node info reliably except after a replication, then it's hosed. It would require a replication to get it happy again.

I don't think that's too huge an issue. The driver could just consider itself out of the network if it has no previous info, no matter what the Z-Stick says its status is. Having to do a re-replication to recover from a lost config file isn't that big an issue I guess. A reset of a unit would only take it back to trying to get manufacturer ids (or to WaitDevInfo if man spec CC isn't supported.) We'd never do that initial probe again except after a replication. And we'd only consider it a mismatch if the man ids are different from what we last got/figured out, so the query of the man ids would be the only thing really done upon loading the driver.

Anyway, I think that's how it has to be.
OK, I'm slowly getting it back together after ripping it apart. I've come up with a better way to do things, which should be faster and not rely on the inconsistent information of the low level node info query. In a way, I should have seen this to start. I was sort of mislead by looking at how some other systems are done. Those are very generic and so they gather all this information. All we really care about is getting a device info file assigned to a unit. So really all we care about is the minimum effort required to get the manufacturer ids or to figure out that they are not available.

The only reason all of the info is required is when we want to get an info dump for a new, unknown unit type, which we can do right there, synchronously when that's asked for.

So now the process is:

1. Try to ping the device. If it's a frequent listener this will also wake it up. If it's a non-listener it will never get a ping. This is now where you would wake up the device. If it gets a wakeup it treats that like a reply to the ping, either way we just want to know it's now awake.
2. If it gets a reply to the ping it does the real node info query (not from the local Z-Stick but from the device itself.) This gets us reliable info about whether it's a listener, frequent listener, and the non-secure classes.
3. If man spec CC is in the non-secure list it goes to #5. If not but security is, it goes to #4. If neither, it knows it can't auto-id and goes to a 'NoAutoMatch' state.
4. Asks for secure classes. If man spec CC is not in there, then goes to NoAutoMatch, else it goes to #5.
5. Asks for the man ids. If it recognizes the unit, it goes to GetName if naming is supported otherwise to WaitApprove. If it doesn't recognize the unit it goes to NoAutoMatch.
6. If naming support is there we get the name as the default initial name, then go to WaitApprove.

So it's a far shorter sequence of events, which should go much quicker and have fewer chances to fail. The actual device info for working units will come from the device info file, not from the device. And there are fewer states to have to understand.

This involved a LOT of weirdness, made worse by having to try to change a finally carefully balanced driver in place to a much different scheme. It also required coming up with a way to do async pings, which is difficult because Z-Wave is completely inconsistent and doesn't respond to them, it just acks them, and all the acking process is totally hidden (for good reason) inside the Z-Stick's I/O thread processing. I had to claw my way over every bit of progress, but I got all that worked out.

I was starting to get some units to flow through the process. I'll tighten it up tomorrow and hopefully get it all the way back up and ready for testing on some user systems again.
I got a lot of stuff worked out today, so it's coming back together, and I think much better all around. There's a lot more fiddly bits in some ways since we don't know anything about units at first. But, in other ways, it's much simplified.
OMG, I don't believe how screwed up Z-Wave is. I made all of these changes because it is apparently not possible to know for sure if the fundamental node info query is for what node. Partly because it doesn't include a source node in the response. So, if something goes wrong and you time out and get an extra one in the input queue or something, then you can see the wrong one and not know it. And partly because it may not be reliable other than after a replication on some systems, from what I saw on Kevin's system.

So I changed it all completely around to get the actual info from the node itself, which returns a standard node information frame which has this information in it. And I just realized that the serial API that we use to talk to the chip in the Z-Stick removes that one piece of information from the info frame when it passes it back to us, only providing the other information. So the one thing I did all of this to get reliably, I don't get.

Not that this work was a waste, it's a lot better now. But I still don't have the information I want, basically is it a listening device or is it a frequently listening device. I can't go by whether it supports the battery class, since frequent listeners do usually as well.

I guess I'm going to have to add back in the low level node info call and try to find some way to get around its limitations.
Hey Dean, I just read this int eh maint fee reminder email

We will likely drop support for the old iOS/Android clients, and hence also the background server that supported them, i
n the next large release. It's unlikely anyone will still be using them given how much better the new WebRIVA client is.
And that will further release us from contraints on making improvements, since we won't have to worry about breaking th
ose products. We have already in 5.2 begun the process of de-emphasizing them and labelling them legacy products so that
there will be no new users of them.

Don't drop support - it still works and works very well - better that WebRIVA iMO (iOS). While there is the new app for doing WebRIVA it does not work on older iPads (I'm using an iPad 2 which is doing its job fine with CQC and the original app).

WebRIVA for android is still not great unless there have been big changes recently but I am still struggling to find a seamless solution that is suitable for non techie users to deal with (i.e. something as simple as an app)
Most likely the next 'large' release would be 6.x, which is still a good ways out. We are just 5.2 heading to 5.3, and there's nothing on the order of the 5.x UI reworking planned that would justify a 6.x in the near future.

It's hard to imagine that it works better than WebRIVA given the complaints about stability that were around at the time, particularly the Android one, that ultimately drove the work on WebRIVA. Also, it's possible that any iOS upgrade that comes along could break the client and there'd be nothing really to be done about it.

I disagree. I have tried to move from the old IOS app to the WebRIVA and the experience is just not as good. Mainly because of the reconnect time. It's a good 7-8 seconds to get to a usable point with the WebRIVA interface and only 2-3 seconds with the IOS app. So please don't drop the iOS app.

Something would seem to be awry if it's taking that long. It should be quite fast, and it certainly sounded like it was for everyone during the testing phase. What browser were you using on what platform?

The old apps are going to go at some point, since no one is maintaining them and they will hold back development of new stuff. So we should figure out what is causing the delays for you.
OK, another re-swizzling of the driver to get back the basic node info query, while keeping the many other recent improvements. And I've added some stuff to hopefully help get around the issues that make that query of questionable reliability. It's going quite well for me, much less messing about to get units up and working, and a lot fewer msgs back and forth to get to that point as well, which means less network traffic and fewer opportunities to fail.

I'll do some more testing tonight and tomorrow and get a new drop up for folks to try hopefully tomorrow night.