PDA

View Full Version : CIDLib error writing buffer to USB HID device


Jonathan
08-07-2006, 06:53 PM
I have CML driver code that finds and opens a USB HID device sucessfully and then dies when I try to write 8 bytes to it.

When calling m_USB.Write(m_IOBuf, 8, 2000) I get the following:

http://img391.imageshack.us/img391/6076/1hv9.jpg


Unfortunately, I've no idea what this means so understanding what is going wrong is difficult. Here is the work in progress code that is causing the above error:


Class=[NonFinal]
ClassPath MEng.User.CQC.Drivers.PowerLincUSB.DriverImpl;
ParentClass MEng.System.CQC.Runtime.CQCDriverBase;
EndClass;

Imports=
MEng.System.Runtime.MemBuf;
MEng.System.Runtime.USBHID;
MEng.System.CQC.Runtime.CQCFldDef;
#BeginDebug
MEng.System.Runtime.ConsoleOutStream;
#EndDebug
EndImports;

Literals=
Card2 kVendorId(0x10BF);
Card2 kProductId(0x0004);

// All IBIOS commands are prefixed with STX followed by the command number
Card1 kCmd_STX(0x02);

// 0x80 in the count field indicates the device is ready to receive more data
Card1 kCmd_Ready(0x80);

// IBIOS Commands
Card1 kCmd_Download(0x40);
Card1 kCmd_FixedLengthMessage(0x41);
Card1 kCmd_Upload(0x42);
Card1 kCmd_VariableLengthMessage(0x43);
Card1 kCmd_GetChecksum(0x44);
Card1 kCmd_EventReport(0x45);
Card1 kCmd_Mask(0x46);
Card1 kCmd_SimulatedEvent(0x47);
Card1 kCmd_GetVersion(0x48);
Card1 kCmd_DebugReport(0x49);
Card1 kCmd_X10ByteReceived(0x4A);
Card1 kCmd_InsteonMessageReceived(0x4F);
EndLiterals;

Members=
USBHID m_USB;
MemBuf m_IOBuf;
String m_Path;
Time m_Time;
#BeginDebug
ConsoleOutStream m_Con;
#EndDebug
EndMembers;

//
// Constructor and an entry point for testing purposes
//
Methods=[Public,Final]

Constructor() : m_IOBuf(8, 8);
Begin
EndConstructor;

Method Start() Returns Int4
Begin
Simulate();
Return 0;
EndMethod;

EndMethods;

//
// Helper methods for reading, writing, and parsing messages
//
Methods=[Private,Final]

Method InitIoBuffer([In] Card1 Byte1,
[In] Card1 Byte2,
[In] Card1 Byte3,
[In] Card1 Byte4,
[In] Card1 Byte5,
[In] Card1 Byte6,
[In] Card1 Byte7,
[In] Card1 Byte8)
Begin
m_IOBuf.SetAll(0);
m_IOBuf.PutCard1At(0, Byte1);
m_IOBuf.PutCard1At(1, Byte2);
m_IOBuf.PutCard1At(2, Byte3);
m_IOBuf.PutCard1At(3, Byte4);
m_IOBuf.PutCard1At(4, Byte5);
m_IOBuf.PutCard1At(5, Byte6);
m_IOBuf.PutCard1At(6, Byte7);
m_IOBuf.PutCard1At(7, Byte8);
EndMethod;

Method WriteIoBuffer()
Begin
#BeginDebug
m_Con.FmtStr("S:");
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(0), Radices.Hex);
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(1), Radices.Hex);
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(2), Radices.Hex);
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(3), Radices.Hex);
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(4), Radices.Hex);
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(5), Radices.Hex);
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(6), Radices.Hex);
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(7), Radices.Hex);
m_Con.NewLn();
m_Con.Flush();
#EndDebug
m_USB.Write(m_IOBuf, 8, 2000);
EndMethod;

Method ReadIoBuffer() Returns Boolean
Begin
If (m_USB.Read(m_IOBuf, 8, 100, False))
#BeginDebug
m_Con.FmtStr("R: ");
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(0), Radices.Hex);
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(1), Radices.Hex);
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(2), Radices.Hex);
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(3), Radices.Hex);
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(4), Radices.Hex);
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(5), Radices.Hex);
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(6), Radices.Hex);
m_Con.FmtStr(" 0x");
m_Con.FmtCard1R(m_IOBuf.GetCard1At(7), Radices.Hex);
m_Con.NewLn();
m_Con.Flush();
#EndDebug
Return True;
EndIf;
Return False;
EndMethod;

Method CheckStatus() Returns Boolean
Begin
InitIoBuffer(2, kCmd_STX, kCmd_GetVersion, 0, 0, 0, 0, 0);
WriteIoBuffer();
Return False;
EndMethod;

EndMethods;

//
// Overrides of required methods defined by our base class. We must implement
// these or the compiler will complain. These are how the underlying driver
// infrastructure calls us to let us know events have occured.
//
Methods=[Public,Overrides,Final]

//
// This is called when we are first loaded. It is just for us to do any
// data initialization, to register our fields, set our poll times, and
// so forth. Here we are checking to see if the PowerLinc is present in
// the system.
Method InitializeOther([In] MEng.String CfgString, [In] MEng.System.Runtime.NamedValMap PromptVals) Returns DrvInitRes
Begin
If (!m_USB.FindDevice(kVendorId, kProductId, m_Path))
If (GetVerboseLevel() >= VerboseLvls.Medium)
LogMsg("The Insteon PowerLinc Controller is not present");
EndIf;
#BeginDebug
m_Con.FmtStr("The Insteon PowerLinc Controller is not present\n");
m_Con.Flush();
#EndDebug
Return DrvInitRes.Failed;
EndIf;
#BeginDebug
m_Con.FmtStr("Insteon PowerLinc Controller found.\nPath=");
m_Con.FmtStr(m_Path);
m_Con.NewLn();
m_Con.Flush();
#EndDebug
Return DrvInitRes.WaitCommRes;
EndMethod;

Method GetCommResource() Returns Boolean
Begin
Try
m_USB.Open(kVendorId, kProductId);
EndTry;

Catch
If (GetVerboseLevel() >= VerboseLvls.Medium)
LogMsg($Exception.GetErrorText());
EndIf;
#BeginDebug
m_Con.FmtStr("Failed to open USB port\n");
m_Con.Flush();
#EndDebug
Return False;
EndCatch;
#BeginDebug
m_Con.FmtStr("Opened USB port\n");
m_Con.Flush();
#EndDebug
Return True;
EndMethod;

Method Connect() Returns CommResults
Begin
CheckStatus();
Return CommResults.Success;
EndMethod;

Method Poll() Returns CommResults
Begin
Return CommResults.Success;
EndMethod;

Method CardFldChanged([In] Card4 FldId, [In] Card4 ValWritten) Returns CommResults
Begin
Return CommResults.Success;
EndMethod;

Method FreeCommResource() Returns Boolean
Begin
m_USB.Close();
Return True;
EndMethod;

EndMethods;

Dean Roddey
08-07-2006, 07:50 PM
Hmmm... Looks ok to me. I'll try it with the X-10 controller. It won't actually do anything but if there's a bug with the HID write buffer itself, it should happen to me also.

Dean Roddey
08-07-2006, 07:58 PM
OK, yes, I'm seeing the same, so obviously this HID code, not having actually been used since it was created, has developed some sort of bug. Let me figure out what's up and I'll get a patch together for you.

Dean Roddey
08-07-2006, 08:12 PM
There are two things going on here. One is that the Windows HID buffers are 9 bytes, not 8. Your data only takes up 8, but for whatever reason Windows ones are 9. So try putting a 0 into the 0th byte, then put your data into the other 8 bytes. And when you read, read 9 bytes and ignore the 0th one.

What happened was that the write failed due to too few bytes, and I messed up the error reporting for that particular CML level error, so what you were seeing was the internal failure that occured during that error handling.

I'll fix that for the next release, and you shouldn't see it anyway once you use the 9 byte buffer size.

Jonathan
08-08-2006, 03:39 AM
Thanks Dean that seems to do the trick fine! Glad I could help find and squish a bug ;-)

Dean Roddey
08-08-2006, 08:49 AM
BTW, there's no need to call SetAll() on the buffer since you are settign them all anyway. And it would probably be worth create a couple other buffer init methods that take 2 or 3 bytes and set the others to zero., to avoid creating so many dummy zero value objects.

Jonathan
08-08-2006, 09:52 AM
Yep, turns out I can actually only use the first four bytes anyway otherwise I overrun the hardware write from the USB controller to the external EEPROM (plus you need to give it 20ms after you finish writing too for that operation to complete before you write anything else or alternatively write a single byte at a time and give it 6ms) but I digress...

My next fun is trying to join up the responses I'm getting from reading the USB buffer into an actual command. I.e: lets assume I've read 4 responses from the USB buffer (we'll ignore the question of how I know 4 responses is the right amount for now).

The first byte on each read indicates how many bytes of real data is following so I get the following raw data:

0x01 0x02 0x00 0x00 0x00 0x00 0x00 0x00
0x03 0x48 0xff 0xff 0x00 0x00 0x00 0x00
0x03 0xff 0x04 0x00 0x00 0x00 0x00 0x00
0x02 0x23 0x06 0x00 0x00 0x00 0x00 0x00

My output buffer with the actual data for parsing by higher level methods should be:

0x02 0x48 0xff 0xff 0xff 0x04 0x00 0x23 0x06

I'm open to suggestions since I've only been using CML for about 5 hours now since last night! :-D

Dean Roddey
08-08-2006, 10:14 AM
You would just use two buffers. You have one that you do each read in and another that you use to hold the whole message. Initialize a Count variable to zero and bump it on each round by how many bytes you read that time, CurBytes. So each time through you will append CurBytes bytes from the per-msg buffer into the big buffer at offset Count, then bump count by CurBytes.

Use the CopyInAt() method to copy in CurBytes bytes from another buffer into the current buffer at index Count.

Dean Roddey
08-08-2006, 10:17 AM
Oh, wait, you have to skip the 0th byte, so you can't do that. You'll have to do just do a little loop to copy the per-msg bytes to the big buffer.

Jonathan
10-04-2006, 11:41 AM
Dean, trying to track down a problem that seems to be randomly happening with the PowerLinc driver I wrote. It appears to manifest itself as an issue trying to write to the USB device but I can't make it happen on demand :(

I wouldn't be suprised if it turns out the PowerLinc device has stopped responding but right now I'm not sure.

The following is a partial dump of the log, this same error sequence is repeated over and over again, can you interpret the CIDLIB portions and shed any light as to what is happening under the covers?


mediapc,CQCServer,InsteonThread2
(CIDMacroEng,CIDMacroEng_MethodImpl.cpp.873,Status -App Status,0-0-0)
An exception was thrown from method MEng.User.CQC.Drivers.PowerLincUSB.PlcHelper.Power LincWrite

mediapc,CQCServer,InsteonThread2
(CIDMacroEng,CIDMacroEng_MethodImpl.cpp.873,Status -App Status,0-0-0)
An exception was thrown from method MEng.User.CQC.Drivers.PowerLincUSB.DriverImpl.IGet FirmwareVersion

mediapc,CQCServer,InsteonThread2
(CIDMacroEng,CIDMacroEng_MethodImpl.cpp.873,Status -App Status,0-0-0)
An exception was thrown from method MEng.USBHID.Write

mediapc,CQCServer,InsteonThread2
(CIDLib,CIDLib_Collection.cpp.249,API Failed-Bad Parms,2-0-0)
4294967295 is not a valid index for the TVector<TElem> object. Count=4

mediapc,CQCServer,InsteonThread2
(CQCKit,CQCKit_MEngErrHandler.cpp.69,Status-App Status,0-0-0)
A C++ exception occured during macro execution

mediapc,CQCServer,InsteonThread2
(CIDLib,CIDLib_Collection.cpp.249,API Failed-Bad Parms,2-0-0)
4294967295 is not a valid index for the TVector<TElem> object. Count=4

mediapc,CQCServer,InsteonThread2
(CQCKit,CQCKit_MEngErrHandler.cpp.153,Status-App Status,0-0-0)
MEng.User.CQC.Drivers.PowerLincUSB.PlcHelper.Power LincWrite([In] MEng.Card1 Length,[In] MEng.Card1 Byte1,[In] MEng.Card1 Byte2,[In] MEng.Card1 Byte3) Returns MEng.Void
MEng.User.CQC.Drivers.PowerLincUSB.DriverImpl.IGet FirmwareVersion() Returns MEng.Card1
MEng.User.CQC.Drivers.PowerLincUSB.DriverImpl.Conn ect() Returns MEng.System.CQC.Runtime.CQCDriverBase.CommResults

Dean Roddey
10-04-2006, 12:29 PM
You are sure that you aren't passing a zero length array to be written? This is usually a side effect of assuming you want to do 0 to count-1 and count is zero, so count-1 ends up being Card4.kMaxValue, which is the large index error number you are seeing. It should be getting caught at the CML level and thrown, but it's obviousy not and so the C++ level code is throwing the error.

I'll look into that code to see how the CML level code might be missing such a thing. If you are sure you could never be doing such a wrap around in the PowerLincWrite() method, then I'll look into how I might be causing it internally.

Jonathan
10-04-2006, 12:49 PM
The buffer is 9 bytes and works fine normally (this routine is called ALOT!)

I believe if it was a zero length array I assume I would have got an exception when I tried to put values in it?


// Write the IO buffer (length + 3 bytes of data payload)
// to the USB HID device and wait 20ms for the PLC to write
// the received data to the external EEPROM
//
Method PowerLincWrite([In] Card1 Length,
[In] Card1 Byte1,
[In] Card1 Byte2,
[In] Card1 Byte3)
Begin
m_OutBuf.PutCard1At(1, Length);
m_OutBuf.PutCard1At(2, Byte1);
m_OutBuf.PutCard1At(3, Byte2);
m_OutBuf.PutCard1At(4, Byte3);

// Write buffer to USB HID device
m_USB.Write(m_OutBuf, kBufferSize, kTimeout);
// Wait for PLC to write bytes to external EEPROM
// This eliminates the risk of the PLC to lock up
// by sending more data too soon.
// http://www.insteon.net/sdk/forum/topic.asp?TOPIC_ID=232
m_Time.Sleep(TypeCast(MEng.Card4, Length + 1) * 5);
EndMethod;

Dean Roddey
10-04-2006, 12:57 PM
It's not that the array/vector is zero length, but that the Length value is zero. But you are passing a constant buffer length so that's not very likely. So let me look into it.

Dean Roddey
10-06-2006, 07:40 PM
Can you get this error to happen while it's running in the test harness?

Jonathan
10-06-2006, 08:00 PM
Can you get this error to happen while it's running in the test harness?

Yes, same error and it happens upon calling the USBHID Write method.

http://img142.imageshack.us/img142/1537/errorap9.jpg

Dean Roddey
10-06-2006, 08:06 PM
OK, then do this. Don't start the driver IDE from the start menu. Open a CQC Command Prompt and run it like this:

CQCDrvDev /TraceSev=APIFAILED

This should cause a .ErrorInf file to be generated when this exception occurs. It should be in the CQCData\Logs directory, but search around if not (let me know if not.)

Send me that file and that should give me a better feeling for where you are when the error occurs.

Jonathan
10-06-2006, 08:09 PM
Would appear that the issue is hardware related as unplugging and plugging the PLC device back in clears the error. I'll wait until it happens again though and get a trace.

Dean Roddey
10-07-2006, 10:40 AM
Any success in getting a stack trace? Even if it's a hardware thing, somewhere it's causing me to do something wrong, so I'd like to find it.

Jonathan
10-07-2006, 10:48 AM
Not as of yet, as always, when you want something to happen, it doesn't!

Dean Roddey
10-07-2006, 10:51 AM
Pick up the phone and start talking, where it can just hear you, and pretend like you are bragging to someone about how rock solid your Insteon driver is.

Jonathan
10-17-2006, 05:45 PM
Ok, during a stress test tonight trying to track down my 100% CPU issue this USB issue happened again so I ran it in the IDE with the command line you told me to and I'm attaching the error dump.

It would appear the Insteon PLC has stopped responding based on the fact I have another app that will talk to the device and it is timing out trying to access it. This explains why if I unplug and plug the PLC back in, things return to normal.

However, Windows still sees the device so when the driver checks to see if the device is there using FindDevice() it returns success and there is no error when calling Open() either so CQC then calls Connect() and the situation simply repeats itself at a very rapid rate (would changing the SetPollTimes reconnect value slow this down?)

The log file (also included) shows this:

mediapc,CQCServer,testThread11
(CQCKit,MEng.System.CQC.Runtime.CQCLogger.163,Stat us-App Status,0-0-0)
Sending firmware version request

mediapc,CQCServer,testThread11
(CIDMacroEng,CIDMacroEng_MethodImpl.cpp.873,Status -App Status,0-0-0)
An exception was thrown from method MEng.USBHID.Write

mediapc,CQCServer,testThread11
(CIDLib,CIDLib_Collection.cpp.249,API Failed-Bad Parms,2-0-0)
4294967295 is not a valid index for the TVector<TElem> object. Count=4

mediapc,CQCServer,testThread11
(CQCKit,CQCKit_MEngErrHandler.cpp.69,Status-App Status,0-0-0)
A C++ exception occured during macro execution

mediapc,CQCServer,testThread11
(CIDLib,CIDLib_Collection.cpp.249,API Failed-Bad Parms,2-0-0)
4294967295 is not a valid index for the TVector<TElem> object. Count=4

mediapc,CQCServer,testThread11
(CQCKit,CQCKit_MEngErrHandler.cpp.153,Status-App Status,0-0-0)
MEng.User.CQC.Drivers.PowerLincUSB.PlcHelper.Power LincWrite([In] MEng.Card1 Length,[In] MEng.Card1 Byte1,[In] MEng.Card1 Byte2,[In] MEng.Card1 Byte3) Returns MEng.Void
MEng.User.CQC.Drivers.PowerLincUSB.DriverImpl.IGet FirmwareVersion() Returns MEng.Card1
MEng.User.CQC.Drivers.PowerLincUSB.DriverImpl.Conn ect() Returns MEng.System.CQC.Runtime.CQCDriverBase.CommResults


mediapc,CQCServer,testThread11
(CIDMacroEng,CIDMacroEng_MethodImpl.cpp.873,Status -App Status,0-0-0)
An exception was thrown from method MEng.User.CQC.Drivers.PowerLincUSB.PlcHelper.Power LincWrite



Runtime Error:
Process: CQCDrvDev
Thread: CIDMacroDbgThread1

Error Information:
Facility: CIDLib
Error: The file read/write did not complete in the given timeout period. Name=\\?\hid#vid_10bf&pid_0004#6&1a992bf8&0&0000#{4d1e55b2-f16f-11cf-88cb-001111000030}
Aux Error Text:
Error Id: 1448
Kernel Error Id: 0
Host Error Id: 0
File: CIDLib_FileBase.cpp.711

Dean Roddey
10-17-2006, 06:21 PM
You were running in verbose log mode, right? If so, then those reading at the logical end of stream things are not a problem. That is normal and is used by the parser to deal with the end of the class input, since the end isn't obvious from the content being parsed. So it just reads until it gets an end of stream exception.

Unfortunately, it looks like the actual error didn't cause a stack dump for some reason.

Jonathan
10-17-2006, 06:23 PM
Sorry, hadn't finished editing the message. Try now :)

Dean Roddey
10-17-2006, 06:31 PM
I'm starting to think that the index error is really in response to something else going wrong and the failure is in response to that error, not the actual error itself. There's just nothing that could cause that error in the code path invoked by the Write operation.

The only thing I can see that might cause that, and I don't know how it could happen, is that a default case is never set for some switch statement. But the compiler won't allow that, so I can't see how it could happen.

Dean Roddey
10-17-2006, 06:34 PM
However, Windows still sees the device so when the driver checks to see if the device is there using FindDevice() it returns success and there is no error when calling Open() either so CQC then calls Connect() and the situation simply repeats itself at a very rapid rate (would changing the SetPollTimes reconnect value slow this down?)


So do you think that is what's happening? What is the reconnect time now? It should generally be a few seconds, which wouldn't cause the kind of problem you are seeing. Are you catching exceptions in the connect and returning the lost comm res indicator?

Jonathan
10-17-2006, 06:46 PM
Just to be clear, I think there are two different issues going on here. This one is related to (I assume) the PLC deciding it wants to take its toys home and doesn't want to play anymore. It does NOT peg the cpu, but it does cause a lot of log entries as it cycles the driver around and around (my reconnect was set to 3 seconds though it could be set higher). The only fix for this issue is to power cycle the PLC device so it isn't recoverable but should be "managed".

My other thread is about a seperate issue. In that case, same driver but in that situation we get the 100% cpu issue and the PLC is NOT dead, it is responding. I don't think the two are connected.

As I referenced, the PLC comes with software (called SDM) which, in this situation, also thinks the PLC is there but times out when trying to send data to it so at least we are seeing a consistent issue.

I don't think that regular users of the PLC will likely see this situation happen as the normal amount of traffic through the PLC should be an order of magnitude less than what I put through to induce this failure condition.

In a bigger picture kind of thing, I'd just like be able to catch this situation and find a way to handle it in a nicer way in case it happens when I'm gone for a week :-)

Dean Roddey
10-17-2006, 07:38 PM
I may send you an instrumented version to see if we can force a stack trace. I don't understand why it didn't do the trace in this particular case when the index error occured. That's what I want to find because it's showed up on another user's system and we couldn't figure it out. If it's the same one, I want to nail this one and put it away.

Jonathan
10-17-2006, 07:45 PM
Sounds good. I've also got remote console access to that machine too if you want to get hands on.

Dean Roddey
10-21-2006, 07:49 PM
OK, the C++ exception popup, after all that looking for something obscure, turned out to be something obvious. I was incorrectly calling the method that should only be called if a C++ exception leaks out of the CML code (meaning it was never translated to a CML exception.) Must have been a copy and paste error and I forgot to remove that call.

So anyway, that one will be fixed in the next 2.0 drop. I just ended up writing a simple 'echo back device' type program, and wrote a small driver to talk to it. That was a sufficiently simple test scenario to make the problem simple to find.

Dean Roddey
10-21-2006, 11:25 PM
Oops, this was the wrong thread for the above comment. I'll post it in the correct one.