05-01-2013, 03:29 PM
(This post was last modified: 12-31-2013, 04:14 PM by Dean Roddey.)
General Description
This thread is for discussion of the device class for the Media Repository class. This is one of those where a device class effectively defines all of the functionality of a driver, though of course there can be other stuff implemented by such drivers beyond this class definition, or in some cases implement other device classes.
Media repositories have two fundamental jobs. They load up and cache media metadata from some source (files on disk, exported info from media management program, etc...) They then serve up that metadata to clients via a fairly extensive set of backdoor commands. These commands are understood by the standard metadata oriented media widgets, and that is how they display and allow you to interact with metadata.
There are some important issues that all media repo drivers must deal with, mostly related to loading data, and their current reported state.
Data Loading
Clients need to know if repositories are ready to interact with, which means that they have loaded the required metadata and can now serve that data up correctly. They all have a load status field that indicates their current state, and clients will use this to know if it is safe to interact with them.
The 'load status' field has these values:
This scheme ultimately provides a much better experience for the user. If a reload fails, it doesn't leave the repo incapacitated. The last successfully loaded contents remain in place and available. And it also means that data remains available during the actual reload process, which can take a bit of time in some cases. It does of course mean more memory usage since for a short while the database will be in memory twice, but the usage is small by modern standards.
Note that there is also a more general 'Status' field, which is opened ended and is used by the repo driver to reflect its current state. This field MUST cycle through on each reload, and can be used to indicate actual loading status.
The basic strategy will be:
- On driver initialization, set it to initializing
- Once the driver starts its first load attempt, set it to loading
- Once the first load is complete and data is available, set it to ready and leave it there. If reloads fail, just log information about the failed loads.
- Use the Status field, as opposed to the LoadStatus field, to indicate current loading status on each reload.
Caching Support
In order to be practical, media data must be cached by clients, so that they don't have to continually come back to the repo driver to get it. As of CQC version 4.4.902, this is accomplished by way of the 'Client Service', which is a service that runs on all machines, including clients, and which will over time provide various services for local clients. Currently one of them is to download and cache media repo data and serve it up to clients on that host.
All media repository drivers MUST implement a few simple strategies to allow the client service to cache data as effectively as possible. There are currently two caching strategies involved:
For the overall metadata database, if the driver reads in an XML file, for instance, it can just read the whole file in and hash the contents. This is guaranteed to only change the serial number if the file changes. If it doesn't have such a single source, it can load the database, then flatten the data and hash the resulting memory buffer. Actually, given that the repo driver probably doesn't use many of the fields exported by the source program, the latter might be preferable since it would only change if the actual data used changed. If the metadata source provides a unique id, then that can be used of course, as long as it is guaranteed to change if the metadata changes.
These values are just strings, so they can be hashes of data, they can be numeric values formatted to text, concatenations of pieces of info provided by the metadata source, or whatever is useful. As long as they only change when the actual data changes, they will serve their purpose. If doing a concatenation of information, and that information could be long, it would be better to hash it, so as to keep the id short and faster to compare.
Fields Provided
[INDENT]The public field interface of media repositories is quite simple, despite the fact that the internals are very complicated. The fields provided by this device class have pre-determined names, and these MUST be implemented as indicated here. They are all prefixed by the device class prefix in the form:
MREPO#fieldname
where MREPO# indicates it is a field of this device class, and fieldname meets the general requirements of CQC field names. There will never be multi-unit considerations for this type of device class.
Multi-Unit Considerations
[INDENT]This device class does not expect there to ever be multiple repositories within one controllable device, server, etc... at least not any that are considered to be of a piece. There may well be separate media repositories, but they will have separate driver instances.[/INDENT]
Backdoor Commands/Queries
[INDENT]There are many backdoor commands defined for media repositories, but they are not currently publically defined.[/INDENT]
This thread is for discussion of the device class for the Media Repository class. This is one of those where a device class effectively defines all of the functionality of a driver, though of course there can be other stuff implemented by such drivers beyond this class definition, or in some cases implement other device classes.
Media repositories have two fundamental jobs. They load up and cache media metadata from some source (files on disk, exported info from media management program, etc...) They then serve up that metadata to clients via a fairly extensive set of backdoor commands. These commands are understood by the standard metadata oriented media widgets, and that is how they display and allow you to interact with metadata.
There are some important issues that all media repo drivers must deal with, mostly related to loading data, and their current reported state.
Data Loading
Clients need to know if repositories are ready to interact with, which means that they have loaded the required metadata and can now serve that data up correctly. They all have a load status field that indicates their current state, and clients will use this to know if it is safe to interact with them.
The 'load status' field has these values:
- Initializing - The repo is currently in the process of starting up, and of course is not ready for use.
- Loading - The repo is doing an initial load, so it is up and running but has not yet successfully loaded its data. So the client shouldn't try to use it yet, but probably can successfully wait for it become ready without having to block for too long.
- Ready - The repo has loaded repository contents and is ready to use.
- Failed - The repo has failed to load, and is not going to successfully do so without some sort of change in circumstances and most likely some sort of user intervention.
This scheme ultimately provides a much better experience for the user. If a reload fails, it doesn't leave the repo incapacitated. The last successfully loaded contents remain in place and available. And it also means that data remains available during the actual reload process, which can take a bit of time in some cases. It does of course mean more memory usage since for a short while the database will be in memory twice, but the usage is small by modern standards.
Note that there is also a more general 'Status' field, which is opened ended and is used by the repo driver to reflect its current state. This field MUST cycle through on each reload, and can be used to indicate actual loading status.
The basic strategy will be:
- On driver initialization, set it to initializing
- Once the driver starts its first load attempt, set it to loading
- Once the first load is complete and data is available, set it to ready and leave it there. If reloads fail, just log information about the failed loads.
- Use the Status field, as opposed to the LoadStatus field, to indicate current loading status on each reload.
Caching Support
In order to be practical, media data must be cached by clients, so that they don't have to continually come back to the repo driver to get it. As of CQC version 4.4.902, this is accomplished by way of the 'Client Service', which is a service that runs on all machines, including clients, and which will over time provide various services for local clients. Currently one of them is to download and cache media repo data and serve it up to clients on that host.
All media repository drivers MUST implement a few simple strategies to allow the client service to cache data as effectively as possible. There are currently two caching strategies involved:
- Metadata - The metadata itself (the in-memory database that each repository maintains) is one set of data that the client service downloads and caches. To support this all repositories MUST provide a 'database serial number' that changes any time the database changes. Where at all possible this serial number SHOULD be the same if the driver is stopped and restarted, so that the client service doesn't think it needs to re-download the data.
- Cover Art - Cover art is even more of a burden, and all possible steps must be taken to avoid downloading it more than is required. All media repo drivers MUST generate a 'persistent id' for images that will always be the same for a given image, even across a restart of the driver. This way the client service can use this persistent id to know if it already has the image locally. Worst case, the driver must read in the image and hash it, if the metadata source doesn't provide some unique id that it already has in place.
For the overall metadata database, if the driver reads in an XML file, for instance, it can just read the whole file in and hash the contents. This is guaranteed to only change the serial number if the file changes. If it doesn't have such a single source, it can load the database, then flatten the data and hash the resulting memory buffer. Actually, given that the repo driver probably doesn't use many of the fields exported by the source program, the latter might be preferable since it would only change if the actual data used changed. If the metadata source provides a unique id, then that can be used of course, as long as it is guaranteed to change if the metadata changes.
These values are just strings, so they can be hashes of data, they can be numeric values formatted to text, concatenations of pieces of info provided by the metadata source, or whatever is useful. As long as they only change when the actual data changes, they will serve their purpose. If doing a concatenation of information, and that information could be long, it would be better to hash it, so as to keep the id short and faster to compare.
Fields Provided
[INDENT]The public field interface of media repositories is quite simple, despite the fact that the internals are very complicated. The fields provided by this device class have pre-determined names, and these MUST be implemented as indicated here. They are all prefixed by the device class prefix in the form:
MREPO#fieldname
where MREPO# indicates it is a field of this device class, and fieldname meets the general requirements of CQC field names. There will never be multi-unit considerations for this type of device class.
- DBSerialNum. This field MUST be a string field. Any time the database is modified such that it could invalidate existing information, this MUST be updated to uniquely reflect that loaded version of the database.
- LoadStatus. This field MUST be an enumerated, read-only field that indicates the status of the repository. The enumerated values MUST be Initializing, Loading, Ready, and Error. See the above discussion for how this field MUST be used.
- ReloadDB. This field MUST be a write only Boolean field. Any value written to it will tell the repository driver to reload data from its metadata source. Where possible it should just kick off an asynchronous process that will load the new data while leaving the driver in a ready state in the meantime, and then just switch over to the new data without ever leaving the Ready state. If the driver can determine that the actual data hasn't changed, it should maintain the same value in DBSerialNum.
- Status. This field MUST be a string field, but the contents are open ended and purely for human consumption, to provide status primarily during the loading process. Providing information about the current state of the loading process is often helpful for in the field diagnosis of issues. In many cases it will be status text gotten from the actual metadata source, just being passed through into this field. Note that this field DOES recycle on every reload of the database, unlike the LoadStatus field.
- TitleCnt. This field MUST be a read only, unsigned value that is set to zero initially before the first good load. One data is successfully loaded, it is just updated when new data is swapped in after a successful reload. It MUST NOT be reset to zero on each reload. Leave it reflecting the number of titles in the currently available data.
Multi-Unit Considerations
[INDENT]This device class does not expect there to ever be multiple repositories within one controllable device, server, etc... at least not any that are considered to be of a piece. There may well be separate media repositories, but they will have separate driver instances.[/INDENT]
Backdoor Commands/Queries
[INDENT]There are many backdoor commands defined for media repositories, but they are not currently publically defined.[/INDENT]
Dean Roddey
Explorans limites defectum
Explorans limites defectum