Containment Domains C++ API  0.1
Containment Domains C++ API v0.1
 All Classes Namespaces Files Functions Variables Enumerations Enumerator Groups Pages
Classes | Enumerations | Functions
Error Reporting

Classes

class  cd::SoftMemErrInfo
 Interface to soft memory error information. More...
 
class  cd::DegradedMemErrInfo
 Interface to degraded memory error information. More...
 
struct  cd::SysErrT
 Type for specifying errors and failure. More...
 

Enumerations

enum  cd::SysErrNameT {
  cd::kOK =0, cd::kSoftMem = 0b1, cd::kDegradedMem = 0b01, cd::kSoftComm = 0b001,
  cd::kDegradedComm = 0b0001, cd::kSoftComp = 0b00001, cd::kDegradedResource =0b000001, cd::kHardResource = 0b0000001,
  cd::kFileSys = 0b00000001
}
 Type for specifying system errors and failure names. More...
 
enum  cd::SysErrLocT {
  cd::kOK =0, cd::kIntraCore = 0b1, cd::kCore = 0b01, cd::kProc = 0b001,
  cd::kNode = 0b0001, cd::kModule = 0b00001, cd::kCabinet = 0b000001, cd::kCabinetGroup =0b0000001,
  cd::kSystem = 0b00000001
}
 Type for specifying errors and failure location names. More...
 
enum  cd::CDErrT { cd::kOK =0, cd::kAlreadyInit, cd::kError }
 Type for specifying error return codes from an API call – signifies some failure of the API call itself, not a system failure. More...
 

Functions

uint cd::DeclareErrName (const char *name_string)
 Create a new error/failure type name. More...
 
CDErrT cd::UndeclareErrName (uint error_name_id)
 Free a name that was created with DeclareErrorName() More...
 
uint cd::DeclareErrLoc (const char *name_string)
 Create a new error/failure type name. More...
 
CDErrT cd::UndeclareErrLoc (uint error_name_id) class SysErrInfo
 Free a name that was created with DeclareErrLoc() More...
 

Detailed Description

The Error Reporting module includes the definition of types and methods used for system and CD runtime error/failure reporting.

Enumeration Type Documentation

enum cd::CDErrT

Type for specifying error return codes from an API call – signifies some failure of the API call itself, not a system failure.

Unlike SysErrNameT, CDErrT is not for system errors, but rather errors originating from the CD framework itself.

For now, only returning OK or error, but will get more elaborate in future versions of this API.

Enumerator
kOK 

No errors/failures.

Call executed without error.

kAlreadyInit 

Init called more than once.

kError 

Call did not execute as expected.

Type for specifying errors and failure location names.

Please see SysErrNameT for discussion of intent and defintions

This is really not that suitable for all topologies, but the intent really is to be rather comprehensive to maintain portability – how do we resolve this?

Todo:
is SysErrLocT comprehensive enough for portability?
See also
SysErrNameT, DeclareErrLoc(), UndeclareErrLoc()
Enumerator
kOK 

No errors/failures.

Call executed without error.

kIntraCore 

Within a part of a core.

kCore 

A core.

kProc 

Processor.

kNode 

Same as processor?

kModule 

Module.

kCabinet 

A cabinet.

kCabinetGroup 

Some grouping of cabinets.

kSystem 

Entire system.

Type for specifying system errors and failure names.

This type represents the interface between the user and the system with respect to errors and failures. The intent is for these error/failure names to be fairly comprehensive with respect to system-based issues, while still providing general/abstract enough names to be useful to the application programmer. The use categories/names are meant to capture recovery strategies that might be different based on the error/failure and be comprehensive in that regard. The DeclareErrName() method may be used to create a programmer-defined error that may be associated with a programmer-provided detection method.

We considered doing an extensible class hierarchy like GVR, but ended up with a hybrid type system. There are predefined bit vector constants for error/failure names and machine location names. These may be extended by application programmers for specialized detectors. These types are meant to capture abstract general classes of errors that may be treated differently by recovery functions and therefore benefit from easy-to-access and well-defined names. Additional error/failure-specific information will be represented by the SysErrInfo interface class hierarchy, which may be extended by the programmer at compiler time. Thus, each error/failure is a combination of SysErrNameT, SysErrLocT, and SysErrInfo.

This needs more thought

Warning
The SysErrNameT and SysErrLocT are extensible by a runtime call to generate a new constant, but SysErrInfo is a class hierarchy and extended at compile time by inhereting the interface – is this a problem? Should we go the GVR way with all extensions done at runtime and all accesses with potential runtime methods and no compile-time typing?
Todo:

Is SysErrNameT comprehensive enough for portability?

segv (segmentation violations) can be used as proxy for soft memory errors using the existing kernel infrastructure

See also
SysErrLocT, DeclareErrName(), UndeclareErrName()
Enumerator
kOK 

No errors/failures.

Call executed without error.

kSoftMem 

Soft memory error (info includes address range and perhaps syndrome)

kDegradedMem 

Hard memory error that disabled some memory capacity (info includes address range(s))

kSoftComm 

(info includes message info)

Soft communication error

kDegradedComm 

Some channel loss.

kSoftComp 

includes affected PC and perhaps bounds on the error?)

Soft compute error (info

kDegradedResource 

functionality

Resource lost some

kHardResource 

(control/reachability failure).

Resource entirely lost

kFileSys 

Some file

Function Documentation

uint cd::DeclareErrLoc ( const char *  name_string)

Create a new error/failure type name.

Returns
Returns a "constant" corresponding to a free bit location in the SysErrNameT bitvector.
See also
SysErrNameT, SysErrLocT, UndeclareErrLoc()
Parameters
name_stringuser-specified name for a new error/failure location
uint cd::DeclareErrName ( const char *  name_string)

Create a new error/failure type name.

Returns
Returns a "constant" corresponding to a free bit location in the SysErrNameT bitvector.
See also
SysErrNameT, SysErrLocT, UndeclareErrName()
Parameters
name_stringuser-specified name for a new error/failure type
CDErrT cd::UndeclareErrLoc ( uint  error_name_id)

Free a name that was created with DeclareErrLoc()

Returns
Returns kOK on success.
See also
SysErrNameT, SysErrLocT, DeclareErrLoc()Interface to error/failure-specific information

An abstract interface to specific error/failure information, such as address range, core number, degradation, specific lost functionality, ...

This is an empty interface because the information is very much error dependent. Also defining a few specific initial examples below. This follows the GVR ideas pretty closely.

See also
SoftMemErrInfo, DegradedMemErrInfo
Parameters
error_name_idID to free
CDErrT cd::UndeclareErrName ( uint  error_name_id)

Free a name that was created with DeclareErrorName()

Returns
Returns kOK on success.
See also
SysErrNameT, SysErrLocT, DeclareErrName()
Parameters
error_name_idID to free