This page is under consruction

Introduction

I put here some thoughts related to the using of C++ in high performance embedded devices.

Two rules

Do not use C++ constructions in embedded system.
(for advanced developers) you can use C++ if you are very careful

side remark here. it is not easy to provide any kind of generic advise how to write or not to write in this or that language. different tricks and approaches can be used in different layers of the same application. some things though are always wrong. it is wrong, for example, to start a project saying that performance of the management part of the application does not matter, because it runs in low priority it can be right in some circumstances after 3 years of active development, but not at the design phase.
consistency in the code is very important. use the same set of tools and reuse the same classes everywhere and as often as possible.
the main purpose of this document is to give some idea of what probably is not good in most cases. For example, blind usage of virtual functions can bring very nice and deep inheritance trees but at unbearable performance costs. And once you started to use virtual methods to get rid of them at later phase is not an easy task. Converting raw data packet into object and then back to the raw data packet will not work in many (most ?) applications.
Multiple inheritance should be used carefully or better avoided, because it complicates the class diagram and introduces unintended dependencies between unrelated parent classes.
Embedded system is everything from smart card with 4K ROM and 200 bytes RAM to high end server with 16 CPU interconnected with proprietary high speed bus. For every device the decision of using this methodology or that should be reached separately. Fortunately we almost never the first who try to do it.
Returning to C++ there are many different schools of patterns. Some say that templates are not good, because the object code is too large and they are right. Some say that virtual methods are not good for exaclty the same reason and they are right too. Some say that we don't need templates because we have virtual methods and mutliple inheritance and it should be more than enough for anybody. And they are right too. Also there are those (like me) who prefer larger object code over performance penalty and advocate templates and limited use of virtual methods. I guess that this side has it's points too. It all depends.

Typical limitations for embedded system

ROM size
RAM size
CPU performance
Maintenance costs

If you think that there are plenty of resources on the card when a project starts after one year of development you will find out that there need for three times more and the most significant constrain is CPU.
ROM size can influence dramatically compilation time. Tell me what is application RAM and ROM footprint and i will tell you how long the card boots up. Compilation time + boot up influence the performance of the SW team and (unexpectedly) one of the largest contributing factors. Long compilation times make rebuilds less often and slow boot makes debugging devastating. The tendency should be to make compile&debug cycle an easy and quick procedure.

From Unified Modeling Language User Guide Grady Booch James Rumbaugh Ivar Jacobson In software, there are several ways to approach a model. The two most common ways are from an algorithmic perspective and from an object-oriented perspective. The traditional view of software development takes an algorithmic perspective. In this approach, the main building block of all software is the procedure or function. This view leads developers to focus on issues of control and the decomposition of larger algorithms into smaller ones. There's nothing inherently evil about such a point of view except that it tends to yield brittle systems. As requirements change (and they will) and the system grows (and it will), systems built with an algorithmic focus turn out to be very hard to maintain.
The contemporary view of software development takes an object-oriented perspective. In this approach, the main building block of all software systems is the object or class. Simply put, an object is a thing, generally drawn from the vocabulary of the problem space or the solution space; a class is a description of a set of common objects. Every object has identity (you can name it or otherwise distinguish it from other objects), state (there's generally some data associated with it), and behavior (you can do things to the object, and it can do things to other objects, as well).
Typical approach to the C++ application

list all objects we need for the application
list all interactions between objects
drop the lists on the design board
start to write code.

In the packet process oriented systems data flow dictates significant parts of the design. Processing engines (processors) can be implemented as objects/classes, but not data packets themselves. Do not throw away procedural approach. It has it's applications. C++ has subset - regular C. There is no reason to put everythig into objects. Or let's say some methods can remain static methods. Imagine that you need driver for FPGA. there is little doubt that no matter what is going to happen with the project you will have only one FPGA, because, for example, this is PLL for the TDM clock. There is no any special reason to create more than one object myClockFPGA of ClockFPGA { ... } type. Want to encapsulate the function names and instead of writing fpgaSetRegister write fpga.setRegister ? It is Ok and right thing to do - we have keyword static. Call to static methods has exactly the same performance as a call to regular C function. Make sure that only one instance of the object can exist (move container to the protected area and supply static factory method which creates only one object and fails on subsequent calls)
Use or not use the templates

some compilers do not support it
object code and especially debug information is inflated (70% or more)

bottom line - there is simply no alternative
Virtual functions

performance
object code size
ROM based data
code is more difficult to maintain

Avoid it at any cost in time sensitive operations like access to arrays, cyclic buffers, stacks, memory allocation.
The following is based on http://www.embedded.com/98/9802fe3.htm
The first cost is that it makes objects bigger. Every object of a class with virtual member functions contains a vtable pointer. So each object is one pointer bigger than it would be otherwise. Adding a virtual function can have a disproportionate effect on a small object. An object can be as small as one byte and if a virtual function is added and the compiler enforces four-byte alignment, the size of the object becomes eight bytes. There is the cost of the vtable lookup for a function call, rather than a direct one. The cost is a memory read before every call to a virtual function (to get the object's vtable pointer) and a second memory read (to get the function pointer from the vtable). The actual cost depends on CPU. The C substitute would look like
(this->vTable[methodIndex]) (this);
where methodIndex is a constant known at compilation time and vTable is hidden field added to every class containing virtual methods.

Array

What we are looking for in arrays.
There are three allocation models in C++.

allocation on stack (automatic variable)
dynamic allocation with new operator
static allocation with word static

We will prefer arrays which are allocated statically or arrays which are automatic variables. We will prefer arrays with size known at compilation time.

An example of naive implementation class IntArray { public: Array(int size) { this->size = size; data = malloc(size * sizeof(int)); } int getSize() { return size; // reference (this->size) and // one additional argument (this) } int get(int index) { int size = getSize; if (index 0)\|\|(index > size) .... } // call to the method forwards 3 arguments // 8060/8250/8260 have no stack void setData(index i, int val) { int size = getSize; if (index < 0)\|\|(index > size) .... else // imagine that this is array which should support // not int but array // then we need memcpy(). how can i write generic // interface which equally efficient // for int, void * and C++ object/C structure ? data[i] = val; } int * data; // who and when allocates this thing ? }	`void IntArray::someOperation() { // be carefull - this->size and can be called for // every loop depending on the optimization for (int i = 0;i < size;i++) { ..... } } void IntArray::someOperation_1() { for (int i = 0;i < size;i++) { // performance ! is get() an inline function ? int val = get(i); .... } }` Some of the problems with this implemenation. What if we need array in shared memory or in DPRAM and our allocation fuction is not trivial. Another problem here is methods in the child classes making operation with the data. Every time to call method get() to fetch next entry is clearly not a good idea. The alternative - access the data directly is not much better, because it assumes knowledge of the implementation details of the parent class. Such child can not be easily moved from one project to another + parent can not be changed without at least code reading of all child classes. Because the implementation dynamically allocates the data we can not use this class for automatic variables. Imagine function where you need temporary array of 10 integers to read HW registers into and then make some processing with this array. No matter what is the performance of the memory management you use objects of this class can not be used for this task. You have to create only one array and make your function thread safe (array is a static variable). The alternative is to use regular C array of integers and loose all debug statistics, error checking and performance monitors parent class could provide. This is not going to be our way to solve the problems. But before we consider different approach let's notice that everything started from attempt to bring memory allocation, data holder and iterator into one class. Indeed, malloc() is an engine allocating memory which is not trivial and deserves attention and separate class (actually large family of classes). Data holder expressed by field int * data is an array itself. Method get(int) is an attempt to provide some way to access the data, for example iterate through the data which is most often used operation. Two engines and data in one class and the result is about zero usefulness and bad performance. Start to use this class everywhere and project will end pretty soon because of "poor performance of C++"
how it could be done // Container is a base class and can contain debug info, // array name, statistics (important !), etc. template <class ObjectType> class Container : Object { public: inline Container(); inline ~Container(); inline int size() const; // return non-zero if the Object belongs to the array // the function takes a while: // { // int offset = (int)Object - (int)First; // return ( (Object >= First) && (Object <= Last) // && (offset%sizeof(Data) == 0) ); // } inline bool belongs(const ObjectType Object) const; protected: inline void init(); ObjectType Data; ObjectType First; ObjectType Last; int Size; private: }	side remark regardign statisitcs. base class normally will contain object name, class name, time and date of creation, log of access hits of the public methods and other debug statistic. debug info comes first, subclass(s) come second. For example, array. You probably would like to know who and when allocated 8M array of chars, or when exactly (and by what task) was accessed method set() for out-of-range index. Array methods can call some dummy object - debug info collector at 0 performance costs (empty inline function). When you start to look for the problem you can replace the collector by some real object (even in run time). When writing a new class the first question to ask is "How will i debug it". method belong() is probably not a great idea for public method, but it is good a protected method. there is an assumption, of course, that data storig in the container is placed into array - consecutive memory region. pay attention that belongs check address of the array entry. Basically it works like if ((index >= 0) && (index <= arraySize)), but the test is not for index, but for the address and in many cases it is going to be faster. To improve the performance of the method allocate aligned arrays and use ObjectType with size power of 2. Well designed class Array can provide constructor with parameter alignment. method init() will set First and Last fields. this is dirty, because it implies that subclass should do something more besides just calling parent constructor, but this is may be the only problem. the major gain here that Container has no idea how subclass allocates data, but still can provide lot of usefull services, like belongs(). If subclass fails to call init() the problem will be apparent from the first run, because typically any iterator will call belongs() to check incoming arguments.
`// do not use this class directly // use ArrayS or ArrayD instead template <class ObjectType> class Array : public Container<ObjectType> { public: inline ObjectType& operator[](int Index); class IllegalIndex : public Exception // IllegalIndex { public: IllegalIndex(int Line, int Index) : Exception(Line, __FILE__) { this->Index = Index; } int Index; }; // class IllegalIndex virtual inline ~Array(); protected: inline Array(); inline void init(); private: }; // class Array`	Small detail here. IllegalIndex exception is part of array and not container. Container can be a set or map or hash table. Some other array related things can be put here. for example, method swap() or iterator. Talkig about exceptions. This is not a good idea to throw exception every time there is an error in the argument. Typically in case of uncatched exception OS will suspend the calling thread. Exceptions are handled differently by OSs and compilers but we should expect that in throw OS will allocate (dynamically !) an object of required type. Throw and catch are very expensive operations. Constructor of class Array is in the protected area. Application can not create objects of this type directly. Array is just a place holder in case we need some common API between all arrays in the system. For example, registration of all created arrays and printing relevant statistics like array name, thread owner, array size, number of fetches from the array in the last 10 seconds.
template <class ObjectType> class ArrayD : public Array<ObjectType> { public: ArrayD(int Size); virtual ~ArrayD(); protected: private: }; // class ArrayD // example of usage { // runtime library (OS loader) calls new() and // constructor before application main() is called static ArrayD intArray10(10); // this line will be executed when CPU reach the line // somebody has to fill the array with actual pointers // presumably zero terminated strings ArrayD stringArray = new ArrayD(10); } // array of strings class ArrayString : ArrayD { ArrayString(int Size) : ArrayD(Size) { // First is (char * ) for (char * p = First;p <= Last;p++) { * p = NULL; } } }	elements will be allocated dynamically `template <class ObjectType> ArrayD<ObjectType>::ArrayD(int Size) : Array<ObjectType>() { Data = new ObjectType[Size]; this->Size = Size; Array<ObjectType>::init(); } template <class ObjectType> ArrayD<ObjectType>::~ArrayD() { delete [] Data; }` It is not for automatic variables, unless we have very efficient memory management. One exception is task variables. Usually ArrayD is used in static variables or explicitly creared with operator new(). interesting thing here. Container (parent class) does not handle any allocations. Container assumes that child class in the constructor will initialize pointer to the array (Data) and set array size. It gives us freedom to use different allocation modes in the child classes, while sharing the same API. this trick is considered dirty, because OO suggests that constructor of non-abstract class should bring object to the valid state - initialize all internal fields. The base line requirement, that after Container is created any Container method can be called with reasonable result. my opinion is that you can use this approach if you do not want to write templates with argument Allocator. talking about Allocator. Imagine you need a pool of blocks. Sometimes block is just raw data 96 bytes long, sometimes block is a structure representing IP header, sometimes this is 4 bytes address in the shared memory region and sometimes this is C++ object. base class pool is implemented as a stack of pointers to the stored objects and provides two methods get() and free() + debug info like who and when allocated/released, etc. Let's say that sometimes we want to allocate the blocks (initially - to fill the pool) using malloc, sometimes we will point to the reserved areas in DPRAM, etc. clearly this is not a great idea to change the pool itself every time our allocation scheme is changed or subclass the pool. Pool works very well if we provide to the pool allocation interface. We can build many different allocators - to allign blocks, to get power of 2 block size, to preset block header, etc. Pool does not care how application does it. Pool knows very well to push/pop to/from a stack of pointers. this example is good because it shows how the problems should be separated. Bottom line - keep classes small, limit the API, isolate engine and data in separate classes, different functionality often means different classes. `template <class BlockType, class Lock, class AllocatorType> class Pool : public PoolStatistics { // we need Allocator only once - when call to the // Pool constructor. Pool(const char * Name, AllocatorType * Allocator, int NumberOfBlocks) { for (int i = 0;i < NumberOfBlocks;i++) { BlockType block = Allocator->get(); // now add to the stack of free blocks } } }` get() can be the only public method in the Allocator and can be static method too get() can also be a field (not method) of type function. still ok. write regular C style function allocatiing blocks, write class without constructor containing static field - allocating function, set the function. we can change Allocator API whatever we like class Lock is our next topic, but the idea behid this thing is quite clear. Sometimes we want to protect allocation from pool with interrupt disable and sometimes we do not need any protection at all, because this is the same task which free and allocate blocks. For example, timer task receiving requests like timer start, timer stop, etc.
`template <class ObjectType, int Size> class ArrayS : public Array<ObjectType> { public: ArrayS(); virtual ~ArrayS(); protected: private: ObjectType SData[Size]; }; // class ArrayS`	this is our best friend - static array can be automatic variable. constructor is trivial and does nothing. variable Size is a constant and will be set by preprocessor max performance the price is bloating object code `template <class ObjectType, int Size> ArrayS<ObjectType, Size>::ArrayS() : Array<ObjectType>() { Data = SData; this->Size = Size; Array::init(); }` How do we use this thing ? `void myFunction() { ..................... { // new type - can be used more than once typedef ArrayS<int, 100> intArray100T; // constructor called - it is the same as to write // int SData[100]; intArray100T intArray100; // we need 50 strings array only once ArrayS<char * , 50> stringArray50; } // call two destructors here (inline and empty // methods) .................... }` What we have here is two arrays on the stack - two automatic variables. i am lying here saying that the constructor is empty, but the lie is not that big in reality as it appears. Let's move method belongs() from the Container to the Array and we do not need any assignments in the constructor - field Data (or SData) is initialized by the compilet, fields First and Last can be initialized by the compiler either. One can even place keyword const before First and Last declarations to make sure that nobody ever change the address.

Locks

What we are looking in locks

different types of locks
automatic lock free
both statically and dynamically created locks

Example.


class MyLock
{
  public:

  MyLock()
  {
    semId = new Semaphore());
  }

  void lock()
  {
    ...
  }

  void unlock()
  {
     ..
  }

  private
  int semId;
}

This is somewhat naive implementation. Definitely this thing will work, but it is not maintainable. Let's say you suddenly decided that semaphore is not good enough and you need interrupt disable instead. Sublass MyLock and replace lock() and unlock() methods ? Semaphore still will be created by the parent and ignored by the child. Another problem with this class that if you need global variable of MyLock type and it means that every module using your lock will have to inlcude the class definition. While it appears not a big deal it would be nice to have only one class in the include file and one line declaration of synchronyzation bject of well known type (included from common library)
Clearly Locks are not supposed to be global variables unless this is a library. For example, imagine driver providing API to I/O device. Driver can be made reentrant using interrupt disable, task switch disable or semaphore. In all three cases driver (set of functions) makes some assumptions about underlying CPU and OS. Even worse, the driver makes an assumption that application is multitask which is not always the case. Application has no any alternative, but to live with interrupt disable or we have to change the source code of the driver.
The alternative is to call Lock (internal class of the driver) and require from the application to provide synchroniztion object. Application can supply dummy object (empty inline functions) or interrupt disable. We isolated low level driver from OS and CPU.


// better way to do the same
class MyLock
{
  public:

  inline MyLock(int semId)
  {
    this->semId = semId;
    semGet(semId);
  }

  inline ~MyLock()
  {
    semSend(semId);
  }


  private:

  int semId;
}

side remark here. imagine a function with multiple return points - calling to return more than once let's say that this is a lookup in database and the whole function is protected by semaphore function starts with getSemaphore() and in all return lines i will place sendSemaphore without going too deep into the problems with nontrivial critical section i will only say that C++ lock() is absolutely must. returning to the database containing such kind of lookup there are clearly better solutions one can always start the project stating that only low priority tasks will access the database or, for example, that typically only one task will access the database so it does not matter that we lock API for long time. these arguments are all wrong and will not survive even one year of application development.
This is also an example of dividing APIs between separate classes. MyLock is NOT a semaphore. MyLock contains and use semaphore or any other synchronization interface. Sempahore is a separate class and MyLock is a separate class. They have some things in common, but not enough to put them in the same inheritance tree.
How we solve the problem with global lock here ? Semaphore ID is a global variable, MyLock is well known class (LockSemaphore).



// Usage
void main()                                                    
{                                                              
  {  // critical section is started here                       
     MyLock lock(globalSemId);  // constructor is called       
     ..........                                                
  }  // destructor is callled                                  
}

but what with different kinds of locks ?


// for example
//
//  {
//     Lock();    --- disable interrupts
//
//     do critical section
//
//
//  }                --- call to ~Lock, enable interrupts
class Lock
{
  public:
  
  Lock()
  {
    Mutex.get();  // here i have no idea what kind of mutex it is 
                  // and i do not care i need get/release methods 
  }
  
  ~Lock()
  {
    Mutex.release();
  }
  
  
  protected:
  
  MutexInterrupt Mutex;
  
  
  private:
  
}; // class Lock



class MutexInterrupt : SynchroObject
{
  public:
  
  MutexInterrupt()
  {
  }
  
  ~MutexInterrupt()
  {
  }
  
  virtual void get()                 
  {                                  
    RTOS::interruptDisable();        
  }                                  
                                     
  virtual void release()
  {
    RTOS::interruptEnable();
  }
  
  protected:
  
  
  private:
  
}; // class MutexInterrupt


class SynchroObject
{
  public:
  
  SynchroObject()
  {
  }
  
  virtual ~SynchroObject()
  {
  }
  
  virtual void get() = 0;
  virtual void release() = 0;
  
  protected:
  
  
  private:
}; // class SynchroObject

want to improve performance ? remove dependency on SynchroObject add keyword inline. Lock does not require SynchroObject want to have shared property among all mutexes like debug counters ? no problem - create parent containing required methods, but leave get and release methods to child. avoid virtual functions


// disable/enable context switching
// for example
//
//  {
//
//     LockOS();    --- disable context switching
//
//     do critical section
//
//
//  }  call to ~LockOS, enable context switching
class LockOS
{
  public:
  
  LockOS()
  {
    Mutex.get();
  }
  
  ~LockOS()
  {
    Mutex.release();
  }
  
  
  protected:
  
  MutexOS Mutex;
  
  
  private:
  
}; // class LockOS


// dummy lock
class LockDummy
{
  public:
  
  LockDummy()
  {
  }
  
  ~LockDummy()
  {
  }
  
  
  protected:
  
  private:
  
}; // class LockDummy

This one is not just a place holder. Application will use this lock a lot. For example, database API requires Lock. Database does not assume any knowledge of the OS or design of application. Code of the database accurately calls the lock to protect all critical sections. This is the applicaton's decision what kind of lock (if any) to use.


class MutexOS : SynchroObject
{
  public:
  
  MutexOS()
  {
  }
  
  virtual ~MutexOS()
  {
  }
  
  virtual void get()
  {
    RTOS::disable();
  }
  
  virtual void release()
  {
    RTOS::enable();
  }
  
  protected:
  
  
  private:
  
}; // class MutexOS

class MutexSemaphore
{
  public:
  
  MutexSemaphore()
  {
    semId = new Semaphore();
  }
  
  ~MutexSemaphore()
  {
  }
  
  void get()
  {
    semGet(semId);
  }

  void release()
  {
    semSend(semId);
  }
  
  protected:
  
  
  private:

  int semId;
}; // class MutexSemaphore


class LockSemaphore
{

  public:
  
  LockSemaphore(Semaphore * semaphore)
  {
    this->semaphore = semaphore;
    semaphore->get();
  }
  
  ~LockSemaphore()
  {
    semaphore->release();
  }
  
  protected:
  
  Semaphore * semaphore;
  
}; // class LockSemaphore

This lock could use semaphore or signal or any other object. Let's say that we have no idea what kind of synchronization object we are going to use in three months from now, because we decided to mobe from vxWorks to Linux



template <SynchroObject> class LockSemaphore
{
  public:
  
  LockSemaphore(SynchroObject * synchroObject)
  {
    this->synchroObject = synchroObject;
    synchroObject->get();
  }
  
  ~LockSemaphore()
  {
    synchroObject->release();
  }
  
  protected:

  SynchroObject * synchroObject;
}; // class LockSemaphore

SynchroObject is just any class assuming that it hase two public methods get() and release()
Pay attention to implicit operator new. Line like static MutexSemaphore myMutex; is a bug because when static variables are initialized RTOS is probably not initialize/running. In Linux user space C++ demangler does the job correctly though



typedef LockSemaphore<MutexSemaphore> myLockT;
void main()
{
  MutexSemaphore * myMutex = new MutexSemaphore();
  // ...................................
  {  // critical section
    myLockT(myMutex);  // calls LockSemaphore() and in turn 
                       //  myMutex->get()
  }  // relase()
}

Locks and Arrays meet each other in iterator



#define _ITERATOR_TEMPLATE_ARGS class ListType, class ObjectType,   \\
   class IndexType, class Lock
#define _ITERATOR_TEMPLATE_ARG_LIST ListType, ObjectType, IndexType, Lock

// example:
// class MyArrayT : public ArrayS<int, 20>
// {
//   public:
//   
//   bool getNextIndex(int * Index)
//   {
//     * Index = (* Index) + 1;
//     if (* Index < size()) return true;
//     else return false;
//   }
//   
//   int * getEntry(int Index)
//   {
//     return &Data[Index];
//   }
//   
//   bool getFirstIndex(int * Index)
//   {
//     * Index = 0;
//   }
// }; // class MyArrayT
// MyArrayT MyArray;
// Iterator<MyArrayT, int, int LockDummy> iterator(&MyArray);
// int value;
// while (iterator.next(&value))
//    printf(value);

template <_ITERATOR_TEMPLATE_ARGS> class Iterator
{
  
  public:
  
  Iterator(ListType * List);
  ~Iterator();
  
  typedef const ObjectType * PObjectType;
  // return true if Ok
  bool next(PObjectType * PObject);
  
  protected:
  
  IndexType Index;
  ListType * List;
  bool NotFirst;
}; // class Iterator

This iterator is not good enough. Let's add method bool getNext(Object & o) and instead of fetching of next index will fecth next entry of the array. The implementation is obvious - this is exactly the ++ operator we would do with the regular C pointer. Let's make the method inline and suddenly we have performance of our IntArray::someOperation() method similar to one in C - imagine that the inline method is already in the code cache, and the only price we are going to pay is one additional reference - instead of * (p) we have * (this->p). Typically "this" is a register variable and performance overhead is essentially zero for many CPUs.

Inheritance

you probably paid attention that i do not use inheritance a lot. for example, one could argue that

array is a container
FIFO is an array
mailbox (message queue) is FIFO

and get very nice and clean (and as we see next not very usefull) inheritance tree. i will put it another way

array is a container
FIFO is a container
mailbox contains FIFO

let's compare two approaches (naturally i prefer the last one) the basic idea behind this is never build inheritance because of inheritance per se. it can lead and eventually will lead to less maintainable code. i do not say here to avoid inheritance, but i say to consider carefully every case and case.
When we write parent class and then subclass it we basically limit our fredom to chnage not only API of the parent class, but also implementation details. Imagine that parent class calls lock in all public functions. Assume also that subclass should call one of them, but before that should take the same lock. We have a problem here, because our subclass is suddenly aware of the implementation details of the parent. Let's say that parent provides public methods started with lock and calling internal (protected) methods. Subclass (child) will call only protected methods and not API. Again the problem - API performance is not as good as it could be and subclass should play according the rules and call the lock before calling any parent method.
How word contains solves the problem ? Mailbox has access only to the public methods of FIFO. If you decide that CyclicBuffer better suits the Mailbox needs you just replace the declaration, for example


template <class Message> class Mailbox
{
  FIFO<Message * > q;

  Message * recv()
  {
    return q.remove();
  }

  void send(Message * msg)
  {
    q.add(msg);
  }
}

class MailboxNew
{
  CyclicBuffer<Message * > q;

  Message * recv()
  {
    return q.remove();
  }

  void send(Message * msg)
  {
    q.add(msg);
  }
}

No changes in the implemenation of Mailbox. We are free to do with FIFO whatever we want. Imagine that Mailbox is a subclass of FIFO. Suddenly Mailbox has more methods and fields to access (polluted name space). But this is relatively small problem.
Another example is card in shelf - single router blade in grid router. router blade contains many public properties and only very limited set of public methods. Most of the properties themselves are nontrivial objects . for example, parent class Card can contain methods

reset
get slot
management block
management unblock

and properties

System Log
Alarm Log
LEDs

log contains FIFO and is a nontrivial class. card has a property - public field Log Log provides API like clear(), stop(), pause(), print(), printWithFilter(), etc. same with alarm. LEDs is an object containing array of LED objects (array can be empty). LEDs has a public method ledsTest().
will routerCard inherit base Card ? the answer is clear - yes. if an application has two different blade cards like router with single 1GB interface and card with 4x100M we have correct inheritance. The interface itself is property of the card. There is no a thousand get()/set() methods communicating outside what kind of interfaces the card has. There is a field - array of interfaces, or even better add also iterator of interfaces. Than everyone can find out what interfaces the card has and check the status of everyone of them.
Reasonable number of methods for a class is 3-10 including those inherited from parent If you see that you need more methods consider public property fields.
Simple Property field is another example of making property public still providing restricted access to the field and save multiple get()/set() methods. Consider following class


template <class FieldType, class ChildType> class PropertyRO : public Object
{

  public:
  
  inline PropertyRO();
  inline ~PropertyRO();
  
  inline operator FieldType() const;

  protected:

  inline PropertyRO(FieldType Value);
  inline PropertyRO(const ChildType & Property);

  FieldType Value;
  
  inline ChildType & operator=(FieldType Value);

  private:
  
}; // class PropertyRO

example of usage


class MyCard 
{

  public: 

  class Slot : public PropertyRO<int, Slot>
  {
     ....
     friend MyCard;
  }

  // public property field. application can 
  // not change, but can read it. 
  // no additional methods are required
  Slot slot;  
              
}

we can still provide access for MyCard to the internals of Slot. MyCard is friend of Slot and can access any protected and private method/field. For example, MyCard constructor can setup initial value for the Slot.

operator = can be implemented with lock if FieldType is not trivial or, for example, it can call the wrapping class (MyCard) method to read/write hardware. Property can be a memory mapped HW register, for example. the idea here is to provide the same performance as regular C construct like define * (UINT32*)addr = value. Thanks to the inline static methods we can do it without any performance penalty and compiler will check all types.


class Register32 : public PropertyRW<UINT32, Register32>
{
}

Your evaluation board contains many different blocks, like CPU, FPGA, etc. and every one of them contains registers. while some of the most often used functionality can be done in dedicated methods in some cases direct access to the registers can be feasible. for example, HW driver can contain two parts - low level part (just registers, bit masks, initialization, reset, getVersion routines) and high level part - FSM. FSM can find it more convenient to write directly to the registers instead of calling separate method for setting every single bit.

Virtual functions

What is virtual function ?
How it can be implemented ?
What exactly compiler does with this ?
Usually i will sacrifice virtuality and polymorphism and gain performance.
Some examples.
Shelf contains different cards



class Shelf
{
  class Cards : public ArrayS<Card * , 12>  // array containing 12 cards (12 pointers to the card structure)
  {
  }
  Cards cards;  // array of cards

  void printCards()
  {
    for (int i = 0;i < Cards.Size;i++)  // pay attention to the Size - this is a class constant (remeber Array ? )
    {                                   // not the best way to write code but interesting as an example
      pintf("%s\n", cards.get(i)->getName());        
    }
  }
}

class Card
{
  virtual const char * getName() = 0;  // pure virtual function, the class is abstract
}                                      // we can not create objects of this type


class NoCard : public Card
{
  virtual const char * getName() // implements virtual method
  {                              // Objects of type NoCard can be created
    return  "Empty";
  }
}

next question here how we create Card objects first of all let us hide the constructor


class Card
{
  public:
  static Card * factory(CardType, CardVersion, Slot);  // pay attention to the word static

  ~Card()
  {
  }

  operator delete(void * )
  {
  }

  operator void * new(int size)
  {
  }


  protected:
  Card()
  {
  }
}

factory() allocates from memory pool a block of required size and calls related initialization routine. for example,


class NoCard : public Card
{
  NoCard()
  {
    // init the state of the object
  }
}

Card * Card::factory(CardType, CardVersion, Slot)
{
  Card * card;
  switch (CardType)
  {
    case CardTypeNoCard:
    card = new NoCard();  // call operator new to allocate memory, call NoCard constructor
    break;
  }
}

if new() is fast the code is effective enough even if you want to create cards which are automatic (temporary) variables assuming that we are talking about functions on high levels of the application. Why we could not use Array in the same way ? In case of cards i know what the maximum size of block i need to create a card - this is size of the largest child and i know number of simultaneously existing card objects. The implementation of operator new than is trivial - allocation a block from the stack of free blocks. pay attention that there is only one virtual function so far - getName(). i would suggest to keep it this way

Abstract interfaces


class RxTask : Task
{
  static const BLOCK_SIZE = 1024;
  void loop()
  {
     Array<unsigned char, BLOCK_SIZE> block;
     do
     {
       io->read(block);
     }
     while (res == OK);
  }
  IO * io;
}

what is IO ? IO can be any class implementing methods

open
read
write
close
ctrl



class IO
{
   virtual int read() = 0;
   .....  
}

for example,


class Socket : public IO
{
  virtual int read()
  {
  }
}

but i do not like virtual functions especially in character devices. what i can do ?


template <class IO> RxTask
{
  RxTask(IO * io)
  {
    this->io = io;  // store io
  }

  void mainloop()
  {
    io->read();
  }

  IO * io;
}
void main()
{
  Socket * soc = new Socket();
  RxTask<Socket> * rxTask = new RxTask<Socket>(soc);
}

now IO is an argument, no virtual functions required. When creating the task call operator new for RxTask() and here we are with correct type in the task. Another way around is forward the method instead of IO object.


class ReadFunction
{
  public:
  int read(void * )
  {
    socket->read();
  }

  protected:

  ReadFunction(Socket * socket)
  {
    this->socket = socket;
  }
  friend Socket; // Socket will create ReadFunction objects
  Socket socket;
}

Socket creates wrapper ReadFunction. RxTask works only with objects of type ReadFunction. no virtual functions, but double reference. but the idea is good, let's improve it


typedef int ( * ReadFunction)(void * this, void * )
class RxTask
{
  ReadFunction read; // application will initialize this with socketReadFunction
  void * readParam;  // first argument of the ReadFunction()

  void mainloop()
  {
    read(readParam, NULL); // make field read static to get rid of implicit this->
  }
}

// example of read function
inline socketReadFunction(void * socket, void * block)
{
  // check that socket is indeed pointer to something reasonable
  // call read
  (Socket *)socket->read(block);
}

this is not C++ and compiler can not do type checking here. still can be found useful.
should we write code like this ? i do not know the answer. probably there is no single answer to this. strict OO says "no way" and suggests virtual functions.

Mailbox and message

Some example of interface


class Message
{
   Message();

   void send();
   Message & receive();
}


// better way (?) separate message and mailbox
class Message
{
  Message();
}

template <class Message> class Mailbox
{
  Mailbox();
  void send(Message &);
  Message & receive();
  Message & receive(timeout);
}

What we are expecting from Mailbox

It should be equally efficient with both pointers and strcutures
Sometimes we want to copy the message and sometimes we allocate a message from pool and send pointer.

Side remark here. Allocation messages from pool is better than assuming that mailbox copy the message. Overall performance is going to be better. Consider following code


typedef struct MessageT
{
  /* some data here */
}
void sender()
{
  MessageT * message;
  // allocate block from pool
  pool->get( & message);
  // setup the data:   message->data = ...
  // and send pointer
  mailbox->send(message); 
}

void receiver()
{
  MessageT * message;
  mailbox->receive( & message); 
  // process message->data
  //free block
  pool->free(message);
}

If we go with copy we basically pay memcpy at least once and in some mailbox implementations twice. The natural approach here is to restrict the application providing only send-pointer mailbox. It dictates the right task design and improve overall performance of the application. Another gain is that we can use preset message headers when allocating from pool. Allocator or pool can set the headers at the initialization phase or, for example, clean up the block every time application allocates it. It is very easy to forget to call memset(data, 0, sizeof(*data)) when setting data in some structure defined somewhere else. And immediate remark on remark here. One can argue that application make use of C++ classes instead of structures and message constructor will initialize the data. The argument has a flow though. If messages are allocated from pool all constructors will be called only once. Application has only two alternatives here - call method init() or dynamically allocate message and use memcpy() in mailbox. The right place to call init() from is pool or allocator, because this is only one place and not every time before mailbox->send().
One could ask also what if one task sends different messages to different task and there is a task which will receive messages of different types. The typical advice here is not to use void. We can try to subclass some base class message and then typecast the object according to message event. And how we know what pool to call to free the message. Here comes data prefix. Pool can sign any block with pointer to itself. Method free() is static and can free any block no matter from what pool it was allocated. Safety net can be and should be placed in like watermarks, flags and even checksum.


// Real class
template <class MessageType, class Lock> class Mailbox 
   : public Object
{

  public:
  
  Mailbox(const char * Name, int Size);
  ~Mailbox();
  
  void send(const MessageType * Message);
  inline void send(const MessageType & Message);

  void receive(MessageType * Message);
  
  // return 1 if new message
  // return 0 if timeout expired
  int receive(MessageType * Message, int Timeout);
  
  protected:
  
  Semaphore Semaphore;

  FifoD<MessageType, LockDummy> * Fifo;
  
  const char *Name;
  
  inline void fetchMail(MessageType * Message);
  
  private:
  
}; // class Mailbox

Every class in our OO application has a parent - Object. I assume here that we are not going to use multiple inheritance.
Mailbox is equally efficient when send objects and pointers - pay attention to two send() methods.
Semaphore (or signal) used to block the calling thread until a message available in the message queue.
FifoD conatins dynamically allocated array and two fields head and tail - this is message queue.
Name is a name of the mailbox - usefull for debug.
fetchMail() calls to lock() and fifo->remove(). This method can be usefull if application decides to subclass Mailbox and reload receive() - just a backdoor for dirty designs. Example of usage:


class SendReceive
{
  private:

  typedef class Mailbox<int, LockOS> MailboxT;
  static MailboxT * mailbox = NULL;

  SendReceive()
  {
  }
  
  public:

  static void init()
  {
    // create mailbox
    mailbox = new MailboxT("processor", 10);
    // spawn two tasks here if not exist already
    // ...
  }

  static void receiver()
  {
    while (1)
    {
      int message;
      mailbox->receive(message);
    }
  }
  
  static void sender()
  {
    while (1)
    {
      int message = 1;
      mailbox->send(message);
    }
  }
}

From ecos comments (class class Cyg_Mboxt): Message/Mail Box. This template implements a queue of T's. Implemented as a template for maximal flexibility; one would hope that only one, with T==(void *) and the same number of them, is ever used without very good reason.

Inline functions, define and const, static methods

NOT-TO-DO list

The following is mostly from http://www.caravan.net/ec2plus/rationale.html.

...exception handling (time and memory requirements are unpredictable, there is performance penalty)
...RTTI (larger object code without significant gains)
...virtual inheritance (makes sense only if multiple inheritance is used)
...multiple inheritance (code is less readable, less re-usable, and more difficult to maintain)

some links http://developer.apple.com/documentation/Cocoa/Conceptual/ObjectiveC/