This page is under consruction
Introduction
I put here some thoughts related to the using of C++ in high performance embedded devices.
Two rules
- Do not use C++ constructions in embedded system.
- (for advanced developers) you can use C++ if you are very careful
|
side remark here. it is not easy to provide any kind of generic advise how to write or not to write in this
or that language. different tricks and approaches can be used in different layers of the same application. some things though
are always wrong. it is wrong, for example, to start a project saying that performance of the management part of the
application does not matter, because it runs in low priority it can be right in some circumstances after 3 years of
active development, but not at the design phase.
consistency in the code is very important. use the same set of tools and reuse the same classes everywhere and as often as possible.
the main purpose of this document is to give some idea of what probably is not good in most cases. For example,
blind usage of virtual functions can bring very nice and deep inheritance trees but at unbearable performance costs. And once you started
to use virtual methods to get rid of them at later phase is not an easy task. Converting raw data packet into object and then back to
the raw data packet will not work in many (most ?) applications.
Multiple inheritance should be used carefully or better avoided, because it complicates the class diagram and introduces unintended dependencies between unrelated parent classes.
Embedded system is everything from smart card with 4K ROM and 200 bytes RAM to high end server with 16 CPU
interconnected with proprietary high speed bus. For every device the decision of using this methodology or that
should be reached separately. Fortunately we almost never the first who try to do it.
Returning to C++ there are many different schools of patterns. Some say that templates are not good, because
the object code is too large and they are right. Some say that virtual methods are not good for exaclty the same reason
and they are right too. Some say that we don't need templates because we have virtual methods and mutliple inheritance
and it should be more than enough for anybody. And they are right too. Also there are those (like me) who prefer larger
object code over performance penalty and advocate templates and limited use of virtual methods. I guess that this side
has it's points too. It all depends.
|
Typical limitations for embedded system
- ROM size
- RAM size
- CPU performance
- Maintenance costs
If you think that there are plenty of resources on the card when a project starts
after one year of development you will find out that there need for three times more and
the most significant constrain is CPU.
ROM size can influence dramatically compilation time. Tell me what is application RAM and ROM footprint and i will tell you how long the card boots up.
Compilation time + boot up influence the performance of the SW team and (unexpectedly) one of the largest contributing factors. Long compilation times
make rebuilds less often and slow boot makes debugging devastating. The tendency should be to make compile&debug cycle an easy and quick procedure.
From Unified Modeling Language User Guide Grady Booch James Rumbaugh Ivar Jacobson
In software, there are several ways to approach a model. The two most common ways are from
an algorithmic perspective and from an object-oriented perspective.
The traditional view of software development takes an algorithmic perspective. In this approach,
the main building block of all software is the procedure or function. This view leads developers to
focus on issues of control and the decomposition of larger algorithms into smaller ones. There's
nothing inherently evil about such a point of view except that it tends to yield brittle systems. As
requirements change (and they will) and the system grows (and it will), systems built with an
algorithmic focus turn out to be very hard to maintain.
The contemporary view of software development takes an object-oriented perspective. In this
approach, the main building block of all software systems is the object or class. Simply put, an
object is a thing, generally drawn from the vocabulary of the problem space or the solution space;
a class is a description of a set of common objects. Every object has identity (you can name it or
otherwise distinguish it from other objects), state (there's generally some data associated with it),
and behavior (you can do things to the object, and it can do things to other objects, as well).
Typical approach to the C++ application
- list all objects we need for the application
- list all interactions between objects
- drop the lists on the design board
- start to write code.
In the packet process oriented systems data flow dictates significant parts of the design.
Processing engines (processors) can be implemented as
objects/classes, but not data packets themselves. Do not throw away procedural approach. It has it's applications. C++
has subset - regular C. There is no reason to put everythig into objects. Or let's say some methods can remain static
methods. Imagine that you need driver for FPGA. there is little doubt that no matter what is going to happen with the
project you will have only one FPGA, because, for example, this is PLL for the TDM clock.
There is no any special reason to create more than one object myClockFPGA of ClockFPGA { ... } type.
Want to encapsulate the
function names and instead of writing fpgaSetRegister write fpga.setRegister ? It is Ok and right thing to do - we have
keyword static. Call to static methods has exactly the same performance as a call to regular C function. Make sure
that only one instance of the object can exist (move container to the protected area and supply static factory
method which creates only one object and fails on subsequent calls)
Use or not use the templates
- some compilers do not support it
- object code and especially debug information is inflated (70% or more)
bottom line - there is simply no alternative
Virtual functions
- performance
- object code size
- ROM based data
- code is more difficult to maintain
Avoid it at any cost in time sensitive operations like access to arrays, cyclic buffers, stacks, memory allocation.
The following is based on http://www.embedded.com/98/9802fe3.htm
The first cost is that it makes objects bigger. Every object of a class with virtual member functions contains a vtable pointer.
So each object is one pointer bigger than it would be otherwise. Adding a virtual function can have a disproportionate effect on a small object.
An object can be as small as one byte and if a virtual function is added and the compiler enforces four-byte alignment, the size of the
object becomes eight bytes. There is the cost of the vtable lookup for a function call, rather than a direct one. The cost is a memory
read before every call to a virtual function (to get the object's vtable pointer) and a second memory read
(to get the function pointer from the vtable). The actual cost depends on CPU. The C substitute would look like
(this->vTable[methodIndex]) (this);
where methodIndex is a constant known at compilation time and vTable is hidden field added to every class containing virtual methods.
Array
What we are looking for in arrays.
There are three allocation models in C++.
- allocation on stack (automatic variable)
- dynamic allocation with new operator
- static allocation with word static
We will prefer arrays which are allocated statically or arrays which are automatic variables. We will prefer arrays with size known at compilation time.
An example of naive implementation
class IntArray
{
public:
Array(int size)
{
this->size = size;
data = malloc(size * sizeof(int));
}
int getSize()
{
return size; // reference (this->size) and
// one additional argument (this)
}
int get(int index)
{
int size = getSize;
if (index 0)||(index > size)
....
}
// call to the method forwards 3 arguments
// 8060/8250/8260 have no stack
void setData(index i, int val)
{
int size = getSize;
if (index < 0)||(index > size)
....
else
// imagine that this is array which should support
// not int but array
// then we need memcpy(). how can i write generic
// interface which equally efficient
// for int, void * and C++ object/C structure ?
data[i] = val;
}
int * data; // who and when allocates this thing ?
}
|
void IntArray::someOperation()
{
// be carefull - this->size and can be called for
// every loop depending on the optimization
for (int i = 0;i < size;i++)
{
.....
}
}
void IntArray::someOperation_1()
{
for (int i = 0;i < size;i++)
{
// performance ! is get() an inline function ?
int val = get(i);
....
}
}
Some of the problems with this implemenation. What if we need array in shared memory or in DPRAM and our allocation
fuction is not trivial. Another problem here is methods in the child classes making operation with the data. Every
time to call method get() to fetch next entry is clearly not a good idea. The alternative - access the data directly
is not much better, because it assumes knowledge of the implementation details of the parent class. Such child
can not be easily moved from one project to another + parent can not be changed without at least code reading of all
child classes. Because the implementation dynamically allocates the data we can not use this class for automatic
variables. Imagine function where you need temporary array of 10 integers to read HW registers into and then make some
processing with this array. No matter what is the performance of the memory management you use objects of this class
can not be used for this task. You have to create only one array and make your function thread safe (array is a static variable). The alternative
is to use regular C array of integers and loose all debug statistics, error checking and performance monitors
parent class could provide. This is not going to be our way to solve the problems. But before we consider different
approach let's notice that everything started from attempt to bring memory allocation, data holder and iterator into
one class. Indeed, malloc() is an engine allocating memory which is not trivial and deserves attention and separate
class (actually large family of classes). Data holder expressed by field int * data is an array itself. Method get(int)
is an attempt to provide some way to access the data, for example iterate through the data which is most often
used operation. Two engines and data in one class and the result is about zero usefulness and bad performance. Start
to use this class everywhere and project will end pretty soon because of "poor performance of C++"
|
how it could be done
// Container is a base class and can contain debug info,
// array name, statistics (important !), etc.
template <class ObjectType> class Container : Object
{
public:
inline Container();
inline ~Container();
inline int size() const;
// return non-zero if the Object belongs to the array
// the function takes a while:
// {
// int offset = (int)Object - (int)First;
// return ( (Object >= First) && (Object <= Last)
// && (offset%sizeof(Data) == 0) );
// }
inline bool belongs(const ObjectType *Object) const;
protected:
inline void init();
ObjectType *Data;
ObjectType *First;
ObjectType *Last;
int Size;
private:
}
|
side remark regardign statisitcs. base class normally will contain object name, class name, time and date of creation, log of access hits
of the public methods and other debug statistic. debug info comes first, subclass(s) come second. For example, array. You
probably would like to know who and when allocated 8M array of chars, or when exactly (and by what task) was accessed method set() for
out-of-range index. Array methods can call some dummy object - debug info collector at 0 performance costs (empty inline function). When
you start to look for the problem you can replace the collector by some real object (even in run time). When writing a new class the first
question to ask is "How will i debug it".
method belong() is probably not a great idea for public method, but it is good a protected method. there is an assumption, of course, that
data storig in the container is placed into array - consecutive memory region. pay attention that belongs check address of the array entry.
Basically it works like if ((index >= 0) && (index <= arraySize)), but the test is not for index, but for the address and in many cases it
is going to be faster. To improve the performance of the method allocate aligned arrays and use ObjectType with size power of 2.
Well designed class Array can provide constructor with parameter alignment.
method init() will set First and Last fields. this is dirty, because it implies that subclass should do something more
besides just calling parent constructor, but this is may be the only problem. the major gain here that Container has no
idea how subclass allocates data, but still can provide lot of usefull services, like belongs(). If subclass fails to
call init() the problem will be apparent from the first run, because typically any iterator will call belongs() to
check incoming arguments.
|
// do not use this class directly
// use ArrayS or ArrayD instead
template <class ObjectType> class Array
: public Container<ObjectType>
{
public:
inline ObjectType& operator[](int Index);
class IllegalIndex : public Exception // IllegalIndex
{
public:
IllegalIndex(int Line, int Index)
: Exception(Line, __FILE__)
{
this->Index = Index;
}
int Index;
}; // class IllegalIndex
virtual inline ~Array();
protected:
inline Array();
inline void init();
private:
}; // class Array
|
Small detail here. IllegalIndex exception is part of array and not container. Container can be a set or map or hash table. Some other array related
things can be put here. for example, method swap() or iterator.
Talkig about exceptions. This is not a good idea to throw exception every time there is an error in the argument. Typically in case
of uncatched exception OS will suspend the calling thread. Exceptions are handled differently by OSs and compilers but we should expect that in throw
OS will allocate (dynamically !) an object of required type. Throw and catch are very expensive operations.
Constructor of class Array is in the protected area. Application can not create objects of this type directly. Array is just a place holder
in case we need some common API between all arrays in the system. For example, registration of all created arrays and printing relevant statistics
like array name, thread owner, array size, number of fetches from the array in the last 10 seconds.
|
template <class ObjectType> class ArrayD
: public Array<ObjectType>
{
public:
ArrayD(int Size);
virtual ~ArrayD();
protected:
private:
}; // class ArrayD
// example of usage
{
// runtime library (OS loader) calls new() and
// constructor before application main() is called
static ArrayD intArray10(10);
// this line will be executed when CPU reach the line
// somebody has to fill the array with actual pointers
// presumably zero terminated strings
ArrayD stringArray = new ArrayD(10);
}
// array of strings
class ArrayString : ArrayD
{
ArrayString(int Size) : ArrayD(Size)
{
// First is (char * *)
for (char * * p = First;p <= Last;p++)
{
* p = NULL;
}
}
}
|
elements will be allocated dynamically
template <class ObjectType> ArrayD<ObjectType>::ArrayD(int Size)
: Array<ObjectType>()
{
Data = new ObjectType[Size];
this->Size = Size;
Array<ObjectType>::init();
}
template <class ObjectType> ArrayD<ObjectType>::~ArrayD()
{
delete [] Data;
}
It is not for automatic variables, unless we have very efficient memory management. One exception is task variables.
Usually ArrayD is used in static variables or explicitly creared with operator new().
interesting thing here. Container (parent class) does not handle any allocations. Container assumes that child class in the constructor will
initialize pointer to the array (Data) and set array size. It gives us freedom to use different allocation modes in the child classes, while
sharing the same API. this trick is considered dirty, because OO suggests that constructor of non-abstract class should bring object to
the valid state - initialize all internal fields. The base line requirement, that after Container is created any Container method can be
called with reasonable result. my opinion is that you can use this approach if you do not want to write templates with argument Allocator.
talking about Allocator. Imagine you need a pool of blocks. Sometimes block is just raw data 96 bytes long, sometimes block is a
structure representing IP header, sometimes this is 4 bytes address in the shared memory region and sometimes this is C++ object.
base class pool is implemented as a stack of pointers to the stored objects and
provides two methods get() and free() + debug info like who and when allocated/released, etc. Let's say that sometimes we want to allocate
the blocks (initially - to fill the pool) using malloc, sometimes we will point to the reserved areas in DPRAM, etc. clearly this is not
a great idea to change the pool itself every time our allocation scheme is changed or subclass the pool. Pool works very well if we provide
to the pool allocation interface.
We can build many different allocators - to allign blocks, to get power of 2 block size, to preset block header, etc. Pool does not care
how application does it. Pool knows very well to push/pop to/from a stack of pointers.
this example is good because it shows how the problems should be separated. Bottom line - keep classes small, limit the API, isolate
engine and data in separate classes, different functionality often means different classes.
template <class BlockType, class Lock, class AllocatorType>
class Pool : public PoolStatistics
{
// we need Allocator only once - when call to the
// Pool constructor.
Pool(const char * Name, AllocatorType * Allocator,
int NumberOfBlocks)
{
for (int i = 0;i < NumberOfBlocks;i++)
{
BlockType block = Allocator->get();
// now add to the stack of free blocks
}
}
}
get() can be the only public method in the Allocator and can be static method too
get() can also be a field (not method) of type function. still ok.
write regular C style function allocatiing blocks, write class without constructor
containing static field - allocating function, set the function. we can change Allocator API
whatever we like
class Lock is our next topic, but the idea behid this thing is quite clear. Sometimes we want to protect allocation from pool with
interrupt disable and sometimes we do not need any protection at all, because this is the same task which free and allocate blocks.
For example, timer task receiving requests like timer start, timer stop, etc.
|
template <class ObjectType, int Size> class ArrayS
: public Array<ObjectType>
{
public:
ArrayS();
virtual ~ArrayS();
protected:
private:
ObjectType SData[Size];
}; // class ArrayS
|
this is our best friend - static array can be automatic variable. constructor is trivial and does nothing. variable Size is a constant and will be set
by preprocessor max performance the price is bloating object code
template <class ObjectType, int Size>
ArrayS<ObjectType, Size>::ArrayS()
: Array<ObjectType>()
{
Data = SData;
this->Size = Size;
Array::init();
}
How do we use this thing ?
void myFunction()
{
.....................
{
// new type - can be used more than once
typedef ArrayS<int, 100> intArray100T;
// constructor called - it is the same as to write
// int SData[100];
intArray100T intArray100;
// we need 50 strings array only once
ArrayS<char * , 50> stringArray50;
} // call two destructors here (inline and empty
// methods)
....................
}
What we have here is two arrays on the stack - two automatic variables.
i am lying here saying that the constructor is empty, but the lie is not that big in reality as it appears. Let's move method belongs() from
the Container to the Array and we do not need any assignments in the constructor - field Data (or SData) is initialized by the compilet,
fields First and Last can be initialized by the compiler either. One can even place keyword const before First and Last declarations to make
sure that nobody ever change the address.
|
Locks
What we are looking in locks
- different types of locks
- automatic lock free
- both statically and dynamically created locks
Example.
class MyLock
{
public:
MyLock()
{
semId = new Semaphore());
}
void lock()
{
...
}
void unlock()
{
..
}
private
int semId;
}
|
This is somewhat naive implementation. Definitely this thing will work, but it is not maintainable. Let's say you suddenly decided that semaphore
is not good enough and you need interrupt disable instead. Sublass MyLock and replace lock() and unlock() methods ? Semaphore still will be created
by the parent and ignored by the child. Another problem with this class that if you need global variable of MyLock type and it means that every module
using your lock will have to inlcude the class definition. While it appears not a big deal it would be nice to have only one class in the include
file and one line declaration of synchronyzation bject of well known type (included from common library)
Clearly Locks are not supposed to be global variables unless this is a library. For example, imagine driver providing API to I/O device. Driver
can be made reentrant using interrupt disable, task switch disable or semaphore. In all three cases driver (set of functions) makes some assumptions
about underlying CPU and OS. Even worse, the driver makes an assumption that application is multitask which is not always the case. Application has
no any alternative, but to live with interrupt disable or we have to change the source code of the driver.
The alternative is to call Lock (internal class of the driver) and require from the application to provide synchroniztion object. Application
can supply dummy object (empty inline functions) or interrupt disable. We isolated low level driver from OS and CPU.
|
// better way to do the same
class MyLock
{
public:
inline MyLock(int semId)
{
this->semId = semId;
semGet(semId);
}
inline ~MyLock()
{
semSend(semId);
}
private:
int semId;
}
|
side remark here. imagine a function with multiple return points - calling to return more than once let's say that this is a lookup in database and the
whole function is protected by semaphore function starts with getSemaphore() and in all
return lines i will place sendSemaphore without going too deep into the problems with
nontrivial critical section i will only say that C++ lock() is absolutely must. returning to the database containing such kind
of lookup there are clearly better solutions one can always start the project stating that
only low priority tasks will access the database or, for example, that typically only one task
will access the database so it does not matter that we lock API for long time. these arguments are all
wrong and will not survive even one year of application development.
This is also an example of dividing APIs between separate classes. MyLock is NOT a semaphore. MyLock contains and use semaphore or any
other synchronization interface. Sempahore is a separate class and MyLock is a separate class. They have some things in common, but not enough to put
them in the same inheritance tree.
How we solve the problem with global lock here ? Semaphore ID is a global variable, MyLock is well known class (LockSemaphore).
// Usage
void main()
{
{ // critical section is started here
MyLock lock(globalSemId); // constructor is called
..........
} // destructor is callled
}
|
but what with different kinds of locks ?
// for example
//
// {
// Lock(); --- disable interrupts
//
// do critical section
//
//
// } --- call to ~Lock, enable interrupts
class Lock
{
public:
Lock()
{
Mutex.get(); // here i have no idea what kind of mutex it is
// and i do not care i need get/release methods
}
~Lock()
{
Mutex.release();
}
protected:
MutexInterrupt Mutex;
private:
}; // class Lock
class MutexInterrupt : SynchroObject
{
public:
MutexInterrupt()
{
}
~MutexInterrupt()
{
}
virtual void get()
{
RTOS::interruptDisable();
}
virtual void release()
{
RTOS::interruptEnable();
}
protected:
private:
}; // class MutexInterrupt
|
class SynchroObject
{
public:
SynchroObject()
{
}
virtual ~SynchroObject()
{
}
virtual void get() = 0;
virtual void release() = 0;
protected:
private:
}; // class SynchroObject
want to improve performance ? remove dependency on SynchroObject
add keyword inline. Lock does not require SynchroObject
want to have shared property among all mutexes like debug counters ?
no problem - create parent containing required methods, but leave
get and release methods to child. avoid virtual functions
|
// disable/enable context switching
// for example
//
// {
//
// LockOS(); --- disable context switching
//
// do critical section
//
//
// } call to ~LockOS, enable context switching
class LockOS
{
public:
LockOS()
{
Mutex.get();
}
~LockOS()
{
Mutex.release();
}
protected:
MutexOS Mutex;
private:
}; // class LockOS
|
// dummy lock
class LockDummy
{
public:
LockDummy()
{
}
~LockDummy()
{
}
protected:
private:
}; // class LockDummy
This one is not just a place holder. Application will use this lock a lot. For example, database API requires Lock. Database does not assume
any knowledge of the OS or design of application. Code of the database accurately calls the lock to protect all critical sections. This is the
applicaton's decision what kind of lock (if any) to use.
|
class MutexOS : SynchroObject
{
public:
MutexOS()
{
}
virtual ~MutexOS()
{
}
virtual void get()
{
RTOS::disable();
}
virtual void release()
{
RTOS::enable();
}
protected:
private:
}; // class MutexOS
class MutexSemaphore
{
public:
MutexSemaphore()
{
semId = new Semaphore();
}
~MutexSemaphore()
{
}
void get()
{
semGet(semId);
}
void release()
{
semSend(semId);
}
protected:
private:
int semId;
}; // class MutexSemaphore
|
class LockSemaphore
{
public:
LockSemaphore(Semaphore * semaphore)
{
this->semaphore = semaphore;
semaphore->get();
}
~LockSemaphore()
{
semaphore->release();
}
protected:
Semaphore * semaphore;
}; // class LockSemaphore
This lock could use semaphore or signal or any other object. Let's say that we have no idea what kind of synchronization object we are going to
use in three months from now, because we decided to mobe from vxWorks to Linux
template <SynchroObject> class LockSemaphore
{
public:
LockSemaphore(SynchroObject * synchroObject)
{
this->synchroObject = synchroObject;
synchroObject->get();
}
~LockSemaphore()
{
synchroObject->release();
}
protected:
SynchroObject * synchroObject;
}; // class LockSemaphore
SynchroObject is just any class assuming that it hase two public methods get() and release()
Pay attention to implicit operator new.
Line like static MutexSemaphore myMutex; is a bug
because when static variables are
initialized RTOS is probably not initialize/running. In Linux user space C++ demangler does the job correctly though
typedef LockSemaphore<MutexSemaphore> myLockT;
void main()
{
MutexSemaphore * myMutex = new MutexSemaphore();
// ...................................
{ // critical section
myLockT(myMutex); // calls LockSemaphore() and in turn
// myMutex->get()
} // relase()
}
|
Locks and Arrays meet each other in iterator
#define _ITERATOR_TEMPLATE_ARGS class ListType, class ObjectType, \\
class IndexType, class Lock
#define _ITERATOR_TEMPLATE_ARG_LIST ListType, ObjectType, IndexType, Lock
// example:
// class MyArrayT : public ArrayS<int, 20>
// {
// public:
//
// bool getNextIndex(int * Index)
// {
// * Index = (* Index) + 1;
// if (* Index < size()) return true;
// else return false;
// }
//
// int * getEntry(int Index)
// {
// return &Data[Index];
// }
//
// bool getFirstIndex(int * Index)
// {
// * Index = 0;
// }
// }; // class MyArrayT
// MyArrayT MyArray;
// Iterator<MyArrayT, int, int LockDummy> iterator(&MyArray);
// int value;
// while (iterator.next(&value))
// printf(value);
template <_ITERATOR_TEMPLATE_ARGS> class Iterator
{
public:
Iterator(ListType * List);
~Iterator();
typedef const ObjectType * PObjectType;
// return true if Ok
bool next(PObjectType * PObject);
protected:
IndexType Index;
ListType * List;
bool NotFirst;
}; // class Iterator
|
This iterator is not good enough. Let's add method bool getNext(Object & o) and instead of fetching of next index
will fecth next entry of the array. The implementation is obvious - this is exactly the ++ operator we would do with
the regular C pointer. Let's make the method inline and suddenly we have performance of our IntArray::someOperation()
method similar to one in C - imagine that the inline method is already in the code cache, and the only price we are
going to pay is one additional reference - instead of * (p) we have * (this->p). Typically "this" is
a register variable and performance overhead is essentially zero for many CPUs.
|
Inheritance
you probably paid attention that i do not use inheritance a lot.
for example, one could argue that
- array is a container
- FIFO is an array
- mailbox (message queue) is FIFO
and get very nice and clean (and as we see next not very usefull) inheritance tree.
i will put it another way
- array is a container
- FIFO is a container
- mailbox contains FIFO
let's compare two approaches (naturally i prefer the last one)
the basic idea behind this is never build inheritance because of inheritance per se. it can lead
and eventually will lead to less maintainable code. i do not say here to avoid inheritance, but
i say to consider carefully every case and case.
When we write parent class and then subclass it we basically limit our fredom to chnage not only API of
the parent class, but also implementation details. Imagine that parent class calls lock in all public functions.
Assume also that subclass should call one of them, but before that should take the same lock. We have a problem here,
because our subclass is suddenly aware of the implementation details of the parent.
Let's say that parent provides public methods started with lock and calling internal (protected) methods.
Subclass (child) will call only protected methods and not API. Again the problem - API performance is not as good as
it could be and subclass should play according the rules and call the lock before calling any parent method.
How word contains solves the problem ? Mailbox has access only to the public methods of FIFO.
If you decide that CyclicBuffer better suits the Mailbox needs you just replace the declaration, for example
template <class Message> class Mailbox
{
FIFO<Message * > q;
Message * recv()
{
return q.remove();
}
void send(Message * msg)
{
q.add(msg);
}
}
class MailboxNew
{
CyclicBuffer<Message * > q;
Message * recv()
{
return q.remove();
}
void send(Message * msg)
{
q.add(msg);
}
}
No changes in the implemenation of Mailbox. We are free to do with FIFO whatever we want. Imagine that Mailbox is a
subclass of FIFO. Suddenly Mailbox has more methods and fields to access (polluted name space).
But this is relatively small problem.
Another example is card in shelf - single router blade in grid router.
router blade contains many public properties and only very limited
set of public methods. Most of the properties themselves are nontrivial objects .
for example, parent class Card can contain methods
- reset
- get slot
- management block
- management unblock
and properties
- System Log
- Alarm Log
- LEDs
log contains FIFO and is a nontrivial class. card has a property - public field Log
Log provides API like clear(), stop(), pause(), print(), printWithFilter(), etc.
same with alarm. LEDs is an object containing array of LED objects (array can be empty).
LEDs has a public method ledsTest().
will routerCard inherit base Card ? the answer is clear - yes. if an application has two different blade cards
like router with single 1GB interface and card with 4x100M we have correct inheritance. The interface itself is
property of the card. There is no a thousand get()/set() methods communicating outside what kind of interfaces
the card has. There is a field - array of interfaces, or even better add also iterator of interfaces. Than
everyone can find out what interfaces the card has and check the status of everyone of them.
Reasonable number of methods for a class is 3-10 including those inherited from parent
If you see that you need more methods consider public property fields.
Simple Property field is another example of making property public still providing restricted access to
the field and save multiple get()/set() methods. Consider following class
template <class FieldType, class ChildType> class PropertyRO : public Object
{
public:
inline PropertyRO();
inline ~PropertyRO();
inline operator FieldType() const;
protected:
inline PropertyRO(FieldType Value);
inline PropertyRO(const ChildType & Property);
FieldType Value;
inline ChildType & operator=(FieldType Value);
private:
}; // class PropertyRO
example of usage
class MyCard
{
public:
class Slot : public PropertyRO<int, Slot>
{
....
friend MyCard;
}
// public property field. application can
// not change, but can read it.
// no additional methods are required
Slot slot;
}
|
we can still provide access for MyCard to the internals of Slot. MyCard is friend of Slot and can access
any protected and private method/field.
For example, MyCard constructor can setup initial value for the Slot.
|
operator = can be implemented with lock if FieldType is not trivial or, for example, it can call the wrapping class
(MyCard) method to read/write hardware. Property can be a memory mapped HW register, for example. the idea here is
to provide the same performance as regular C construct like define * (UINT32*)addr = value. Thanks to the inline
static methods we can do it without any performance penalty and compiler will check all types.
class Register32 : public PropertyRW<UINT32, Register32>
{
}
Your evaluation board contains many different blocks, like CPU, FPGA, etc. and every one of them contains registers.
while some of the most often used functionality can be done in dedicated methods in some cases direct access to the
registers can be feasible. for example, HW driver can contain two parts - low level part (just registers, bit masks,
initialization, reset, getVersion routines) and high level part - FSM. FSM can find it more convenient to write
directly to the registers instead of calling separate method for setting every single bit.
Virtual functions
What is virtual function ?
How it can be implemented ?
What exactly compiler does with this ?
Usually i will sacrifice virtuality and polymorphism and gain performance.
Some examples.
Shelf contains different cards
class Shelf
{
class Cards : public ArrayS<Card * , 12> // array containing 12 cards (12 pointers to the card structure)
{
}
Cards cards; // array of cards
void printCards()
{
for (int i = 0;i < Cards.Size;i++) // pay attention to the Size - this is a class constant (remeber Array ? )
{ // not the best way to write code but interesting as an example
pintf("%s\n", cards.get(i)->getName());
}
}
}
class Card
{
virtual const char * getName() = 0; // pure virtual function, the class is abstract
} // we can not create objects of this type
class NoCard : public Card
{
virtual const char * getName() // implements virtual method
{ // Objects of type NoCard can be created
return "Empty";
}
}
next question here how we create Card objects
first of all let us hide the constructor
class Card
{
public:
static Card * factory(CardType, CardVersion, Slot); // pay attention to the word static
~Card()
{
}
operator delete(void * )
{
}
operator void * new(int size)
{
}
protected:
Card()
{
}
}
factory() allocates from memory pool a block of required size and calls related initialization routine. for example,
class NoCard : public Card
{
NoCard()
{
// init the state of the object
}
}
Card * Card::factory(CardType, CardVersion, Slot)
{
Card * card;
switch (CardType)
{
case CardTypeNoCard:
card = new NoCard(); // call operator new to allocate memory, call NoCard constructor
break;
}
}
if new() is fast the code is effective enough even if you want to create cards which
are automatic (temporary) variables assuming that we are talking about functions on high levels of the application.
Why we could not use Array in the same way ? In case of cards i know what the maximum size of block i need to create
a card - this is size of the largest child and i know number of simultaneously existing card objects.
The implementation of operator new than is trivial - allocation a block from the stack of free blocks.
pay attention that there is only one virtual function so far - getName(). i would suggest to keep it this way
Abstract interfaces
class RxTask : Task
{
static const BLOCK_SIZE = 1024;
void loop()
{
Array<unsigned char, BLOCK_SIZE> block;
do
{
io->read(block);
}
while (res == OK);
}
IO * io;
}
what is IO ?
IO can be any class implementing methods
- open
- read
- write
- close
- ctrl
class IO
{
virtual int read() = 0;
.....
}
for example,
class Socket : public IO
{
virtual int read()
{
}
}
but i do not like virtual functions especially in character devices. what i can do ?
template <class IO> RxTask
{
RxTask(IO * io)
{
this->io = io; // store io
}
void mainloop()
{
io->read();
}
IO * io;
}
void main()
{
Socket * soc = new Socket();
RxTask<Socket> * rxTask = new RxTask<Socket>(soc);
}
now IO is an argument, no virtual functions required. When creating the task call operator new
for RxTask() and here we are with correct type in the task.
Another way around is forward the method instead of IO object.
class ReadFunction
{
public:
int read(void * )
{
socket->read();
}
protected:
ReadFunction(Socket * socket)
{
this->socket = socket;
}
friend Socket; // Socket will create ReadFunction objects
Socket socket;
}
Socket creates wrapper ReadFunction. RxTask works only with objects of type ReadFunction. no virtual functions, but
double reference. but the idea is good, let's improve it
typedef int ( * ReadFunction)(void * this, void * )
class RxTask
{
ReadFunction read; // application will initialize this with socketReadFunction
void * readParam; // first argument of the ReadFunction()
void mainloop()
{
read(readParam, NULL); // make field read static to get rid of implicit this->
}
}
// example of read function
inline socketReadFunction(void * socket, void * block)
{
// check that socket is indeed pointer to something reasonable
// call read
(Socket *)socket->read(block);
}
this is not C++ and compiler can not do type checking here. still can be found useful.
should we write code like this ? i do not know the answer. probably there is no single answer to this.
strict OO says "no way" and suggests virtual functions.
Mailbox and message
Some example of interface
class Message
{
Message();
void send();
Message & receive();
}
|
// better way (?) separate message and mailbox
class Message
{
Message();
}
template <class Message> class Mailbox
{
Mailbox();
void send(Message &);
Message & receive();
Message & receive(timeout);
}
|
What we are expecting from Mailbox
- It should be equally efficient with both pointers and strcutures
- Sometimes we want to copy the message and sometimes we allocate a message from pool and send pointer.
|
Side remark here. Allocation messages from pool is better than assuming that mailbox copy the message. Overall performance
is going to be better. Consider following code
typedef struct MessageT
{
/* some data here */
}
void sender()
{
MessageT * message;
// allocate block from pool
pool->get( & message);
// setup the data: message->data = ...
// and send pointer
mailbox->send(message);
}
void receiver()
{
MessageT * message;
mailbox->receive( & message);
// process message->data
//free block
pool->free(message);
}
If we go with copy we basically pay memcpy at least once and in some mailbox implementations twice. The natural
approach here is to restrict the application providing only send-pointer mailbox. It dictates the right task design
and improve overall performance of the application. Another gain is that we can use preset message headers when
allocating from pool. Allocator or pool can set the headers at the initialization phase or, for example, clean up
the block every time application allocates it. It is very easy to forget to call memset(data, 0, sizeof(*data))
when setting data in some structure defined somewhere else. And immediate remark on remark here. One can argue that
application make use of C++ classes instead of structures and message constructor will initialize the data. The
argument has a flow though. If messages are allocated from pool all constructors will be called only once.
Application has only two alternatives here - call method init() or dynamically allocate message and use memcpy() in
mailbox. The right place to call init() from is pool or allocator, because this is only one place and not every time
before mailbox->send().
One could ask also what if one task sends different messages to
different task and there is a task which will receive messages of different types. The typical advice here is not to
use void. We can try to subclass some base class message and then typecast the object according to message event.
And how we know what pool to call to free the message. Here comes data prefix. Pool can sign any block with pointer
to itself. Method free() is static and can free any block no matter from what pool it was allocated. Safety net can
be and should be placed in like watermarks, flags and even checksum.
|
// Real class
template <class MessageType, class Lock> class Mailbox
: public Object
{
public:
Mailbox(const char * Name, int Size);
~Mailbox();
void send(const MessageType * Message);
inline void send(const MessageType & Message);
void receive(MessageType * Message);
// return 1 if new message
// return 0 if timeout expired
int receive(MessageType * Message, int Timeout);
protected:
Semaphore Semaphore;
FifoD<MessageType, LockDummy> * Fifo;
const char *Name;
inline void fetchMail(MessageType * Message);
private:
}; // class Mailbox
|
Every class in our OO application has a parent - Object. I assume here that we are not going to use multiple inheritance.
Mailbox is equally efficient when send objects and pointers - pay attention to two send() methods.
Semaphore (or signal) used to block the calling thread until a message available in the message queue.
FifoD conatins dynamically allocated array and two fields head and tail - this is message queue.
Name is a name of the mailbox - usefull for debug.
fetchMail() calls to lock() and fifo->remove(). This method can be usefull if application decides to subclass
Mailbox and reload receive() - just a backdoor for dirty designs. Example of usage:
class SendReceive
{
private:
typedef class Mailbox<int, LockOS> MailboxT;
static MailboxT * mailbox = NULL;
SendReceive()
{
}
public:
static void init()
{
// create mailbox
mailbox = new MailboxT("processor", 10);
// spawn two tasks here if not exist already
// ...
}
static void receiver()
{
while (1)
{
int message;
mailbox->receive(message);
}
}
static void sender()
{
while (1)
{
int message = 1;
mailbox->send(message);
}
}
}
From ecos comments (class class Cyg_Mboxt):
Message/Mail Box. This template implements a queue of T's.
Implemented as a template for maximal flexibility; one would hope
that only one, with T==(void *) and the same number of them,
is ever used without very good reason.
|
Inline functions, define and const, static methods
NOT-TO-DO list
The following is mostly from http://www.caravan.net/ec2plus/rationale.html.
Not to use
- ...exception handling (time and memory requirements are unpredictable, there is performance penalty)
- ...RTTI (larger object code without significant gains)
- ...virtual inheritance (makes sense only if multiple inheritance is used)
- ...multiple inheritance (code is less readable, less re-usable, and more difficult to maintain)
|