What Belongs in a C .h Header File?

« Upcoming Embedded Software Boot Camps

Subscribing to this Blog’s Comments »

What Belongs in a C .h Header File?

Wednesday, November 10th, 2010 by Michael Barr

What sorts of things should you (or should you not) put in a C language .h header file? When should you create a header file? And why?

When I talk to embedded C programmers about hardware interfacing in C or Netrino’s Embedded C Coding Standard, I often come to see that they lack basic skills and information about the C programming language. This is usually because we are mostly a gang of electrical engineers who are self-taught in C (and every other programming language we use).

When the subject of header files comes up, here’s my list of do’s and don’ts:

DO create one .h header file for each “module” of the system. A module may comprise one or more compilation units (e.g., .c or .asm source code files). But it should implement just one aspect of the system. Examples of well-chosen modules are: a device driver for an A/D converter; a communication protocol, such as FTP; and an alarm manager that is solely responsible for logging error conditions and alerting the user of the active errors.

DO include in the header file all of the function prototypes for the public interface of the module it describes. For example a header file adc.h might contain function prototypes for adc_init(), adc_select_input(), and adc_read().

DON’T include in the header file any other function or macro that may lie inside the module source code. It is desirable to hide these internal “helper” functions inside the implementation. If it’s not called from any other module, hide it! (If your module spans several compilation units that need to share a helper function, then create a separate header file just for this purpose.) Module A should only call Module B through the public interface defined in moduleb.h.

DON’T include any executable lines of code in a header file, including variable declarations. But note it is necessary to make an exception for the bodies of some inline functions.

DON’T expose any variable in a header file, as is too often done by way of the ‘extern’ keyword. Proper encapsulation of a module requires data hiding: any and all internal state data in private variables inside the .c source code files. Whenever possible these variables should also be declared with keyword ‘static’ to enlist the linker’s help in hiding them.

DON’T expose the internal format of any module-specific data structure passed to or returned from one or more of the module’s interface functions. That is to say there should be no “struct { … } foo;” code in any header file. If you do have a type you need to pass in and out of your module, so client modules can create instances of it, you can simply “typedef struct foo moduleb_type” in the header file. Client modules should never know, and this way cannot know, the internal format of the struct.

Though not really specific to embedded software development, I hope this advice on good C programming practices is useful to you. If it is please let me know and I will provide more C advice in future blog posts.

Tags: architecture, embedded, firmware, programming, standards

This entry was posted on Wednesday, November 10th, 2010 at 11:23 am and is filed under Coding Standards. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

44 Responses to “What Belongs in a C .h Header File?”

Lundin says:

November 11, 2010 at 9:00 am

I fully agree with everything stated. All of this is usually referred to as object-oriented design A module with a well-defined access interface and private encapsulation, that’s what OO is all about.

Some aditional advice regarding function prototypes:
In the context of function declarations, static/extern keywords are C’s equivalents of the more well-known private/public keywords in other languages.

Those functions in the .c file that don’t need to be exposed outside the module should be declared with the keyword static, making it impossible to call the function from outside the .c file:

static void internal_func (void);

To further emphasize which functions that are public and private, you could also use the extern keyword for function prototypes that are public:

extern void adc_init (void);

In fact, if you don’t write static nor extern in front of a function prototype, C will implicitly add an invisible extern in front of it.

Reply
GroovyD says:

November 11, 2010 at 9:10 pm

‘DON’T include any executable lines of code in a header file, including variable declarations.’

dont you mean variable definitions? where would the declaration of a global variable go? i have yet to see a program (especially embedded) that doesn’t use them. granted generally globals is a bad idea.

Reply
- Michael Barr says:
  
  November 12, 2010 at 2:50 pm
  
  Yes there is a difference between the declaration of a global variable (e.g., uint8_t g_counter = 0;) and an extern declaration of that variable (e.g., extern uint8_t g_counter;). But NO you shouldn’t put EITHER into any header file.
  
  Reply
Peter Bushell says:

November 12, 2010 at 5:53 am

A good, succinct, no-nonsense article, Michael. However, it didn’t address the important matter of comments – specifically usage comments, in the case of a header file.

This omission prompted me to start blogging again myself. My musings are here:
software-integrity.com/blog/2010/11/12/documenting-code/

All feedback gratefully accepted.

Reply
- Michael Barr says:
  
  November 12, 2010 at 2:49 pm
  
  Nice comment and follow-on blog post, Peter. I completely agree that comment blocks describing how to use the public API of a module belong in that module’s header file. Too bad the standard industry practice is to hide this type of comments in the implementation source file instead.
  
  Reply
Glenn Scheibel says:

November 12, 2010 at 11:23 am

“DON’T expose any variable in a header file”

Boy, if I had a nickel…

The sadness is that there are so many C developers in general (not just embedded) that don’t get this. When you don’t get this part and cannot properly decompose source code into modules, you end up with total spaghetti.

This is the cruft that leads to disgusting stuff like this:

#ifdef _FOO_C_
#define EXTERN
#else
#define EXTERN extern
#endif

Reply
- Michael Barr says:
  
  November 12, 2010 at 2:49 pm
  
  I hate that cruft! Never a good sign to find a #define EXTERN extern.
  
  Reply
  - Philippe Meilleur says:
    
    November 22, 2010 at 5:01 pm
    
    Michael,
    
    What about using the following precompiler code:
    
    // in filename.h
    #ifdef FILENAME_C
    #define FILENAME_EXTERN
    #else
    #define FILENAME_EXTERN extern
    #endif
    
    According that the extern keyword is globally unique in a project, to prototype the public function(s) only:
    
    FILENAME_EXTERN void public_function1(uint8_t arg1, uint16_t arg2, void * p_generic_arg3);
    
    In my opinion this method prevents the need to prototype locally in the c source file (duplicate). It has been seen in the past to change the type of some arguments in the c source file and its prototype during development, but not its extern declaration in the header file.
    
    Even if static analysis tools could see such errors, I believe the best is always to minimize the risks from the source by using good work methods.
    
    Reply
    - Michael Barr says:
      
      November 23, 2010 at 10:15 am
      
      Phillipe,
      
      According to the C standard “void public_function1(…)” is equivalent to “extern void public_function1(…)”. That is to say that wherever function prototypes appear they are assumed extern. Thus I believe the code above is a long-winded way of saying nothing. (This “extern by default” property of functions is why we seek to hide the prototypes for “helper” functions in the .C module–and also tag them “static”.)
      
      There is a simpler way to prevent “the need to prototype locally in the c source file (duplicate).” The simple rule is that any .C module should always #include the associated .h file. So, for example, “adc.c” should always #include “adc.h”. Here’s how we’ve worded that rule in Netrino’s Embedded C Coding Standard:
      
      Rule 4.3.c: Each source file shall always #include the header file of the same name (e.g., file adc.c should #include “adc.h”), to allow the compiler to confirm that each public function and its prototype match.
      
      Hope this helps.
      
      Cheers,
      Mike
      
      Reply
      - David Brown says:
        
        March 22, 2011 at 10:22 am
        
        I don’t agree with everything you write in this article, but I do agree with this. I have never been able to understand the “#define EXTERN extern” nonsense that some people seem to like.
        
        Either a function is local to the C file (e.g., adc.c) and it is declared “static”, or it is exported and there is an “extern void foo(void)” in the header file “adc.h” and “void foo(void) { .. }” in the implementation file. The implementation file “adc.c” always has #include “adc.h”.
        
        Such a rule is simple, clear and reliable, and compilers can check that implementations match the extern declarations.
      - smertrios says:
        
        December 4, 2012 at 11:33 am
        
        Mike, I’m missing something here… when you say “void fn();” is equivalent to “extern void fn();” then what’s the point of using the “extern” keyword when applied to function prototypes? it seems you’re implying it makes no difference. is that correct?
- David Brown says:
  
  March 22, 2011 at 11:07 am
  
  If I had a nickel for every time someone said “don’t use global variables, use accessor functions” I’d be rich. Think about the difference between these two modules:
  
  volume1.h:
  extern int volume;
  
  volume1.c:
  #include “volume1.h”
  int volume = 0;
  
  volume2.h:
  extern int getVolume(void);
  extern void setVolume(int newVolume);
  
  volume2.c:
  #include “volume2.h”
  static int volume = 0;
  int getVolume(void) { return volume;}
  void setVolume(int newVolume) { volume = newVolume;}
  
  There are plenty of programmers who will swear blind that volume2 is the “right” solution, because it hides the global data. But does it /really/ hide anything? No, in fact the user has exactly the same access in the same way. It is just less convenient, less clear, and leads to bigger and slower code.
  
  It is perhaps easier, or at least more tempting, to abuse global variables than to abuse access functions. But in the end you are doing the same thing – you are communicating data into and out of a module. You need the same discipline, and the same understanding of the consequences and validity of using the interface, whether the interface is a global variable or an accessor function.
  
  People will often argue that by abstracting the access with a function, it is easier to change the code later. So what? If you want to change how the global variable is accessed, then change it. You should not be accessing that variable very often in the program anyway – if you /are/ accessing it a lot, then it’s an indication that your code structure and discipline is poor. And if you change the functionality of the accessor functions, you are probably going to need to check all the code that uses it anyway.
  
  I agree that too many global variables is a sign of a messy structure. But too many unnecessary extra functions just make it worse. Global data is one of C’s ways to implement interfaces between modules – it is a poor idea to dismiss it because of religious convictions.
  
  Reply
  - Michael Barr says:
    
    March 22, 2011 at 1:28 pm
    
    As a general rule, if your API design includes accessor functions of the get/set variety you describe, then you are doing API design wrong. This cruft is all too common.
    
    But that doesn’t make data hiding altogether bad. There are good ways to design APIs around abstract data types too.
    
    Reply
    - David Brown says:
      
      March 22, 2011 at 3:07 pm
      
      Oh, I agree with that – data hiding in general is a good idea. What I disagree with is /obsessive/ data hiding, especially when people treat something like “thou shalt not use global variables” as an unbreakable rule. If there is one thing that is constant in embedded development, it is that the answer is always “it depends”. It is good to hide the details of your implementation within the relevant modules, but only if the cost of doing so is not too high.
      
      Reply
  - gallier2 says:
    
    March 22, 2011 at 3:14 pm
    
    I’m with David Brown here. Of course, the need of global variables should be held to a minimum, if a project is riddled with them, it’s difficult to scale the project. This said, if you have indeed a clear use for a global variables there’s no need to force stupid accessors. As example, in our project we have a clear need for lookup tables, which are constant, used in every sub module in a lot of sub projects and relatively big, declaring therefore a
    extern const wchar_t cp1252_unicode[MAX]; in the interface of the module is the simplest, fastest and cleanest way of implementing it.
    
    The evil of global variables comes when they are used to pass state between functions. A global variable is an information concerning the state of the application.
    
    Reply
    - David Brown says:
      
      March 23, 2011 at 3:07 am
      
      It is true that global variables can make a program difficult to scale, and they can make it difficult to re-use modules – it is harder to make a rigid and well-defined interface to a module when you have global variables.
      
      But this is embedded programming – for many programs, you are not interested in scaling. You know what the program has to do, and that is /all/ it has to do. If your program has to control the speed of an electric drill motor, then that is what it is doing – it doesn’t have to control two motors, or play music in the background. It’s okay to have state information such as “motorRunning” as global variables – as long as other modules use it safely. But that is no different from any other way of reading or changing state – functions such as “startMotor()” and “stopMotor()” must be used with the same care.
      
      When programs get bigger, and programming teams get bigger, you have to take more care to make your interfaces rigid and restricted, and to add greater checks on the usage of the interface – and global variables are not popular here (except with C++, where global object instances can be okay). But when programs are smaller, you aim for simple, clear and efficient interfaces, and global variables play a much bigger role.
      
      Regarding constant data, however, it is fine to make your consts global. In fact, it is very common to put small constants directly in the header files as “static const int sizeOfData = 123;”. It is as clear and safe as using a “extern const int sizeOfData;”.
      
      Reply
  - Tod Gentille says:
    
    May 25, 2011 at 1:35 pm
    
    Re: “People will often argue that by abstracting the access with a function, it is easier to change the code later. So what? If you want to change how the global variable is accessed, then change it.”
    
    OK How? If I decide that my volume variable should now always be read from an external widget with accessors I just change the implementation of getVolume and I can be confident that all access to “volume” is updated. If I exposed a global variable I’m stuck. I have no control of how users of my framework/library/module are using that global variable. That’s why accessors are “generally” a good idea. Even if you’re the only one using your code they’re generally a good idea. Do you really want to package up reusable code into a library that exposes global data? If you need to optimize (and you probably don’t) you can inline the accessor code and it’s probably no less efficient or bloated than the global variable approach. It’s just safer and one heck of a lot easier to debug for the maintenance programmer that inherits your code. I’ve seen lots of hard-to-maintain code that uses global data, and not once has it been justified by performance or code-size issues, just laziness or ignorance. I’ve never cursed a programmer for making me use an accessor method. The commandment by the way is: “Thou shalt not use global data without a damn good reason that the maintenance programmer agrees with.”
    
    Reply
  - Garry says:
    
    January 16, 2012 at 11:16 am
    
    It is clear that volume1 and volume2 are NOT equivalent.
    
    The bulk of the objection to volume2′s using access functions seems to be “less convenient, less clear, and leads to bigger and slower code”. Is that a fair summary?
    
    If the syntax of updating ‘volume’ were were identical to using volume1 in the source code, would the ‘less convenient, less clear’ objections go away?
    If the speed and size difference could be eliminated by correct use of compiler flags, would those objections go away?
    
    Put another way, are all of those objections potentially soluble by careful use of a programming language like C++ or C#?
    
    Some of the issues comparing the two approaches are discussed, but some are not. If all of those objections go away, then IMHO we should consider the other side of the argument.
    
    The code to actually *update* volume in volume1 is in every part of the program that updates it, but the code to actually *update* volume in volume2 can be forced to exist at only one place. Depending on the development system, it may be possible to choose to distribute the update code to all the update sites using volum2 by changing a few compiler flags and/or a single piece of source code. So, not only are the two approaches functionally different, the scope to change the functional behaviour of the program during developmnt is different.
    
    This wasn’t discussed as a difference, and I think it can be important.
    
    A contrived example: If the debugging system does not support ‘data watchpoints’ but only a limited number of breakpoints, the difference may become extremely significant. An intermittent problem might be hard to find, and access functions may be better than direct data access. One approach might be to augment the code around the setting of volume to log a little bit of extra information to help diagnose the error. For example, the return address of the code attempting to update the variable is available (maybe in a link register, or on a stack). None of the other code would need to be changed, so the ‘Heisenberg effect’ of making code changes is likely more limited in the case of accessors than global data.
    
    I am putting the case that writing code which makes it easy to take a defensive approach, in the absence of other issues, is inherently worth considering as a ‘good thing’.
    
    I don’t accept the “you should just get yourself a better debugger and development system” response. That seems equivalent to saying, “make the world work in a different way, and you would have a solution to that development problem”. IMHO, it is a smaller problem to write robust code than make the world work in such a way that all development problems can be sidestepped.
    
    I have not seen any solid, evidence based, research that supports the view that access functions are inherently less clear than assignment. Maybe someone can direct me to it? I tend to think both sides of the debate are ‘religious war’ stuff in the absence of evidence.
    
    I am willing to suggest that the fact that there *is* an explicit difference between local variables and ‘globals’, where locals can be assigned directly, and all ‘global’ data must be ‘accessed’, might actually help.
    Of course, some programming languages allow that syntactic difference to be hidden, so that may be a mute point.
    
    I especially accept a concern over ‘lack of syntactic clarity’ is important, but that may depend on the language used. If ‘lack of syntactic clarity’ is the main objection, then it may also be an argument for some languages and against others.
    
    I am simply aiming to ensure a more complete set of issues are considered, and clarify the previous objections.
    
    Reply
david collier says:

November 16, 2010 at 10:06 am

Of course the biggest example of a public-domain C program, that is the Linux sources, seem to follow neither your rules, nor any other rhyme or reason.

What I LIKE to see is the .h file imported into the .c fie so that the .h file provides function prototypes for the implementation. It’s a swine to implement that properly in C though.

Or we can all learn to program in a language that works in modules, then learn C afterwards.

Reply
- Peter Bushell says:
  
  November 26, 2010 at 2:31 pm
  
  David, I don’t see what is difficult about including a header file in its related C file, so that the compiler can check for inconsistencies.
  
  Of course, it’s a swine to enforce!
  
  Sadly, it surprises me not at all that the Linux source coders paid little heed to such things.
  
  Reply
  - kalpak dabir says:
    
    December 11, 2010 at 1:56 am
    
    If something important is a swine to enforce, isn’t that is a flaw in the language design or implementation?
    
    Also, wouldn’t it be better if the definition was done in the .h file rather than write separate code again for global variable initialization?
    
    Ideally, that defined value will be a meaningful #define in the same .h file rather than a magic number.
    
    Reply
    - Peter Bushell says:
      
      December 22, 2010 at 6:54 am
      
      (a) If C required prototype function declarations for all functions defined in a file (not just called there), that would be a step in the right direction. However, it doesn’t. C is by no means flawless, I have to agree.
      
      (b) Again, I agree with your point, but in C you can/should put only the declarations in the .h file. More importantly, your question becomes academic if there are NO global variables. The real question for me is: “Why does C permit global variables?”. Note that I’m not including static variables in my definition of “global”, but these belong in the .c files anyway. I never define global variables in my code and “extern” has been expunged from my C vocabulary! Except, of course, when I have to work on some other people’s code and don’t have the opportunity to redesign it.
      
      (c) Academic (see (b)), but I don’t agree with you here, anyway. Although I never define global variables, I do sometimes define global constants – but only in C++ which (unlike C) has no problem with their being defined in header files. Having defined such a constant in a header file (near the top, of course), I see no reason to have a separate #define for its value, just a few lines away in the same file. Indeed, C++ treats constants the way it does in order to promote the usage of const instead of #define, in this context. However, this, being about C++, is off-topic…
      
      Reply
- David Brown says:
  
  March 22, 2011 at 10:31 am
  
  Linux is not public domain. There is a vast difference between public domain code and open source code.
  
  There are also different reasons for having different styles of C programming, and different rules for how code is built up. Michael Barr, and presumably most of the readers of this site, are concerned with embedded C programming. These are typically programs consisting of a few dozen of modules with perhaps tens of thousands of code lines, with one or two programmers. Linux has many thousands of modules, millions of code lines, and thousands of developers scattered around the world. Complaining that Linux code is not organised like your own code is like saying that the books on your shelf are organized by height, because that looks neater, and then complaining when a library sorts them differently.
  
  Linux does have style guides, which are easily found by anyone who wants to read them. I certainly wouldn’t claim that all of the Linux code base follows all the rules, nor that they are the best rules for a project of that size. But the situation is not nearly as bad as you seem to think.
  
  Reply
dan says: