Skip to content

File Inclusion

Y-Less edited this page May 22, 2018 · 1 revision

Introduction

File inclusion and include guards in PAWN are surprisingly complex topics, which have been documented in a few places poorly, but never comprehensively in one place. Their intricacies are exploited extensively in YSI, which is the reason why #include <YSI/y_ini> will fail while #include <YSI\y_ini> will work.

Simple Includes

Let's look at a very simple include statement:

#include <a_samp>

That will just include the file called a_samp.inc (actually, it will try several extensions including .inc, .p, and .pawn) from the include files directory. Using double quotes instead will include it from the current directory (i.e. the directory where the file that was doing the including is located):

#include "a_samp"

When including the file, the compiler also creates a define internally that is _inc_ + filename so in this case _inc_a_samp. This is just a normal pre-processor symbol and can be removed or tested for:

#undef _inc_a_samp
// Or:
#if defined _inc_a_samp
#endif

The new compiler DOES NOT create this symbol, unless you pass the -Z flag. When you include a file a second time the compiler looks for this internal symbol. If it already exists, the file doesn't get included again.

Most includes ALSO have a C-style manual include guard at the top of the file. For a_samp this looks like:

#if defined _samp_included
	#endinput
#endif
#define _samp_included

The first time the file is included _samp_included is not defined so it gets defined. The second time it is included _samp_included is defined, so the file instantly ends before duplicating all its contained code. However, because of the compiler-internal guard, that file will never be included twice to hit this check. So why does it exist at all? Simply because the internal guard was not known about when the code was originally written. The benefit is that it will work on the new compiler as well with no updates or modifications.

Directories

The include guard name is based ONLY on the current filename, not its directory:

#include <dir1\include> // Creates _inc_include
#include <dir2\include> // Finds _inc_include already

That code will only include one of the two files, despite them being apparently different. The internal symbol is based only on the filename, the directories are ignored despite being different. The separator between directory and filename is the first symbol found of either \ or / - and ONLY the first one (per include):

#include <dir1\include>
#include <dir2/include> // Also doesn't work
#include <dir1\subdir\include>
#include <dir2\subdir/include> // DOES work

Why does the second example work now? In the first line dir1\subdir\include the first directory separator seen is \, so that is the separator used throughout and the filename is seen as include and thus the created symbol is _inc_include. In the second example the first separator is also \ so that is taken as the ONLY separator - / is assumed to not be one. Instead of trying to open the file include in the directory dir2\subdir\ the compiler tries to open the file subdir/include in the directory dir2\ - this works perfectly fine, but means that the internal symbol created is _inc_subdir/include. This is NOT the same as the first one, and so the inclusion continues.

This is a little confusing! Also, you might think that _inc_subdir/include is not a valid symbol name - and you'd be right, this method slightly bypasses the compiler's normal checks on names, which means you can't do:

#undef _inc_subdir/include

etc. One that symbol exists, it can't be disposed of, and can't be checked for.

Includes In Includes

The rules for directory separators are transitive. The compiler first builds the full path for an include, then applies the rules for symbol names:

mode.pwn:

#include <YSI_Storage\y_ini>

YSI_Storage\y_ini:

#include "y_ini/impl"
#include "y_ini/tests"

The first file will include a second one with the name of YSI_Storage\y_ini from the includes direcory. That file will in turn include y_ini/impl from a subdirectory relative to itself. This means the second included file's full path is: YSI_Storage\y_ini/impl. Suddenly you have a mixed separator path! Many files in YSI have their implementations in impl.inc - all the same filename, but no collisions because of this file path manipulation.

Relative Locations

Remember that when using a\b\c/d the path is a\b\ and the file is c/d (which the OS correctly interprets as a full path), and when using "include" the path is relative? What happens if you include a file from a mixed separator file?

main.pwn:

#include <dir1\dir2\dir3/include1>

dir1\dir2\dir3/include1.inc:

#include "include2"

What file will that try and include? Because " is used, the final file will be included relative to dir1\dir2\dir3/include1.inc's directory - which is dir1\dir2\ NOT dir1\dir2\dir3/. If you put include2 in the same directory as include1, then try and include them both in the way it won't work. This is the reason why in YSI no impl.inc files or similar have any includes in them - all inclusions are done from the base file such as YSI_Storage\y_ini.inc.

Depending on the compiler options and OS, you may already have \s in your path even before your first include:

#include <dir_1/include_1>

On windows the compiler will translate that to something like:

C:\sa-mp\pawno\includes\dir_1/include_1

Giving you a mixed separator path immediately.

Long paths

Given an include of:

#include <dir_1\dir_2/include>

The compiler with automatic include guards will generate the internal symbol _inc_dir_2/include. This can't be removed, as previously mentioned, because this is invalid syntax:

#undef _inc_dir_2/include

However, there is a way around this. Include guards, like all symbols, are limited to 31 characters (32 including the NULL terminator). Therefore, if the directory is long enough, the / will not appear in the symbol:

#include <dir_1\dir_2__with_a_long_suffix_/include>
#undef _inc_dir_2__with_a_long_suffix_

The symbol name is truncated (without a warning) to _inc_ plus the first 26 used characters of the path. Without 26 leading characters, you will still get a truncation, but in a different place:

#include <dir_1\dir_2_with_a_long_suffix/include>
#undef _inc_dir_2_with_a_long_suffix/i // Invalid again.

With more than 26 leading characters, you may have different directories and paths that appear to the compiler to be the same:

#include <dir_1\an_even_longer_suffix_on_dir_40/include_40>
// This file won't get included because `_inc_an_even_longer_suffix_on_d` already exists.
#include <dir_1\an_even_longer_suffix_on_dir_30/include_30>
Clone this wiki locally