x

NGINX gotchas

NGINX is a wild beast of a program. It is one of the most straightforward but as a result one of the more intelligently complex proxy servers out there. And due to its rich feature set, it can be used for much more than just simple proxying. But that complexity and those features make it often unwieldy to use; combined with numerous bugs that have existed so long they’ve essentially become features, because changing them could break hundreds of thousands of websites were they to change. Here I’ve collected some of the “gotcha” moments I’ve experienced in my time using it. I will update this article should I come up with any more.

Inheritance

Martin Fjordvald made a nice article about this that I think explains it perfectly:

The default inheritance model is that directives inherit downwards only. […] When it comes to inheritance behaviour there are four types of configuration directives in nginx:

  • Normal directive – One value per context, for example: “root” or “index”.
  • Array directive – Multiple values per context, for example: “access_log” or “fastcgi_param”
  • Action directive – Something which does not just configure, for example: “rewrite” or “fastcgi_pass”
  • try_files directive.

Essentially, normal directives propagate downwards into new contexts; arrays do too, but any changes will overwrite the previous array; action directives usually do not propagate and must be included in the child context if they should take effect. This is not obvious at first, and the syntax being similar to object-oriented languages might give a false impression that everything inherits in all cases, or that NGINX configuration is imperative (it isn’t).

If is Evil… with regular expressions

if directive matches create a new context. Because of this, action directives will not propagate downwards, like try_files. This is well known and documented, even by the developers themselves, leading to the idiom of If Is Evil. However, one often overlooked side effect of this is positional and named captures from the location block being overwritten if regex is used in the if. The captures must be saved to other variables with the set directive, or the named capture syntax used ((?<varname>…)) if they are to be used inside an if with regex. I have not seen this documented elsewhere, but it makes sense and is likely not a bug.

root vs alias

Many people question the differences between the root and alias directives. They both set the document root, from which files are served. But while root’s functionality ends there, alias has a whole host of other things it does.

root

When using the root directive, the document root is set to the path specified in the directive. This path is exposed via the $document_root variable. System filenames in NGINX are often constructed relative to the document root, simply by concatenating the root and desired paths. The target file referenced in a request is also constructed in this manner, and is exposed via $request_filename. For example, with the following configuration and a request URI of /i/top.gif:

location /i/ {
    root /data/w3;
}

The $document_root is /data/w3, and the URI ($uri) is /i/top.gif. The $request_filename is hence /data/w3/i/top.gif. The default functionality in NGINX without an action directive is to serve the file present at that path. If the path points to a directory, NGINX will append a slash (/) to the URI automatically. If the ngx_http_index_module is enabled and the $uri ends with a slash, then index files will be tried under that directory (by default: index, index.html).

alias

When using alias, the document root is also set to the path specified in the directive, as with the root directive. However, this also marks the location directive block as using aliasing, which modifies the function of some directives and other code.

location blocks, by default, will match against the beginning of request URIs. When constructing filenames in alias mode, NGINX will attempt to remove the matching location block URI prefix from the beginning of the request URI, before appending it to the document root. This results in the argument of the location directive to be effectively “replaced” with the alias. Let’s modify the example above to illustrate the process:

location /i/ {
    alias /data/w3/images/;
}

The $document_root is /data/w3/images/, and note that the $uri is still /i/top.gif as in the previous example. To alias the $request_filename, NGINX starts by looking at the URI, stripping the location block prefix /i/ from the beginning, and appending the result to the document root, which was set by the alias directive. As such, the $request_filename is now /data/w3/images/top.gif. This also explains why $document_root$uri doesn’t come out as expected when using aliases, which seems to be a common misunderstanding—the $uri variable isn’t changed, the directives and other code themselves strip the prefix before using the URI.

Regular expressions in location and how it affects alias

When using regular expressions in location directives with alias mode enabled, trimming of the beginning of the request URI was never implemented. Regardless of the reasons why, this omission fundamentally and unintuitively changes how alias works, as it leads to compounding differences in functionality:
1. The location prefix is not removed from the request URI in places where it otherwise would with alias mode enabled.
2. When calculating the $request_filename, adding the whole untrimmed URI to the end of the document root doesn’t make sense in the context of the alias directive, so it isn’t done.
3. To get around this shortcoming, NGINX developers specify that capture group references from the location directive should be used in the alias directive to translate the URI into a fully resolved path. The value of alias is then used to set $request_filename directly.1
4. In this configuration, $request_filename and $document_root are equal, which is a major departure from existing conventions and expectations. This could introduce issues with other parts of the configuration that expect an upper level directory as the document root, especially FastCGI scripts.

Clearly, care needs to be taken that scripts and other components are using the right directories and files in this configuration.

So why would you want to use alias over other options when using regex in location?

  • alias can be used to point to a file or directory explicitly, where root would see the request URI appended even when using regex;
  • alias and try_files can be combined to make the paths in try_files shorter (but note try_files with aliaslocation prefix replacement only works sometimes?);
  • alias is inherited by child blocks where try_files and other solutions are not, potentially reducing code reuse;
  • On a micro-optimization level, performance with alias should be faster than with try_files, and should be very barely slower than root.

try_files with aliaslocation prefix replacement only works sometimes?

try_files handles replacing the location prefix within its own logic when alias is used, like with other parts of NGINX. However, it does not always replace the prefix, and this has resulted in continued confusion to a number of users. Looking at the example presented in this bug:

# bug: request to "/test/x" will try "/tmp/x" (good) and
# "/tmp//test/y" (bad?)
location /test/ {
    alias /tmp/;
    try_files $uri /test/y =404;
}

Logically, one would assume that the second path would resolve to /tmp/y as the prefix /test/ should match the prefix in the location directive. However, it turns out that try_files only replaces the prefix if the following criteria is met:

There has been heated debate over the source of this discrepancy, and whether or not it’s a bug; but looking at the code makes it plain to see that this is either a defect that needs to be fixed, or something that needs to be documented. The documentation specifically says:

The path to a file is constructed from the file parameter according to the root and alias directives.

Yet, this is not true if the alias directive is used and there is no variable or other script processing in the path.

I am fairly confident this could be fixed by simply moving the part that strips the prefix outside the block for script processing:

However, fixing this issue could potentially break configurations, though I consider it unlikely. Since it has been 12 years as of this writing since the ticket was opened, it might as well be yet another layer of “yes this is a problem but it has existed for long enough that it is now expected behavior” that we have all come to know and love in NGINX.

See also

DigitalOcean Community: Understanding the Nginx Configuration File Structure and Configuration Contexts
Understanding Nginx Server and Location Block Selection Algorithms | DigitalOcean
Nginx location directive examples | DigitalOcean
Pitfalls and Common Mistakes | NGINX
Nginx Guts

Left-click: follow link, Right-click: select node, Scroll: zoom
x