Coppermine Photo Gallery v1.5.x: Documentation and Manual

Table of Contents

Sanitization of Superglobals using Inspekt

Target audience

This part of the documentation is not meant for end users of Coppermine, but only for developers and skilled users who are familiar with coding. There is no support for this section, it comes as-is.

For coppermine dev team members, this piece of documentation is meant both as a reference as well as a programming guideline.

For users, this section can be helpful if you want to modify your coppermine gallery and want to come up with additional functionality and use superglobals (like $_GET and $_POST variables) in those custom scripts.

What's new?

Starting with cpg1.5.0 the individual components of coppermine (the script files like displayimage.php) can no longer access superglobals like $_GET or $_POST directly. Those superglobals are being put into a "cage" object instead when the Coppermine basic includes get started and all vars and function are being initialized. Once the "cage" has been populated, the superglobals get emptied and can therefore no longer be accessed. Subsequently, you can no longer use the superglobals as you might be used to. Confused? Read on.

Reason

Everyone knows that you should filter your inputs - most of the good programmers do it, but when you are working with a large team of programmers on an open source project things slip up, errors do creep in, at times like this you wish for a mechanism which would prevent your team from making such mistakes, some thing which forces them to declare their intent.

Most of the vulnerabilities that have been discovered in cpg1.4.x where caused by user input not properly sanitized, which led to XSS attacks becoming possible. In an effort to improve the security behind coppermine thoroughly the dev team has decided to sanitize all superglobals using the tool "Inspekt" (which is being released under the BSD license and can therefore be included without issues into the coppermine core code license-wise).

The idea to use Inspekt has been brought up by Coppermine lead developer Dr. Tarique Sani in his blog article "Inspekt - put a firewall in your PHP applications", where he discusses the reasons to use Inspekt in detail.

What Inspekt does

Inspekt acts as a sort of 'firewall' API between user input and the rest of the application. It takes PHP superglobal arrays, encapsulates their data in an "cage" object, and destroys the original superglobal. Data can then be retrieved from the input data object using a variety of accessor methods that apply filtering, or the data can be checked against validation methods. Raw data can only be accessed via a getRaw() method, forcing the developer to show clear intent. Read more about usage on the Inspect Wiki: Basic Usage.

Inspekt accessor methods

One thing which is missing from the Inspekt Wiki page is the list of accessor methods. In brief they are:

Care should be taken to as far as possible not use the getRaw() method - if it is used then please comment profusely as to why it is safe to use getRaw in the given circumstances (e.g.: the same value was tested against a regex before fetching or the value is sanitized immediately after getting). If there case where getRaw() cannot be avoided but is still unsafe please comment on possible solutions. Once again - the final aim is to NOT use getRaw() at all.

How to use Inspekt with Coppermine Photo Gallery

Inspekt provides tools for filtering scalar or array data in two ways:

The guiding principle for Inspekt is to make it easier to create secure PHP applications. As such, ease of use is valued over flexibility, and verbosity is avoided when possible.

Using Inspekt

// Example: creating a cage for $_POST
require_once "Inspekt.php";

$cage_POST = Inspekt::makePostCage();

$userid = $cage_POST->getInt('userid');

if ( !isset($_POST['userid']) ) {
    echo 'Cannot access input via $_POST -- use the cage object';
}

We can see from the above example that after making a cage for $_POST, it is not accessible directly. One must use the methods provided by Inspekt to get the data in correct data type and format.

Inspekt in Coppermine

Inspekt has been used in CPG by including it in init.inc.php file at the very beginning and creating a supercage immediately after its inclusion.

set_include_path(get_include_path().PATH_SEPARATOR.dirname(__FILE__).PATH_SEPARATOR.'Inspekt');
echo dirname(__FILE__);
require_once "Inspekt.php";

$superCage = Inspekt::makeSuperCage();

Supercage is an aggregation of all the cages, i.e EGPCS (the order of variable parsing). Once the supercage is created none of the EGPCS variables are available.

To access a variable within a supercage we have to use the following format:

$qs = $superCage->server->getDigits('QUERY_STRING');
$album_id = $superCage->get->getInt('album');
$title = $superCage->post->getAlpha('title');

To get an instance of $superCage inside a function use $superCage = Inspekt::makeSuperCage(); again. Do not use the global directive. It may be noted here that makeSuperCage() creates a singleton pattern object. So calling it multiple times does not have any overheads and you can be assured of getting the very same object every time.

In other words - you don't have to take care not to define $superCage - it doesn't hurt to call it many times over. Don't bloat the code by assigning other names:

Bad example Good example
$yetAnotherCage = Inspekt::makeSuperCage(); $superCage = Inspekt::makeSuperCage();

Every dev is encouraged to download the latest tarball of Inspekt and checkout the API documentation for the list of available methods for accessing data from cages. In addition to this, there is a bunch of test functions which will test a value of a given key against a pre-determined datatype or format.

Consider the methods to use

Before using a particular method, make up your mind what kind of data you expect to retrieve: is the data you expect just an integer (e.g. "0" or "1" used in a parameter to toggle an option on or off) - then use getInt. Do you expect an alphanumeric string (with only alphanumeric data, no spaces, no special chars nor non-latin characters), like the parameter of a pre-determined action (e.g. "delete", "update" or "add"), then use getAlnum or getAlpha. If the parameter you're trying to fetch may contain something else (e.g. a filename that may only contain latin characters, but that may as well contain utf-8 encoded non-latin characters like Umlauts), then use getMatched and perform a match against a regular expression - you may even have to sanitize the value even more after fetching it.

Examples

Here are some real-world examples of how code used to look initially (before Inspekt was introduced) and what was changed to make it work with Inspekt.

File Before implementing Inspekt After implementing Inspekt
albmgr.php
$CLEAN['cat'] = isset($_GET['cat']) ? (int)($_GET['cat']) : 0;
[...]
$cat = $CLEAN['cat'];
 
[...]
if ($superCage->get->keyExists('cat')) {
    $cat = $superCage->get->getInt('cat');
} else {
        $cat = 0;
}
calendar.php
if ($_REQUEST['action'] == 'browsebydate') {
[...]
$month = (int) $_REQUEST['month'];
$year  = (int) $_REQUEST['year'];
if ($matches = $superCage->get->getMatched('action', '/^[a-z]+$/')) {
    $action = $matches[0];
} elseif ($matches = $superCage->post->getMatched('action', '/^[a-z]+$/')) {
    $action = $matches[0];
} else {
    $action = '';
}
if ($action == 'browsebydate') {
[...]
if ($superCage->get->testInt('month')) {
        $month = $superCage->get->getInt('month');
} elseif ($superCage->post->testInt('month')) {
    $month = $superCage->post->getInt('month');
} else {
        $month = 0;
}

if ($superCage->get->testInt('year')) {
    $year = $superCage->get->getInt('year');
} elseif ($superCage->post->testInt('year')) {
    $year = $superCage->post->getInt('year');
} else {
    $year = 0;
}
getlang.php
if (isset($_GET['get'])) {
     $file_index = (int)$_GET['get'];
[...]
<img src="images/folder.gif" alt="">&nbsp;<a href="{$_SERVER['PHP_SELF']}?get=$index">$file</a>
if ($superCage->get->keyExists('get')) {
                $file_index = $superCage->get->getInt('get') ;
[...]
<img src="images/folder.gif" alt="">&nbsp;<a href="{$CPG_PHP_SELF}?get=$index">$file</a>

Regular Expressions (regex)

The method getMatched makes use of regular expressions that can be tricky for beginners. It would be beyond the scope of this documentation to explain how regular expressions work in detail - we just added some "typical" regular expressions that you can use in your own code:

Regex Description Example
/^[0-9a-z]+$/ Will match lower-case alpha-numerals
/^[0-9A-A]+$/ Will match upper-case alpha-numerals
/^[0-9A-Za-z]+$/ Will match alpha-numerals (case-insensitive)
/^[0-9A-Za-z]{3,6}$/ Will match alpha-numerals (case-insensitive) with a minimum of 3 characters and a maximum of 6
/^[a-z_]+$/ Will match lower case letters and an underscore, e.g. strings like 'foo' or 'foo_bar'
/^[0-9A-Za-z_]+$/ Alphanumerals (numbers and latin characters) both in lower as well as upper case and the special char "underscore" (_) will match (i.e. return TRUE) 'foobar', 'fooBar', 'foo_bar', 'foobar_', '2foo3bar' will match (true).
'fübar', 'foo-bar', 'foo bar' will not match (false)
/^[a-zA-Z0-9_\-]*$/ Alphanumerals (numbers and latin characters) both in lower as well as upper case and the special char "underscore" (_) and the dash (-) will match (i.e. return TRUE) 'foobar', 'fooBar', 'foo_bar', 'foobar_', '2foo3bar', 'foo-bar' will match (true).
/^[a-z]*$/ Only latin character in lowercase will match. 'foobar' will match (true).
'fooBar', 'foo5bar', 'foo-bar', 'foo bar' will not match (false)
/^[a-zA-Z0-9]*$/ Alphanumerals (numbers and latin characters) both in lower as well as upper case will match (i.e. return TRUE). However, you could use the method getAlnum just as well - it will make the code more readable and easier to understand for others. 'foobar', 'fooBar', '2foo3bar' will match (true).
'fübar', 'foo_bar', 'foo-bar', 'foo bar', 'foobar_' will not match (false)
/^\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b$/ IP addresses (v4) will match (return TRUE). '1.2.3.4', '192.168.0.1' will match (true).
'1.2.3', '1.2.3.4.5', '192.168.0.10/2', 'coppermine-gallery.net', '192.168.0.300' will not match (false)
/^([a-zA-Z0-9]((\.|\-|\_){0,1}[a-zA-Z0-9]){0,})@([a-zA-Z]((\.|\-){0,1}[a-zA-Z0-9]){0,})\.([a-zA-Z]{2,4})$/ Valid email addresses will match (i.e. return TRUE). However, this regex does not check if the domain exists nor if the TLD is valid. 'john.doe@example.com', 'john@example.com', 'john-doe@some.example.com' will match (true).
'john.doe', 'john.doe@', '@example.com', 'jürgen.doe@example.com', 'john=doe@example.com' will not match (false)
/^[+-]?([0-9]{1,2})*\.?[0-9]+$/ Will allow you to enter integers between 0 and 999 with or without a trailing percent sign, which can be helpfull if you want to allow dimensions (for HTML/CSS attributes) in pixels or percent
/^#(?:(?:[a-f\d]{3}){1,2})$/i Will match valid RGB color codes from #000000 to #FFFFFF

Links