You are here

cURL vs. getimagesize vs. file_get_contents

getimagesize()

The geimagesize() function allows us to determine the dimensions of an image, as well as the file type e.g. JPG, GIF, PNG. and the HTTP Content Type. Though this function is listed under the GD and Image Functions, this function does not require the GD image library.

Usage

This function will return an array containing the image information including the HTTP Content Type or MIME type which helps deliver images with a valid content-type header, or FALSE if the image information was not retrieved.

Here is an example of how the getimagesize() function can be implemented:

/**
* Validates whether an image exists and whether its headers are valid.
*
* @param $filename
* A URL to the physical file which needs to be validated.
*/
function is_valid($filename) {
  $size = getimagesize($filename);
  print '<pre>';
  print_r($size);
  print '</pre>';
}

This code will print he following:

Array (
  [0] => 190
  [1] => 107
  [2] => 2
  [3] => width="190" height="107"
  [bits] => 8
  [channels] => 3
  [mime] => image/jpeg
)

If the file is not found, the following PHP warning might be displayed:

"PHP Warning: getimagesize(img.jpg): failed to open stream: No such file or directory in [directory]"

To avoid this message, you could use the "Error Control Operator" to ignore the warning, e.g. $size = @getimagesize($filename);

As you can see, index 0 and 1 correspond to the image's width and height. Index 2 indicates the type of the image. Index 3 is the same as index 0 and 1, but the value is a string instead of integers. Index 'bits' is the number of bits for each color. Index 'channels' will be 3 for RGB pictures and 4 for CMYK pictures. Index 'mime' is the correspondant MIME type of the image.

Performance

The getimagesize() function stores the entire image onto the server in order to get the image information. This could be very costly on large image files and could potentially bring a server down. Think about for instance, using this function to check information on images referenced by public users. If the size of these images are as big as GigaBytes and multiple users or the same user tries to reference multiple images, the server will be using too much memory causing it to either work extremely slow or to even crash.

There are ways to avoid using this function to check basic image information, such the width and height. For instance, checking at the first few bytes of the file's header or in the case of JPG files "frame segments" which contain information such the width and height. Keep in mind that different image file formats have different headers/frame segments offsets.

Security

Most image types allow sections for comments or other data irrelevant to most users. These sections can be used to infiltrate php code onto the server and since these images are stored as sent by the client, files with a ".php" extension can be executed and perform malicious operations.

Client URL (cURL) Library

Allows connections and transfer options to multiple types of files; which means that is not limited to just images. To use these options the cURL its required to be in the server. cURL offers transfer options such the "CURLOPT_NOBODY"; which if set to FALSE it will set the request method to get only the HEAD and exclude the body from the output.

Also, we can set the "CURLOPT_FAILONERROR" option to TRUE, so that if the returned HTTP status code is more than 400 then it will fail the request. see list of HTTP status code to see why the request will fail if it gets a status code of more than 400.

Usage

Now, if we take advantage of this options, we can use cURL to check a file's information without the need of storing the file on the server or programming logic to detect certain pieces of information from the header/frame segments of a file.

Here is an example of how the cURL library can be implemented:

/**
* Validates whether an image exists and whether its headers are valid.
*
* @param $filename
* A URL to the physical file which needs to be validated.
* @return TRUE/FALSE
* A boolean value representing the validated state of the file.
*/
function is_valid($filename) {
  // Initialize a cURL session
  $curl_session = curl_init();
  // The URL to fetch. This can also be set when initializing a session with curl_init().
  curl_setopt($ch, CURLOPT_URL,$filename);
  // TRUE to exclude the body from the response. Request method is then set to
  // HEAD. Changing this to FALSE does not change it to GET.
  curl_setopt($curl_session, CURLOPT_NOBODY, 1);
  // TRUE to fail silently if the HTTP code returned is greater than or equal to 400.
  // The default behavior is to return the page normally, ignoring the code.
  curl_setopt($curl_session, CURLOPT_FAILONERROR, 1);
  // TRUE to return the transfer as a string of the return value of curl_exec() instead
  // of outputting it out directly.
  curl_setopt($curl_session, CURLOPT_RETURNTRANSFER, 1);

  // Perform a cURL session
  if(curl_exec($curl_session) !== FALSE) {
    return TRUE;
  } else {
    return FALSE;
  }
}

Performance

cURL offers a much improved performance than the getimagesize() function, since it allows to set an option tell curl not to request for the body of the response. Instead, to only issue a HEAD request. As explained in the above code, this can be accomplished by setting the CURLOPT_NOBODY option to 1.

Security

Of course, cURL is not perfect and does have security issues, but thankfully there's a team dedicated to maintain the cURL library. Here is an online queue of existing and previous security flaws.

File Get Contents Function

file_get_contents()

If you are unable to get the cURL library on your server and don't want to deal with getimagesize() functions vulnarabilities then file_get_contents() may be a good alternative. The file_get_contents() function allows us to specify which part of a file we're interested in by specifying a starting offset and the number of bytes to be checked after the offset.

Usage

This function returns the file as a string. It can be use to get the source code of a web page or the contents of file given a starting point and an (optional) end point.

/**
* Checks if the file exist by requesting only the first byte.
*
* @param $filename
* A URL to the physical file which needs to be validated.
* @return TRUE/FALSE
* A boolean value representing the validated state of the file.
* @see file_get_contents Documentation
*/
function is_valid($filename) {
  // arguments: filename, flags, context, offset, maxlen
  if(file_get_contents($filename, 0, NULL, 0, 1)) {
    return TRUE;
  } else {
    return FALSE;
  }
}

Performance

Compared to the getfilesize() function, this function is a lot faster since it has the option to specify a maximum length of data to be fetched. So instead of downloading an entire file we only retrieve the data that's relevant to the application.

Security

Its more secured than the getfilesize() function, since even if a file has malicious hidden code, by specifying an offset and the maximum length to be fetch that code may get cut off or may never even be read.

References:

  • Client URL Library
  • File Get Contents
  • Get Image Size
  • Error Control Operator
  • HTTP Status Code

lifestyle:

medium: