How to Perform Static Code Analysis in PHP
-
Use the
lint
Mode to Perform Static Code Analysis in PHP -
Use the
PHPMD
orPHP Depend
Project to Perform Static Code Analysis in PHP -
Use the
pfff
Tool to Perform Static Code Analysis in PHP -
Use
HHVM
to Perform Static Code Analysis in PHP
The vital part of development is identifying errors and quickly eliminating them from your codebase, and you can code or perform static code analysis to achieve this in PHP. This tutorial teaches how lint
mode and a few other methods perform static code analysis in PHP.
Static code analysis is an incredible way to detect bugs, increase general developer productivity, auto-completion, and refactor your code using many type-related features for your strongly-typed PHP code. Before execution or compilation, you can check your source code to eliminate syntax errors and enforce PHP coding standards and styles to detect security vulnerabilities in your code.
The lint
mode is one of the best ways to perform static code analysis in PHP, and you will further learn about php lint
, PHPMD
, pfff
, and HHVM
to adopt the one approach that suits your needs better. It leans heavily on the PHP’s type system, and the more information you provide for static analysis, the better results you will get, and declaring types in your code is one way to add more information.
A function function exp_funct ($args) {}
that gets a list of featured posts can be declared as function exp_funct (array $args) : array {}
to provide further information for static code analysis. Alternatively, you can add a PHPDoc
comment to declare or initialize the function input and output types with something like // @return array<exp_var>
before declaring a function.
Use the lint
Mode to Perform Static Code Analysis in PHP
The Lint PHP
mode is one of the best ways to perform static code analysis to check syntax errors and identify unused variable assignments, assigned arrays without any initialization, possibly code style warnings, and many more. You can use php -l FILENAME
by running PHP and validate the syntax without an execution in the lint
mode from shell or any other command line.
There are many high-level and low-level static analyzers on the internet based on PHP Lint
. For example, php-sat
, PHPStan
, PHP-CS-Fixer
, and phan
are some of the higher-level static analyzers; on the other hand, PHP Parser
and token get all
(primitive function) are some of the lower-level analyzers based on PHP Lint
.
You can split the given source into PHP tokens like token_get_all(string $code, int $flags = 0): array
, and the token_get_all
will help you parse the given code string into the PHP language tokens using the Zend
engine’s lexical
scanner. The TOKEN_PARSE
parameter flag recognizes the ability to use reserved words in a specific context and helps return an array of token identifiers.
Each token identifier returned from this method is either a single character or a three-element array containing the token index, the string content of the original token, and the line number in elements 0, 1, and 2, respectively. You will find two examples of the token_get_all()
; one is a general use, and the other is performing it on a class using a reserved word.
<?php
$userQuota_getToken = token_get_all('<?php echo; ?>');
foreach ($userQuota_getToken as $get_tokenQ) {
if (is_array($get_tokenQ)) {
echo "Line {$get_tokenQ[2]}: ", token_name($get_tokenQ[0]), " ('{$get_tokenQ[1]}')", PHP_EOL;
}
}
?>
// 2nd example on class
/*
$token_quota_source = <<<'code'
class A
{
const PUBLIC = 1;
}
code;
$userQuota_getToken = token_get_all($token_quota_source, TOKEN_PARSE);
foreach ($userQuota_getToken as $get_tokenQ) {
if (is_array($get_tokenQ)) {
echo token_name($get_tokenQ[0]) , PHP_EOL;
}
}
*/
// its output will be something similar
/*
T_OPEN_TAG
T_WHITESPACE
.
.
.
T_CLASS
T_WHITESPACE
T_STRING
.
.
*/
Output:
Line 1: T_OPEN_TAG ('<?php ')
Line 1: T_ECHO ('echo')
Line 1: T_WHITESPACE (' ')
Line 1: T_CLOSE_TAG ('?>')
Additionally, runtime analyzers work in lint
mode and are more useful for some things due to the dynamic nature of this programming language. The Xdebug
is a runtime analyzer with code coverage and function tracers.
The phpweaver
has Xdebug
function traces and uses a combined static/dynamic approach to perform code analysis. If you are looking for a static code analyzer for production servers, xhprof
is the best lint
mode static analyzer, similar to Xdebug
, but lighter and includes a PHP-based interface.
Use the PHPMD
or PHP Depend
Project to Perform Static Code Analysis in PHP
It stands for PHP Mess Detector and is a spin-off project of PHP Depend
aims to be the equivalent static code analyzer of the well-known Java PMD
tool. You can use composer to install PHP_Depend
, curl -s http://getcomposer.org/installer | php
and the php composer.phar
requires pdepend/pdepend:2.12.0
or if you have any globally installed composer.
On the other hand, PHPMD is preferable over the PHP_Depend
as it is more user-friendly and has an easy-to-configure front-end for raw metrics measured by PHP_Depend
. It takes a given source code (PHP code) base and, as it has a straightforward working principle, looks or tries to find potential bugs or cautions within that source.
It can easily detect bugs and syntax errors, overcomplicated expressions, unused properties, methods, parameters, and suboptimal code. As a mature PHP project and static code analyzer, PHP Mass Detector offers a vast library of pre-defined rules to analyze the PHP source code.
// Type phpmd [filename|directory] [report format] [ruleset file]
hassan@kazmi ~ $ phpmd PHP/Depend/DbusUI/ xml rulesets/codesize.xml
<?xml version="1.0" encoding="UTF-8" ?>
<pmd version="0.0.1" timestamp="2009-12-19T22:17:18+01:00">
<file name="/projects/pdepend/PHP/Depend/DbusUI/PHPMD.php">
<violation beginline="54"
endline="359"
rule="TooManyProperties"
ruleset="Code Size Rules"
package="PHP_Depend\DbusUI"
class="PHP_Depend_DbusUI_ResultPrinter"
priority="1">
This class has too many properties; consider refactoring it.
</violation>
</file>
</pmd>
Output:
This class has too many properties; consider refactoring it.
The command line usage of the PHP Mess Detector can be activated or used by typing phpmd [filename|directory] [report format] [ruleset file]
, and it is possible to pass a file/directory name to PHPMD
as a container for PHP source code for analyzing. The codesize.xml
or rulesets
parameters can look like a filesystem reference as its Phar
distribution includes the rule set files inside its archive.
Furthermore, it enables PHP programmers to use shortened names or references to refer to built-in rule sets like phpmd
Depend XML codesize. The command line interface of PHPMD also accepts optional arguments like --min-priority
, --report-file
, --suffixes
, --strict
, and many more.
You can apply the ~ $ phpmd /path/to/source text codesize
configuration by using the multiple rules sets applied against the source code under the test and enabling a call to its CLI tools with a set name. Furthermore, it allows in-depth configuration for programmers to mix custom rule sets files with build-in rule sets, and the ~ $ phpmd /path/to/source text codesize,/my/rules.xml
command is a perfect example of it to specify your custom rule sets to analyze the source code.
Use the pfff
Tool to Perform Static Code Analysis in PHP
As a set of APIs and tools, it can perform static code analysis to index, search, navigate, visualize, refactor source code, and style-preserving source-to-source PHP code transformation.
It is easy to compile and install pfff
; however, it produces results in a complex format like go-automatic.php:14:77: CHECK: Use of undeclared variable $goUrl
or login-now.php:7:4: CHECK: Unused Local variable $title
. You can access the pfff
on GitHub using $ ~/sw/pfff/scheck ~/code/github/sc/
.
Furthermore, you can embed the parsing library in your own OCaml
application by copying the commons/
and parsing_php/
directories in your project directory and adding a recursive make; in the end, link the application with the parsing_php/parsing_php.cma
& commons/commons.cma
library. Also, observe the pfff/demos/Makefile
for better understanding, and once the source is compiled, you can test pfff
with the following:
$ cd demos/
$ ocamlc -I ../commons/ -I ../parsing_php/ \
../commons/commons.cma ../parsing_php/parsing_php.cma \
show_function_calls1.ml -o show_function_calls
$ ./show_function_calls foo.php
Afterward, you must be able to see on stdout
some helpful information on the function calls in foo.php
according to the code in show_function_calls1.m1
in the pfff
project on the Facebook archives. The pfff
parser is extraordinarily productive, and you can test it on the phpbb
website.
// source code of pfff command-line
$ cd /tmp
$ wget http://d10xg45o6p6dbl.cloudfront.net/projects/p/phpbb/phpBB-3.0.6.tar.bz2
$ tar xvfj phpBB-3.0.6.tar.bz2
$ cd <pfff_src_directory>
$ ./pfff -parse_php /tmp/phpBB3/
The pfff
program should then iterate over all the source code files (.php
source files) and run the parser on each source file and will output some statistics showing, like: NB total files = 265; perfect = 265; =========> 100%
and nb good = 183197, nb bad = 0 =========> 100.000000%
which means pfff
was able to parse 100% of your PHP source code.
As a command line program, it features different commands like pfff
to test the PHP language parsers. You can use scheck
to find bugs, and it works like lint
and stag
for the Emacs tag generator, which is more precise than any other.
The sgrep
is a synthetical grep
to make it easy to find precise code patterns, and spatch
is a syntactical patch to make it easy to refactor PHP code, as well as codemap
, pfff_db
, codegraph
, and codequery
, are some of the latest additions to pfff
tool to perform global analysis on a set of source files or query information about the structure of your PHP codebase.
Use HHVM
to Perform Static Code Analysis in PHP
It has built-in Proxygen and FastCGI server-type support and can be one of the perfect static code analyzers. HHVM is known as a fully functional web server with Proxygen directly built into it, and its ease of use and processing source code make it highly recommendable for static code analysis.
It servers fast web requests and provides a high-performance web server equivalent to the FastCGI
and nginx
combined. You can implement hhvm -m server -p 8080
to use Proxygen when running HHVM in server mode and can set the port by command line configuration: hhvm.server.port=7777
, or putting -d hhvm.server.port=7777
in your server.ini
file.
You can use the -d hhvm.server.type=proxygen
command to define the Proxygen server type without explicitly specifying it (Proxygen is the default). The init
scripts HHVM packages start in FastCGI
mode by default and require configuration tweaking before being automatically started as a server.
The following is an example of HHVM package configuration with different customizable options (server.ini
or -d
options) at the command line. Remember, some of these configuration options are optional since they are the default value, but they can help deliver more information or show illustrations to the user.
// initialize a server port
hhvm.server.port = 60
// the default server type is `proxygen`
hhvm.server.type - proxygen
hhvm.server.default_document = source.php
hhvm.error_document404 = source.php
hhvm.server.source_root = /edit/source/php
Using optional configuration options is good for documentation purposes to be explicit, and the hhvm.server.source_root
and hhvm.server.port
are most likely ones that need explicit values. HH Virtual Machine is open-source and written in Hack
and uses a JIT (just-in-time) compilation to achieve superior performance while maintaining exceptional development flexibility.
The default directory HHVM binary launched in is the default_document
that you can change based on your server. After installing HHVM to your OS in your PHP project, you can use the sudo update-rc.d hhvm defaults
and sudo service hhvm restart
commands to set HHVM to start up at boot as a server.
Hassan is a Software Engineer with a well-developed set of programming skills. He uses his knowledge and writing capabilities to produce interesting-to-read technical articles.
GitHub