RarestBlog

PHP the Nice Way (Ever-evolving Huge Retailer Website Story)

(This is a post for all the PHP programmers out there who still think that function hello() { print "<b>Hello</b>"; } is a darn good code)

My story

A few years ago I was a really bad PHP programmer. (Right now I’m probably mediocre, but not bad :) )

I’ve been programming for 20+ years, but I’ve been a bad programmer. Actually, everybody around was saying that I was the fastest programmer they know and that my websites are very solid and are done in no time. Yes, granted, when you create 100+ commercial websites - you learn a thing or two.

I was learning on my own, so I studied PHP from resources such as php.net. It’s a great resource when you consider abundance of the information, but it’s terrible, when you look at the quality of advise.

I’ll start explaining how things evolved in my head in a hope that people will catch up where they are now to understand some things further.

By no means this is a “definitive” guide to PHP. Just a few tricks that might help you save your precious time and nerves.

A big gig. Too big for my brain

I’m currently software development consultant for huge retail chain, tens of thousands of employees, hundreds of stores around Europe.

I’ve started developing that website on my own, few years ago. I’ve had my nice (or so I thought) framework, which was full of stuff like:

function show_basket() {
  if(basket_not_valid($GLOBALS['basket'])) {
    $GLOBALS['error'] = "<b>Basket not valid: ".htmlspecialchars($basket['errors'])."</b>";
  }
  render('basket.php');
}

Yeah, terrible :)

Doesn’t look like it? Read on. This post if for those who think this is not terrible that I write this post.

At that time I’ve had no idea what’s so bad about this code. Why? Because I’ve started on PHP4, when OOP (object oriented programming, just in case) there was soo bad, that you should never ever had touched it. When there was no general acceptance of MVC. Ok, yes, times were dark in all the ways. :)

(Just to note - by that time I’ve written a lot of code in C++ using a lot of OOP ideas. But it never occured to me how to apply those to PHP. Nobody ever explained me why or how should I use OOP in PHP. Even php.net seemed to be against it with all the commenters writing function after function).

Still, my code was secure, it was running. I mean, what else should you ask for? :) I’ve made hundreds of websites that used the same codebase and none of them were broken or hacked. Some worked for 5+ years without a touch.

But this time things were going to be different.

The story

Basically, I’ve had a huge index.php file that accumulated functions like n_redirect() or get_stock_left_cached($shopId).

It seemed okay, considering the start was with 30 products and 5 categories. Nice and easy. Functions upon functions.

function n_redirect($location)
{
    @header('Location: ' . $location);
}

Hell yeah! Things were going great! Until things got way too big to manage.

PHP philosophy gets in the way

If I accumulated any PHP philosophy by that time - it was that “let’s do it simplest way possible”.

Do you need HTML snippet? Write it and print it! Better yet, write PHP between HTML - that’s the simplest way!

It was. Until some amount of work. Work started to pile up. New requirements, new rules how things should work.

More products. More subcategories. Nesting. More complicated ordering. Deliveries. Analytics. Hidden stuff. Admin stuff. Reports.

Suddenly, I realized that having a huge pile of functions, was not really good.

Seriously, function get_stock_left_cached() followed by build_sales_report() followed by item_price() makes no sense whatsoever!

index.php that’s 5000 lines long is stupid! I couldn’t navigate it anymore.

Oh, yeah, at that time I was still using geany (Linux-based editor that’s basically little more than advanced notepad with syntax highlighting and indentation. There are some cool features, but by no means it’s an IDE)

Nobody was there to prevent what was happening.

Started hiring: PHP programmers, stuck in PHP philosophy

I’ve been assigned to hire more PHP guys. It was hard. I’ve had to explain to each of them that the codebase was messy, but I’ve had no idea what to do about it. I’ve had test assignments, I’ve had questions about Enterprise software patterns. I’ve selected people thoroughly, looking at their code. 99% of people did not pass secure code requirement. There were SQL injections, there were XSS vulnerabilities.

I’ve hired two of the most talented guys I could find. One was OOP guru. Another one knew a lot and had a lot of relevant experience.

First one, of course, was complaining that the code was messy (good!). But at the same time he sat there for weeks without any progress (bad!).

The very same day me and non-OOP guy was churning feature after feature - he was doing something. I’ve had no idea what. He said he was refactoring.

Turned out he was attempting to write The Greatest ORM Ever. That didn’t happen.

The guy left the company after he left a serious SQL injection in his “refactored” search (the most basic thing) among some other bad things.

Contrarily, the other guy was giving me a lot of help. Really understanding, really useful. He didn’t see any problems with the code.

So was the next PHP guy.

Nobody saw problem with functions like urls_txt_split or variables like $t, even two highly experienced (10+ years of PHP) programmers.

NOBODY!

There was no problem with my code? Are you kidding me?

Geez!

Static methods to the (temporary) rescue

Finally, I understood that something needs to be done. Something at least needs to group stuff together, so that I wouldn’t have to navigate 5000 lines file.

My boss (who understands a lot of business and a lot of C#) advised me to try OOP. But, I mean in C# you write this:

public class MyClass
{
    public int Age { get; set; } 
}

the same thing in PHP would be:

class MyClass {
    public $age;

    public function getAge()
    {
        return $this->age;
    }

    public function setAge($age)
    {
        $this->age = $age;
    }
}

It’s really not fun to write those!

Edit One of my readers suggested using __get and __set magic methods, I’ll describe here why it’s not such a good idea, but tell a story where they are useful.

And I even didn’t write this snippet of code now - PHPStorm did it for me! Too lazy :)

By that time I was using simple editor, so I’ve had no idea why would I have to write this much code to do such a simple thing.

The only reasonable way that I found to deal with this were static methods.

function item_price($item, $shop) { ... }

becomes

class Item {
    static function price($item, $shop) { ... }
}

At least now things are starting to be groupped together.

If you want to get better - at least start by getting an autoloader (see below) and grouping things into static methods.

Simplest autoloading and static calls

It’s easy. it was item_price($item, $shop) in index.php, but becomes Item::price($item, $shop) in classes/Item.php and a simple autoloader:

function autoload($className)
{
    $className = ltrim($className, '\\');
    $fileName  = '';
    $namespace = '';
    if ($lastNsPos = strripos($className, '\\')) {
        $namespace = substr($className, 0, $lastNsPos);
        $className = substr($className, $lastNsPos + 1);
        $fileName  = str_replace('\\', DIRECTORY_SEPARATOR, $namespace) . DIRECTORY_SEPARATOR;
    }
    $fileName .= str_replace('_', DIRECTORY_SEPARATOR, $className) . '.php';
    require $fileName;
}

spl_autoload_register('autoload')

print Item::price(1, 15);

So, at least now all stuff that was related to Reporting was in classes/Reporting.php, stuff that related to items was in classes/Items.php, etc..

It’s not what ultimately you would want, but it’s a start! Read on.

Auto-escaping everywhere

The second thing that I understood was that even though my colleagues were very careful in their output, (like using htmlspecialchars, which we aliased to hsc and to <? p($value) ?> for printing stuff) - the errors were inevitable.

I needed something that made me specifically ask permission to output raw data, not the other way around.

So, basically I wanted

Title: <?= $title ?>

to output ESCAPED code.

There was only one way (and I think still is) to achieve it. That was Twig with auto-escaping enabled.

Ever since we started using it - I don’t understand how would somebody write templates the other way around.

Title: {{ title }}   <-- this is auto-escaped 
HTML: {{ titleHTML | raw }}   <-- this is not escaped and it's obvious

By using this - you are guaranteed from Cross Site Scripting problems, which I’d say is problem #3 for PHP programmers. (#1 unreadable code, #2 SQL injections)

Warning! Auto-escaping is not on by default in Twig - you have to enable it specifically. You can create a simple function like getRenderer() that would do something like this (in pseudocode):

require_once '/path/to/lib/Twig/Autoloader.php';
Twig_Autoloader::register();

function getRenderer() 
{
  $loader = new Twig_Loader_Filesystem('/path/to/templates');
  $twig = new Twig_Environment($loader, array(
      'cache' => '/path/to/compilation_cache',
      'autoescape' => true,
  ));
  return $twig;
}

(The super-modern way would be to use Composer, but right now it’s an overkill)

So then you can do stuff like

function showItem() {
  $item = ....;
  print getRenderer()->render('item.twig', get_defined_vars());
}

where item.twig is like:

<title>{{ item.title }}</title>
<body>Price: {{ item.price }}</body>

Super cool, right? Clean and auto-escaped. Cross-out XSS (cross-site scripting) from threats list!

If it’s possible to forget - you are going to forget it

This is something that led me to use of auto-escape. During the security audit of our application, there was Cross-Site Scripting problem.

In my code.

In My CODE! The code that I thought was invincible. And it was. Probably in 99.99% cases. But it was vulnerable in the rest. Because I forgot to escape.

That’s where I had to enforce the rule that everything is auto-escaped from now (namely, Twig is used for eveything).

Oh, and Twig is really cool!

With blocks and macros you can really abstract things away and start living worry-free.

Almost.. But we’ll get to that.

Variables like $a, $b

Yeah, we’ve all guilty of that. Senseless names of variables. I totally understan why you do that. You’re using a simple editor and it’s easy to write this:

$i = new Item();
$i2 = new Item();
$i->title = DB::getTitle($i->id);
$is = array($i, $i2);
print $is;

and it’s harder to write this

$item = new Item();
$additionalItem = new Item();
$item->title = DB::getTitle($item->id);
$setOfItems = array($item, $additionalItem);
print $setOfItems;

You can always say that it’s tedious to write $additionalItem and you’ll get a number of $addtionalItems by the way (typos).

But at the same time you can see - the second one is much more readable than first one.

But I didn’t type the long variable names without typos!

No I didn’t. PHPStorm did it for me.

I wrote the first code (with $i) and then pressed Shift+F6 on $i2 and typed additionalItem. PHPStorm renamed it correctly everywhere.

Want to extract a piece of code to function or method?

Select the code and press Ctrl+Alt+M in PHPStorm. Block of code disappears under the function name. And still works. (Unless you have get_defined_vars next, then it doesn’t work, the bug is open)

Take this code:

$item-title

select it, press Ctrl+Alt+V, then Enter - it becomes;

$title = $item->title;

Easy!

If you have a variable $additionalItem - it’s enough to type ai for PHPStorm to figure out what to write.

Lost a function somewhere? Press Ctrl+Alt+Shift+N and search for it in less than a blink (seriously, even in huge code bases).

Ok, I’ll probably quit advertising and just say - get it (30-day trial) - it will be beautiful!

It’s way too useful to be missing out and not use it.

SQL injections

The next on my list was SQL injections. “If it’s possible to forget - you are going to forget it” again.

At first we were doing

mysql_query("SELECT * FROM users WHERE id='".mysql_real_escape_string($userId)."'") 

Yeah, that was awful, but secure. Easy to forget.

We did forget. Many times. Scrupulous code review shown those points, but not all of them.

It needed to be automated.

I think it was Rails which shown me that querying should somehow be like this:

query('SELECT * FROM users WHERE id=? LIMIT ?', $userId, $limit)

I did not find an easy library that could do that. I’ve looked at many libraries out there, including Propel and Doctrine, but some were boring, and things like Propel and Doctrine made me shocked. Do I really have to read all that to do queries?

Nope, too lazy.

I’ve ended up writing my own library, which I open-sourced under name of DBix (though I haven’t updated it in a while), which let’s us do things like:

db()->query('SELECT * FROM users WHERE id=?', $id)->fetch_all();

I frankly have no idea whether there is another library that doesn things this way and maintained well - but there should be!

That’s what you should be striving for - complete automation of escaping. No more thinking about simple stuff!

Keep thinking on high level.

If it’s broke - test it!

The next problem that we came upon was that each time we sent code to production - we were never sure what will break.

It was okay at first, but as thing grew more and more complex - we started to hit the same problems over and over again.

“The ‘Add item’ button doesn’t work again!”

…and there is only one reasonable solution for that - TESTING.

Yeah, many PHP programmers often have no idea why should you test something.

Yes, it’s kind of tedious to do too. But in the end - it saves you a lot of time! A LOT!

Testing is actually pretty simple.

If you want it super-simple – try this:

(Let’s start in non-OOP way)

Create test.php:

header('Content-type: text/plain');

$item = get_item_from_database();
$basket = get_basket();
add_item_to_basket($basket, $item);

if(basket_total($basket) != item_price($item)) {
  print "ERROR: Basket doesn't add items correctly.\n";
}

add_item_to_basket($basket, $item);
if(basket_total($basket) != item_price($item) * 2) {
  print "ERROR: Basket doesn't count total correctly.\n";
}

(please don’t write code exactly like this, it’s also awful, read until the end)

That’s it, open test.php and you are testing!

Now you can be sure that whatever you do to basket - if you accidentally add something that breaks basket - you’ll know.

Just refresh test.php and sleep tight even if you pushed the code to production Friday night (which is an awful idea, even with testing :) ).

But, really, you should use PHPUnit or one of other PHP testing libraries

Installing PHPUnit is almost simple in Linux:

sudo apt-get install php-pear
sudo pear channel-update pear.php.net
sudo pear upgrade-all
pear config-set auto_discover 1
pear install pear.phpunit.de/PHPUnit

If that doesn’t work - try:

sudo pear channel-discover pear.phpunit.de
sudo pear channel-discover components.ez.no
sudo pear channel-discover pear.symfony.com
sudo pear install -a phpunit/PHPUnit

Now write a simple test StackTest.php

class StackTest extends PHPUnit_Framework_TestCase
{
  public function testSimple()
  {
    $item = get_item_from_database();
    $basket = get_basket();
    add_item_to_basket($basket, $item);
    $this->assertEquals(basket_total($basket), item_price($item), "ERROR: Basket doesn't add items correctly");
  }
}

Now run phpunit StackTest and you’re done! You can create whole directories of tests.

What we have finally done for testing?

Something like this:

  1. There is a script that kills the database and inserts known values (and it starts with if(isProduction()) exit; :) ), so for example we know that it creates Item with id = 15, which price is 123 Roubles and title Item15.

  2. We have tests like this:

1
2
3
4
$this->get('/basket');
    $this->post('/basket/add', array('item_id'=>'15'));
    $this->assertHtmlContains('125 RUB', 'Basket did not count total correctly');
    $this->assertHtmlContains('<a href="/item/15">Item15</a>', 'Basket did not output item correctly');

… and all other cases, like if item has zero price, zero stock - there should be error messages, etc…

Yes, you have to create get, post and assert_html_contains yourself.

Use PHP’s curl :)

Should you use Selemium?

Well, we tried. That was exhausting. You can’t really debug that stuff. And it’s slow. VERY slow.

Use curl - that would be enough for most cases, unless you write something that’s mostly JavaScript.

OOP and asking for stuff

The next big idea that I understood was how to use OOP in PHP. Actually I understood it from learning Ruby on Rails (which is awesome!)

So, let’s say you have a store. How do you get the price of item?

– Oh, that’s easy. There is a table items, which has a link to table prices, so you just write an INNER JOIN….

– Yuck! There should be an easier way.

And there is.

$price = Items::findById(15)->getPriceAtShop(3);

Yeah. You should NOT really care about where stuff is kept. Just ask something to do something!

– So, how to create an order?

– Well, you insert and entry to table orders, while taking basket from user’s $_SESSION variable, where you go by cookie to baskets table.

– No! You do something like:

$basket = Basket::getCurrentBasket();

if($basket->completeOrder()) {
  getRenderer()->render('basket_ok.twig', get_defined_vars());
}

That’s it. It’s next step to create actual methods, which know where basket is stored, where do orders go, etc…

Just tell something to do something and be done with it!

Knowing exactly HOW to do it - is SECOND step.

For example:

class Basket {
  static function getCurrentBasket() {
    return Session::getVariable('basket', new Basket());
  }
}

Again, you don’t have to know how Session does what it does. It’s NEXT step.

class Session {
  static function getVariable($name, $default) {
    @session_start();
    if(!isset($_SESSION[$name])) {
      $_SESSION[$name] = $default;
    }
    return $_SESSION[$name];
  }
}

There, we’re done. But even now you might not know what $_SESSION does for you under the hood.

Don’t be afraid to ask objects to do things for you even if you have no idea how to do it yet.

Another example would be external systems. Like if you want to get stuff from Twitter. You don’t know yet how Twitter API works, so you write something like (you’re guessing that you’ll need some API key, so you initialize an object)

$twitter = new TwitterAPI('MY API KEY');
$twits = $twitter->getTwits('#php');
foreach($twits as $twit) {
  print $twit->getAuthor();
}

Yeah, write that WITHOUT having any idea how Twitter API works yet.

You go and write a stub for Twitter class:

class Twit {
  public $author = '';

  function getAuthor() {
    return $this->author;
  }
}

class TwitterAPI {
  public $apiKey;

  function __construct($key) {
    $this->apiKey = $key;
  }

  function getTwits($search) {
    $twits = array(new Twit);
    return $twits;
  }
}

Again, you have no idea how Twitter API works, yet you have a code which you can use.

Later on you can replace it with something that really works, just by replacing contents of TwitterAPI class.

"Object-oriented code tells objects to do things." — Alec Sharp

So, get this. OOP is mostly a way to tell things to do things. It’s also a way to hide unnecessary complexity. You don’t really need that mysql_query('INSERT....') there, you need DB::createOrder(...) there or $twitter->sendTwit('...') or something that is meaningful.

MVC… what the..?

So, by now you probably heard about MVC. Model-View-Controller. But you have no idea why would you use it.

It’s quite simple actually. First - when you get used to MVC - you know right away WHERE to search for code.

It touches database? It’s in models.

It accepts data from user? It’s in controllers.

It outputs HTML or JSON or whatevern? It’s in views.

That’s it! Easy.

Another reason to do MVC is that if you ever need to present your data in multiple forms - it’s your best friend.

Like, what if you need to write an Android application that needs to create orders on your website?

It’s easy. You just create a controller for that, because models do all the database-stuff, all the validation, all insertions, etc.. You only need to place the controller that will expose that to Android app.

Compare that to when you have HTML all over the place and mysql_queryies in-between. That’s a lot of work!

So, let’s see a simplified controller for HTML version of creation of order:

class orderController() {
  function createOrder() {
    $basket = Basket::getFromSession();
    if($basket->createOrder($_POST['customer_data'])) {
      redirect('/thanks.html');
    } else {
      render('basket.twig', array('basket' => $basket))
    }
  }
}

where in basket.twig, you would call basket.getErrors (in Twig that’s like $basket->getErrors()) to see if there are any errors.

and a Android version, which accepts JSON:

class androidOrderController() {
  function createOrder() {
    self::checkCredentials();
    $basket = Basket::getFromSession();
    $customerData = json_decode($_POST['customer_data']);
    $order = $basket->createOrder($customerData);
    if($order) {
      print json_encode(array('orderNumber' => $basket->getOrderNumber()));
    } else {
      print json_encode(array('errors' => $basket->getErrors()));
    }
  }
}

Easy!

Now imagine doing the same to stuff like this:

<title><?php echo $pageTitle; ?></title>

<body>
  <?php include('header.php'); ?>

  <?php
    if(preg_match('#Android#', $_SERVER['HTTP_USER_AGENT'])) {
      $android = 1;
    }
    $errors = array();
    $basket = $_SESSION['basket'];
    if(count($basket['items']) == 0) {
      if(!$android) {
        print "<b>You can't order empty basket</b><a href=\"/\">Go back!</a>";
      } else {
        $errors []= "You can't order empty basket";
      }
    }
    if(item_price($basket['items'][0]) == 0) {
      if(!$android) {
        $errors []= "You can't order priceless item!";
      } else {
        print "<b>You can't order priceless item!</b><a href=\"/\">Go back!</a>";
      }
    }
    .... etc
    if(!$android) {
      print "Order is OK!"
    } else {
      print json_encode(array('errors' => $errors))
    }
  ?>
  <?php include('footer.php'); ?>
</body>   

All the while discovering that you already outputting HTML and can’t really print JSON here. Messy!

So, to sum up MVC very-very shortly:

If it accepts data from user or browser - it’s a “controller”.

class BasketController {
  function addItem() {
    $basket = Basket::getFromSession();
    $basket->addItem($_POST['item']);
    getRenderer()->render('basket.twig', get_defined_vars());
  }
}

If it touches the database or represents some external API (like Twitter, other database etc..) - it’s a “model”:

class Basket { // model
  private $errors = array();
  static function getFromSession() {
    $basket = Session::getVariable('basket', new Basket);
  }

  function addItem($itemId) {
    $item = DB::Items()->find($itemId);
    if(!$item) {
      $this->errors []= 'There is no such item to add';
    }
    $basket->items []= $item; // simplified
  }

  function completeOrder() {
    Orders::completeOrderFromBasket($this); // don't be afraid to delegate to more relevant classes
  }
}

class Menu {
  static function getRootCategories() 
  {
    ....
  }
}

If it’s HTML/JSON/XML - it’s a “view”.

You have {{ basket.getCount }} items in basket.

<ul>
  {% for item in basket.getItems %}
    <li>{{ item.title }} <span class="price">{{ item.getPriceAsImage }}</span></li>
  {% endfor %}
</ul>

{% endfor %}

Write in a language that doesn’t yet exist!

So, you need to write an Amazon.com clone?

Well, let’s define some URLS:

urls('
  / -> HomeController#index
  /* -> BookController#showCategory
')

What? You don’t have urls function? Then write one!

Can’t you parse a string and use call_user_func? It’s not that hard :)

Write a controller:

class BookController {
  function showCategory() {
    $category = Categories::getByUrl(Controllers::currentUrl());
    renderTemplate('category.twig', get_defined_vars());
  }

  function destroyCategory() {
    if(User::isAdmin()) {
      $category = Categories::getByUrl(Controllers::currentUrl());
      $category->destroyAll();
      redirect('/');
    } else {
      Errors::notAuthorized();
    }
  }

  function moveCategoryToOtherCategory() {
    if(User::isAdmin()) {
      $category = Categories::getByUrl(Controllers::currentUrl());
      $otherCategory = Categories::findById($_POST['newId']);
      $category->moveToCategory($otherCategory);
      redirect('/');
    } else {
      Errors::notAuthorized();
    }
  }
}

and it doesn’t matter that none of those exist yet!

You can just create it later. The main thing is that you’ve written in the most highest-level possible what you actually need.

3 months laters you’ll look at this code and be surprised - that you actually understand what it does. :)

Happy coding!