PHP References

Something you should be aware of, if you are cloning objects containing objects:
Objects are always passed as reference, so the idea of cloning the object when returning is the first thing to do if you want to avoid changing the value because you are operating on a reference. The issue is that the clone keeps the child references.

To clone the childs you need to override __clone(). So adding this method in the example class will prevent this behavior:

Share This:

Process issue #1: Code Review and QA run in parallel

Share This:

A real-life git workflow. Why git flow does not work for us

Most standardized git workflows are not suitable for real agile teams dealing with continuous delivery and constant changing of short term goals.
Most flows assume following a plan, over responding to change, like having planned releases, rigid process phases, waterfall-style.
Agile accepts those changes as normal. Small companies usually don’t have the luxury to make planned releases, stop development, test the release thoroughly and release only what was planned a month ago. And that’s fine. At the end, what drives our business? Our customers, and releasing new features before the competition, without causing problems.

The most used git workflow is git-flow described here:
This model suggests creating two long lived branches: develop and master. Master is the stable version (reflects the production version). Develop is the development branch, where new features are prepared.
At certain intervals a release is scheduled in order to stabilize develop. So a release branch is taken from develop and the stabilization is done there. Some people may continue working on develop if they work on a feature to be release later. You create topic branches for hotfixes and new features. Hotfixes are integrated into master (and release, if we are during a release), then back-ported into develop. Completed features are integrated into develop.

A first problem can be seen here: teams do not finish their jobs at the same time. Planned releases are fine in theory, but in practice some features from develop are far from done, even if at the time of integration into develop that feature seemed done. Taking a branch from develop and trying to finish them on release would be a bad practice, because releases should take 1-2 weeks and only bug-fixing should be done there, not development.

A second problem with this approach is that you can’t know for sure that only those planned features need to go live. I saw numerous cases where the client will push for a new feature (for him this is usually a key missing piece, so he doesn’t see it as a feature but as a bug) and if the customers is really important, we will try to satisfy him, in the agile spirit.

A third problem is that features are assumed to be independent of each other, which are not. One can introduce a change that makes integrating another feature impossible. Further more, the flow recommends (not forces) that features are developed locally, so this means no integration with other branches. For experienced people this has a big “merge conflict” sign on it, high probability for regression problems due to code incompatibilities and it’s ussually a pain to integrate.

What happens now with the release? We have a release branch with an incomplete feature, probably due to a new feature not originally part of the plan. We will probably throw away the release branch after we see that is highly unstable, and if we were smart enough to work on feature branches (the cold truth is that most people work on dev) we will integrate the completed feature branches into another release branch. But release takes longer than planned because one feature is not really done, the client makes some changes, because he realized he needs something more, but you can’t wait to finish that to release another important feature. On top of that, we have applied some hot-fixes taken from master, that need additional tests to ensure compatibility with the code on release. What a mess!

The solution is not to take the methodology all persons are hyped about now and try to adapt the business to this development workflow, but to create the best suitable workflow for the current business, especially when you can’t control the business side.

I am suggesting a methodology that incorporates instructions about QA as well, not only how branches are made and integrated with each other. It’s a release methodology, based on a basic feature branch flow.


  1. For each feature or hotfix we create branches from master.
  2. Internal Testing is done on the feature branches, which are kept in sync with master.
    This is the same thing as testing on the integration to master. Internal testing uses the dev environment.
  3. Once a feature is completed and tested by QA, the feature branch gets integrated into the UAT branch.
    UAT – User Acceptance Testing is a special environment very similar to production, usually sharing the same machine as the production, but using a copy of the database.
  4. Client acceptance is done on the UAT branch which is kept in sync with master.
    Clients confirm that features on UAT comply with their specifications.
    UAT and master accepts only merges, not commits or cherry-picks..
    Steps 3 and 4 can be skipped if the feature is minor and does not need acceptance.
  5. The accepted features get merged into master
  6. Automated tests should run on each build at least for UAT and Master and avoid update if they fail

As in the gitlab flow we are using long-lived branches for different environments like UAT but we don’t need a new branch for production, because we can safely use master. And we do not merge from pre-production (our UAT) into master, because then UAT would be a release branch, and we would need to stabilize it and handle all the problems we previously had. UAT for us is just a demo as close as possible with a possible integration into production, and a place to see how features interact with each other. Completed features are merged from their feature branches, not from UAT, which may contain partially done features.

If you fully understand how git rebase works, you  can develop features locally and rebase on master until you publish the branch. You will usually publish the branch if the feature is done or when you are collaborating with someone else on this branch.

Share This:

Switching to Jira from Asana

Moving towards a more standardized Agile methodology means finding the suitable tool for our processes.

Asana helped us develop new features and track the issues, but we can see some difficulties in organizing it in the way we want.

The main problems with Asana:

  1. inability to easily find what the teams are actually working on
  2. no support for creating and tracking epics (combining multiple stories in order to implement flows and track them across several sprints).

    You can mitigate this by creating different projects, add epics as tasks, stories as sub-tasks, then assign these stories to sprints, but you still don’t have any way of seeing a progress report for a certain epic.
    We may track the epic status if we create it as a project, but we won’t be able to prioritize it as a whole

  3. poor reporting – we have only a burn up chart.
  4. poor overall responsiveness – it freezes and goes down often
  5. hard to use it in a traditional scrum workflow – does not use or enforce any process

On the other hand, Jira is an older tool created by a well known company (Attlasian) with rich experience in project management software.  Attlasian owns Bitbucket also – a service similar to GitHub (Jira integrates well with Bitbucket). Attlasian serves 85 of the Fortune 100.

Jira was designed for teams wanting to enforce a standardized flow.

Main advantages of Jira over Asana:

  1. more mature product
  2. built in support for Scrum and Kanban. Also, you can define your own flow by using a visual representation.
  3. supports epics
  4. you can prioritize entire product backlog including epics in addition to prioritization of individual epic backlog items
  5. easily see active sprints
  6. interactive scrum or Kanban boards (see what’s in progress/done and change status by moving items like you do with the post-its)
  7. supports estimates using several methods (classic time, issue count, business value, story points)
  8. advanced agile reporting (sprint burn-down chart, epic burn-down, velocity, cumulative flow diagram, etc)
  9. can use estimate method (story points for example) to take into account story complexity in reporting
  10. can track time and  let’s you edit remaining time for tasks
  11. supports working with versions
  12. supports components (ex: Database, User Interface, etc
  13. configurable screen types for each type (story, bug)
  14. configurable fields
  15. integrates with github to link issues to commits. Also integrates well with other Atlassian tools like Bitbucket, Confluence, Bamboo.
  16. faster and more reliable

Main advantages of Asana over Jira:

  1. nicer UI/UX. Asana is a newer product.  Every UI interaction is quicker: assign,  add labels, comments, upload file,  change state, set due dates, add followers, etc.
  2. more flexible. Does not impose any flow. This can be either a plus or a minus, depending on what we want.
  3. who is doing the issues more visible, and easier retrieval of the list of items assigned to a person

As a personal impression, It feels very natural to work in Asana, and I have a hard time finding my way in Jira. If I could combine the Asana ease of use and Jira flows and reporting I would say that would be a good choice. For now it seems we need to choose from ease of use against better processes.

Coming from a flexible tool like Asana to something more rigid like Jira will mean we definitely need to follow stricter procedures and some frustrations may arise out of this because some may feel that procedures will stand in their way. That’s why a transition from loose procedures to more rigid ones need to be carefully analyzed.

My recommended workflow using Jira:

  1. Preferably use a single project in order to have a single backlog and prioritize the project from a centralized place
  2. Use Components to organize related items (Broker Area, Employer Area, etc). Components can have Component Leads: people who are automatically assigned issues with that component. Components add some structure to projects, breaking it up into features, teams, modules, sub-projects, and more. Using components, you can generate reports, collect statistics, display it on dashboards, etc. Project components can be managed only by users who have project administrator permissions. They should have unique names across one project. Nothing prevents users from adding issue to more than one component.
  3. Use Epics to group related stories and track flows. Epics or complex stories may be re-organized during the backlog refinement meetings
  4. Use Labels as the simplest way to categorize items.  Anyone can create new labels on the fly while editing an item. All project labels are displayed in the Labels tab of the project as a tag cloud. We can have labels like Production emergency, Feature requests, etc
  5. Use parallel sprints (this is experimental feature in Jira but our current process uses parallel sprints)
    Where to enable this:
  6. use this workflow

  7. use this board configuration:


The usual procedure is to write stories in the product backlog using the standard format:
As a <actor> i’d like to <action> in order to <benefit>.
A story describes a feature from the business perspective.

Stories can be grouped into epics (a flow for example or a complex story is an epic and can span on multiple sprints).
In Jira you can filter to see backlog items from an epic, those without epics or all.
This way you can track epic progress, prioritize stories inside epics.

There is no easy way to prioritize epics itself. To accomplish this you need to add a KanBan board and filter only epics. This can be used as a Roadmap or as a ScrumBan bucket.

Stories are split into tasks by the dev team. Tasks focus on the “how” while stories focus on the “what”. A story can have sub-tasks, and sub-tasks can represent the technical part.
The product owner should never create tasks, but he will create stories.

For the big picture, organizing a project always involves starting with a roadmap. This is used to create epics, then stories, then tasks.

Definition of done

Scrum was created to allow for an iterative, incremental development. This means at the end of an iteration a piece of working software is delivered.
And this means we need to clarify the definition of done. This is generally accepted as: functional code, does what the requirements say, development is final (passed code review and refactoring), automatic test were written, and manual QA tests passed.

More on definition of done:

… Scrum asks that teams deliver “potentially shippable software” at the end of every sprint. To me, potentially shippable software is a feature(s) that can be released, with limited notice, to end users at the product owner’s discretion. – See more at:

This does not mean that all sprints declared successful are bug-free, but should be production-ready. Each sprint ends with a sprint review and then the product owner can declare the sprint successful or failed. In Scrum, failing sprints is not considered a bad thing, but accepted as normal and certain actions are taken to improve what went wrong.
Scrum teams must include a QA specialist on the team and QA is done during the sprint, not after the sprint. If the QA is done after the sprint or the sprint is not declared successful or failed, I think this is not Scrum.

What about a separate sprint for QA ?

Some people may advocate the idea that from their own experience QA will always remain behind, and a separate sprint for QA will improve the process.
QA also may feel that this will allow for a slower, more systematic approach to testing.
I know for a fact such procedure is implemented in some organizations, but I find it to be anti-agile because it does not take into account the definition of done.
See comments on this methodology (author itself does not advice to use it, but he notes it as as a possibility):

Having QA work on their own sprint is not in the way Agile/Scrum was designed.
There are some important aspects of doing Scrum that I want to point out:
– Teams share the responsibility of completing work (ideally means a potentially shippable product at product owner discretion) during a sprint
– Teams include QA in order to get things done without defects.
– Teams accomplish this by trusting each other, good communication and constant adjustment and improvement of future sprints by using retrospective meetings.

The whole purpose of Agile is building the product incrementally. The sprint is not completed if it is not QAd.
We didn’t had a QA specialist on our team still we managed to deliver bug-free or with minor bugs because we did automated tests and manual testing ourself.

From my experience a good QA specialist will not remain behind and has enough time to create test plans, test manually and make some automated tests during the sprint, as long as Scrum is done correctly. Remember that the team members can help each other, and developers should not feel uncomfortable to help with the testing or documentation if required.

Sub-tasks for testing ?

In Scrum or KanBan an issue/story is completed in stages. In fact these are the same stages from waterfall, but done inside a smaller timeframe. That’s why you don’t need separate tasks for testing, because the sprint backlog item can flow from analysis to dev to testing to finished. The item is done ONLY when all stages are completed.

Uncompleted items, bugs from previous sprints, production emergencies

If some items remain uncompleted, the item is not presented during the sprint review meeting and is scheduled for another sprint.
If issues are found AFTER the sprint was delivered and that functionality wasn’t released all bugs will be included in next sprints according to priority given by Product Owner.
If issues went into production they are usually added with high priority to current sprints, leaving the possibility of the team to negotiate removal of other less important items in order to complete the sprint on tie.
There are also bug-fixing only sprints, in which a team tries to solve as many bugs as possible from the backlog, usually during and after a release.


The ideal workflow is one that can offer production-ready code at any time not at certain intervals and for that comes into help Continuous Integration where automated tests assure high quality releases even twice a day. CI can be used for maintenance releases and this does not exclude formal V2, V3 releases, but usually you find that by using CI you don’t need to release big changes and prepare months away, but be prepared at any time and use feature toggles instead when ready to enable a finished feature.

In conclusion, I would advice to leave QA in the team, make the team as a whole (including QA) responsible of completing items, declare the sprint success/failed after sprint review meeting, and add items not done or bugs in the next sprints.

Share This:

Widgetize your app! Reusing code needed to show blocks of content in ZF2 with Controller Plugins and child views

So you are developing a ZF2 application and you have a block of content which needs to be insert in several places within your application. Using a forward could work but it renders the whole page, not the part you are interested in. So here comes in the rescue two concepts: controller plugins and child views.

Here is how it looks a controller action with this method:

It looks good, isn’t it ?
“employer” is registered as controller plugin.
We create a parent view called $viewModel. The child view is returned by the plugin method getProfile(). It is then inserted as child to $viewModel and made available as “employerInfo”.

In the view:

And here is a trick to show variables from the child view inside the parent view:

In order to accomplish all this, we need to create the controller plugin:

and register it in the service manager:

THe only thing remaining to do is to create the view for the content block. In our example it is located in module/Employer/view/employer/plugin/info.phtml

Now you can insert this content block in another place by adding a different controller action, which renders a custom parent view.

In the view you can put whatever you want, and include $this->employerInfo in the place where you want to render the content block.

What about having different links ?

In some situations, your content blocks may contain links or actions to use different URLs. These need to be dynamic also. I’ve created a view helper for this.

Here is the helper class:

To use it I pass an array with link information to the controller plugin. Bare with be because I will explain later how this array will be used.
controller code

from inside the plugin method I make the links var available for the view:

plugin code:

Finally I use the PluginLink helper to create the URLs. pluginLink has two parameters.
The first parameter contains the link configuration array, having some keys like: route, param, options.
The second parameter contains a list of variables used to replace those $ placeholders inside the link definition.
$0 is replaced by the first item. $1 is replaced by the second item and so on.

view code:

Notice the line with ‘secondary_id’ => ‘$0’ on the $options definition from controller ? This will instruct the helper to create an url having the link definition in “view” and replace the secondary_id route param with the first array item given (the user id).

Here is an extended example where I pass dynamic query params:
controller code:

the view for this link:


Using controller plugins is an easy method of reusing controller related code, and child views allows for easy reuse of view blocks. You can have in your plugin logic to get the route params and use them to filter results or make decisions. Having a viewModel (child view model) returned allows for html rendering as is, or using only the variables declared within it in order to create a totally different view.

You could use a normal service instead of a controller plugin, but controller plugins are types of services especially provided by the framework for handling controller related logic – you have the getController() method included and they are available on all your controllers.

Share This:

Simplify handling of tables, entities, forms and validations in ZF2 by using annotations

If you developed any application using ZF2 you may become frustrated of the tedious work of creating boilerplate code for handling common tasks like a simple form which will be validated then saved in a database. The Zend manual recommends creating a table class, an entity class a form class, a validator class, along with the common MVC prerequisites like controller, action, view plus the Zend config stuff for paths, etc. Coming from a convention over configuration world of cakePHP this seems ridiculous.

Here is how you can speed up your workflow while still benefit from all enterprise features and flexibility you like in ZF2:

Annotations are special docblocks which store metadata in PHP classes. These information are available at runtime, unlike regular comment blocks, which are not. Note the difference:


There is no support in the PHP core for annotations, but there are some engines using the reflection API which can be used successfully with annotations. Common choices are the one present in symfony and phpdocumentor. ZF2 includes support for annotation by using it’s AnnotationBuilder class and doctrine/common (a symfony package).

You can add the required package to your project by using composer:
Edit composer.json

then run
php composer.phar install

I am using a TableGateway factory to return a generic table instance or custom table instances if they exists. The table service will take care of the CRUD operations and hydrate the result set.

We start with a base entity:

To demonstrate, I will use two entities: User and Address. User contains another object called Address.
I’ve put some examples to annotate validations, filters and define how the form looks.
As you can see, the validators, filters and their options are the one shipped with Zend Framework.

Address Entity:

Basic usage

Controller action

You obtain the form from the entity with an already bound object, then work this the form as usual.

Extracting an array representation on the object including composed objects:

View – test.phtml

Here is a basic method to populate the entities with data:

You can hydrate providing an object or an array

Then you can validate your entity like this:

A very nice feature is to automatically hydrate the composed objects as well when using queries with joins. This can be done automatically if you attach this custom hydrator to the result set prototype and prepare the query for this behavior.

Any composed objects will be populated. Other joins will populate a special property called VF (from virtual fields). You can get virtual fields later by using getVF($name = null).

For example if we join ContactDetail it will populate the properties from ContactDetail as well and if we have also an aggregate expression like COUNT(*) then you will find this value in the virtual fields. The purpose of virtual fields is to store any data outside the scope of the entity.


Then prepare the query to format the column name in order to let the hydrator to detect it.

For further reference, here is a list of available annotations:

  • AllowEmpty: mark an input as allowing an empty value. This annotation does not require a value.
  • Attributes: specify the form, fieldset, or element attributes. This annotation requires an associative array of
    values, in a JSON object format: @Attributes({"class":"zend_form","type":"text"}).
  • ComposedObject: specify another object with annotations to parse. Typically, this is used if a property
    references another object, which will then be added to your form as an additional fieldset. Expects a string
    value indicating the class for the object being composed @ComposedObject("Namespace\Model\ComposedObject") or an array to compose a collection: @ComposedObject({
    "target_object":"Namespace\Model\ComposedCollection", "is_collection":"true", "options":{"count":2}})

    target_object is the element to compose, is_collection flags this as a collection and options can take an array
    of options to pass into the collection.
  • ErrorMessage: specify the error message to return for an element in the case of a failed validation. Expects a
    string value.
  • Exclude: mark a property to exclude from the form or fieldset. This annotation does not require a value.
  • Filter: provide a specification for a filter to use on a given element. Expects an associative array of values,
    with a “name” key pointing to a string filter name, and an “options” key pointing to an associative array of
    filter options for the constructor: @Filter({"name": "Boolean", "options": {"casting":true}}). This annotation
    may be specified multiple times.
  • Flags: flags to pass to the fieldset or form composing an element or fieldset; these are usually used to
    specify the name or priority. The annotation expects an associative array: @Flags({"priority": 100}).
  • Hydrator: specify the hydrator class to use for this given form or fieldset. A string value is expected.
  • InputFilter: specify the input filter class to use for this given form or fieldset. A string value is expected.
  • Input: specify the input class to use for this given element. A string value is expected.
  • Instance: specify an object class instance to bind to the form or fieldset.
  • Name: specify the name of the current element, fieldset, or form. A string value is expected.
  • Object: specify an object class instance to bind to the form or fieldset.
    (Note: this is deprecated in 2.4.0; use Instance instead.)
  • Options: options to pass to the fieldset or form that are used to inform behavior – things that are not
    attributes; e.g. labels, CAPTCHA adapters, etc. The annotation expects an associative array: @Options({"label":
  • Required: indicate whether an element is required. A boolean value is expected. By default, all elements are
    required, so this annotation is mainly present to allow disabling a requirement.
  • Type: indicate the class to use for the current element, fieldset, or form. A string value is expected.
  • Validator: provide a specification for a validator to use on a given element. Expects an associative array of
    values, with a “name” key pointing to a string validator name, and an “options” key pointing to an associative
    array of validator options for the constructor: @Validator({"name": "StringLength", "options": {"min":3, "max":
    . This annotation may be specified multiple times.

Share This:

Unique record validation in ZF2 forms

Controller code:

Validator class:

Share This:

Configure Sphinx Search server with a main + delta indexing scheme, including updates & deletes

Sphinx Search is an OpenSource FULL TEXT search server developed in C++, and it is a very fast and scalable solution, superior to what database servers offer. It works on all major operating systems, but in this example, I will show you how to install and configure  it in Linux, which is the most common choice.

The datasource will be a MySQL database.


Installing is simple. You can download the sources, and use the standard procedure (configure and make). If you are using CentOS, you can download the latest RPM and install it like this:

rpm -ihv <the-URL-of-RPM-from-sphinx-website>

CentOS usually has an old version with the official yum repo, so downloading the latest version would be needed, because new cool features are always added.

if it complains about missing libraries, like odbc, use yum to locate them.


If you used rpm to install, the configuration file is located at /etc/sphinx/sphinx.conf

Sample config:


As you can see I used a table named ads.

You need to create two tables for sphinx:

  1. sphinx_ads_deleted – Will contain deleted items from ads. The deleted items are inserted for the DELETE trigger in ads
  2. sphinx_counter – Will contain the updated last id and modification date since the last reindex

You need to define a DELETE trigger found bellow in the ads table.

I will include the structure for my ads table also.

Useful commands

start/stop/restart service:
service searchd restart
indexer –rotate ads_main

rotate will update index named ads_main even if it is in use

Cron jobs (update schedule)

Usually the main index is rebuilt once a day, and the delta updates more frequently.

Make sure the crond service is running with:
service crond status
It should say the service is running.

Create a file for each job in /etc/cron.d

sphinx_main – runs at 2:12 AM each night
sphinx_delta – runs at every minute

Faster updates ?

1. Use merge

Instead of reindexing main, you could merge delta into main. This still consumes a lot of memory, but it would be faster.

The basic command syntax is as follows:

So you will have something like this:

The problem is that you can’t use the shell for this because you will need to update the sphinx_counter table also, and that is why you will need to do this from a script.

I prefer to rebuild the index each night to make sure I am using a synchronized version of the database.
A full re index for a 100K records table takes only a few seconds.

2. Use Real TIme indexes for live updates

Real Time indexes were introduced with version version 1.10-beta. Updates on a RT index can appear in the search results in 1-2 milliseconds, ie. 0.001-0.002 seconds. However, RT index are less efficient for bulk indexing huge amounts of data.

Share This:

HTML Cleaner

How many of you needed to clean up those messy MS Word files in order to integrate them into valid W3C pages, or just integrate them in the overall design ?
I’ve looked for a good HTML Cleaner and didn’t find a good free one.

Meanwhile, I’ve developed my own HTML Cleaner class in PHP, because I needed to clean up tons of word generated code in that time.

I’ve combined the strong HTML Tidy library with my own regular expression-based cleaning algorithms. I wanted a simple method to strip all unnecessary tags and styles yet to keep it W3C standard compliant.

Syntax checking is being done only when using Tidy.
Note that this tool is designed to strip/clean useless tags and attributes back to HTML basics and optimize code, not sanitize (like HTMLPurifier).

Without the tidy PHP extension, the class can:
– remove styles, attributes
– strip useless tags
– fill empty table cells with non-breaking spaces
– optimize code (merge inline tags, strip empty inline tags, trim excess new lines)
– drop empty paragraphs
– compress (trim space and new-line breaks).

In conjunction with tidy, the class can apply all tidy actions (clean-up, fix errors, convert to XHTML, etc) and then optionally perform all actions of the class (remove styles, compress, etc).

Currently the following cleaning method is implemented: tag whitelist/attribute blacklist

See it in action:
Download latest version

Licenced under Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported (
for personal, non-commercial use

For commercial use one developer licence costs 15 EUROs

-taken from RC6

v. 1.0 RC6
-added option to apply tidy before internal cleanup
-added function TidyClean() that cleans only with Tidy the source from html, modifying it
-changed license to Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported

v. 1.0 RC5
-tidy cleanup works also with PHP 4.3 now. Correction: class is compatible with PHP >=4.3. PHP 5 recommended. Basic cleanup (no tidy) can work with earlier versions of PHP 4
-removed drop-empty-paras option from default tidy config since there is already an internal drop-empty-paras mechanism
-Optimize now defaults to true since is very useful
-new default tidy config options:
‘preserve-entities’ => true, // preserve the well-formed entities as found in the input (to display correctly some chars)
‘quote-ampersand’ => true,//output unadorned & characters as & (as required by W3C)
-default Encoding set to latin1

v. 1.0 RC4
-the class is now compatible with PHP 4.4 or higher (maybe 4.0, but never tested)
-minor bugfix for Optimize (loop until optimized now works correctly)

v. 1.0 RC3
-cleaning is now done case insensitive
-improved optimize, removed EXPERIMENTAL tag
-default tidy config now sets word-2000 to false

Share This: