Posted on November 2nd, 2009.
After you deploy a CMS, the client types in their content, and as time goes on that is all there is too it. At least that is the proverbial theory.
Sometimes though, all your efforts to provide a consistent appearance to the content via CSS just don’t work. The trouble is your client is often copying and pasting content from all manner of sources including their favourite Word processing package.
The richtext editor bundled with MODx is TinyMCE, and a very good job it does. It even has a ‘Paste from Word’ button that if configured can filter out any styles that have been pasted in from Word, thus getting rid of the original font and font size related styling so that your CSS can do its job in providing a consistent look to the site.
Unfortunately you will need to configure it to strip out those styles, and pasting is not the only way some bright spark in your client’s organisation can insert styles. So maybe there is another, more bombproof, way.
The solution lies with MODx events, or rather the event OnDocFormSave. Basically, you run the content through a regular expression that removes any style attributes. Unfortunately you may want to keep some useful style attributes as TinyMCE itself does use some e.g. for image positioning and margins.
Personally I have (so far) found only the image tag’s style="float: left/right" worth keeping. Anything else just results in the consistent site look being, erm, inconsistent. Thus I have the following code:
/*
* Clean styles from richtext content
*
* Last modified: TS, 26/8/2009
*
* for OnDocFormSave
*
* Note $unchanged_text_tv_fields in the config. This must include ALL unchanged text fields from modx_site_content.
*/
/* ----- CONFIG ----- */
$tv_fields = array('content');
$custom_tv_fields = array('subcontent');
$unchanged_text_tv_fields = array('pagetitle', 'longtitle', 'menutitle', 'introtext', 'type', 'contentType', 'description', 'alias', 'link_attributes'); // VERY IMPORTANT FOR SECURITY!!!
// Regex to allow floats (no other styles must be included to preserve the float)
$regex = '/(\<[^>]+)style\="(?!float\:\s*[a-z]+;?")[^"]+"([^>]*\>)/';
// Regex to strip all styles
//$regex = '/(\<[^>]+)style\="[^"]+"([^>]*\>)/';
/* ----- ------ ----- */
require_once($modx->config['base_path'].'assets/libs/docmanager/document.class.inc.php');
$doc = new Document($id);
$doc_tvs = $modx->getTemplateVarOutput(array_merge($tv_fields, $custom_tv_fields), $id);
if ($doc_tvs)
{
foreach ($tv_fields as $field)
{
$doc->Set($field, $modx->db->escape(preg_replace($regex, '$1 $2', $doc_tvs[$field])));
}
foreach ($custom_tv_fields as $field)
{
if (isset($doc_tvs[$field]))
$doc->SetTV($field, $modx->db->escape(preg_replace($regex, '$1 $2', $doc_tvs[$field])));
}
foreach ($unchanged_text_tv_fields as $field)
{
$doc->Set($field, $modx->db->escape($doc->Get($field)));
}
$doc->Save();
}
A few points… I use the docmanager class on a regular basis and so I have used it here also. It does have a potential security flaw if you are not aware of how it works so I suggest you look up my posting on the issue in the MODx forums. If you use the above code as is, you will be fine; if you change it – read the comments.
The above config is by way of example set up to ‘clean’ styles from both the normal content field and a template variable named ‘subcontent’. If you have no richtext TVs other than the normal or standard one then you can change
$custom_tv_fields = array('subcontent');
to
$custom_tv_fields = array();
(actually you can just use the code above as is – it will ignore any TVs it cannot find rather than producing an error – which means it will work fine on different templates).
If you have other richtext TVs, then change it to (for example)
$custom_tv_fields = array('content2','content3');
There you have it – now enjoy consistent looking websites! Easy to install as per any plugin – just create a new plugin, paste in the code, and tick the OnDocFormSave event. Done!