Paste here

Clean up trailing whitespaces in sources

2009-06-02T16:52:00.003+02:00

My editor (emacs) is configured to remove trailing whitespaces in python files when I save them. This way I never commit modifications related to whitespaces changes, making the diffs readable since they contain relevant modifications only.

Unfortunately not everyone do that, and when it comes to contributing to an existing project it can be very difficult to produce readable patches: sometimes while the actual patch is just a one line change the diff will show dozens of blank changes, due to whitespaces clean up.

Diff has a switch to ignore whitespace changes, but this is incompatible with python: if you change the block level (indentation) it is just ignored.

To clean up all python files found under a directory I use a shell one-liner:

$ find . -name '*.py' -exec sed -i {} -e 's/[ \t]*$//' ';'

As usual it worked-for-me™ but comes with no warranty.

My workflow for contributing a clean patch is like this:

create a local branch with bazaar, mercurial or git. Of course it depends if the project is already using one of them, but if you branch from a subversion repository it's just a matter of preferences
clean whitespaces and commit (locally)
create and submit the patch normally

And here is the related configuration part for emacs:

;; whitespace cleanup
(defun my-py-no-trailing-space ()
; this hook is buffer local, can't add it globaly
(add-hook 'write-contents-functions 'delete-trailing-whitespace)
; if enabled, clear buffer at load time (this will automatically put the buffer in modified state, it might be annoying)
;(whitespace-cleanup)
)

(add-hook 'python-mode-hook 'my-py-no-trailing-space)

I believe that other editors and environments (including Eclipse) can be configured for this, too.

Installing Django, Solr, Varnish and Supervisord with Buildout

2009-05-29T14:54:00.015+02:00

Here I'll detail my buildout configuration for an install of Django, Solr (http search server, ), Varnish (http cache), and supervisord for controling solr and varnish. I'll show how to get a Debian init script for supervisord (of course instructions are valid for Ubuntu). I'll detail parts of base.cfg for each service, and I'll try to explain what and why.

This may not be the best way to do that but at least it works for me(tm), so I think it deserves to be shared.

My buildout doesn't handle (yet?!) the apache configuration, so I will not cover this. For the curious here is the intended http chain:

Apache listens on port 80 and forwards requests to Varnish on port 3128. Unlike a typical Zope setup I don't need rewrite rules (here), simple proxying is enough
Varnish reaches the backend (django) on port 8000
Apache listens on port 8000, and serves Django with wsgi. Hopefully it serves only localhost.
django may query solr on port 8983

Buildout files organisation

In my buildout directory I have:

base.cfg: this file contains the core configuration. Specific settings (for developpement, production...) are made in files that extends base.cfg.
templates/: this directory contains file templates used in my buildout, for example I put here the template of the supervisord init script
varnish-conf/, solr-conf/: I'm versionning configuration for theses services, since the configurations generated by the recipes needed adjustements

Here is the "buildout" part in base.cfg:

[buildout]
newest = false

newest: by default I don't want to check if eggs can be updated

versions = versions

I want to have an exact version for a given egg, It will be declared in "versions" section. For example one can set "my.app = 1.0.3" (If you are developping "my.app" you can unset this in dev.cfg by declaring "my.app = ")

parts =
  svn-products
  django
  solr-files
  solr
  solr-conf
  varnish-build
  varnish
  supervisor
  supervisord_init_script

parts are installed in order. I'll detail them in time.

find-links =
   http://dist.repoze.org/

A distribution of PIL can be found here (it is poorly referenced at pypi). Also if you cannot upload "my.app" at pypi (customer project anyone?) and you don't have an egg server you can put egg tarballs of "my.app" at a local web server and put the link in "find-links".

eggs =
   PIL
   lxml
   psycopg2
   django-extensions
   django-cachepurge
   Werkzeug
   my.app

[versions]
djangorecipe = 0.17.4
django-extensions = 0.4
Werkzeug = 0.5

Django

Parts related to django are "svn-products" and "django". "svn-products" allows me to get solango (not yet egg released :-( ).

[svn-products]
recipe = iw.recipe.subversion
urls =
   http://django-solr-search.googlecode.com/svn/trunk/solango solango

The django part. Note that I'm using django 1.0.2, since 1.1 is not yet released final. This is a matter of choice. Django 1.0.2 is available as an egg, but yet the recipe doesn't use this and downloads itself django.

[django]
recipe = djangorecipe
version = 1.0.2
control-script = django
wsgi = true
projectegg = my.app
eggs = ${buildout:eggs}
extra-paths = ${svn-products:location}

Solr

solr installation is made of 4 parts:

solr-files: download and unpack solr distribution:

[solr-files]
recipe = hexagonit.recipe.download
url = ftp://mir1.ovh.net/ftp.apache.org/dist/lucene/solr/1.3.0/apache-solr-1.3.0.tgz
md5sum = 23774b077598c6440d69016fed5cc810
strip-top-level-dir = true

solr: creates a runable instance of solr

[solr]
recipe = collective.recipe.solrinstance
solr-location = ${buildout:parts-directory}/solr-files
host = localhost
port = 8983

unique-key = uniqueID
default-search-field = text

index =
   name:uniqueID type:string indexed:true stored:true required:true
   name:text type:string indexed:true stored:true required:false omitnorms:false multivalued:true

solr-conf: I have added this to overwrite some config files in solr instance directory
```
[solr-conf]
recipe = iw.recipe.cmd
on_install = true
on_update = true
cmds =
   cp -v ${buildout:directory}/solr-conf/jetty.xml ${solr:jetty-destination}
   cp -v ${buildout:directory}/solr-conf/schema.xml ${solr:schema-destination}
   cp -v ${buildout:directory}/solr-conf/stopwords_fr.txt ${solr:schema-destination}
```
Why? because:
- for jetty.xml I made solr listen only on localhost, this was not by default. If you choose to customize jetty.xml you must change absolute paths by relative ones. For example for "RequestLog", the path must be changed to: "../../var/solr/log/jetty-yyyy_mm_dd.request.log"
- For schema.xml it is a bit different. The first times I have let the recipe generate it, but solango offers to output fields definitions from you application. Thus there is no reason to maintain them in buildout (in "solr" part). The command is:
```
bin/django solr --fields --path=/tmp
```
Then update schema.xml with the output.

solr-rebuild: "command" for reindexing django content (clear & rebuild)

[solr-rebuild]
recipe = iw.recipe.cmd
on_install = true
on_update = true

# since solr is not started by solr-instance but supervisord, solr-instance has
# no pid file and thinks that solr is down. Thus we must run it with
# solr-instance to be able to "solr-instance purge"
cmds =
   ${buildout:bin-directory}/supervisorctl stop solr
   cp -v ${buildout:directory}/solr-conf/schema.xml ${solr:schema-destination}
   ${buildout:bin-directory}/solr-instance start
   COUNT=15; echo "Waiting $COUNT s"; sleep $COUNT
   ${buildout:bin-directory}/solr-instance purge
   time ${buildout:bin-directory}/${django:control-script} solr --reindex --batch-size 100
   ${buildout:bin-directory}/solr-instance stop
   ${buildout:bin-directory}/supervisorctl start solr

Actually I could have made a template of a shell script with collective.recipe.template, and I'll probably change for that solution; I made this quickly and I didn't know yet about the possibilities of the template recipe. Right now to rebuild solr-index I have to type:

$ bin/buildout install solr-rebuild

Note that solr-rebuild part is not listed in buildout:parts, because I don't want to run it by default.

Varnish

Nothing really advanced here. I have just customized varnish configuration to change a few things, and to add a ping url (important for supervisord).

[varnish-build]
recipe = zc.recipe.cmmi
url = http://downloads.sourceforge.net/varnish/varnish-2.0.4.tar.gz

[varnish]
recipe = plone.recipe.varnish
daemon = ${varnish-build:location}/sbin/varnishd
bind = 127.0.0.1:3128
config = ${buildout:directory}/varnish-conf/varnish.vcl
telnet = localhost:8888
cache-size = 1G

# foreground is needed for supervisor to control varnish correctly
mode = foreground

How to add a ping url? in varnish.vcl, at the beginning of vcl_recv:

 # This url will always reply 200 whenever varnish is running
if (req.request == "GET" && req.url ~ "/varnish-ping") {
error 200 "OK";
}

For this I must admit I made a (very) quick search on the net; if anyone has a better solution please let me know!

Supervisor

[supervisor]
recipe = collective.recipe.supervisor
port = localhost:9001
user = admin
password = admin
plugins =
   superlance

# solr security settings: see
# http://docs.codehaus.org/display/JETTY/Connectors+slow+to+startup
programs =
   10 varnish  (startsecs=10) ${buildout:directory}/bin/varnish  true
   20 solr     (startsecs=10) java [-Djava.security.egd=file:/dev/urandom -jar start.jar] ${buildout:parts-directory}/solr true

eventlisteners =
   SolrHttpOk TICK_60 ${buildout:bin-directory}/httpok [-p solr -t 20 http://localhost:8983/solr/]
   VarnishHttpOk TICK_60 ${buildout:bin-directory}/httpok [-p varnish -t 20 http://localhost:3128/varnish-ping]

For programs I set "startsecs" to 10 seconds. This tells supervisor to wait 10 seconds before considering that the program is properly running. This is important if your services take a bit of time before properly serving: if an event listeners is ran and finds a failure it may ask supervisor to restart again the service (i.e. before the service could ever complete its startup).

Solr is not started with "bin/solr-instance fg", mainly because I needed to pass an aditionnal parameter (without it solr startup time was very long, from 1 to 5 min...).

The event listeners are configured to check varnish and solr every minute. They order to restart them if they fail to answer.

Supervisor Init script for Debian

[supervisord_init_script]
recipe = collective.recipe.template
input = templates/supervisord_init.in
output = ${buildout:bin-directory}/supervisord_rc

For making "templates/supervisord_init.in" I copied /etc/init.d/skeleton and edited it. Important: do "chmod +x templates/supervisord_init.in", the permission will be reported on the generated file. Here is the diff:

--- /etc/init.d/skeleton    2009-03-31 11:01:55.000000000 +0200
+++ templates/supervisord_init.in    2009-05-26 16:45:24.000000000 +0200
@@ -1,31 +1,31 @@
#! /bin/sh
### BEGIN INIT INFO
-# Provides:          skeleton
+# Provides:          supervisord
# Required-Start:    $remote_fs
# Required-Stop:     $remote_fs
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
-# Short-Description: Example initscript
+# Short-Description: initscript for supervisord at ${buildout:bin-directory}
# Description:       This file should be used to construct scripts to be
#                    placed in /etc/init.d.
### END INIT INFO

-# Author: Foo Bar 
+# Author: Bertrand Mathieu 
#
-# Please remove the "Author" lines above and replace them
-# with your own name if you copy and modify this script.
-
# Do NOT "set -e"

# PATH should only include /usr/* if it runs after the mountnfs.sh script
PATH=/sbin:/usr/sbin:/bin:/usr/bin
-DESC="Description of the service"
-NAME=daemonexecutablename
-DAEMON=/usr/sbin/$NAME
-DAEMON_ARGS="--options args"
-PIDFILE=/var/run/$NAME.pid
+DESC="Start/Stop supervisord at ${buildout:bin-directory}"
+NAME=supervisord
+DAEMON=${buildout:bin-directory}/$NAME
+DAEMON_ARGS=""
+PIDFILE=${buildout:directory}/var/$NAME.pid
SCRIPTNAME=/etc/init.d/$NAME

+# file owner will be used to run daemon
+OWNER=$(stat -c %U $DAEMON)
+
# Exit if the package is not installed
[ -x "$DAEMON" ] || exit 0

@@ -48,9 +48,9 @@
#   0 if daemon has been started
#   1 if daemon was already running
#   2 if daemon could not be started
-    start-stop-daemon --start --quiet --pidfile $PIDFILE --exec $DAEMON --test > /dev/null \
+    start-stop-daemon --start --quiet --pidfile $PIDFILE --exec $DAEMON --chuid $OWNER --test > /dev/null \
  || return 1
-    start-stop-daemon --start --quiet --pidfile $PIDFILE --exec $DAEMON -- \
+    start-stop-daemon --start --quiet --pidfile $PIDFILE --exec $DAEMON --chuid $OWNER -- \
  $DAEMON_ARGS \
  || return 2
# Add code here, if necessary, that waits for the process to be ready
@@ -68,7 +68,7 @@
#   1 if daemon was already stopped
#   2 if daemon could not be stopped
#   other if a failure occurred
-    start-stop-daemon --stop --quiet --retry=TERM/30/KILL/5 --pidfile $PIDFILE --name $NAME
+    start-stop-daemon --stop --quiet --retry=TERM/30/KILL/5 --pidfile $PIDFILE --chuid $OWNER --name $NAME
RETVAL="$?"
[ "$RETVAL" = 2 ] && return 2
# Wait for children to finish too if this is a daemon that forks
@@ -77,7 +77,7 @@
# that waits for the process to drop all resources that could be
# needed by services started subsequently.  A last resort is to
# sleep for some time.
-    start-stop-daemon --stop --quiet --oknodo --retry=0/30/KILL/5 --exec $DAEMON
+    start-stop-daemon --stop --quiet --oknodo --retry=0/30/KILL/5  --chuid $OWNER --exec $DAEMON
[ "$?" = 2 ] && return 2
# Many daemons don't delete their pidfiles when they exit.
rm -f $PIDFILE
@@ -93,7 +93,7 @@
# restarting (for example, when it is sent a SIGHUP),
# then implement that here.
#
-    start-stop-daemon --stop --signal 1 --quiet --pidfile $PIDFILE --name $NAME
+    start-stop-daemon --stop --signal 1 --quiet --pidfile $PIDFILE  --chuid $OWNER --name $NAME
return 0
}

Notes:

the daemon is run by the owner of bin/supervisord. Most of the time it is the user who has ran buildout (hopefully it is not root!)
I have used a bash construct to get the owner ("OWNER=$(stat -c %U $DAEMON)"), this could be changed for pure sh
thus bin/supervisord_rc (start | stop) can be run by this user, without the need for "sudo". Without this "solr-rebuild" could not work.

To install it in init.d:

$ cd /etc/init.d
$ sudo ln -s /path/to/buildout/bin/supervisord_rc my_preferred_service_name
$ sudo updated-rc.d my_preferred_service_name defaults

django-cachepurge 0.1a

2009-05-25T17:30:00.007+02:00

I have released django-cachepurge 0.1a. It is available as an egg for an easy installation with buildout or virtualenv+easy_install, for example.

This package allows django to purge HTTP cache when a model instance is changed or deleted. It does this by sending asynchronous "PURGE" requests to one or more upstream HTTP cache (such as Squid or Varnish). It is inspired by Plone CacheFu components (more specifically: CMFSquidTool).

Unfortunatly Django does not have a "post_commit" signal (it would be the best place to do such a job), so purge requests are sent when response has been computed: if an exception occurs during response the urls are not purged. This is done by the middleware.

Pre-requisite: the cache must be configured to accept and handle "PURGE" requests from the server where the django application is hosted.

Configuration on django side:

The application must be the first app declared in settings.INSTALLED_APP. The reason is that it listens to the class_prepared signal to connect post_save and post_delete handlers on eligible models (more on that below). If you put other app before django-cachepurge it may miss their models. Note that the package name uses an underscore.
INSTALLED_APPS = ( 'django_cachepurge', ... )
add "django_cachepurge.middleware.CachePurge" in settings.MIDDLEWARE_CLASSES
define settings.CACHE_URLS to the cache root for django. CACHE_URLS can be a single string or an iterable of strings. For example:
CACHE_URLS = 'http://127.0.0.1:3128'

How urls are found?

If the model has a get_absolute_url method, this url will be purged. Additionnaly you can define "get_purged_urls": it should return a list of urls. This is useful for "through" models used in M2M relation to invalidate url of linked contents for example. If the model has none of these methods, nothing happens (the signals are not connected).

Pypi: http://pypi.python.org/pypi/django-cachepurge/
Launchpad: http://launchpad.net/django-cachepurge/

Defining and accessing macros located in browser:page template

2009-02-10T14:29:00.007+01:00

The case: I want to define a simple browser page (let's name it "mypage") for a page template where I defined some metal macros (in my case it is a template for an archetypes field). My product does not provide a skin for portal_skins, I don't want to add a layer and all the generic setup stuff just for a single template. I'm using plone 3.1.

The problem: @@mypage/macros does not work as legacy portal_skins page templates used to.

Solution: define a simple class like this:

from Products.Five import BrowserView

class MacrosView(BrowserView):

@property
def macros(self):
return self.index.macros

The ZCML for "mypage":

<browser:page
for="*"
name="mypage"
class=".macros.MacrosView"
template="mypage.pt"
allowed_attributes="macros"
permission="zope.Public" />

There could be a better, less verbose solution (like providing a meta definition for zcml, in order to avoid declaring "class" and "allowed_attributes"). We could also patch Five BrowserView.

In my case I have been able to use mypage as a template for my archetypes widget:

MyWidget(macro="@@mypage",)

Compared to legacy PT you will loose some builtins (like python: test()), but that kind of logic should be (easily) moved into a dedicated view class. This is noticeable in the case you are customizing an old template (like archetypes/widgets/file.pt ;-))

Dunno if it is the "right way of doing things", at least it worked-for-me(tm).

Useful script in a plone developer toolbox

2009-01-21T11:30:00.006+01:00

Sometimes something weird happen on the production site and you have to investigate on data from that site because you can't reproduce the problem on development site. When it's really hard you have to copy the Data.fs and run a separate instance to work on it. What I'm putting here is a script that changes all users passwords to their id, and also change email property: this allows to login as anybody easily, and no mail can be sent to the actual users. All you have to do is create a "Script (Python)" in ZMI at portal root, put this code and click "test". It's not a revolution, it's not-so-good-practice(tm), it's just a convenience ;-)

mtool = context.portal_membership.aq_inner
pu = context.plone_utils.aq_inner
acl = context.acl_users.aq_inner
count = 0

for uid in acl.getUserIds():
   count += 1
   acl.userSetPassword(uid, uid)
   member = mtool.getMemberById(uid)
   pu.setMemberProperties(member, email='me.the.developper@mydomain.tld')
   print uid

print
print count, "users"
return printed

Profiling made easy

2009-01-02T15:00:00.010+01:00

Recently I had to profile some pages on a Plone 2.5 (zope 2.9). I collected some datas on interesting pages with the help of the well-known ZopeProfiler 1.7.2 but I had to patch it to avoid an error:

--- ZopeProfiler.py~ 2007-06-26 10:43:25.000000000 +0200
+++ ZopeProfiler.py 2008-12-22 18:02:26.000000000 +0100
@@ -393,10 +393,10 @@
# Five broke 'getPhysicalPath' for its view classes -- work around
try: p= gP()
except:
-      _log.error("calling 'getPhysicalPath' failed for %r", s,
-                 exc_info=sys.exc_info()
-                 )
-      return
+        #         _log.error("calling 'getPhysicalPath' failed for %r", s,
+        #                    exc_info=sys.exc_info()
+        #         )
+        return ('?', _Empty, fn)
if type(p) is StringType: fi= p
else: fi= '/'.join(p)
return (fi,_Empty,fn)

A few years ago we had no other option than digging in the raw stats as they come from Stats.sort_stats().print_stats(). Since then there is a new tool: Gprof2dot. The author also made something more than handy: xdot.py.

Now just add a little bash function:

$ function build_dot() { ./gprof2dot.py -f pstats -o $(basename $1 .pstats).dot $1; }

Then my workflow for profiling some pages could be faster and easier:

Profile a page, and save "some_page.pstats"
run "build_dot some_page.pstats"
run "./xdot.py some_page.dot"
visit the graph

Here is the first overview:

The mouse wheel allows to zoom in/out, holding left-click and moving the mouse will move the graph. It's quite easy to quickly find some hotspots, sometimes they will appear very obviously:

I can read: 69% of total time spent in schema copy. In this particular case I know there is just one object with a "Schema" method, so probably it would be a good idea to review the code here to reduce the number of schema copies, or thinking about adding some cache decorator if it's possible (like plone.memoize). The graph does not tell what to do, though ;-)

Another interesting hotspot (in plone 2.5): for some pages up to 15% of the time in spent in... getAllowedTypes (just 1 call - nearly 11% in pythonproducts.py __bobo_traverse__).

GenericSetup and dependence on circular dependencies problem

2009-01-02T11:52:00.008+01:00

As of Plone 3.1.6 there is a problem with import step dependencies: if you register a custom import step through zcml, and if this step depends on "portlets", "content" or "plone-final", then your import step will be inserted before its dependencies. This is because local steps (i.e. defined in an import_steps.xml file) are listed after ZCML ones, and in its final loop GS ordering method will insert remaining steps as they comme.

The big problem is when you must to execute "mysite-final" after "portlets" for example.

There is a related ticket on plone.org, I have added a comment with a patch (and tests) for GS to deal better with this kind of dependencies. It may be useful now for someone. This ticket may not be the best place to put that, but sadly I really don't have the time to discuss it in the right mailing list.
Here is the idea: basically the final loop is modified to insert first any step involved in circular chain, and then it will try to insert remaining ones with dependency resolution. Thus "mysite-final" will always be inserted after "portlets".

Managing Apache virtualhosts and squid with iw.recipe.squid

2008-12-04T12:59:00.007+01:00

I have a number of plone sites on a server, generally one site is associated with one zope instance. I am using buildout and iw.recipe.squid to add easily a new virtualhost with squid proxying. Im my case the server is running Ubuntu server 8.04, but this should apply without change to a Debian server. This may look overkill for juste one or two sites, but in my case I may have up to 30 different sites (of course all zope instances are not on localhost!).

Let's suppose I have these sites:

www.site1.net, zope instance on localhost (127.0.01) port 8080, plone path is located at /site1/portal
www.someothersite.com, zope instance on 10.2.0.5 port 9080, plone path /somepath/portal

buildout.cfg:

[buildout]
parts = squid
versions = versions

[versions]
iw.recipe.squid = 0.9

[squid]
recipe = iw.recipe.squid
squid_owner = proxy
squid_visible_hostname = myservername
squid_cache_dir = /var/cache/squid
squid_log_dir = /var/log/squid

squid_accelerated_hosts =
 www.site1.net: 127.0.0.1:8080/site1/portal
 www.someothersite.com: 10.2.0.5:9080/somepath/portal

After running buildout for the first time:

make a symbolic link "/etc/squid.conf" pointing to parts/squid/etc/squid.conf
run "bin/squidctl createswap" if required
check that squid starts normally and that helper processes are running, too (iRedirector.py, squidAcl.py, squidRewriteRules.py)

After having added one or more site:

in /etc/apache2/sites-available create a symbolic link for all config files located in parts/squid/apache. In our case they should be named "vhost_www.site1.net_80.conf", etc
run a2ensite to active sites ("a2ensite vhost_www.site1.net_80.conf")
reload apache

Adding a new site will be just a matter of adding a new line in squid_accelerated_hosts, running buildout, making the symbolic links and reloading apache.

As of iw.recipe.squid 0.9, squid and apache log reside in the same directory, but the next release should allow to have different directory; it should also allow you to set "combined" rather than "common" log format for apache.

Of course do not forget to configure properly CacheSetup on every plone site.

How to change add permision of another product content

2008-10-16T18:21:00.006+02:00

In a customer project I have had to use CalendarX. CalendarX uses "Add portal content" for content add permission. But they wanted that only "Manager" be able to add a calendar. I could have gone straightforward by overriding "getNotAddableTypes" script in portal_skins, but I chose to try to patch the product from mine (let's call it "my.product").

I first tried to import CalendarX and change the permission name with a simple monkey patch, but I immediately had to dig into zope initialization process: "my.product" is imported after Products.CalendarX, because "my.product" gets loaded by Products.Five initialisation, and Five is loaded after CalendarX.. so my patch arrives too late in the init process. What to do? try harder! I found I could reload the product after having patched the permission.

Honestly I don't know if it can raise issue, and I would prefer something that looks less "hacky", but...
So, here is the guilty code!

# this is my/product/__init__.py
import logging
LOG = logging.getLogger(__name__)

def initialize(context):
"""Initializer called when used as a Zope 2 product."""

from Products import CalendarX
cxf_perm = "%s: Add CalendarX content" % (PROJECT_NAME,)
setDefaultRoles(cxf_perm, ('Manager',))
CalendarX.DEFAULT_ADD_CONTENT_PERMISSION = cxf_perm
CalendarX.config.DEFAULT_ADD_CONTENT_PERMISSION = cxf_perm

# also patch this, else it will try to register the profile twice
# and the registry will complain
from Products.GenericSetup.registry import ProfileRegistry
dummy_registry = ProfileRegistry()
std_registry = CalendarX.profile_registry
CalendarX.profile_registry = dummy_registry

# FIXME: is there anything cleaner to get app?
app = context._ProductContext__app
from OFS import Application
Application.reinstall_product(app, 'CalendarX')
CalendarX.profile_registry = std_registry
LOG.info("patched CalendarX.DEFAULT_ADD_CONTENT_PERMISSION: use '%s'"
        % cxf_perm)

iw.eggproxy 0.2.0

2008-09-22T11:08:00.004+02:00

We have released iw.eggproxy 0.2.0. Bugs fixed:

package index/download files: skip modules installed in local system (resulted
in copying a directory instead of downloading a file)
update script crashed with invalid/obsolete package name
get eggs distributions for all versions/platforms, instead of system ones
malformed tag in generated indexes

We are going to rename it to collective.eggproxy. The next release will provide a standalone server and (hopefully) a WSGI application. This should make it easier to start with it.

A sprint occured this summer in the topic, a few months after iw.eggproxy was released, and some people chose to create a full mirorring tool called z3c.pypimirror.
Here are some differences with z3c.pypimirror:

eggproxy relies on setuptools. It sees what "easy_install" can see: no more, no less. Also it does not include its own machinery to read pypi indexes and follow links.
eggproxy provides on-demand any egg provided at pypi. You don't need to know in advance what packages you will need, you don't need to download them before: just ask for them as if you were directly on pypi server
OTOH if the server has network problems to reach pypi eggproxy may not be able to serve an egg, where z3.pypimirror would have already downloaded it. This case happens only if eggproxy has not already served once the egg.
z3c.pypimirror creates a static directory layout suitable for any HTTP server; because of its nature eggproxy needs apache and mod_python (next release should change this strict requirement by providing a standalone service and/or WSGI).

Five 1.5.6 testbrowser and recent version of mechanize

2008-06-24T18:39:00.004+02:00

Doing development with Plone 3.1, I encountered a KeyError when I tried to use Five.testbrowser.Browser:

Browser()
KeyError: '_seek'

I eventually found that my system (currently Ubuntu 8.04) had the package "python-mechanize" 0.1.7b, which masks the one provided by Zope 2.10 (0.1.2b). Unfortunatly there has been API changes between 0.1.2b and 0.1.7b... (see classes UserAgent/UserAgentBase)

The solution is to tell buildout to get mechanize egg 0.1.2b: this way it will mask the system library.

[buildout]
versions = versions
eggs=
   ...
   mechanize

[versions]
mechanize = 0.1.2b

Et voila!

Buildout, ploneldap and ldap products (2)

2008-06-09T17:41:00.002+02:00

In my previous post I described how to get a buildout part to get PloneLDAP with an up-to-date LDAPUserFolder. Things are even simpler now since all involved products have been released as eggs. So just list theses in your buildout along with other eggs:

eggs =
...
Products.LDAPUserFolder
Products.LDAPMultiPlugins
Products.PloneLDAP

Second important thing: LDAPUserFolder 2.9 has been released. If you used 2.9-beta you must upgrade, since an important bug related to the negative cache has been fixed. PloneLDAP documents usage with LDAPUserFolder 2.8, but 2.9 is definitely worth it: The negative cache feature avoids doing too many ldap request, especially useless/unsuccessful ones! Many thanks to Jens Vagelpohl (http://www.dataflake.org/software/ldapuserfolder).

Quick workflow migration from plone 2.x to plone 3

2008-06-09T15:53:00.003+02:00

When migrating a site from plone 2.x to plone 3, I often have to port customized workflows. Back in the plone 2.0 days we hadn't generic setup, and workflows were created with python code (with the help of DCWorkflowDump), or directly created in the ZMI and installed on final site by importing a zexp.

Nowadays we want to use generic setup (GS).

The simplest way I have found is to export the workflow as a zexp, import this zexp into the new site, and then make a GS export. Done! you've got a nice XML definition of your workflow, ready to be included in your product GS profile.

Once I encountered one caveat: if the original workflow contains strings that are not in plain ascii (in titles, etc...), the GS export will fail. You'll have to relabel properly everywhere, and add all those labels (msgids) to your translation files (I think i18ndude does it for you).

iw.eggproxy: a proxy for pypi

2008-06-06T16:36:00.004+02:00

Today we have released iw.eggproxy. It's a module for apache mod_python. Its purpose is to serve as a pypi proxy.

The first motivation to make it was because we have had to work in a private network with a very very slow internet access: for example updating a plone buildout could take more than one hour (just checking eggs freshness) when it should be no more than a few minutes. This condition also prevent to run some kind of rsync against pypi. So the obvious solution was to proxy the eggs we need, on-demand.

The module is installed as a handler on a Location. When accessing this location, eggproxy will serve an index similar to pypi simple view.

This is done like this:

we already have the information in index.html: just let apache serve the file
or, we use setuptools to fetch index information, build index.html, and let apache serve the file

This allows eggproxy to serve all available eggs from pypi, without actually having to download the whole pypi + content.

Then easy_install can see a package is available on our server, and tries to fetch information on available eggs:

the subdirectory and its index.html already exists: just let apache serve the file
or, we use setuptools again to get package information, make the subdirectory (package name) and build index.html

Finally, when trying to fetch an egg we do the same:

the file is already present and is served by apache
or we use setuptools to get the file on the server

iw.eggproxy also provides an update script: "eggproxy_update". This script refreshes the main index and all proxied eggs, if their index.html is older than the interval specified in the configuration file (24h by default).

We have installed it here: http://release.ingeniweb.com/pypi.python.org-mirror

Known bugs: some packages on pypi don't have eggs, then eggproxy does not respond. This is the case with "reportlab" for example.

Enhancements:

indexes aggregation. At ingeniweb we plan to install it on local server and agregate pypi and some private eggs indexes.
Standalone/pluggable server. Currently we are bound to apache + mod_python, which may not suit to anyone.

Buildout, ploneldap and ldap products

2008-04-22T14:36:00.004+02:00

UPDATE: all products are available as eggs. See this post.

PloneLDAP is a great product, but its bundle ships a not so recent version of LDAPUserFolder. LDAPUserFolder and LDAPMultiPlugins download urls are both ending with "/download". That seems to confuse plone.recipe.distros, because it installs just one of them. Furthermore, at plone.org PloneLDAP single product (as opposed to PloneLDAP bundle) download url ends with... a space! (%20).
So we have uploaded theses tarballs on release.ingeniweb.com. Here is a working part for buildout:


[ldap_products]
recipe = plone.recipe.distros
urls =
http://release.ingeniweb.com/third-party-dist/LDAPUserFolder-2.9-beta.tgz
http://release.ingeniweb.com/third-party-dist/LDAPMultiPlugins-1.5.tgz
http://release.ingeniweb.com/third-party-dist/PloneLDAP-1.0.tar.gz

iw.rejectanonymous: private site with plone 3.0

2008-02-11T14:56:00.000+01:00

We have made a small package to provide the functionality described in my previous post. It is named "iw.rejectanonymous".

Quick recipe to use it from an integration product (i.e a product responsible of setting up a plone site for your particular environment/customer/...):

Add in configure.zcml:

<include package="iw.rejectanonymous" />

Add python code to activate it for your site. This is probably done in a function called by generic setup, this is often located in setuphandlers.py:

from zope.interface import alsoProvides
from iw.rejectanonymous import IPrivateSite

def setupPortal(portal):
if not IPrivateSite.providedBy(portal):
   alsoProvides(portal, IPrivateSite)

The second step can be done through the ZMI with the "Interfaces" tab.

Making a private site with plone 3.0

2008-01-31T16:04:00.001+01:00

UPDATE: we have made a product for this: iw.rejectanonymous

There is a document at plone.org suggesting to use plone 3.0 builtin "intranet" workflows, however this will not make a site absolutely private, i.e. force user to login before he can view anything. This is the use case for an extranet for example.

In the past we used to put something like tal:define="dummy here/rejectAnonymous" in global_defines.pt, rejectAnonymous was a skin script. Now with the help of events we can do far better, and it will work for any content/object within a plone site. As a consequence we must be careful about what is allowed to be retrievied anonymously, since anonymous should be able to see a themed login page.

The idea has been taken from plone.aftertraverse. An event is sent before traversal, but not immediatly after. The problem is that authentication is performed after traversal. Fortunately the request object accepts to register post traverse hooks, with arbitrary parameters.

The code, zcml part:

<subscriber handler=".hooks.insertRejectAnonymousHook"
 for="Products.CMFCore.interfaces.ISiteRoot
      zope.app.publication.interfaces.IBeforeTraverseEvent"
 />

and hooks.py:

# -*- coding: utf-8 -*-
from zExceptions import Unauthorized

valid_subparts = set(('login.js', 'spinner.gif',
                   'portal_css', 'portal_javascripts'))

def rejectAnonymous(portal, request):
 mtool = portal.portal_membership
 if mtool.isAnonymousUser():
     url = request.physicalPathFromURL(request['URL'])
     if url and not (url[-1] in ('login_form', 'require_login')
                     or [path for path in url
                         if path in valid_subparts]):
         raise Unauthorized, "You must be authenticated"


def insertRejectAnonymousHook(portal, event):
 """
 """
 event.request.post_traverse(rejectAnonymous, (portal, event.request))

The code checking for allowed path may not be the best, and it could certainly be more clever but for-me-it-worked(tm)

VersionConflict in buildout

2008-01-09T16:38:00.000+01:00

Today I tried to update my buildout from plone 3.0.4 to 3.0.5. In the relevant section I just changed:

- recipe = plone.recipe.plone==3.0.4
+ recipe = plone.recipe.plone==3.0.5

and re-run buildout, but I got a "VersionConflict" error:

[...]
Uninstalling plone.
While:
 Installing.
 Uninstalling plone.
 Loading recipe 'plone.recipe.plone==3.0.4'.

An internal error occured due to a bug in either zc.buildout or in a
recipe being used:

VersionConflict:
(plone.recipe.plone 3.0.5 (/home/bmathieu/.buildout/eggs/plone.recipe.plone-3.0.5-py2.4.egg), Requirement.parse('plone.recipe.plone==3.0.4'))

The only way I found to get rid of this is to edit the hidden file named ".installed.cfg", and replace the line:

- recipe = plone.recipe.plone==3.0.4
+ recipe = plone.recipe.plone

Then buildout could finish its job. I don't know if this is clean, but it may help.

Maintenance mode for Apache

2007-12-07T15:30:00.000+01:00

This has been tested with Plone behind Apache but should apply to any site with Apache as a front server.
In maintenance mode we want Apache to serve a single static page with its associated ressources, and have any other requests redirected to this page. The static files are:

maintenance.html : the main file
ressources/ : directory containing CSS, javascript, images...
robots.txt (instruct to not index maintenance.html)

To create a page with the same design of the real site the firefox extension Save Complete is of great help.

There is 2 strategies:

Doing it dynamically by testing a parameter; in our case it will be the presence of a file
With an alternative configuration file, swapped with the normal one at maintenance time

In this example we are running a plone site on the same host, port 8080.

1. Dynamically:

DocumentRoot /static/files

Options FollowSymLinks
AllowOverride None
Order deny,allow
Allow from all
Satisfy all

DirectoryIndex maintenance.html

RewriteEngine on

The next rule tests the presence of the regular file "/some/path/maintenance.txt"; if it exists we set the environment variable "MAINTENANCE" to 1.

RewriteCond /some/path/maintenance.txt -f
RewriteRule ^(.*)$ - [env=MAINTENANCE:1]

In maintenance mode we want the browser to never perform any cache: as soon as we'll return to production mode, the normal site should appear. Important: the headers module must be enabled.

Header set cache-control "max-age=0,must-revalidate,post-check=0,pre-check=0" env=MAINTENANCE
Header set Expires -1 env=MAINTENANCE

Next rule instructs to let pass any request matching maintenance files (maintenance.html + CSS/JS/images ressources), else redirect to the maintenance page.

RewriteCond %{ENV:MAINTENANCE} 1
RewriteCond %{REQUEST_URI} ^/ressources [OR]
RewriteCond %{REQUEST_URI} ^/maintenance.html [OR]
RewriteCond %(REQUEST_URI) ^/robots.txt [OR]
RewriteRule .* - [L]

RewriteCond %{ENV:MAINTENANCE} 1
RewriteCond %{REQUEST_URI} !^/maintenance.html
RewriteRule ^.* /maintenance.html [L,R]

The next and last rewrite rule is for normal mode.

RewriteCond %{ENV:MAINTENANCE} !1
RewriteCond %{HTTP:Authorization}  ^(.*)
RewriteRule ^(.*) http://localhost:8080/VirtualHostBase/http/%{HTTP_HOST}:80/plone/site/VirtualHostRoot/$1 [P]

Now, to put the site in maintenance mode, just do "touch /some/path/maintenance.txt"; delete maintenance.txt to go back in normal mode.

2. Using an alternative configuration file:

This case is much simpler. First create a static configuration:

DocumentRoot /static/files
RewriteEngine on

RewriteCond %{REQUEST_URI} ^/ressources [OR]
RewriteCond %{REQUEST_URI} ^/maintenance.html [OR]
RewriteCond %(REQUEST_URI) ^/robots.txt [OR]
RewriteRule .* - [L]

RewriteCond %{REQUEST_URI} !^/maintenance.html
RewriteRule ^.* /maintenance.html [L,R]

Header set cache-control "max-age=0,must-revalidate,post-check=0,pre-check=0"
Header set Expires -1

On Debian this file must be placed in /etc/apache2/sites-available. If the site configuration file is named "plone", you can name its maintenance counterpart "plone-maint" for example. To switch to maintenance mode:

> sudo a2dissite plone
> sudo a2ensite plone-maint
> sudo /etc/init.d/apache reload

To go back to normal mode just switch "plone" and "plone-maint".

Enhancement not covered here:

While the site is supposed to show a maintenance page, it may be desirable to let the maintainer access the site through apache: this should be done with a rewrite condition/rule placed first (condition based on Ip address for example).