David Glick – Plone developer
reflections on PyCon 2010
Some highlights of the talks I attended were:
- "Building Leafy Chat, DjangoDose, and Hurricane: Lessons Learned on the Real-Time Web with Python" by Alex Gaynor – Introduced me to Orbited, Twisted, Redis, and other tools for building scalable, interactive websites.
- "Managing the world's oldest Django project" by James Bennett – I found myself drawing parallels between the evolution of Django and Ellington that James presented and that of Zope and Plone. The Django community is learning the same lessons about testing and reusability that we have.
- "What every developer should know about database scalability" by Jonathan Ellis – good general overview of different strategies for replication and caching (focused on concepts rather than any particular software)
- "Powerful Pythonic Patterns" by Alex Martelli – philosophizing on software patterns and anti-patterns in the Python context
- "Demystifying Non-Blocking and Asynchronous I/O" by Peter A Portante – very helpful beginner-level overview
- "Unladed Swallow: fewer coconuts, faster Python" by Collin Winter – an update on the state of Unladen Swallow, which was approved for being merged into CPython during the language summit just before PyCon
- "Pynie: Python 3 on Parrot" by Allison Randal – This one was for fun...I might keep an eye on Pynie just to see how a language actually gets implemented.
- "How Python is guiding infrastructure construction in Africa" by Roy Hyunjin Han – Covered the use of Python for recognizing buildings in satellite imagery to help with planning development, etc.
- "Why not run all your tests all the time? A Study of continuous integration systems" by C. Titus Brown – Bottom line: "Use Hudson."
- the infamous Testing in Python BoF, which was a 3-hour lightning talk session organized one evening by the folks from Disney, complete with pizza, beer, heckling, and goats (the goat meme was introduced by Terry Peppers as an alternative to lolcats in slides, and ended up being adopted as a testing mascot).
- "Tests and Testability" by Ned Batchelder – Not a lot new here for me, but a good overview by the creator of coverage.py.
Selecting which talk to go to was sometimes excruciating, and I'm looking forward to catching up with some of the ones I missed. Some of the ones I've heard recommended are:
- "Deployment, development, packaging, and a little bit of the cloud" by Ian Bicking
- "The state of Packaging" by Tarek Ziadé
- "Scaling your Python application on EC2" by Jeremy Edberg – learnings from reddit
- "Dude, Where's My Database?" by Eric Florenzano
- "Understanding the Python GIL" by David Beazley – the hot topic of the conference
- "The Python and the Elephant: Large Scale Natural Language Processing with NLTK and Dumbo" by Nitin Madnani and Jimmly L. Lin
Videos of the talks are, amazingly, already becoming available. Kudos to the A/V team.
On Sunday my attention waned and I got a bit mischievous. The Eldarion guys, who created Type War, set up OHWar, a type war clone where you compete to correctly guess who said various quotes that were overheard at PyCon. After playing for far too long and still failing to stay in first place for long, I decided it was a job for Python and created an OHWar-playing bot. I left it running in screen and came back a few hours later to find that I had not only topped the leaderboard but also hit the game's built-in score limit. :) This was also the evening that David Brenneman and I found the Django Pony unattended and added some "enhancements." ;)

Zope and Plone were not very visible in the conference schedule (there was one talk on Plone GetPaid and Satchmo, one on using Plone with Salesforce in which I contributed a few minutes of technical material to go with Chris Johnson's high-level overview, and one on the interface/adapter concepts...as well as a couple relating to repoze.bfg which has a Zopish ancestry). On the other hand, I believe Plone was, surprisingly, the only open source project with a booth in the exhibition hall. We had a nice-looking display with the Plone banner that continues to be passed around to US events, a bunch of collateral and books for display, and a big monitor for demoing Plone 4. Various people took turns staffing the booth, including members of the Atlanta Plone group, and Chris Calloway for much of Saturday. The Plone Foundation also subsidized World Plone Day T-shirts which a bunch of us wore on Saturday. We gathered for a photo and ended up with around 30 people.

During the conference, a highlight for me was meeting and eating meals with various luminaries, including Jason Huggins (of Selenium fame), Holger Krekel (founder of PyPy), Wesley Chun (author of Core Python Programming) and even Guido himself (well, way down at the other end of the table). I also got to interact briefly with Allison Randal (from the Perl community), while trying out and submitting a new test for pynie, a nascent Python implementation for the Parrot VM. I also now have a face to put with many additional names that I had only seen online before.
I was only able to join the sprints for one day, and mostly spent my time working on some miscellaneous tasks I hadn't been getting too. However we were able to have a good meeting of GetPaid folks, to try to determine how to move forward with Brandon Rhodes' work to clean up payment processor configuration. I also did some refactoring of the GetPaid development buildout to clean it up, make sure it still works, and pave the way for updating the product for compatibility with Plone 4. If I had been able to stay longer, I think it would have been fun to participate in the great work being done in the Python packaging sprint, led by Tarek Ziadé and the Packaging Pig. Next year I will have to be sure to attend the entire sprint.
Reflections on building a member directory using Plone and Salesforce.com
The goal of the project is to provide web access to a directory of businesses who have paid for membership and inclusion in our client's directory -- while keeping the master data for the directory within Salesforce.com, not Plone. This involves several crucial challenges:
- How to present views for searching and browsing the Salesforce directory data within Plone
- How to provide the ability for businesses to log in and update their member profile
- How to provide the ability for businesses to apply and complete payment for membership, as well as to renew membership each year.
In this article, I'm going to focus on explaining how I approached the first two challenges. This is much more of a hand wave in the right direction, assuming a fair amount of background in Plone, than a detailed tutorial. That said, feel free to ask me questions about aspects of the implementation that I gloss over here.
Exposing the directory within Plone
Querying Salesforce directly on each request is a non-starter for many use cases. That's because Salesforce puts a pretty low limit on the number of API requests allowed per day (something like 1000 per user license). This means that we need a way to mirror data from Salesforce within Plone, and then update it in batch (thereby using fewer API requests) every night. (Building the directory as VisualForce pages within Salesforce Sites would be a valid alternative in some cases -- though requiring more work to integrate visually. But for this project it was a requirement that we be able to store additional data such as logos within Plone, as well as link to related content items for a business.)
How do we model data from Salesforce within Plone? It depends on what you need to do with the content in Plone. If you just need to be able to search and display a listing of results, then there is no reason to create full-fledged content items. In the past, for a case like this, I have just created temporary stub objects during a nightly dump of data from Salesforce, indexed them in a custom catalog, and then discarded the stubs. This is the most lightweight option; you have a catalog full of data for building your search views, but no unnecessary data hanging around.
If you actually need to be able to navigate to a full page view of a particular directory item, then you probably need an actual content item. I think Dexterity would be promising for this sort of thing, but for the project I'm just now wrapping up, I used Archetypes because I needed image scaling and the ability to link to other AT content as related items, both of which Dexterity doesn't have great support for yet.
Note that you don't actually need to define most aspects of the schema, if there are fields you want to display but don't need to have editable within Plone. For example, my schema looks something like this:
MemberProfileSchema = document.ATDocumentSchema.copy() + atapi.Schema((
atapi.TextField('sf_id'),
atapi.TextField('mailingAddress'),
# etc...
))
# hide most fields
for field in MemberProfileSchema.fields():
if field.schemata == 'default' and field.__name__ not in ('text',):
field.widget.visible = {'edit':'invisible', 'view':'visible'}
Fields like mailingAddress get populated during the nightly data dump, but don't appear on the edit form if you edit the member profile. Why not? Well, mostly because I figured it would be hard to get an Archetypes edit form to save things to Salesforce as well as Plone. Alex Tokar at Web Collective tells me he has successfully taken this approach, though.
Here is an abbreviated version of the browser view that is called once a night to pull in the data from Salesforce:
"""
SFDC sync view. This is intended to be run via cron every night to update
the member profiles based on data from Salesforce.com.
It will:
* Find all Accounts with a member status of 'Current' or 'Grace Period' (in
our client's Salesforce schema this is a custom rollup field based on various
criteria).
* For each Account, find an existing Member Profile object in Plone whose
'sf_id' field value equals the Id of the Account, and update it.
* Or, if no existing Member Profile was found, create a new one and publish it.
* Retract any existing Member Profiles that were no longer found as Accounts
with the Active or Grace Period membership status in Salesforce, so they are
still present but not publicly visible.
"""
import logging
import transaction
from zope.component import getUtility
from Products.Five import BrowserView
from Products.CMFCore.utils import getToolByName
from plone.i18n.normalizer.interfaces import IIDNormalizer
from Products.CMFPlone.utils import safe_unicode
from Products.CMFPlone.utils import _createObjectByType
SOBJECT_TYPE = 'Account'
FIELDS_TO_FETCH = (
'Id',
'Name',
'Description',
'BillingStreet',
'BillingCity',
'BillingState',
'BillingPostalCode',
# etc...
)
FETCH_CRITERIA = "Member_Status__c = 'Current' OR Member_Status__c = 'Grace Period'"
DIRECTORY_ID = 'directory'
PROFILE_PORTAL_TYPE = 'Member Profile'
logger = logging.getLogger('SFDC Import')
class UpdateMemberProfilesFromSalesforce(BrowserView):
def __init__(self, context, request):
BrowserView.__init__(self, context, request)
self.catalog = getToolByName(self.context, 'portal_catalog')
self.wftool = getToolByName(self.context, 'portal_workflow')
self.normalizer = getUtility(IIDNormalizer)
def getDirectoryFolder(self):
portal = getToolByName(self.context, 'portal_url').getPortalObject()
# create the directory folder if it doesn't exist yet
try:
directory = portal.unrestrictedTraverse(DIRECTORY_ID)
except KeyError:
_createObjectByType('Large Plone Folder', portal, id=DIRECTORY_ID)
directory = getattr(portal, DIRECTORY_ID)
return directory
def findOrCreateProfileBySfId(self, name, sf_id):
res = self.catalog.searchResults(getSf_id = sf_id)
if res:
# update existing profile
profile = res[0].getObject()
logger.info('Updating %s' % '/'.join(profile.getPhysicalPath()))
return profile
else:
# didn't match sf_id or UID: create new profile
name = safe_unicode(name)
profile_id = self.normalizer.normalize(name)
directory = self.getDirectoryFolder()
profile_id = directory.invokeFactory(PROFILE_PORTAL_TYPE, profile_id)
profile = getattr(directory, profile_id)
profile.setSf_id(sf_id)
profile.reindexObject(idxs=['getSf_id'])
logger.info('Creating %s' % '/'.join(profile.getPhysicalPath()))
return profile
def updateProfile(self, profile, data):
profile.setSf_id(data.Id)
profile.setTitle(data.Name)
if not profile.getText():
profile.setText(data.Description, mimetype='text/x-web-intelligent')
profile.setMailingAddress("%s\n%s, %s %s" % (data.BillingStreet, data.BillingCity,
data.BillingState, data.BillingPostalCode))
# etc...
# publish and reindex
try:
self.wftool.doActionFor(profile, 'publish')
except:
pass
profile.reindexObject()
def hideProfileBySfId(self, sf_id):
res = self.catalog.searchResults(getSf_id = sf_id)
profile = res[0].getObject()
try:
self.wftool.doActionFor(profile, 'reject')
except:
pass
def queryMembers(self):
""" Returns an iterator over the records of active members from Salesforce.com """
sfbc = getToolByName(self.context, 'portal_salesforcebaseconnector')
where = '(' + FETCH_CRITERIA + ')'
soql = "SELECT %s FROM %s WHERE %s" % (
','.join(FIELDS_TO_FETCH),
SOBJECT_TYPE,
where)
logger.debug(soql)
res = sfbc.query(soql)
logger.info('%s records found.' % res['size'])
for member in res:
yield member
while not res['done']:
res = sfbc.queryMore(res['queryLocator'])
for member in res:
yield member
def __call__(self, queryMembers=queryMembers):
""" Updates the member directory based on querying Salesforce.com """
# 0. get list of sf_ids for the profiles we already know about, so we
# can keep track of which ones we need to make private
sf_ids_not_found = set(self.catalog.uniqueValuesFor('getSf_id'))
# 1. fetch active Member Profile records, update ones that match,
# and create new ones
for i, data in enumerate(queryMembers(self)):
profile = self.findOrCreateProfileBySfId(name = data.Name, sf_id = data.Id)
self.updateProfile(profile, data)
# commit periodically (every 10) to avoid conflicts
if not i % 10:
transaction.commit()
# keep track of which profiles we need to hide
try:
sf_ids_not_found.remove(data.Id)
except KeyError:
pass
# 2. hide any profiles that are no longer active
for sf_id in sf_ids_not_found:
self.hideProfileBySfId(sf_id)
All that's left is writing the view which actually queries the catalog for these member profiles and presents them as a listing, which is relatively straightforward, and left as an exercise for the reader. :)
Allowing updates to directory profiles
So if the Archetypes content type doesn't allow edits to most of its fields, how did I provide for logged-in members to edit profile info? Well, there are 2 parts:
- The Salesforce Auth Plugin allows logins to Plone based on Account records in Salesforce (by matching on custom username and password fields on the Account).
- A custom z3c.form form reads values from the Account associated with the currently logged-in user, and writes to both that Account record in Salesforce and also to the associated Member Profile archetype within Plone (so that updates appear in the directory immediately).
I won't go into detail on the configuration of the Auth Plugin, as it is covered in the package's documentation. I configured it to load the Salesforce Id of the Account and several other fields into PAS member properties, for easy access within Plone. I did not configure all of the account fields as member properties -- while I could have done so, I didn't see much utility in that, since Plone can't (at least not yet) automatically generate an edit form for all the member properties.
Instead, I built a custom z3c.form form that reads and writes directly to Salesforce. This turned out to be less complicated than I anticipated, mostly thanks to a new ORM-style library I built for wrapping the objects returned from Salesforce by beatbox (with attributes corresponding to Salesforce field names) with a model whose attribute names match the field names of the form schema -- allowing use of the wrapper as the context of a z3c.form form. I'm not yet going to post the implementation of this library, as I intend to make some significant changes to the API before releasing it (real soon now?). But let me at least show you what using it looks like (again I have simplified from the real code):
from zope.interface import implements
from z3c.form import form, field, button
from plone.z3cform.layout import wrap_form
from plone.memoize.instance import memoize
from Products.CMFCore.utils import getToolByName
from sforzando import SFObject, SFField
class IAccountGeneralInfo(Interface):
""" Schema for member profile edit form """
business_name = schema.TextLine(title = u'Business Name')
# etc...
class SFAccount(SFObject):
""" Adapts a Salesforce Account to the profile edit form schema"""
implements(IAccountGeneralInfo)
_sObjectType = 'Account'
sf_id = SFField('Id')
business_name = SFField('Name')
# etc...
class ProfileEditForm(form.Form):
""" An edit form for the current authenticated member's Account """
label = u'Update Profile'
fields = field.Fields(IAccountGeneralInfo)
def _get_sf_id(self):
""" Find the Salesforce Account Id corresponding to the current logged in member. """
mtool = getToolByName(self.context, 'portal_membership')
member = mtool.getAuthenticatedMember()
sf_id = member.getProperty('sf_id')
if not sf_id:
raise Exception("Did not find valid Salesforce ID for member '%s'" % member.getId())
return sf_id
@memoize
def getContent(self):
""" Provides the object this form will edit.
Memoized so we always get the same one for a given request. """
sfbc = getToolByName(context, 'portal_salesforcebaseconnector')
return SFAccount(sfbc, "Id='%s'" % self._get_sf_id())
@button.buttonAndHandler(u'Update Profile')
def handleUpdate(self, action):
""" Handler for the Update Profile button """
data, errors = self.extractData()
if not errors:
self.status = u'Changes saved.'
# save changes to Salesforce
sf_id = self._get_sf_id()
sfbc = getToolByName(context, 'portal_salesforcebaseconnector')
SFAccount.update(sfbc, id=sf_id, **data)
# etc...additional code to update the local AT-based copy of the Account data...
ProfileEditView = wrap_form(ProfileEditForm)
Formlib would probably also work just as well as z3c.form. And certainly using a PloneFormGen form with the 'update' feature of the salesforcepfgadapter would work without need for coding, if you don't need a particularly fancy form. As long as you mapped the Salesforce object Id as a member property in the Auth Plugin configuration, it's pretty easy to use that as the basis for determining which object the form should edit.
In conclusion
I'm pretty excited about the results of this project, which is one of the deeper integrations of Plone and Salesforce.com that I have worked on, and which builds on the tools Groundwire has led the development of over the past few years -- especially the Salesforce Auth Plugin. Giving Plone the ability to accept logins based on a CRM system opens the door to a lot of exciting possibilities -- think about being able to show visitors targeted content based on what your database knows about their interests or location, or allowing them to share content with other visitors from the same geographic area.
If you are putting to good use the tools and code discussed here, or are finding other cool things to do by integrating Plone and Salesforce, I'd love to hear about it.
Using HAProxy with Zope via Buildout
First, a few words about what HAProxy offers. For the past couple years I've been using Pound to load balance between multiple backend Zope instances. But recently I've been hearing recommendations from people I trust (such as Jarn and Elizabeth Leddy) to try HAProxy instead.
HAProxy offers some nice features: - Backend health checks - Various load-balance algorithms for how requests get distributed to backends - Can do sticky sessions so that an authenticated user always hits the same backend - Warmup time (don't send as many requests to a Zope instance while it's starting up) - Provides a status page giving info on backend status and uptime, # of queued requests, # of active sessions, # of errors, etc.
Some of these are possible with pound too, but the status screen was really the "killer app" for me. This is fun to watch but also very useful for doing rolling restarts when new code needs to be deployed without an interruption in service.

Configuration
In my buildout.cfg I added:
[buildout]
...
parts =
...
haproxy-build
haproxy-conf
[haproxy-build]
recipe = plone.recipe.haproxy
url = http://dist.plone.org/thirdparty/haproxy-1.3.22.zip
[haproxy-conf]
recipe = collective.recipe.template
input = ${buildout:directory}/haproxy.conf.in
output = ${buildout:directory}/etc/haproxy.conf
maxconn = 24000
ulimit-n = 65536
user = zope
group = staff
bind = 127.0.0.1:8080
Here, we add a part called "haproxy-build" which uses the plone.recipe.haproxy recipe to build haproxy from source and add a bin/haproxy script for running it, and a part called "haproxy-conf" which builds the HAProxy configuration file by filling in variables in a template file called haproxy.conf.in.
Be sure to set the user and group variables to the user and group you want HAProxy to run as, and update the bind variable to set the port to which HAProxy should bind.
I run most of my Plone stack using supervisord, so I also updated my supervisord configuration in buildout to run HAProxy:
[supervisor]
recipe = collective.recipe.supervisor
...
programs =
...
10 haproxy ${buildout:directory}/bin/haproxy [ -f ${buildout:directory}/etc/haproxy.conf -db ]
In a real life deployment, you'll probably also want a caching reverse proxy like squid or varnish sitting in front of HAProxy.
What about the contents of haproxy.conf.in? Here's mine:
global
log 127.0.0.1 local6
maxconn ${haproxy-conf:maxconn}
user ${haproxy-conf:user}
group ${haproxy-conf:group}
daemon
nbproc 1
defaults
mode http
option httpclose
# Remove requests from the queue if people press stop button
option abortonclose
# Try to connect this many times on failure
retries 3
# If a client is bound to a particular backend but it goes down,
# send them to a different one
option redispatch
monitor-uri /haproxy-ping
timeout connect 7s
timeout queue 300s
timeout client 300s
timeout server 300s
# Enable status page at this URL, on the port HAProxy is bound to
stats enable
stats uri /haproxy-status
stats refresh 5s
stats realm Haproxy\ statistics
frontend zopecluster
bind ${haproxy-conf:bind}
default_backend zope
# Load balancing over the zope instances
backend zope
# Use Zope's __ac cookie as a basis for session stickiness if present.
appsession __ac len 32 timeout 1d
# Otherwise add a cookie called "serverid" for maintaining session stickiness.
# This cookie lasts until the client's browser closes, and is invisible to Zope.
cookie serverid insert nocache indirect
# If no session found, use the roundrobin load-balancing algorithm to pick a backend.
balance roundrobin
# Use / (the default) for periodic backend health checks
option httpchk
# Server options:
# "cookie" sets the value of the serverid cookie to be used for the server
# "maxconn" is how many connections can be sent to the server at once
# "check" enables health checks
# "rise 1" means consider Zope up after 1 successful health check
server plone0101 127.0.0.1:${zeoclient1:http-address} cookie p0101 check maxconn 2 rise 1
server plone0102 127.0.0.1:${zeoclient2:http-address} cookie p0102 check maxconn 2 rise 1
This assumes that I have Zope instances built by parts called "zeoclient1" and "zeoclient2" in my buildout; you'll probably need to update those names.
You may want to adjust the "option httpchk" line to use a different URL for checking whether the Zope instances are up -- you want to point at something that can be rendered as quickly as possible (in my case it's the Zope root information screen, so I'm not too worried).
The maxconn setting for each backend should be at least the number of threads that that Zope instance is running. Laurence Rowe pointed out to me that it should probably not be set to 1, since Zope also serves some things (blobs and ) via file stream iterators, which happens apart from the main ZPublisher threads. (So setting maxconn to 1 would mean serving a large blob could block other requests to that backend, for instance.)
See the HAProxy configuration documentation for more details on the settings that can be used in this file.