Ticket #9232 (closed defect: fixed)

Opened 8 years ago

Last modified 8 years ago

Ampersand in Lens comment causes site error

Reported by: ew2 Owned by: bnwest
Priority: critical Milestone: Hot Fixes
Component: Lenses Version: Live
Severity: severe Keywords:
Cc: System Area: Content Display
Primary Skill: Several Equally
Site URL:
Suppress email to reporter: no

Description

If you add an ampersand in the lens comments or tags, any module/collection for which that lens info will show up will have render problems (because ampersand needs to be converted to an entity reference, &). Noticed this looking at two of Bill Wilson's modules that just got added (via their parent collection) to the Featured Content lens, which had the phrase 'E&M' in the comments field.

Change History

  Changed 8 years ago by ew2

  • status changed from new to assigned

  Changed 8 years ago by bnwest

  • status changed from assigned to working

To recreate the problem:

  1. Go to a published module (or collection)
  2. add module to an existing lens
  3. add 'E&M' (with or without the quotes) as the comment
  4. hit 'Add to lens'
  5. refresh page
  6. get XMLParserError Site Error
    Error Type
        XMLParserError
    Error Value
        Error: could not parse document: Line 75: EntityRef: expecting ';' 
    

  Changed 8 years ago by bnwest

Note that the comment for the lens entry can take HTML. The HTML does not have to be well formed to be saved.

The lens entry comment uses the HTML in the lens_content_view page.

The lens entry comment does not use the HTML in the tooltip for the lens entry in the module/collection content page.

HTML that is not well formed will throw an exception while rendering the tooltip for the module/collection content page.

  Changed 8 years ago by bnwest

The lens Description field and the lens entry Comment field have both been instrumented to perform a 'simple_safe_html' validation (see Lensmaker/SimpleHtmlValidator.py).

Empirically, the validation is performed on the lens Description field but not on the lens entry Comment field.

follow-up: ↓ 6   Changed 8 years ago by bnwest

In Lensmaker/SimpleHtmlValidator.py, we override the VALID TAGS for CMFDEfault's StrippingParser. Empirically, these overrides are not efficacious and the originals value for the VALID TAGS is being used.

in reply to: ↑ 5   Changed 8 years ago by bnwest

Replying to bnwest:

Empirically, these overrides are not efficacious and the originals value for the VALID TAGS is being used.

The originals are not being used either :(

We appear to be changing the Products.CMFDefault.utils.VALID_TAGS but the parser does not always respect this.

Tags that the parser accepts which are in not in VALID_TAGS inlcude:

del
pre
h1
small
blockquote

Looking at the parsing code, methinks that we are not using it correctly or as intended. We are calling a "strip parser" to validate, while it is intended to validate and return a stripped version of its input.

  Changed 8 years ago by bnwest

List of Problems:

  1. Simple Html Validator, which is used for Lens Description and targeted for Lens Entry Comment, does not flag a standalone '&' as an error. Simple Html Validator also does not flag mismatched beginning and end tags, beginning tags without end tags, etc. We need a better validator.
  2. Lens Entry Comment is not being validated either under the Lenses tab or in the "Add to lens" dialog box. Validating needs to added in these places as well as workflow to handle validation errors (which may be nontrivial for the "Add to lens" dialog box.
  3. Lens Entry Comment is being displayed as straight text with ellipsis and not HTML. This might be the right thing to do.

  Changed 8 years ago by bnwest

As a short term workaround, we may be able to change the Lens Description and the Lens Entry Comment field types to text from HTML. We would need to data mine the repository to see if anyone has used HTML in these fields (which thankfully is a undocumented feature).

  Changed 8 years ago by bnwest

Lens Entry Comments that are added via the "Add to lens" dialog box get surrounding <p> and </p> added to the comment value on the way to the database.

Lens Entry Comments that are editted under the Lenses tab do not get the surrounding <p> and </p> added when written to the database.

Editing in both places will not see the surrounding <p> and </p>. This implies that the surrounding <p> and </p> are stripped prior to editing.

Thus we have Lens Entry Comments in the database with and without surrounding <p> and </p>. This implies that we will have a data conversion issue.

  Changed 8 years ago by bnwest

I used the following code to datamine the repository:

from AccessControl.SecurityManagement import newSecurityManager
from Products.CMFCore.tests.base.security import AnonymousUser, OmnipotentUser
#user = app.plone.acl_users.getUserById('bnwest').__of__(app.plone.acl_users)
user = app.plone.acl_users.getUserById('jccman').__of__(app.plone.acl_users)
newSecurityManager(None, user)

from Products.CMFCore.utils import getToolByName
lenstool = getToolByName(app.plone, 'lens_catalog')
brains = lenstool(portal_type='SelectedContent')

iNonZeroCommentCount = 0
for b in brains:
    selContent = b.getObject()
    comment = selContent.getComment()
    if comment is not None and len(comment) > 0:
        iNonZeroCommentCount += 1
        print "%s has the comment:" % b.getPath()
        print comment

I determined that 431 of our 1752 SelectedContent objects have nonzero comments.

  Changed 8 years ago by ew2

  • owner changed from bnwest to ew2
  • status changed from working to assigned
  • milestone changed from Hot Fixes to future

Moving out of hot fixes since we don't have time to fix it right now.

  Changed 8 years ago by ew2

  • owner changed from ew2 to bnwest
  • milestone changed from future to Hot Fixes

After team discussion, I am moving back to hot fixes. This is ticket has the potential to be a major problem.

  Changed 8 years ago by bnwest

  • status changed from assigned to testing

(In [28499]) naked '&' in lens entry comments was creating havoc. the lens entry comment preview on module content was taking down the entire page, in the presence of '&'.

the lens entry comment field is now treated as plain text and is appropriatedly escaped when rendered in HTML.

fixes ticket:9232. see ticket for more info wrt why we fixed it this way.

  Changed 8 years ago by jenn

  • status changed from testing to assigned

I updated to latest trunk and ran the Lensmaker upgrade script; but the problem is still there. It doesn't look like any of that except the Lensmaker part actually got merged to trunk.

  Changed 8 years ago by bnwest

  • status changed from assigned to testing

  Changed 8 years ago by jenn

  • status changed from testing to closed
  • resolution set to fixed

  Changed 8 years ago by bnwest

(In [29216]) CAM merge to trunk stomped on this change. reinstating.

  Changed 8 years ago by jenn

  • status changed from closed to testing

Noted that, while having a bare "&" as a tag no longer causes the page not to render, clicking on the link to view the "&" tag goes to a page that says "Error: no tag provided". This is probably NAR relative to the way it worked before the merge problem.

  Changed 8 years ago by jenn

  • status changed from testing to closed

Re-rolled out 8/20.

Note: See TracTickets for help on using tickets.