nh3 clean doesn't include html, head or body tags even when included in ALLOWED_TAGS · Issue #32 · messense/nh3

While using nh3 library, we came across a use case, where HTML content is expected for a field, but we need to remove the content that can cause XSS attack. Using nh3.clean() directly on the input text doesn't give the expected result and a lot of useful data is getting trimmed ultimately modifying the html template input. ```python import nh3 text = ''' <!DOCTYPE html> <html> <head> <title>HTML Tutorial</title> </head> <body> <h1>This is a heading</h1> <p>This is a paragraph.</p> </body> </html> ''' nh3.ALLOWED_TAGS.add('title') nh3.ALLOWED_TAGS.add('head') nh3.ALLOWED_TAGS.add('html') nh3.ALLOWED_TAGS.add('div') nh3.ALLOWED_TAGS.add('body') print(nh3.clean(text,tags=nh3.ALLOWED_TAGS,strip_comments=False)) Output: <title>HTML Tutorial</title> <h1>This is a heading</h1> <p>This is a paragraph.</p> ``` We don't want to trim the html or head or body tags. Is there any limitation to nh3 library which does not allow these tags?

AI Analysis

This issue appears to be discussing a feature request or bug report related to the repository. Based on the content, it seems to be still under discussion. The issue was opened by barkhabol and has received 1 comments.

Add a comment

Comment form would go here

nh3 clean doesn't include html, head or body tags even when included in ALLOWED_TAGS#32