While using nh3 library, we came across a use case, where HTML content is expected for a field, but we need to remove the content that can cause XSS attack. Using nh3.clean() directly on the input text doesn't give the expected result and a lot of useful data is getting trimmed ultimately modifying the html template input. ```python import nh3 text = ''' <!DOCTYPE html> <html> <head> <title>HTML Tutorial</title> </head> <body> <h1>This is a heading</h1> <p>This is a paragraph.</p> </body> </html> ''' nh3.ALLOWED_TAGS.add('title') nh3.ALLOWED_TAGS.add('head') nh3.ALLOWED_TAGS.add('html') nh3.ALLOWED_TAGS.add('div') nh3.ALLOWED_TAGS.add('body') print(nh3.clean(text,tags=nh3.ALLOWED_TAGS,strip_comments=False)) Output: <title>HTML Tutorial</title> <h1>This is a heading</h1> <p>This is a paragraph.</p> ``` We don't want to trim the html or head or body tags. Is there any limitation to nh3 library which does not allow these tags?
This issue appears to be discussing a feature request or bug report related to the repository. Based on the content, it seems to be still under discussion. The issue was opened by barkhabol and has received 1 comments.