### Motivation It is causing valid names to be marked as censored ### Summary While investigating why words with accented characters were marked as censored, I noticed this comment on the censor function ``` /// # Unfortunate Side Effects /// /// All diacritical marks (accents) are removed by the current implementation. This is subject /// to change, as a better implementation would make this optional. ``` i made a test case to see the difference in text once run through `censor` and what it was marked as ``` #[test] fn test() { let filter = RustrictFilter { ignore_false_positives: false, ignore_self_censoring: false, temp_censor_replacement: '\u{00a0}', regex: Regex::new(format!("{}+", '\u{00a0}').as_str()).unwrap(), censor_replacement: "🤫".to_string(), }; let valid_name = "Ernésto Jose Durán Lar"; println!("{}", valid_name); let result = filter.filter(valid_name, Severity::ModerateOrHigher); println!("{}", result); assert!(!filter.is_censored(valid_name, Severity::ModerateOrHigher)); } ``` This outputs a failed test on the assert that the name is not censored: ``` Ernésto Jose Durán Lar Ernesto Jose Duran Lar thread 'filter::test::test' panicked at 'assertion failed: !filter.is_censored(valid_name, Severity::ModerateOrHigher)', libraries/s6-validations/src/filter.rs:218:9 stack backtrace: 0: rust_begin_unwind at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panicking.rs:593:5 1: core::panicking::panic_fmt at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/panicking.rs:67:14 2: core::panicking::panic at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/panicking.rs:117:5 3: s6_validations::filter::test::test at ./src/filter.rs:218:9 4: s6_validations::filter::test::test::{{closure}} at ./src/filter.rs:205:15 5: core::ops::function::FnOnce::call_once at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/ops/function.rs:250:5 6: core::ops::function::FnOnce::call_once at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/ops/function.rs:250:5 note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace. test filter::test::test ... FAILED ``` Could a change be made so that at minimum, this is not considered censored text? ### Alternatives Accented characters are not removed at all and they are not marked as censored ### Context I am using `rustrict` version `0.4.0`
This issue appears to be discussing a feature request or bug report related to the repository. Based on the content, it seems to be resolved. The issue was opened by shemmings6 and has received 4 comments.