fix regarding the issues raised during security audit

In the polyseed-examples repository, the `utf8_nfc` and `utf8_nfkd` functions will never return a value exceeding `POLYSEED_STR_SIZE - 1`
In your code, the utf8_norm function has variable return behavior that seems odd
In case of a normalization error, the underlying normalizer will return a negative value, at which point your function just returns POLYSEED_STR_SIZE (this is unclear)
In case the buffer isn't large enough, the normalizer will return the required buffer size but have undefined internal behavior, at which point your function returns a value exceeding POLYSEED_STR_SIZE
Otherwise, it uses the normalizer's return value (indicating the written size) to continue with re-encoding

tobtoht: Czarek Nakamoto: polyseed asserts that the return value < POLYSEED_STR_SIZE, so if normalization fails the program crashes..
> I think my idea was to have have polyseed check the return value and return an error code instead of asserting, which would in turn throw the "Unicode normalization failed" error
> I'll upstream that. In the meantime you can replace the injected function with
```cpp
    inline size_t utf8_norm(const char* str, polyseed_str norm, utf8proc_option_t options) {
      utf8proc_int32_t buffer[POLYSEED_STR_SIZE];
      utf8proc_ssize_t result;

      result = utf8proc_decompose(reinterpret_cast<const uint8_t*>(str), 0, buffer, POLYSEED_STR_SIZE, options);
      if (result < 0 || result > (POLYSEED_STR_SIZE - 1)) {
        throw std::runtime_error("Unicode normalization failed");
      }

      result = utf8proc_reencode(buffer, result, options);
      if (result < 0 || result > POLYSEED_STR_SIZE) {
        throw std::runtime_error("Unicode normalization failed");
      }

      strcpy(norm, reinterpret_cast<const char*>(buffer));
      sodium_memzero(buffer, sizeof(buffer));
      return result;
    }
```
This commit is contained in:
Czarek Nakamoto
2024-04-19 16:37:42 +02:00
parent 05569f7b80
commit 22f6fb4b63

View File

@@ -517,14 +517,14 @@ index 000000000..b26f37574
+ utf8proc_ssize_t result;
+
+ result = utf8proc_decompose(reinterpret_cast<const uint8_t*>(str), 0, buffer, POLYSEED_STR_SIZE, options);
+ if (result < 0) {
+ return POLYSEED_STR_SIZE;
+ if (result < 0 || result > (POLYSEED_STR_SIZE - 1)) {
+ throw std::runtime_error("Unicode normalization failed");
+ }
+ if (result > POLYSEED_STR_SIZE - 1) {
+ return result;
+ }
+
+
+ result = utf8proc_reencode(buffer, result, options);
+ if (result < 0 || result > POLYSEED_STR_SIZE) {
+ throw std::runtime_error("Unicode normalization failed");
+ }
+
+ strcpy(norm, reinterpret_cast<const char*>(buffer));
+ sodium_memzero(buffer, POLYSEED_STR_SIZE);