Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weโ€™ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] strrev destroys smiley input #17241

Closed
remco-pc opened this issue Dec 22, 2024 · 8 comments
Closed

[bug] strrev destroys smiley input #17241

remco-pc opened this issue Dec 22, 2024 · 8 comments

Comments

@remco-pc
Copy link

remco-pc commented Dec 22, 2024

Description

The following code:

<?php

$name = ๐Ÿ‘Œ
$name = strrev($name);
echo $name; //๏ฟฝ๏ฟฝ๏ฟฝ๏ฟฝ

Resulted in this output:

๏ฟฝ๏ฟฝ๏ฟฝ๏ฟฝ

But I expected this output instead:

๐Ÿ‘Œ or inverse codepoints

PHP Version

8.2.26

Operating System

Debian 12

@NattyNarwhal
Copy link
Member

I was about to ask if mb_strrev works for you, except we don't have that one. Good opportunity for a feature request!

The problem is strrev looks at each byte, but not each multibyte character UTF-8 has. Of course, even character reverse is pretty weird for emoji because of ZWJs meaning meaning multiple characters (i.e. ๐Ÿ‘Œ plus a skin tone modifier) become one grapheme.

@kamil-tekiela
Copy link
Member

It works as intended. strrev works on a byte-string. It reverses all bytes in a string. Your Unicode character is composed of 4 bytes so it works correctly. If you are missing the functionality of mb_strrev then you could request that as a feature, but it can be easily done in userland already https://3v4l.org/piYNQ

@kamil-tekiela kamil-tekiela closed this as not planned Won't fix, can't repro, duplicate, stale Dec 22, 2024
@remco-pc
Copy link
Author

remco-pc commented Jan 1, 2025

@kamil-tekiela thanks for the usable example to get smiley function names (code points) working

@kamil-tekiela
Copy link
Member

@remco-pc You are welcome, but please pay attention to what @NattyNarwhal said. What I showed works only with some emojis. Many emojis are actually a combination of 2 or more code points. If you reverse them, the emoji will break.

@youkidearitai
Copy link
Contributor

If exists mb_strrev function, some words(include emoji) breaks. For example, ้‚‰๓ „€ (U+9089 U+E0100) is break mb_strrev function because Unicode code points unit.
In other hands, grapheme cluster function solve this issue. So maybe names grapheme_strrev. @remco-pc , if you want to strrev for emoji and more, I can suggest grapheme_strrev function to PHP internals. What do you think?

@remco-pc
Copy link
Author

remco-pc commented Jan 15, 2025

command:

app raxon/parse compile -source=/mnt/Vps3/Mount/Package/Raxon/Parse/Test/Tpl/101-200/Smiley.Boot.3.tpl -options=1 --flags=2 -duration

template

{{๐Ÿ‘Œ๐Ÿคฃ('โ›ณ', app.flags(), app.options())}}

trait

<?php
/**
 * @package Plugin\Modifier
 * @author Remco van der Velde
 * @since 2024-08-19
 * @license MIT
 * @version 1.0
 * @changeLog
 *    - all
 */
namespace Plugin;

use Raxon\App as Framework;

trait CodePoint_128076_129315 {

    /**
     * @throws Exception
     */
    protected function codepoint_128076_129315($type='', $flags, $options)
    {
        d($type);
        d($flags);
        d($options);
    }
}

works like a charm and if you make all code points available like i did !
Look at the options you create with 1.000.000 * 1.000.000 possibilities with only 2 visible chars.
With constants you could make the quotes disappear.
I need to make a test of your (chinese char) @youkidearitai ? since i cannot read symbols (tried egyptians in the valley of kings)

{
    "class": "mnt_Vps3_Mount_Package_Raxon_Parse_Test_Tpl_101_200_Smiley_Boot_3_tpl",
    "namespace": "Package\\Raxon\\Parse",
    "duration": {
        "require": "1.29 ms",
        "parse": "0.1 ms",
        "total": "147.86 ms",
        "finish": "2025-01-16 00:19:40.7987"
    }
}
/tmp/raxon/org/61c0681b-70d5-4a76-88b9-4a6f2cdcbf00/0/Class/_mnt_Vps3_Mount_Source_Plugin_CodePoint_128076_129315.php:22
string(3) "โ›ณ"
/tmp/raxon/org/61c0681b-70d5-4a76-88b9-4a6f2cdcbf00/0/Class/_mnt_Vps3_Mount_Source_Plugin_CodePoint_128076_129315.php:23
object(stdClass)#731 (1) {
  ["flags"]=>
  int(2)
}
/tmp/raxon/org/61c0681b-70d5-4a76-88b9-4a6f2cdcbf00/0/Class/_mnt_Vps3_Mount_Source_Plugin_CodePoint_128076_129315.php:24
object(stdClass)#729 (7) {
  ["source"]=>
  string(70) "/mnt/Vps3/Mount/Package/Raxon/Parse/Test/Tpl/101-200/Smiley.Boot.3.tpl"
  ["options"]=>
  int(1)
  ["duration"]=>
  bool(true)
  ["hash"]=>
  string(64) "96a500c972948f122add3af67da24276afefe306c0250430df1042aeb9391d80"
  ["class"]=>
  string(134) "mnt_Vps3_Mount_Package_Raxon_Parse_Test_Tpl_101_200_Smiley_Boot_3_tpl_96a500c972948f122add3af67da24276afefe306c0250430df1042aeb9391d80"
  ["namespace"]=>
  string(19) "Package\Raxon\Parse"
  ["debug"]=>
  bool(true)
}

@remco-pc
Copy link
Author

remco-pc commented Jan 15, 2025

@youkidearitai i tried to copy and paste your character in a test script in php storm and it breaks in there too, i cannot test it but the symbol can also not exist in the current font !

here you go:

root@ceb49d5d9721:/Application#  app raxon/parse compile -source=/mnt/Vps3/Mount/Package/Raxon/Parse/Test/Tpl/101-200/Smiley.Boot.4.tpl -options=1 --flags=2 -duration
/mnt/Vps3/Mount/Package/Raxon/Parse/Service/Build.php:1487
array(5) {
  ["tag"]=>
  string(57) "{{้‚‰๓ „€้‚‰๓ „€('้‚‰๓ „€', app.flags(), app.options())}}"
  ["line"]=>
  int(1)
  ["length"]=>
  int(42)
  ["column"]=>
  array(2) {
    ["start"]=>
    int(1)
    ["end"]=>
    int(43)
  }
  ["method"]=>
  array(2) {
    ["name"]=>
    string(14) "้‚‰๓ „€้‚‰๓ „€"
    ["argument"]=>
    array(3) {
      [0]=>
      array(2) {
        ["string"]=>
        string(9) "'้‚‰๓ „€'"
        ["array"]=>
        array(1) {
          [0]=>
          array(4) {
            ["value"]=>
            string(9) "'้‚‰๓ „€'"
            ["execute"]=>
            string(9) "'้‚‰๓ „€'"
            ["type"]=>
            string(6) "string"
            ["is_single_quoted"]=>
            bool(true)
          }
        }
      }
      [1]=>
      array(2) {
        ["string"]=>
        string(12) " app.flags()"
        ["array"]=>
        array(1) {
          [0]=>
          array(6) {
            ["type"]=>
            string(6) "method"
            ["method"]=>
            array(2) {
              ["name"]=>
              string(9) "app.flags"
              ["argument"]=>
              array(0) {
              }
            }
            ["tag"]=>
            string(0) ""
            ["line"]=>
            string(7) "unknown"
            ["length"]=>
            string(7) "unknown"
            ["column"]=>
            array(2) {
              ["start"]=>
              int(0)
              ["end"]=>
              int(0)
            }
          }
        }
      }
      [2]=>
      array(2) {
        ["string"]=>
        string(14) " app.options()"
        ["array"]=>
        array(1) {
          [0]=>
          array(6) {
            ["type"]=>
            string(6) "method"
            ["method"]=>
            array(2) {
              ["name"]=>
              string(11) "app.options"
              ["argument"]=>
              array(0) {
              }
            }
            ["tag"]=>
            string(0) ""
            ["line"]=>
            string(7) "unknown"
            ["length"]=>
            string(7) "unknown"
            ["column"]=>
            array(2) {
              ["start"]=>
              int(0)
              ["end"]=>
              int(0)
            }
          }
        }
      }
    }
  }
}

Press  enter  to continue or  ctrl-c  to break...

Raxon\Exception\LocateException

Plugin not found (้‚‰๓ „€้‚‰๓ „€) exception: "{{้‚‰๓ „€้‚‰๓ „€(\'้‚‰๓ „€\', app.flags(), app.options())}}" on line: 1, column: 1 in source: /mnt/Vps3/Mount/Package/Raxon/Parse/Test/Tpl/101-200/Smiley.Boot.4.tpl
file: /mnt/Vps3/Mount/Package/Raxon/Parse/Service/Build.php
line: 1488
Locations:
/tmp/raxon/org/61c0681b-70d5-4a76-88b9-4a6f2cdcbf00/0/Compile/CodePoint_37001_917760_37001_917760.php
/tmp/raxon/org/61c0681b-70d5-4a76-88b9-4a6f2cdcbf00/0/Class/_mnt_Vps3_Mount_Source_Plugin_CodePoint_37001_917760_37001_917760.php
/mnt/Vps3/Mount/Source/Plugin/CodePoint.37001.917760.37001.917760/CodePoint.37001.917760.37001.917760.php
/mnt/Vps3/Mount/Source/Plugin/CodePoint_37001_917760_37001_917760/CodePoint_37001_917760_37001_917760.php
/mnt/Vps3/Mount/Source/Plugin/CodePoint/.CodePoint.37001.917760.37001.917760.php
/mnt/Vps3/Mount/Source/Plugin/CodePoint/CodePoint.37001.917760.37001.917760.php
/mnt/Vps3/Mount/Source/Plugin/CodePoint/.CodePoint_37001_917760_37001_917760.php
/mnt/Vps3/Mount/Source/Plugin/CodePoint/CodePoint_37001_917760_37001_917760.php
/mnt/Vps3/Mount/Source/Plugin/CodePoint.37001.917760.37001.917760.php
/mnt/Vps3/Mount/Source/Plugin/CodePoint_37001_917760_37001_917760.php

Trace:
/Application/vendor/raxon/framework/src/Exception/LocateException.php (27) setDebugTrace
/tmp/raxon/org/61c0681b-70d5-4a76-88b9-4a6f2cdcbf00/0/Class/_mnt_Vps3_Mount_Package_Raxon_Parse_Service_Build.php (1488) __construct
/tmp/raxon/org/61c0681b-70d5-4a76-88b9-4a6f2cdcbf00/0/Class/_mnt_Vps3_Mount_Package_Raxon_Parse_Service_Build.php (2394) plugin
/tmp/raxon/org/61c0681b-70d5-4a76-88b9-4a6f2cdcbf00/0/Class/_mnt_Vps3_Mount_Package_Raxon_Parse_Service_Build.php (215) method
/tmp/raxon/org/61c0681b-70d5-4a76-88b9-4a6f2cdcbf00/0/Class/_mnt_Vps3_Mount_Package_Raxon_Parse_Service_Build.php (37) document_tag
/tmp/raxon/org/61c0681b-70d5-4a76-88b9-4a6f2cdcbf00/0/Class/_mnt_Vps3_Mount_Package_Raxon_Parse_Service_Parse.php (425) create
/tmp/raxon/org/61c0681b-70d5-4a76-88b9-4a6f2cdcbf00/0/Class/_mnt_Vps3_Mount_Package_Raxon_Parse_Trait_Main.php (34) compile
/tmp/raxon/org/61c0681b-70d5-4a76-88b9-4a6f2cdcbf00/0/Compile/Template_Compile_b64d1262113ac06e3d5d62e915dc9e6df210c43e.php (62) compile
/Application/vendor/raxon/framework/src/Module/Parse.php (890) run
/Application/vendor/raxon/framework/src/Module/Controller.php (742) compile
/mnt/Vps3/Mount/Package/Raxon/Parse/Controller/Cli.php (212) response
/Application/vendor/raxon/framework/src/App.php (503) run
/Application/Bin/Raxon.php (35) run
root@ceb49d5d9721:/Application#

root@ceb49d5d9721:/Application#

@youkidearitai
Copy link
Contributor

@remco-pc You can use "\u{9089}\u{e0100}" instead of "้‚‰๓ „€".
Anyway, Unicode can multiple codepoint that other example รฉ is "\u{0065}\u{0301}" and "\u{0065}\u{0301}" .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants