-
Notifications
You must be signed in to change notification settings - Fork 603
OP_SUBSTR_LEFT - a specialised OP_SUBSTR variant #22785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OP_SUBSTR_LEFT - a specialised OP_SUBSTR variant #22785
Conversation
d6f958e to
f258d43
Compare
leonerd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of small comments but overall nothing troubling-looking here.
I wonder a bit about the name though. I've usually seen the word "nibble" to mean a half-byte; i.e. a 4-bit value. I wondered if that is what is going on here at first. If there are other candidate names to call it, perhaps something else would be better? Not a huge problem though.
|
How about Food related alternatives: |
|
Not sure what's going on with the ABRT test failures. Don't get them locally. |
Looks like an op_private flags assertion. I'll dig into it soon. |
f258d43 to
98d187d
Compare
|
I'm rebasing and renaming it to |
|
Doesn't perl's |
|
The Perl |
|
@richardleach , merge conflicts ^^ |
98d187d to
24108a7
Compare
Oh wow. Huh. In that case, might as well call this one Otherwise my thoughts were going to be something like |
|
Consider ltrim, with inspiration from PHP and Redis (or lstrip a la Ruby/Python but that sounds more whitespace-specific). Though it is also unrelated to builtin::trim, I think it's a bit more descriptive at least |
Hmmm, I'm not sure about this. It seems only more descriptive to someone who already is familiar with |
|
Maybe |
Ok, that seems straightforward enough without colliding with Perlspace. Will rename. |
24108a7 to
2712d8f
Compare
Variants are named to match the style of macros in op.h
2712d8f to
3c55fa6
Compare
3c55fa6 to
c468cc5
Compare
|
OP renamed to |
c468cc5 to
0e93328
Compare
|
BINOPs like PP
|
|
On Thu, Dec 19, 2024 at 02:35:50AM -0800, bulk88 wrote:
BINOPs like PP
``
if(index($str, 'ZZZZZZ) == -1) {
}
``
XS have no concpext of "G_BOOL" content. There is definently a need to deliver bool contet, from runloop to the XS.
What has this got to do with the proposed OP_SUBSTR_LEFT op?
…--
"You may not work around any technical limitations in the software"
-- Windows Vista license
|
0e93328 to
422f35c
Compare
This commit adds OP_SUBSTR_LEFT and associated machinery for fast
handling of the constructions:
substr EXPR,0,LENGTH,''
and
substr EXPR,0,LENGTH
Where EXPR is a scalar lexical, the OFFSET is zero, and either there
is no REPLACEMENT or it is the empty string. LENGTH can be anything
that OP_SUBSTR supports. These constraints allow for a very stripped
back and optimised version of pp_substr.
The primary motivation was for situations where a scalar, containing
some network packets or other binary data structure, is being parsed
piecemeal. Nibbling away at the scalar can be useful when you don't
know how exactly it will be parsed and unpacked until you get started.
It also means that you don't need to worry about correctly updating
a separate offset variable.
This operator also turns out to be an efficient way to (destructively)
break an expression up into fixed size chunks. For example, given:
my $x = ''; my $str = "A"x100_000_000;
This code:
$x = substr($str, 0, 5, "") while ($str);
is twice as fast as doing:
for ($pos = 0; $pos < length($str); $pos += 5) {
$x = substr($str, $pos, 5);
}
Compared with blead, `$y = substr($x, 0, 5)` runs 40% faster and
`$y = substr($x, 0, 5, '')` runs 45% faster.
As suggested in Perl#22785
422f35c to
7a2b126
Compare
This commit adds
OP_SUBSTR_NIBBLEand associated machinery for fast handling of the constructions:and
Where
EXPRis a scalar lexical, theOFFSETis zero, and either there is noREPLACEMENTor it is the empty string.LENGTHcan be anything thatOP_SUBSTRsupports. These constraints allow for a very stripped back and optimised version of pp_substr.The primary motivation was for situations where a scalar, containing some network packets or other binary data structure, is being parsed piecemeal. Nibbling away at the scalar can be useful when you don't know how exactly it will be parsed and unpacked until you get started. It also means that you don't need to worry about correctly updating a separate offset variable.
This operator also turns out to be an efficient way to (destructively) break an expression up into fixed size chunks. For example, given:
This code:
is twice as fast as doing:
Compared with blead,
$y = substr($x, 0, 5)runs 40% faster and$y = substr($x, 0, 5, '')runs 45% faster.