Finally, getting nowhere with that approach, but still believing that the problem was in the DB layer (or MLDBM), I decided just to replace that section with DBM::Deep. I had been wanting to start using it anyway. Luckily it was easy because I wrote that interface code after I learned to modularize, so I only had to change a couple of functions deep in my library.
Of course the problem survived the code transplant, so I started looking at the few bits of suspect code left, when I came across this in something I wrote long long ago (variable names have been changed to protect the innocent; I don't really use names like that in my code :-P ):
$hashref->{key} |= $fluctuating_val;
$hashref was also tied to MLDBM, which is why I had been concentrating on that subsystem. In any case, I started following $fluctuating_val around using Devel::Peek instead of just printing the value itself. VoilĂ ! $fluctuating_val was coming from MLDBM as a PV (string value), and so was $hashref->{key}, and the bitwise operation wasn't giving the expected result. . But I found that the bitwise operation sometimes succeeded when I added debugging print statements. This started to make sense to me when looking with Devel::Peek, because one can follow the internal state of a Perl variable and see it accumulate different kinds of values as it is used in different contexts. One-liner demo:
% perl -e 'use Devel::Peek; my $i="1234"; printf "%s\n",$i; Dump($i); printf "%d\n",$i; Dump($i);'
1234
SV = PV(0x8154b00) at 0x8154714
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x8169758 "1234"\0
CUR = 4
LEN = 8
1234
SV = PVIV(0x8155b10) at 0x8154714
REFCNT = 1
FLAGS = (PADBUSY,PADMY,IOK,POK,pIOK,pPOK)
IV = 1234
PV = 0x8169758 "1234"\0
CUR = 4
LEN = 8
$i acquires an integer value (IV) when it is accessed as an integer. That is the way Perl variables are supposed to work. But what if we access the variable with a bitwise operator?
% perl -e 'use Devel::Peek; my $i="1234"; $i|="5678"; printf "%s\n", $i; Dump($i);'
567
SV = PV(0x8154b00) at 0x8154714 REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x8169748 "567<"\0
CUR = 4
LEN = 8
The result of the operation between two PVs is another PV, and the value is not 1234|5678 = 0x4D2|0x162E = 0x16FE = 5886, which is the value I expected. But what if one operand has a numeric value?
% perl -e 'use Devel::Peek; my $i="1234"; $i|=5678; printf "%s\n", $i; Dump($i);'
5886
SV = PVIV(0x8155b10) at 0x8154714
REFCNT = 1
FLAGS = (PADBUSY,PADMY,IOK,POK,pIOK,pPOK)
IV = 5886
PV = 0x8169748 "5886"\0
CUR = 4
LEN = 8
A PVIV! It behaves differently! And in the way that I want! I changed my problem code to
$hashref->{key} |= 1*$fluctuating_val;
and voilĂ ! again. My problem disappeared, because multiplying by 1 gave the variable an internal numerical value, making the bitwise operator reach the answer I was expecting.
But why? I started Google searching: http://www.google.com/search?hl=en&q=perl+bitwise+pv+iv, which led me to a stackoverflow post entitled "How does Perl decide to treat a scalar as a string or a number?", and a comment made by Leon Timmermans inside it:
Perl [remembers] when a variable is both a valid integer, float or string when either of those is used. However this does not affect the semantics of the variable (except in two cases, bitwise operators and syscall). – Leon Timmermans Dec 1 '08 at 16:08
Bingo. But how are bitwise operators different? I emailed him to ask, and he helpfully pointed me to the perlop manpage (duh!), specifically a section near the bottom called Bitwise String Operators, which says
If you are intending to manipulate bitstrings, be certain that you're supplying bitstrings: If an operand is a number, that will imply a numeric bitwise operation. You may explicitly show which type of operation you intend by using "" or 0+ .
Turns out 1* works too. This is the first time I've thought about additive and multiplicative identities in who knows how long.
Leon offered this comment in his reply to my email:
This is generally considered dubious behavior, and in Perl 6 string and integral bit-operators will be split, just like all other string and numeric operators. For now, we'll have to live with it though.
Big thanks to Leon (http://search.cpan.org/~leont/).