From 5fca9a61bc903e7d07eb11cbb10d555093625a52 Mon Sep 17 00:00:00 2001 From: chromatic Date: Sat, 22 Oct 2011 19:25:29 -0700 Subject: [PATCH] Edited chapter 11. --- sections/barewords.pod | 149 ++++++++++++---------- sections/chapter_11.pod | 9 +- sections/indirect_objects.pod | 85 ++++++------- sections/method_sub_equivalence.pod | 80 +++++------- sections/prototypes.pod | 189 +++++++++++++--------------- sections/tie.pod | 129 ++++++++++--------- 6 files changed, 315 insertions(+), 326 deletions(-) diff --git a/sections/barewords.pod b/sections/barewords.pod index abb44eaf..cecb5054 100644 --- a/sections/barewords.pod +++ b/sections/barewords.pod @@ -2,42 +2,41 @@ Z -Perl uses sigils and other punctuation pervasively to help both the parser and -the programmer identify the categories of named entities. Even so, Perl is a -malleable language. You can write programs in the most creative, maintainable, -obfuscated, or bizarre fashion as you prefer. Maintainability is a concern of -good programmers, but the developers of Perl itself don't presume to dictate -what I find most maintainable. +Perl is a malleable language. You can write programs in the most creative, +maintainable, obfuscated, or bizarre fashion you prefer. Maintainability is a +concern of good programmers, but Perl doesn't presume to dictate what I +consider maintainable. X X pragma> X> -Perl's parser understands the builtin Perl builtins and operators; it knows -that C means you're making objects (L). These are -rarely ambiguous... but Perl programmers can add complexity to parsing by using -I. A bareword is an identifier without a sigil or other attached -disambiguation as to its intended syntactical function. Because there's no -Perl 5 builtin C, the literal word C appearing in source code is -ambiguous. Did you intend to use a variable C<$curse> or to call a function -C? The C pragma warns about use of such ambiguous barewords -for good reason. - -Even so, barewords are permissible in several places in Perl 5 for good reason. +Perl's parser understands Perl's builtins and operators. It uses sigils to +identify variables and other punctuation to recognize function and method +calls. Yet sometimes the parser has to guess what you mean, especially when +you use a I--an identifier without a sigil or other syntactically +significant punctuation. =head2 Good Uses of Barewords X + +Though the C pragma (L) rightly forbids ambiguous barewords, +some barewords are acceptable. + +=head3 Bareword hash keys + X X; unary operator> -Hash keys in Perl 5 are barewords. These are usually not ambiguous because -their use as keys is sufficient for the parser to identify them as the -equivalent of single-quoted strings. Yet be aware that attempting to evaluate -a function call or a builtin operator (such as C) to I a hash -key may not do what you expect, unless you disambiguate by providing arguments, -using function argument parentheses, or prepending unary plus to force the -evaluation of the builtin rather than its interpretation as a string: +Hash keys in Perl 5 are usually I ambiguous because the parser can +identify them as string keys; C in C<$games{pinball}> is obviously a +string. + +Occasionally this interpretation is not what you want, especially when you +intend to I a builtin or a function to produce the hash key. In this +case, disambiguate by providing arguments, using function argument parentheses, +or prepending unary plus to force the evaluation of the builtin: =begin programlisting @@ -52,16 +51,21 @@ evaluation of the builtin rather than its interpretation as a string: =end programlisting +=head3 Bareword package names + X -Package names in Perl 5 are barewords in a sense. Good naming conventions for -packages (initial caps) help prevent unwanted surprises, but the parser uses a -complex heuristic based on the code it's already compiled within the current -namespace to determine whether C<< Package->method() >> means to call a -function named C and then call the C method on its results -or whether to treat C as the name of a package. You can disambiguate -this with the postfix package separator (C<::>), but that's rare and admittedly -ugly: +Package names in Perl 5 are also barewords. If you hew to naming conventions +where package names have initial capitals and functions do not, you'll rarely +encounter naming collisions, but the Perl 5 parser must determine how to parse +C<< Package->method() >>. Does it mean "call a function named C and +call C on its return value?" or does it mean "Call a method named +C in the C namespace?" The answer varies depending on what +code the parser has already encountered in the current namespace. + +Force the parser to treat C as a package name by appending the package +separator (C<::>)N: =begin programlisting @@ -73,6 +77,8 @@ ugly: =end programlisting +=head3 Bareword named code blocks + X> X> X> @@ -81,10 +87,9 @@ X> X> X> -The special named code blocks provide their own types of barewords. -C, C, C, C, C, C, and C -I functions, but they do not need the C builtin to do so. You -may be familiar with the idiom of writing C without C: +The special named code blocks C, C, C, C, +C, C, and C are barewords which I functions +without the C builtin. You've seen this before (L): =begin programlisting @@ -92,10 +97,13 @@ may be familiar with the idiom of writing C without C: BEGIN { initialize_simians( __PACKAGE__ ) } + sub AUTOLOAD { ... } + =end programlisting -You I leave off the C on C declarations, but that's -uncommon. +While you I elide C from C declarations, few people do. + +=head3 Bareword constants X @@ -113,36 +121,44 @@ Constants declared with the C pragma are usable as barewords: =end programlisting -Be aware that these constants do I interpolate in interpolation contexts -such as double-quoted strings. +Note that these constants do I interpolate in double-quoted strings, for +example. X -Constants are a special case of prototyped functions (L). If -you've predeclared a prototype for a function, you may use that function as a -bareword; Perl 5 knows everything it needs to know to parse all occurrences of -that function appropriately. The other drawbacks of prototypes still apply. +Constants are a special case of prototyped functions (L). When you +predeclare a function with a prototype, the parser knows how to treat that +function and will warn about ambiguous parsing errors. All other drawbacks of +prototypes still apply. =head2 Ill-Advised Uses of Barewords X -Barewords should be rare in modern Perl code; their ambiguity produces fragile -code. You can avoid them in almost every case, but you may encounter several -poor uses of barewords in legacy code. +No matter how cautiously you code, barewords still produce ambiguous code. You +can avoid most uses, but you will encounter several types of barewords in +legacy code. + +=head3 Bareword filehandles X Prior to lexical filehandles (L), all file and directory -handles used barewords. You can almost always safely rewrite this code to use +handles used barewords. You can almost always safely rewrite this code to use lexical filehandles; the exceptions are C, C, and C. +Fortunately, Perl's parser recognizes these. + +=head3 Bareword function calls X +X> + +Code written without C may use bareword function names. Adding +parentheses makes the code pass strictures. Use C (see +C) to discover how Perl parses them, then parenthesize +accordingly. -Code written without C in effect may use bareword function -names. You may safely parenthesize the argument lists to these functions -without changing the intent of the codeN to -discover how Perl parses them, then parenthesize accordingly.>. +=head3 Bareword hash values X @@ -160,17 +176,20 @@ pairs appropriately: =end programlisting -Because neither the C nor C functions exist, Perl parses -these hash values as strings. The C pragma makes the parser -give an error in this situation. +When neither the C nor C functions exist, Perl will +interpret these barewords as strings. C will produce an error in +this situation. + +=head3 Bareword sort functions X X> X> Finally, the C builtin can take as its second argument the I of a -function to use for sorting. Instead provide a I to the function to -use for sorting to avoid the use of barewords: +function to use for sorting. While this is rarely ambiguous to the parser, it +can confuse I readers. Prefer instead a I to the sorting +function: =begin programlisting @@ -183,12 +202,11 @@ use for sorting to avoid the use of barewords: =end programlisting -The result is one line longer, but it avoids the use of a bareword. Unlike -other bareword examples, Perl's parser needs no disambiguation for this syntax. -There is only one way for it to interpret C. However, the -clarity of an explicit reference can help human readers. - -Perl 5's parser I understand the single-line version: +The result is one line longer, but it avoids the use of a bareword. +Unfortunately, Perl 5's parser I understand the single-line version +due to the special parsing of C; you cannot use an arbitrary expression +(such as taking a reference to a named function) where a block or a scalar +might otherwise go. =begin programlisting @@ -196,8 +214,3 @@ Perl 5's parser I understand the single-line version: my @sorted = sort \&compare_lengths @unsorted; =end programlisting - -This is due to the special parsing of C; you cannot use an arbitrary -expression (such as taking a reference to a named function) where a block or a -scalar might otherwise go. - diff --git a/sections/chapter_11.pod b/sections/chapter_11.pod index 5fd94e1a..c52823e6 100644 --- a/sections/chapter_11.pod +++ b/sections/chapter_11.pod @@ -1,10 +1,9 @@ =head0 What to Avoid -Perl 5 isn't perfect. Some features seemed like good ideas at the time, but -they're difficult to use correctly. Others don't work as anyone might expect. -A few more are simply bad ideas. These features will likely persist--removing -a feature from Perl is a serious process reserved for only the most egregious -offenses--but you can and should avoid them in almost every case. +Perl 5 isn't perfect. Some features are difficult to use correctly. Otherwise +have never worked well. A few are quirky combinations of other features with +strange edge cases. While you're better off avoiding these features, knowing +why to avoid them will help you find better solutions. L diff --git a/sections/indirect_objects.pod b/sections/indirect_objects.pod index ebc6c830..455f491c 100644 --- a/sections/indirect_objects.pod +++ b/sections/indirect_objects.pod @@ -2,11 +2,10 @@ Z -A constructor in Perl 5 is anything which returns an object; C is not a -builtin operator. By convention, constructors are class methods named -C, but you have the flexibility to choose a different approach to meet -your needs. Several old Perl 5 object tutorials promote the use of C++ and -Java-style constructor calls: +Perl 5 has no operator C; a constructor in Perl 5 is anything which +returns an object. By convention, constructors are class methods named +C, but you can choose anything you like. Several old Perl 5 object +tutorials promote the use of C++ and Java-style constructor calls: =begin programlisting @@ -14,7 +13,7 @@ Java-style constructor calls: =end programlisting -... instead of the unambiguous: +... instead of the obvious method call: =begin programlisting @@ -22,26 +21,26 @@ Java-style constructor calls: =end programlisting -These syntaxes are equivalent in behavior, except when they're not. +These syntaxes produce equivalent behavior, except when they don't. + +=head2 Bareword Indirect Invocations X X -The first form is the indirect object form (more precisely, the I -case), where the verb (the method) precedes the noun to which it refers (the -object). This is fine in spoken languages, but it introduces parsing +In the indirect object form (more precisely, the I case) of the first +example, the verb (the method) precedes the noun to which it refers (the +object). This is fine in spoken languages, but it introduces parsing ambiguities in Perl 5. -=head2 Bareword Indirect Invocations - -One problem is that the name of the method is a bareword (L). The -parser must apply several heuristics to determine the proper interpretation. -While these heuristics are well-tested and I always correct, their -failure modes are confusing. Worse, they're fragile in the face of the -I of compilation and module loading. +As the method name is a bareword (L), the parser must divine the +proper interpretation of the code through the use of several heuristics. While +these heuristics are well-tested and I always correct, their failure +modes are confusing. Worse yet, they depend on the order of compilation of code +and modules. -Parsing is more difficult for humans I the computer when the constructor -takes arguments. The indirect style may resemble: +Parsing difficulty increases when the constructor takes arguments. The indirect +style may resemble: =begin programlisting @@ -50,21 +49,21 @@ takes arguments. The indirect style may resemble: =end programlisting -... thus making the class name C look like a function call. Perl 5 -I disambiguate many of these cases, but its heuristics depend on which -package names the parser has seen at the current point in the parse, which -barewords it has already resolved (and how it resolved them), and the I -of functions already declared in the current package. +... thus making the name of the class look like a function call. Perl 5 I +disambiguate many of these cases, but its heuristics depend on which package +names the parser has seen, which barewords it has already resolved (and how it +resolved them), and the I of functions already declared in the current +package. -Imagine running afoul of a prototyped function (L) with a -name which just happens to conflict somehow with the name of a class or a -method called indirectly. This is infrequent, but so difficult to debug that -avoiding this syntax is always worthwhile. +Imagine running afoul of a prototyped function (L) with a name +which just happens to conflict somehow with the name of a class or a method +called indirectly. This is rare, but so unpleasant to debug that it's worth +avoiding indirect invocations. =head2 Indirect Notation Scalar Limitations Another danger of the syntax is that the parser expects a single scalar -expression as the object. Printing to a filehandle stored in an aggregate +expression as the object. Printing to a filehandle stored in an aggregate variable I obvious, but it is not: =begin programlisting @@ -74,6 +73,8 @@ variable I obvious, but it is not: =end programlisting +Perl will attempt to call C on the C<$config> object. + X> X> X> @@ -81,9 +82,8 @@ X> C, C, and C--all builtins which operate on filehandles--operate in an indirect fashion. This was fine when filehandles were package globals, but lexical filehandles (L) make the -indirect object syntax problems obvious. In the previous example, Perl will -try to call the C method on the C<$config> object. The solution is to -disambiguate the expression which produces the intended invocant: +indirect object syntax problems obvious. To solve this, disambiguate the +subexpression which produces the intended invocant: =begin programlisting @@ -103,24 +103,17 @@ construct an object, call the constructor method on the class name directly: =end programlisting -X> +X> For the limited case of filehandle operations, the dative use is so prevalent that you can use the indirect invocation approach if you surround your intended -invocant with curly brackets. Another option is to use the core C -module which adds IO methods to lexical filehandles. - -=begin sidebar - -For supreme paranoia, disambiguate class method calls further by appending -C<::> to the end of class names, such as C<< CGI::->new() >>. Very little code -does this in practice, however. - -=end sidebar +invocant with curly brackets. If you're using Perl 5.14 (or if you load +C or C), you can use methods on lexical +filehandlesN< and C though.>>. -X> -X> -X> +X> +X> +X> The CPAN module C (a plugin for C) can identify indirect invocations during code reviews. The diff --git a/sections/method_sub_equivalence.pod b/sections/method_sub_equivalence.pod index 0c7d6439..420add65 100644 --- a/sections/method_sub_equivalence.pod +++ b/sections/method_sub_equivalence.pod @@ -5,21 +5,18 @@ Z X> Perl 5's object system is deliberately minimal (L). -Because a class is a package, Perl itself makes no strong distinction between a -function stored in a package and a method stored in a package. The same -builtin, C, expresses both. Documentation and the convention of treating -the first parameter as C<$self> can imply intent to readers of the code, but -Perl itself will treat any function of the appropriate name it can find in an -appropriate package as a method if you try to call it as a method. +Because a class is a package, Perl does not distinguish between a function and +a method stored in a package. The same builtin, C, declares both. +Documentation can clarify your intent, but Perl will happily dispatch to a +function called as a method. Likewise, you can invoke a method as if it were a +function--fully-qualified, exported, or as a reference--if you pass in your own +invocant manually. -Likewise, you can invoke a method as if it were a function--fully-qualified, -exported, or as a reference--if you pass in your own invocant manually. - -Both approaches have their problems; avoid them. +Invoking the wrong thing in the wrong way causes problems. =head2 Caller-side -Suppose you have a class which contains several methods: +Consider a class with several methods: =begin programlisting @@ -37,8 +34,8 @@ Suppose you have a class which contains several methods: =end programlisting -If you have an C object C<$o>, the following invocations of this method -I seem equivalent: +Given an C object C<$o>, the following invocations of this method I +seem equivalent: =begin programlisting @@ -50,22 +47,21 @@ I seem equivalent: =end programlisting Though in this simple case, they produce the same output, the latter violates -the encapsulation of objects in subtle ways. It avoids method lookup -altogether. +object encapsulation by avoiding method lookup. X If C<$o> were instead a subclass or allomorph (L) of C which -overrode C, calling the method as a function would produce -the wrong behavior. Any change to the implementation of C, -such as a modification of inheritance or delegation through -C--might break calling code. +overrode C, bypassing method dispatch would call the wrong +method. Any change to the implementation of C, such as a +modification of inheritance or delegation through C--might break +calling code. X X; C> X> -Perl has one circumstance where this behavior may seem necessary. If you force +Perl has one circumstance where this behavior may seem necessary. If you force method resolution without dispatch, how do you invoke the resulting method reference? @@ -96,7 +92,7 @@ The second is to use the reference itself with method invocation syntax: =end programlisting When C<$meth_ref> contains a function reference, Perl will invoke that -reference with C<$o> as the invocant. This works even under strictures, as it +reference with C<$o> as the invocant. This works even under strictures, as it does when invoking a method with a scalar containing its name: =begin programlisting @@ -107,13 +103,12 @@ does when invoking a method with a scalar containing its name: =end programlisting There is one small drawback in invoking a method by reference; if the structure -of the program has changed between storing the reference and invoking the -reference, the reference may no longer refer to the current, most appropriate -method. If the C class has changed such that C -is no longer the right method to call, the reference in C<$meth_ref> will not -have updated. +of the program changes between storing the reference and invoking the +reference, the reference may no longer refer to the most appropriate method. If +the C class has changed such that C is no longer +the right method to call, the reference in C<$meth_ref> will not have updated. -If you use this form of invocation, limit the scope of the references. +When you use this invocation form, limit the scope of the references. =head2 Callee-side @@ -123,22 +118,15 @@ X Because Perl 5 makes no distinction between functions and methods at the point of declaration and because it's I (however inadvisable) to invoke a given function as a function or a method, it's possible to write a function -callable as either. - -The core C module is a prime offender. Its functions manually inspect -C<@_> to determine whether the first argument is a likely invocant. If so, -they ensure that any object state the function needs to access is available. -If the first argument is not a likely invocant, the function must consult -global data elsewhere. - -As with all heuristics, there are corner cases. It's difficult to predict -exactly which invocants are potentially valid for a given method, especially -when considering that users can create their own subclasses. The documentation -burden is also greater, given the need to explain the dichotomy of the code and -the desire to avoid misuse. What happens when one part of the project uses the -procedural interface and another uses the object interface? - -Providing separate procedural and object interfaces to a library may be -justifiable. Some designs make some techniques more useful than others. -Conflating the two into a single API will create a maintenance burden. Avoid -it. +callable as either. The core C module is a prime offender. Its functions +apply several heuristics to determine whether their first arguments are +invocants. + +The drawbacks are many. It's difficult to predict exactly which invocants are +potentially valid for a given method, especially when you may have to deal with +subclasses. Creating an API that users cannot easily misuse is more difficult +too, as is your documentation burden. What happens when one part of the project +uses the procedural interface and another uses the object interface? + +If you I provide a separate procedural and OO interface to a library, +create two separate APIs. diff --git a/sections/prototypes.pod b/sections/prototypes.pod index 0e55309b..e12f4ef5 100644 --- a/sections/prototypes.pod +++ b/sections/prototypes.pod @@ -4,48 +4,39 @@ Z X -A I is a piece of optional metadata attached to a function -declaration. Novices commonly assume that these prototypes serve as function -signatures; they do not. Instead they serve two separate purposes: they offer -hints to the parser to change the way it parses functions and their arguments, -and they modify the way Perl 5 handles arguments to those functions. +A I is a piece of optional metadata attached to a function which +changes the way the parser understands its arguments. While they may +superficially resemble function signatures in other languages, they are very +different. -To declare a function prototype, add it after the name: +X> -=begin programlisting - - sub foo (&@); - sub bar ($$) { ... } - my $baz = sub (&&) { ... }; - -=end programlisting +Prototypes allow users to define their own functions which behave like +builtins. Consider the builtin C, which takes an array and a list. While +Perl 5 would normally flatten the array and list into a single list passed to +C, the parser knows not to flatten the array so that C can modify +it in place. -You may add prototypes to function forward declarations. You may also omit -them from forward declarations. If you use a forward declaration with a -prototype, that prototype must be present in the full function declaration; -Perl will give a prototype mismatch warning if not. The converse is not true: -you may omit the prototype from a forward declaration and include it for the -full declaration. +Function prototypes are part of declarations: -=begin sidebar +=begin programlisting -There's little reason to omit the prototype from a forward declaration except -for the desire to write too-clever code. + sub foo B<(&@)>; + sub bar B<($$)> { ... } + my $baz = sub B<(&&)> { ... }; -=end sidebar +=end programlisting -The original intent of prototypes was to allow users to define their own -functions which behaved like (certain) builtin operators. Consider the -behavior of the C operator, which takes an array and a list. While Perl -5 would normally flatten the array and list into a single list at the call -site, the Perl 5 parser knows that a call to C must effectively pass the -array as a single unit so that C can operate on the array in place. +Any prototype attached to a forward declaration must match the prototype +attached to the function declaration. Perl will give a warning if this is not +true. Strangely you may omit the prototype from a forward declaration and +include it for the full declaration--but there's no reason to do so. X> The builtin C takes the name of a function and returns a string -representing its prototype. To see the prototype of a builtin, use the -C form: +representing its prototype. Use the C form to see the prototype of a +builtin: =begin programlisting @@ -58,8 +49,8 @@ C form: =end programlisting -Some builtins have prototypes you cannot emulate. In these cases, -C will return C: +C will return C for those builtins whose functions you cannot +emulate: =begin programlisting @@ -73,7 +64,7 @@ C will return C: =end programlisting -Look at C again: +Remember C? =begin programlisting @@ -82,10 +73,10 @@ Look at C again: =end programlisting -The C<@> character represents a list. The backslash forces the use of a -I to the corresponding argument. Thus this function takes a -reference to an array (because you can't take a reference to a list) and a list -of values. C might be: +The C<@> character represents a list. The backslash forces the use of a +I to the corresponding argument. This prototype means that C +takes a reference to an array and a list of values. You might write C +as: =begin programlisting @@ -97,15 +88,15 @@ of values. C might be: =end programlisting -Valid prototype characters include C<$> to force a scalar argument, C<%> to -mark a hash (most often used as a reference), and C<&> which marks a code -block. See C for full documentation. +Other prototype characters include C<$> to force a scalar argument, C<%> to +mark a hash (most often used as a reference), and C<&> to identify a code +block. See C for full documentation. =head2 The Problem with Prototypes -Prototypes can change the parsing of subsequent code and they can coerce the -types of arguments. They don't serve as documentation to the number or types -of arguments functions expect, nor do they map arguments to named parameters. +Prototypes change how Perl parses your code and can cause argument type +coercions. They do not document the number or types of arguments functions +expect, nor do they map arguments to named parameters. Prototype coercions work in subtle ways, such as enforcing scalar context on incoming arguments: @@ -124,7 +115,7 @@ incoming arguments: =end programlisting -... but do I work on anything more complex than a simple expression: +... but only work on simple expressions: =begin programlisting @@ -136,23 +127,23 @@ incoming arguments: =end programlisting -Those aren't even the I kinds of confusion you can get from -prototypes. +To debug this, users of C must know both that a prototype exists, and +the limitations of the array prototype. Worse yet, these are the I +errors prototypes can cause. =head2 Good Uses of Prototypes -As long as code maintainers do not confuse them for full function signatures, -prototypes have a few valid uses. - X X> X pragma> -First, they are often necessary to emulate and override builtins with -user-defined functions. You must first check that you I override the -builtin by checking that C does not return C. Once you know -the prototype of the builtin, use a forward declaration of a function with the -same name as the core builtin: +Few uses of prototypes are compelling enough to overcome their drawbacks, but +they exist. + +First, they can allow you to override builtins. First check that you I +override the builtin by examining its prototype in a small test program. Then +use the C pragma to tell Perl that you you plan to override a builtin, +and finally declare your override with the correct prototype: =begin programlisting @@ -167,10 +158,11 @@ regardless of any lexical scoping. X -The second reason to use prototypes is to define compile-time constants. A -function declared with an empty prototype (as opposed to I prototype) -which evaluates to a single expression becomes a constant rather than a function -call: +The second reason to use prototypes is to define compile-time constants. When +Perl encounters a function declared with an empty prototype (as opposed to +I prototype) I this function evaluates to a single constant +expression, the optimizer will turn all calls to that function into constants +instead of function calls: =begin programlisting @@ -178,31 +170,26 @@ call: =end programlisting -After it processed that prototype declaration, the Perl 5 optimizer -knows it should substitute the calculated value of pi whenever it -encounters a bareword or parenthesized call to C in the rest of the source -code (with respect to scoping and visibility). +All subsequent code will use the calculated value of pi in place of the +bareword C or a call to C, with respect to scoping and visibility. X pragma> X> -X> +X> -Rather than defining constants directly, the core C pragma handles -the details for you and may be clearer to read. If you want to interpolate -constants into strings, the C module from the CPAN may be more -useful. +The core pragma C handles these details for you. The C +module from the CPAN creates constant scalars which you can interpolate into +strings. -X> -X> +X> +X> -The final reason to use a prototype is to extend Perl's syntax to operate on -anonymous functions as blocks. The CPAN module C uses this to -good effect to provide a nice API with delayed computationN is a -newer alternative which may prove more popular in the future.>. Its -C function takes three arguments: a block of code to run, a -regular expression to match against the string of the exception, and an -optional description of the test. Suppose that you want to test Perl 5's -exception message when attempting to invoke a method on an undefined value: +A final reasonable use of prototypes is to extend Perl's syntax to operate on +anonymous functions as blocks. The CPAN module C uses this to +good effect to provide a nice API with delayed computationN>. Its C function takes three arguments: a block of +code to run, a regular expression to match against the string of the exception, +and an optional description of the test: =begin programlisting @@ -216,23 +203,15 @@ exception message when attempting to invoke a method on an undefined value: =end programlisting -The exported C function has a prototype of C<&$;$>. Its first -argument is a block, which Perl upgrades to a full-fledged anonymous function. -The second requirement is a scalar. The third argument is optional. - -The most careful readers may have spotted a syntax oddity notable in its -absence: there is no trailing comma after the end of the anonymous function -passed as the first argument to C. This is a quirk of the Perl 5 -parser. Adding the comma causes a syntax error. The parser expects -whitespace, not the comma operator. - -=begin sidebar - -The "no commas here" rule is a drawback of the prototype syntax. +The exported C function has a prototype of C<&$;$>. Its first +argument is a block, which becomes an anonymous function. The second argument +is a scalar. The third argument is optional. -=end sidebar +Careful readers may have spotted the absence of a comma after the block. This +is a quirk of the Perl 5 parser, which expects whitespace after a prototyped +block, not the comma operator. This is a drawback of the prototype syntax. -You can use this API without the prototype. It's slightly less attractive: +You may use C without taking advantage of the prototype: =begin programlisting @@ -248,11 +227,21 @@ You can use this API without the prototype. It's slightly less attractive: X> -A sparing use of function prototypes to remove the need for the C builtin -is reasonable. Another is when defining a custom function to use with -CN. Declare this function with a -prototype of C<($$)> and Perl will pass its arguments in C<@_> rather than the -package globals C<$a> and C<$b>. This is a rare case, but it can save you time -debugging. +A final good use of prototypes is when defining a custom named function to use +with CN: + +=begin programlisting + + sub length_sort ($$) + { + my ($left, $right) = @_; + + return length($left) <=> length($right); + } + + my @sorted = sort length_sort @unsorted; + +=end programlisting -Few other uses of prototypes are compelling enough to overcome their drawbacks. +The prototype of C<$$> forces Perl to pass the sort pairs in C<@_> instead of +the package globals C<$a> and C<$b>. diff --git a/sections/tie.pod b/sections/tie.pod index 34e9107e..6909a389 100644 --- a/sections/tie.pod +++ b/sections/tie.pod @@ -2,38 +2,42 @@ Z +Where overloading (L) allows you to customize the behavior of +classes and objects for specific types of coercion, a mechanism called I +allows you to customize the behavior of primitive variables (scalars, arrays, +hashes, and filehandles). Any operation you might perform on a tied variable +translates to a specific method call. + X> +X> -Overloading (L) lets you give classes custom behavior for specific -types of coercions and accesses. A similar mechanism exists for making classes -act like built-in types (scalars, arrays, and hashes), but with more specific -behaviors. This mechanism uses the C builtin; it is I. +The C builtin originally allowed the use of hashes stored on disk, so that +Perl could access files larger than could easily fit in memory. The core module +C provides a similar system, and allows you to treat files as if +they were arrays. -The original use of C was to produce a hash stored on disk, rather than in -memory. This allowed the use of DBM files from Perl, as well as the ability to -access files larger than could fit in memory. The core module C -provides a similar system by which to handle data files too large to fit in -memory. +X> +X> +X> The class to which you C a variable must conform to a defined interface -for the specific data type. C is the primary source of -information about these interfaces, though the core modules C, -C, and C are more useful in practice. Inherit -from them to start, and override only those specific methods you need to -modify. +for a specific data type. See C for an overview, then consult +the core modules C, C, and C for +specific details. Start by inheriting from one of those classes, then override +any specific methods you need to modify. =begin sidebar -C, C, and C define the necessary interfaces -to tie scalars, arrays, and hashes, but C, C, -and C provide the default implementations. If C hasn't -confused you, the organization of this code might. +If C weren't confusing enough, C, C, and +C define the necessary interfaces to tie scalars, arrays, and +hashes, but C, C, and C provide +the default implementations. =end sidebar =head2 Tying Variables -Given a variable to tie, tie it with the syntax: +To tie a variable: =begin programlisting @@ -42,53 +46,51 @@ Given a variable to tie, tie it with the syntax: =end programlisting -... where the first argument is the variable to tie, the second is the name of -the class into which to tie it, and C<@args> is an optional list of arguments -required for the tying function. In the case of C, this is the name -of the file to which to tie the array. +The first argument is the variable to tie, the second is the name of the class +into which to tie it, and C<@args> is an optional list of arguments required +for the tying function. In the case of C, this is a valid filename. X> X> Tying functions resemble constructors: C, C, C, or C for scalars, arrays, hashes, and filehandles -respectively. Each function returns a new object which represents the tied -variable. Both the C and C builtins return this object, but most -people ignore it in favor of checking its boolification to determine whether a -given variable is tied. +respectively. Each function returns a new object which represents the tied +variable. Both the C and C builtins return this object. Most people +use C in a boolean context, however. =head2 Implementing Tied Variables To implement the class of a tied variable, inherit from a core module such as C, then override the specific methods for the operations you -want to change. In the case of a tied scalar, you probably need to override -C and C, may need to override C, and can often -ignore C. +want to change. In the case of a tied scalar, these are likely C and +C, possibly C, and probably not C. You can create a class which logs all reads from and writes to a scalar with very little code: =begin programlisting - package Tie::Scalar::Logged; - - use Modern::Perl; - - use Tie::Scalar; - use parent -norequire => 'Tie::StdScalar'; - - sub STORE - { - my ($self, $value) = @_; - Logger->log("Storing <$value> (was [$$self])", 1); - $$self = $value; - } - - sub FETCH + package Tie::Scalar::Logged { - my $self = shift; - Logger->log("Retrieving <$$self>", 1); - return $$self; + use Modern::Perl; + + use Tie::Scalar; + use parent -norequire => 'Tie::StdScalar'; + + sub STORE + { + my ($self, $value) = @_; + Logger->log("Storing <$value> (was [$$self])", 1); + $$self = $value; + } + + sub FETCH + { + my $self = shift; + Logger->log("Retrieving <$$self>", 1); + return $$self; + } } 1; @@ -96,9 +98,14 @@ very little code: =end programlisting Assume that the C class method C takes a string and the number -of frames up the call stack of which to report the location. Be aware that -C does not have its own F<.pm> file, so you must use -C to make it available. +of frames up the call stack of which to report the location. + +=begin tip Using C + +C lacks its own F<.pm> file; C to make it +available. + +=end tip Within the C and C methods, C<$self> works as a blessed scalar. Assigning to that scalar reference changes the value of the scalar and @@ -119,19 +126,19 @@ F. =head2 When to use Tied Variables -Tied variables seem like fun opportunities for cleverness, but they make for -confusing interfaces in almost all cases, due mostly to their rarity. Unless -you have a very good reason for making objects behave as if they were built-in -data types, avoid creating your own ties. +Tied variables seem like fun opportunities for cleverness, but they can produce +confusing interfaces. Unless you have a very good reason for making objects +behave as if they were builtin data types, avoid creating your own ties. +C is also much slower than using the builtin types due to various reasons +of implementation. Good reasons include to ease debugging (use the logged scalar to help you understand where a value changes) and to make certain impossible operations -possible (accessing large files in a memory-efficient way). Tied variables are +possible (accessing large files in a memory-efficient way). Tied variables are less useful as the primary interfaces to objects; it's often too difficult and constraining to try to fit your whole interface to that supported by C. -The final word of warning is both sad and convincing; far too much code does -not expect to work with tied variables. Code which violates encapsulation may -prohibit good and valid uses of cleverness. This is unfortunate, but violating -the expectations of library code tends to reveal bugs that are often out of -your power to fix. +The final word of warning is both sad and convincing; too much code goes out of +its way to I use of tied variables, often by accident. This is +unfortunate, but violating the expectations of library code tends to reveal +bugs that are often out of your power to fix.