@@ -1911,40 +1911,177 @@ present. See L<overload/fallback> for more details.
19111911
19121912=head2 The Structure of an XSUB
19131913
1914- XXX TBC
1914+ Following any file-scoped XS keywords and directives, an XSUB may appear.
1915+ The start of an XSUB is usually indicated by a blank line followed by
1916+ something starting on column one which isn't otherwise recognised as an
1917+ XSUB keyword or file-scoped directive.
1918+
1919+ An XSUB definition consists of a declaration (typically two lines),
1920+ followed by an optional body. The declaration specifies the XSUB's name,
1921+ parameters and return type. The body consists of sections started by
1922+ keywords, which may specify how its parameters and any any return value
1923+ should be processed, and what the main C code body of the XSUB consists
1924+ of. Other keywords can change the behaviour of the XSUB, or affect how it
1925+ is registered with Perl, e.g. with extra named aliases. In the absence of
1926+ an explicit main C code body specified by the C<CODE> or C<PPCODE>
1927+ keywords, the parser will generate a body automatically; this is referred
1928+ to as L<autocall|/"Auto-calling a C function"> in this document.
1929+
1930+ Nothing can appear between keyword sections apart from POD, XS comments,
1931+ and trailing blank lines, all of which are stripped out before the main
1932+ parsing takes place. Anything else will either raise an error, or be
1933+ interpreted as the start of a new XSUB.
1934+
1935+ An XSUB's body can be thought of as having up to five parts. These are, in
1936+ order of appearance, the L<Input|/"The XSUB Input Part">, L<Init|/"The
1937+ XSUB Init Part">, L<Code|/"The XSUB Code Part">, L<Output|/"The XSUB
1938+ Output Part"> and L<Cleanup|/"The XSUB Cleanup Part"> parts. There is no
1939+ formal syntax to define this structure; it's just an understanding that
1940+ certain keywords may only appear in certain parts and thus may only appear
1941+ after certain other keywords etc.
19151942
1916- XXX mention that XS comments and POD can appear between keywords
19171943
19181944=head2 An XSUB Declaration
19191945
1920- XXX TBC
1946+ # A simple declaration:
1947+
1948+ int
1949+ foo1(int i, char *s)
1950+
1951+ # All on one line; plus a default parameter value:
1952+
1953+ int foo2(int i, char *s = "")
1954+
1955+ # Complex parameters; plus variable argument count:
1956+
1957+ int
1958+ foo3(OUT int i, IN_OUTLIST char *s, STRLEN length(s), ...)
1959+
1960+ # No automatic argument processing:
1961+
1962+ void
1963+ foo4(...)
1964+ PPCODE:
19211965
1922- =head3 The NO_OUTPUT Keyword
1966+ # C++ method; plus various return type qualifiers:
19231967
1924- The NO_OUTPUT can be placed as the first token of the XSUB. This keyword
1925- indicates that while the C subroutine we provide an interface to has
1926- a non-C<void> return type, the return value of this C subroutine should not
1927- be returned from the generated Perl subroutine.
1968+ NO_OUTPUT extern "C" static int
1969+ X::Y::foo5(int i, char *s) const
19281970
1929- With this keyword present the C<RETVAL> variable is created, and in the
1930- generated call to the subroutine this variable is assigned to, but the value
1931- of this variable is not going to be used in the auto-generated code.
19321971
1933- This keyword makes sense only if C<RETVAL> is going to be accessed by the
1934- user-supplied code. It is especially useful to make a function interface
1935- more Perl-like, especially when the C return value is just an error condition
1936- indicator. For example,
1972+ An XSUB declaration consists of a return type, name, parameters, and
1973+ optional C<NO_OUTPUT>, C<extern "C">, C<static> and C<const> keywords.
1974+
1975+ =head3 An XSUB's return type and the NO_OUTPUT keyword
1976+
1977+ The return type can be any valid C type, including C<void>. When non-void,
1978+ it serves two purposes. First, it causes a C auto variable of that type
1979+ to be declared, called C<RETVAL>. Second, it (usually) makes the XSUB
1980+ return a single SV whose value is set to C<RETVAL>'s value at the time of
1981+ return. In addition, a non-void autocall XSUB will call the underlying C
1982+ library function and assign its return value to C<RETVAL>.
1983+
1984+ If the return type is prefixed with the C<NO_OUTPUT> keyword, then the
1985+ C<RETVAL> variable is still declared, but code to return its value is
1986+ suppressed. It is typically useful when making an autocall function
1987+ interface more Perl-like, especially when the C return value is just an
1988+ error condition indicator. For example,
19371989
19381990 NO_OUTPUT int
19391991 delete_file(char *name)
1992+ # implicit autocall code here: RETVAL = delete_file(name);
19401993 POSTCALL:
19411994 if (RETVAL != 0)
19421995 croak("Error %d while deleting file '%s'", RETVAL, name);
19431996
1944- Here the generated XS function returns nothing on success, and will die()
1945- with a meaningful error message on error.
1997+ Here the generated XS function returns nothing on success, and will
1998+ C<die()> with a meaningful error message on error. The XSUB's return type
1999+ of C<int> is only meaningful for declaring C<RETVAL> and for doing the
2000+ autocall.
2001+
2002+ The return type can also include the C<extern "C"> and C<static>
2003+ modifiers, which if present must be in that order, and come between any
2004+ C<NO_OUTPUT> keyword and the return type. The C<extern> declaration must
2005+ be written exactly as shown, i.e. with a single space and with double
2006+ quotes around the C<C>. These two modifiers are mainly of use for XSUBs
2007+ written in C++. A C++ XSUB declaration is also allowed to have a trailing
2008+ C<const> keyword, which mimics the C++ syntax. See L</"Using XS With C++">
2009+ for more details.
2010+
2011+ =head3 An XSUB's name
2012+
2013+ The name of the XSUB is usually put on the line following the type, in
2014+ which case it must be on column one. It is permissible for both the return
2015+ type and name to be on the same line.
2016+
2017+ The name can be any valid Perl subroutine name. The C<PACKAGE> value from
2018+ the most recent C<MODULE> declaration is used to give the XSUB it's
2019+ fully-qualified Perl name.
2020+
2021+ If the name includes the package separator, C<::>, then it is treated as
2022+ as a C++ method declaration, and various extra bits of processing take
2023+ place, such as declaring an implicit C<THIS> parameter. The XSUB's I<Perl>
2024+ package name is still determined by the current XS package, and not the
2025+ C++ class name. See L</"Using XS With C++"> for more details.
2026+
2027+ =head3 An XSUB's parameter list
2028+
2029+ Following the XSUB's name, there is a comma-separated list of parameters
2030+ within parentheses. Although this looks superficially the same as a C
2031+ function declaration, it is different. In particular, it is parsed by the
2032+ XS compiler, which is a simple regex-based text processor and which
2033+ doesn't understand the full C type syntax; nor does it recognise C-style
2034+ comments.
2035+
2036+ In fact all it does is extract the text between the C<(...)> and split on
2037+ commas, while having enough intelligence to ignore commas and a closing
2038+ parenthesis within a double-quoted string. Once each parameter declaration
2039+ is extracted, it is processed, as described below in
2040+ L</"An XSUB Parameter">.
2041+
2042+ Each parameter declaration usually generates a C auto variable declaration
2043+ of the same name, along with initialisation code which assigns the value
2044+ of the corresponding passed argument to that variable. Under some
2045+ circumstances code can also be generated to return the value too.
2046+
2047+ Note that the original XS syntax required the type for each parameter to
2048+ be specified separately in one or more INPUT sections, mimicking pre-C89
2049+ "K&R" C syntax. To support this, directly after the declaration there is an
2050+ implicit INPUT section, without a need to include the actual keyword. You
2051+ will see this pattern very frequently in older XS code.
2052+
2053+ Old style with an implicit INPUT keyword (a common pattern):
2054+
2055+ int
2056+ foo(a, b)
2057+ long a
2058+ char *b
2059+ CODE:
2060+ ...
2061+
2062+ Old style with explicit INPUT keyword (unusual):
2063+
2064+ int
2065+ foo(a, b)
2066+ INPUT:
2067+ long a
2068+ char *b
2069+ CODE:
2070+ ...
2071+
2072+ New style (recommended for new code):
2073+
2074+ int
2075+ foo(long a, char *b)
2076+ CODE:
2077+ ...
2078+
2079+ Generally there no reason to use the old style any more, apart from a few
2080+ obscure features that can be specified on an INPUT line but not in the
2081+ signature.
2082+
19462083
1947- =head2 XSUB Parameters
2084+ =head2 An XSUB Parameter
19482085
19492086=head3 The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords
19502087
@@ -1971,7 +2108,7 @@ pointers.
19712108
19722109The return list of the generated Perl function consists of the C return value
19732110from the function (unless the XSUB is of C<void> return type or
1974- C<The NO_OUTPUT Keyword > was used) followed by all the C<OUTLIST>
2111+ C<NO_OUTPUT> was used) followed by all the C<OUTLIST>
19752112and C<IN_OUTLIST> parameters (in the order of appearance). On the
19762113return from the XSUB the C<IN_OUT>/C<OUT> Perl parameter will be
19772114modified to have the values written by the C function.
@@ -2614,7 +2751,7 @@ executed after the C subroutine call is performed. When the POSTCALL:
26142751keyword is used it must precede OUTPUT: and CLEANUP: blocks which are
26152752present in the XSUB.
26162753
2617- See examples in L<"The NO_OUTPUT Keyword ">.
2754+ See an example in L<"An XSUB Declaration ">.
26182755
26192756The POSTCALL: block does not make a lot of sense when the C subroutine
26202757call is supplied by user by providing either CODE: or PPCODE: section.
0 commit comments