Skip to content

Commit e5346fd

Browse files
committed
perlxs.pod: update XSUB Structure + Declaration
Populate the new =head2 The Structure of an XSUB =head2 An XSUB Declaration sections
1 parent 9af5c74 commit e5346fd

File tree

1 file changed

+157
-20
lines changed

1 file changed

+157
-20
lines changed

dist/ExtUtils-ParseXS/lib/perlxs.pod

Lines changed: 157 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1911,40 +1911,177 @@ present. See L<overload/fallback> for more details.
19111911

19121912
=head2 The Structure of an XSUB
19131913

1914-
XXX TBC
1914+
Following any file-scoped XS keywords and directives, an XSUB may appear.
1915+
The start of an XSUB is usually indicated by a blank line followed by
1916+
something starting on column one which isn't otherwise recognised as an
1917+
XSUB keyword or file-scoped directive.
1918+
1919+
An XSUB definition consists of a declaration (typically two lines),
1920+
followed by an optional body. The declaration specifies the XSUB's name,
1921+
parameters and return type. The body consists of sections started by
1922+
keywords, which may specify how its parameters and any any return value
1923+
should be processed, and what the main C code body of the XSUB consists
1924+
of. Other keywords can change the behaviour of the XSUB, or affect how it
1925+
is registered with Perl, e.g. with extra named aliases. In the absence of
1926+
an explicit main C code body specified by the C<CODE> or C<PPCODE>
1927+
keywords, the parser will generate a body automatically; this is referred
1928+
to as L<autocall|/"Auto-calling a C function"> in this document.
1929+
1930+
Nothing can appear between keyword sections apart from POD, XS comments,
1931+
and trailing blank lines, all of which are stripped out before the main
1932+
parsing takes place. Anything else will either raise an error, or be
1933+
interpreted as the start of a new XSUB.
1934+
1935+
An XSUB's body can be thought of as having up to five parts. These are, in
1936+
order of appearance, the L<Input|/"The XSUB Input Part">, L<Init|/"The
1937+
XSUB Init Part">, L<Code|/"The XSUB Code Part">, L<Output|/"The XSUB
1938+
Output Part"> and L<Cleanup|/"The XSUB Cleanup Part"> parts. There is no
1939+
formal syntax to define this structure; it's just an understanding that
1940+
certain keywords may only appear in certain parts and thus may only appear
1941+
after certain other keywords etc.
19151942

1916-
XXX mention that XS comments and POD can appear between keywords
19171943

19181944
=head2 An XSUB Declaration
19191945

1920-
XXX TBC
1946+
# A simple declaration:
1947+
1948+
int
1949+
foo1(int i, char *s)
1950+
1951+
# All on one line; plus a default parameter value:
1952+
1953+
int foo2(int i, char *s = "")
1954+
1955+
# Complex parameters; plus variable argument count:
1956+
1957+
int
1958+
foo3(OUT int i, IN_OUTLIST char *s, STRLEN length(s), ...)
1959+
1960+
# No automatic argument processing:
1961+
1962+
void
1963+
foo4(...)
1964+
PPCODE:
19211965

1922-
=head3 The NO_OUTPUT Keyword
1966+
# C++ method; plus various return type qualifiers:
19231967

1924-
The NO_OUTPUT can be placed as the first token of the XSUB. This keyword
1925-
indicates that while the C subroutine we provide an interface to has
1926-
a non-C<void> return type, the return value of this C subroutine should not
1927-
be returned from the generated Perl subroutine.
1968+
NO_OUTPUT extern "C" static int
1969+
X::Y::foo5(int i, char *s) const
19281970

1929-
With this keyword present the C<RETVAL> variable is created, and in the
1930-
generated call to the subroutine this variable is assigned to, but the value
1931-
of this variable is not going to be used in the auto-generated code.
19321971

1933-
This keyword makes sense only if C<RETVAL> is going to be accessed by the
1934-
user-supplied code. It is especially useful to make a function interface
1935-
more Perl-like, especially when the C return value is just an error condition
1936-
indicator. For example,
1972+
An XSUB declaration consists of a return type, name, parameters, and
1973+
optional C<NO_OUTPUT>, C<extern "C">, C<static> and C<const> keywords.
1974+
1975+
=head3 An XSUB's return type and the NO_OUTPUT keyword
1976+
1977+
The return type can be any valid C type, including C<void>. When non-void,
1978+
it serves two purposes. First, it causes a C auto variable of that type
1979+
to be declared, called C<RETVAL>. Second, it (usually) makes the XSUB
1980+
return a single SV whose value is set to C<RETVAL>'s value at the time of
1981+
return. In addition, a non-void autocall XSUB will call the underlying C
1982+
library function and assign its return value to C<RETVAL>.
1983+
1984+
If the return type is prefixed with the C<NO_OUTPUT> keyword, then the
1985+
C<RETVAL> variable is still declared, but code to return its value is
1986+
suppressed. It is typically useful when making an autocall function
1987+
interface more Perl-like, especially when the C return value is just an
1988+
error condition indicator. For example,
19371989

19381990
NO_OUTPUT int
19391991
delete_file(char *name)
1992+
# implicit autocall code here: RETVAL = delete_file(name);
19401993
POSTCALL:
19411994
if (RETVAL != 0)
19421995
croak("Error %d while deleting file '%s'", RETVAL, name);
19431996

1944-
Here the generated XS function returns nothing on success, and will die()
1945-
with a meaningful error message on error.
1997+
Here the generated XS function returns nothing on success, and will
1998+
C<die()> with a meaningful error message on error. The XSUB's return type
1999+
of C<int> is only meaningful for declaring C<RETVAL> and for doing the
2000+
autocall.
2001+
2002+
The return type can also include the C<extern "C"> and C<static>
2003+
modifiers, which if present must be in that order, and come between any
2004+
C<NO_OUTPUT> keyword and the return type. The C<extern> declaration must
2005+
be written exactly as shown, i.e. with a single space and with double
2006+
quotes around the C<C>. These two modifiers are mainly of use for XSUBs
2007+
written in C++. A C++ XSUB declaration is also allowed to have a trailing
2008+
C<const> keyword, which mimics the C++ syntax. See L</"Using XS With C++">
2009+
for more details.
2010+
2011+
=head3 An XSUB's name
2012+
2013+
The name of the XSUB is usually put on the line following the type, in
2014+
which case it must be on column one. It is permissible for both the return
2015+
type and name to be on the same line.
2016+
2017+
The name can be any valid Perl subroutine name. The C<PACKAGE> value from
2018+
the most recent C<MODULE> declaration is used to give the XSUB it's
2019+
fully-qualified Perl name.
2020+
2021+
If the name includes the package separator, C<::>, then it is treated as
2022+
as a C++ method declaration, and various extra bits of processing take
2023+
place, such as declaring an implicit C<THIS> parameter. The XSUB's I<Perl>
2024+
package name is still determined by the current XS package, and not the
2025+
C++ class name. See L</"Using XS With C++"> for more details.
2026+
2027+
=head3 An XSUB's parameter list
2028+
2029+
Following the XSUB's name, there is a comma-separated list of parameters
2030+
within parentheses. Although this looks superficially the same as a C
2031+
function declaration, it is different. In particular, it is parsed by the
2032+
XS compiler, which is a simple regex-based text processor and which
2033+
doesn't understand the full C type syntax; nor does it recognise C-style
2034+
comments.
2035+
2036+
In fact all it does is extract the text between the C<(...)> and split on
2037+
commas, while having enough intelligence to ignore commas and a closing
2038+
parenthesis within a double-quoted string. Once each parameter declaration
2039+
is extracted, it is processed, as described below in
2040+
L</"An XSUB Parameter">.
2041+
2042+
Each parameter declaration usually generates a C auto variable declaration
2043+
of the same name, along with initialisation code which assigns the value
2044+
of the corresponding passed argument to that variable. Under some
2045+
circumstances code can also be generated to return the value too.
2046+
2047+
Note that the original XS syntax required the type for each parameter to
2048+
be specified separately in one or more INPUT sections, mimicking pre-C89
2049+
"K&R" C syntax. To support this, directly after the declaration there is an
2050+
implicit INPUT section, without a need to include the actual keyword. You
2051+
will see this pattern very frequently in older XS code.
2052+
2053+
Old style with an implicit INPUT keyword (a common pattern):
2054+
2055+
int
2056+
foo(a, b)
2057+
long a
2058+
char *b
2059+
CODE:
2060+
...
2061+
2062+
Old style with explicit INPUT keyword (unusual):
2063+
2064+
int
2065+
foo(a, b)
2066+
INPUT:
2067+
long a
2068+
char *b
2069+
CODE:
2070+
...
2071+
2072+
New style (recommended for new code):
2073+
2074+
int
2075+
foo(long a, char *b)
2076+
CODE:
2077+
...
2078+
2079+
Generally there no reason to use the old style any more, apart from a few
2080+
obscure features that can be specified on an INPUT line but not in the
2081+
signature.
2082+
19462083

1947-
=head2 XSUB Parameters
2084+
=head2 An XSUB Parameter
19482085

19492086
=head3 The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords
19502087

@@ -1971,7 +2108,7 @@ pointers.
19712108

19722109
The return list of the generated Perl function consists of the C return value
19732110
from the function (unless the XSUB is of C<void> return type or
1974-
C<The NO_OUTPUT Keyword> was used) followed by all the C<OUTLIST>
2111+
C<NO_OUTPUT> was used) followed by all the C<OUTLIST>
19752112
and C<IN_OUTLIST> parameters (in the order of appearance). On the
19762113
return from the XSUB the C<IN_OUT>/C<OUT> Perl parameter will be
19772114
modified to have the values written by the C function.
@@ -2614,7 +2751,7 @@ executed after the C subroutine call is performed. When the POSTCALL:
26142751
keyword is used it must precede OUTPUT: and CLEANUP: blocks which are
26152752
present in the XSUB.
26162753

2617-
See examples in L<"The NO_OUTPUT Keyword">.
2754+
See an example in L<"An XSUB Declaration">.
26182755

26192756
The POSTCALL: block does not make a lot of sense when the C subroutine
26202757
call is supplied by user by providing either CODE: or PPCODE: section.

0 commit comments

Comments
 (0)