Skip to content

Commit 1278636

Browse files
authored
Merge pull request #173 from dgelessus/php_serialize_phar
Add PHP serialized value and phar archive format
2 parents a9d65e9 + 24191ef commit 1278636

File tree

2 files changed

+636
-0
lines changed

2 files changed

+636
-0
lines changed

archive/phar_without_stub.ksy

Lines changed: 310 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,310 @@
1+
meta:
2+
id: phar_without_stub
3+
title: PHP phar archive (without stub)
4+
application: PHP
5+
file-extension: phar
6+
xref:
7+
wikidata: Q1269709
8+
license: CC0-1.0
9+
ks-version: 0.9
10+
imports:
11+
- /serialization/php_serialized_value
12+
endian: le
13+
doc: |
14+
A phar (PHP archive) file. The phar format is a custom archive format
15+
from the PHP ecosystem that is used to package a complete PHP library
16+
or application into a single self-contained archive.
17+
All phar archives start with an executable PHP stub, which can be used to
18+
allow executing or including phar files as if they were regular PHP scripts.
19+
PHP 5.3 and later include the phar extension, which adds native support for
20+
reading and manipulating phar files.
21+
22+
The phar format was originally developed as part of the PEAR library
23+
PHP_Archive, first released in 2005. Later, a native PHP extension
24+
named "phar" was developed, which was first released on PECL in 2007,
25+
and is included with PHP 5.3 and later. The phar extension has effectively
26+
superseded the PHP_Archive library, which has not been updated since 2010.
27+
The phar extension is also no longer released independently on PECL;
28+
it is now developed and released as part of PHP itself.
29+
30+
Because of current limitations in Kaitai Struct
31+
(seekaitai-io/kaitai_struct#158 and kaitai-io/kaitai_struct#538),
32+
the executable PHP stub that precedes the rest of the archive is not handled
33+
by this spec. Before parsing a phar using this spec, the stub must be
34+
removed manually.
35+
36+
A phar's stub is terminated by the special token `__HALT_COMPILER();`
37+
(which may be followed by at most one space, the PHP tag end `?>`,
38+
and an optional line terminator). The stub termination sequence is
39+
immediately followed by the remaining parts of the phar format,
40+
as described in this spec.
41+
42+
The phar stub usually contains code that loads the phar and runs
43+
a contained PHP file, but this is not required. A minimal valid phar stub
44+
is `<?php __HALT_COMPILER();` - such a stub makes it impossible to execute
45+
the phar directly, but still allows loading or manipulating it using the
46+
phar extension.
47+
48+
Note: The phar format does not specify any encoding for text fields
49+
(stub, alias name, and all file names), so these fields may contain arbitrary
50+
binary data. The actual text encoding used in a specific phar file usually
51+
depends on the application that created the phar, and on the
52+
standard encoding of the system on which the phar was created.
53+
doc-ref:
54+
- 'https://www.php.net/manual/en/phar.fileformat.php'
55+
- 'https://github.com/php/php-src/tree/master/ext/phar'
56+
- 'https://svn.php.net/viewvc/pecl/phar/'
57+
- 'https://svn.php.net/viewvc/pear/packages/PHP_Archive/'
58+
seq:
59+
- id: manifest
60+
type: manifest
61+
doc: |
62+
The archive's manifest, containing general metadata about the archive
63+
and its files.
64+
- id: files
65+
size: manifest.file_entries[_index].len_data_compressed
66+
repeat: expr
67+
repeat-expr: manifest.num_files
68+
doc: |
69+
The contents of each file in the archive (possibly compressed,
70+
as indicated by the file's flags in the manifest). The files are stored
71+
in the same order as they appear in the manifest.
72+
- id: signature
73+
type: signature
74+
size-eos: true
75+
if: manifest.flags.has_signature
76+
doc: |
77+
The archive's signature - a digest of all archive data before
78+
the signature itself.
79+
80+
Note: Almost all of the available "signature" types are actually hashes,
81+
not signatures, and cannot be used to verify that the archive has not
82+
been tampered with. Only the OpenSSL signature type is a true
83+
cryptographic signature.
84+
enums:
85+
signature_type:
86+
0x1:
87+
id: md5
88+
-orig-id: PHAR_SIG_MD5
89+
doc: Indicates an MD5 hash.
90+
0x2:
91+
id: sha1
92+
-orig-id: PHAR_SIG_SHA1
93+
doc: Indicates a SHA-1 hash.
94+
0x4:
95+
id: sha256
96+
-orig-id: PHAR_SIG_SHA256
97+
doc: |
98+
Indicates a SHA-256 hash. Available since API version 1.1.0,
99+
PHP_Archive 0.12.0 and phar extension 1.1.0.
100+
0x8:
101+
id: sha512
102+
-orig-id: PHAR_SIG_SHA512
103+
doc: |
104+
Indicates a SHA-512 hash. Available since API version 1.1.0,
105+
PHP_Archive 0.12.0 and phar extension 1.1.0.
106+
0x10:
107+
id: openssl
108+
-orig-id: PHAR_SIG_OPENSSL
109+
doc: |
110+
Indicates an OpenSSL signature. Available since API version 1.1.1,
111+
PHP_Archive 0.12.0 (even though it claims to only support
112+
API version 1.1.0) and phar extension 1.3.0. This type is not
113+
documented in the phar extension's documentation of the phar format.
114+
115+
Note: In older versions of the phar extension, this value was used
116+
for an undocumented and unimplemented "PGP" signature type
117+
(`PHAR_SIG_PGP`).
118+
types:
119+
serialized_value:
120+
seq:
121+
- id: raw
122+
size-eos: true
123+
doc: The serialized value, as a raw byte array.
124+
instances:
125+
parsed:
126+
pos: 0
127+
type: php_serialized_value
128+
doc: The serialized value, parsed as a structure.
129+
file_flags:
130+
seq:
131+
- id: value
132+
type: u4
133+
doc: The unparsed flag bits.
134+
instances:
135+
permissions:
136+
value: value & 0x1ff
137+
-orig-id: PHAR_ENT_PERM_MASK
138+
doc: The file's permission bits.
139+
zlib_compressed:
140+
value: (value & 0x1000) != 0
141+
-orig-id: PHAR_ENT_COMPRESSED_GZ
142+
doc: Whether this file's data is stored using zlib compression.
143+
bzip2_compressed:
144+
value: (value & 0x2000) != 0
145+
-orig-id: PHAR_ENT_COMPRESSED_BZ2
146+
doc: Whether this file's data is stored using bzip2 compression.
147+
file_entry:
148+
seq:
149+
- id: len_filename
150+
type: u4
151+
doc: The length of the file name, in bytes.
152+
- id: filename
153+
size: len_filename
154+
doc: |
155+
The name of this file. If the name ends with a slash, this entry
156+
represents a directory, otherwise a regular file. Directory entries
157+
are supported since phar API version 1.1.1.
158+
(Explicit directory entries are only needed for empty directories.
159+
Non-empty directories are implied by the files located inside them.)
160+
- id: len_data_uncompressed
161+
type: u4
162+
doc: The length of the file's data when uncompressed, in bytes.
163+
- id: timestamp
164+
type: u4
165+
doc: |
166+
The time at which the file was added or last updated, as a
167+
Unix timestamp.
168+
- id: len_data_compressed
169+
type: u4
170+
doc: The length of the file's data when compressed, in bytes.
171+
- id: crc32
172+
type: u4
173+
doc: The CRC32 checksum of the file's uncompressed data.
174+
- id: flags
175+
type: file_flags
176+
doc: Flags for this file.
177+
- id: len_metadata
178+
type: u4
179+
doc: The length of the metadata, in bytes, or 0 if there is none.
180+
- id: metadata
181+
size: len_metadata
182+
type: serialized_value
183+
if: len_metadata != 0
184+
doc: |
185+
Metadata for this file, in the format used by PHP's
186+
`serialize` function. The meaning of the serialized data is not
187+
specified further, it may be used to store arbitrary custom data
188+
about the file.
189+
api_version:
190+
meta:
191+
endian: be
192+
seq:
193+
- id: release
194+
type: b4
195+
- id: major
196+
type: b4
197+
- id: minor
198+
type: b4
199+
- id: unused
200+
type: b4
201+
doc: |
202+
A phar API version number. This version number is meant to indicate
203+
which features are used in a specific phar, so that tools reading
204+
the phar can easily check that they support all necessary features.
205+
206+
The following API versions exist so far:
207+
208+
* 0.5, 0.6, 0.7, 0.7.1: The first official API versions. At this point,
209+
the phar format was only used by the PHP_Archive library, and the
210+
API version numbers were identical to the PHP_Archive versions that
211+
supported them. Development of the native phar extension started around
212+
API version 0.7. These API versions could only be queried using the
213+
`PHP_Archive::APIversion()` method, but were not stored physically
214+
in archives. These API versions are not supported by this spec.
215+
* 0.8.0: Used by PHP_Archive 0.8.0 (released 2006-07-18) and
216+
later development versions of the phar extension. This is the first
217+
version number to be physically stored in archives. This API version
218+
is not supported by this spec.
219+
* 0.9.0: Used by later development/early beta versions of the
220+
phar extension. Also temporarily used by PHP_Archive 0.9.0
221+
(released 2006-12-15), but reverted back to API version 0.8.0 in
222+
PHP_Archive 0.9.1 (released 2007-01-05).
223+
* 1.0.0: Supported since PHP_Archive 0.10.0 (released 2007-05-29)
224+
and phar extension 1.0.0 (released 2007-03-28). This is the first
225+
stable, forwards-compatible and documented version of the format.
226+
* 1.1.0: Supported since PHP_Archive 0.12.0 (released 2015-07-06)
227+
and phar extension 1.1.0 (released 2007-04-12). Adds SHA-256 and
228+
SHA-512 signature types.
229+
* 1.1.1: Supported since phar extension 2.0.0 (released 2009-07-29 and
230+
included with PHP 5.3 and later). (PHP_Archive 0.12.0 also supports
231+
all features from API verison 1.1.1, but it reports API version 1.1.0.)
232+
Adds the OpenSSL signature type and support for storing
233+
empty directories.
234+
global_flags:
235+
seq:
236+
- id: value
237+
type: u4
238+
doc: The unparsed flag bits.
239+
instances:
240+
any_zlib_compressed:
241+
value: (value & 0x1000) != 0
242+
-orig-id: PHAR_HDR_COMPRESSED_GZ
243+
doc: |
244+
Whether any of the files in this phar are stored using
245+
zlib compression.
246+
any_bzip2_compressed:
247+
value: (value & 0x2000) != 0
248+
-orig-id: PHAR_HDR_COMPRESSED_BZ2
249+
doc: |
250+
Whether any of the files in this phar are stored using
251+
bzip2 compression.
252+
has_signature:
253+
value: (value & 0x10000) != 0
254+
-orig-id: PHAR_HDR_SIGNATURE
255+
doc: Whether this phar contains a signature.
256+
manifest:
257+
seq:
258+
- id: len_manifest
259+
type: u4
260+
doc: |
261+
The length of the manifest, in bytes.
262+
263+
Note: The phar extension does not allow reading manifests
264+
larger than 100 MiB.
265+
- id: num_files
266+
type: u4
267+
doc: The number of files in this phar.
268+
- id: api_version
269+
type: api_version
270+
doc: The API version used by this phar manifest.
271+
- id: flags
272+
type: global_flags
273+
doc: Global flags for this phar.
274+
- id: len_alias
275+
type: u4
276+
doc: The length of the alias, in bytes.
277+
- id: alias
278+
size: len_alias
279+
doc: |
280+
The phar's alias, i. e. the name under which it is loaded into PHP.
281+
- id: len_metadata
282+
type: u4
283+
doc: The size of the metadata, in bytes, or 0 if there is none.
284+
- id: metadata
285+
size: len_metadata
286+
type: serialized_value
287+
if: len_metadata != 0
288+
doc: |
289+
Metadata for this phar, in the format used by PHP's
290+
`serialize` function. The meaning of the serialized data is not
291+
specified further, it may be used to store arbitrary custom data
292+
about the archive.
293+
- id: file_entries
294+
type: file_entry
295+
repeat: expr
296+
repeat-expr: num_files
297+
doc: Manifest entries for the files contained in this phar.
298+
signature:
299+
seq:
300+
- id: data
301+
size: _io.size - _io.pos - 8
302+
doc: |
303+
The signature data. The size and contents depend on the
304+
signature type.
305+
- id: type
306+
type: u4
307+
enum: signature_type
308+
doc: The signature type.
309+
- id: magic
310+
contents: "GBMB"

0 commit comments

Comments
 (0)