You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+74-6Lines changed: 74 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,27 +11,95 @@ Currently the following engines are supported:
11
11
Usage Example
12
12
-------------
13
13
14
+
The recommended way to use this library is to get results from Hive/Impala via the memory efficient iterator which will keep the connection open and scroll through the results a couple rows at a time. This allows the processing of large result datasets one record at a time minimizing PHP's memory consumption.
15
+
14
16
```php
15
17
// Load this lib
16
18
require_once __DIR__ . '/ThriftSQL.phar';
17
19
18
-
// Try out a Hive query
20
+
// Try out a Hive query via iterator object
21
+
$hive = new \ThriftSQL\Hive( 'hive.host.local', 10000, 'user', 'pass' );
22
+
$hiveTables = $hive
23
+
->connect()
24
+
->getIterator( 'SHOW TABLES' );
25
+
26
+
// Try out an Impala query via iterator object
27
+
$impala = new \ThriftSQL\Impala( 'impala.host.local' );
28
+
$impalaTables = $impala
29
+
->connect()
30
+
->getIterator( 'SHOW TABLES' );
31
+
32
+
// Execute the Hive query and iterate over the result set
33
+
foreach( $hiveTables as $rowNum => $row ) {
34
+
print_r( $row );
35
+
}
36
+
37
+
// Execute the Impala query and iterate over the result set
38
+
foreach( $impalaTables as $rowNum => $row ) {
39
+
print_r( $row );
40
+
}
41
+
42
+
// Don't forget to close socket connection once you're done with it
43
+
$hive->disconnect();
44
+
$impala->disconnect();
45
+
```
46
+
47
+
The downside to using the memory efficient iterator is that we can only iterate over the result set once. If a second `foreach` is called on the same iterator object an exception is thrown by default to prevent the same query from executing on Hive/Impala again as results are not cached within the PHP client. This can be turned off however be aware iterating over the same iterator object may produce different results as the query is rerun.
48
+
49
+
Consider the following example:
50
+
51
+
```php
52
+
// Connect to hive and get a rerun-able iterator
53
+
$hive = new \ThriftSQL\Hive( 'hive.host.local', 10000, 'user', 'pass' );
54
+
$results = $hive
55
+
->connect()
56
+
->getIterator( 'SELECT UNIX_TIMESTAMP()' )
57
+
->allowRerun( true );
58
+
59
+
// Execute the Hive query and get results
60
+
foreach( $results as $rowNum => $row ) {
61
+
echo "Hive server time is: {$v[0]}\n";
62
+
}
63
+
64
+
sleep(3);
65
+
66
+
// Execute the Hive query a second time
67
+
foreach( $results as $rowNum => $row ) {
68
+
echo "Hive server time is: {$v[0]}\n";
69
+
}
70
+
```
71
+
72
+
Which will output something like:
73
+
74
+
```
75
+
Hive server time is: 1517875200
76
+
Hive server time is: 1517875203
77
+
```
78
+
79
+
If the result set is small and it would be easier to load all of it into PHP memory the `queryAndFetchAll()` method can be used which will return a plain numeric multidimensional array of the full result set.
80
+
81
+
```php
82
+
// Try out a small Hive query
19
83
$hive = new \ThriftSQL\Hive( 'hive.host.local', 10000, 'user', 'pass' );
20
84
$hiveTables = $hive
21
85
->connect()
22
86
->queryAndFetchAll( 'SHOW TABLES' );
87
+
$hive->disconnect();
88
+
89
+
// Print out the cached results
23
90
print_r( $hiveTables );
91
+
```
24
92
25
-
// Try out an Impala query
93
+
```php
94
+
// Try out a small Impala query
26
95
$impala = new \ThriftSQL\Impala( 'impala.host.local' );
27
96
$impalaTables = $impala
28
97
->connect()
29
98
->queryAndFetchAll( 'SHOW TABLES' );
30
-
print_r( $impalaTables );
31
-
32
-
// Don't forget to clear the client and close socket.
0 commit comments