|
38 | 38 | "cell_type": "markdown", |
39 | 39 | "metadata": {}, |
40 | 40 | "source": [ |
41 | | - "Look at filesystem to see files extracted from BigQuery" |
| 41 | + "Look at filesystem to see files extracted from BigQuery (or Kaggle: https://www.kaggle.com/davidshinn/github-issues/)" |
42 | 42 | ] |
43 | 43 | }, |
44 | 44 | { |
45 | 45 | "cell_type": "code", |
46 | | - "execution_count": 6, |
| 46 | + "execution_count": 9, |
47 | 47 | "metadata": {}, |
48 | 48 | "outputs": [ |
49 | 49 | { |
50 | 50 | "name": "stdout", |
51 | 51 | "output_type": "stream", |
52 | 52 | "text": [ |
53 | | - "-rw-r--r-- 1 root root 272M Jan 16 00:41 seq2seqdata000000000000.csv\r\n", |
54 | | - "-rw-r--r-- 1 root root 272M Jan 16 00:41 seq2seqdata000000000001.csv\r\n", |
55 | | - "-rw-r--r-- 1 root root 272M Jan 16 00:41 seq2seqdata000000000002.csv\r\n", |
56 | | - "-rw-r--r-- 1 root root 273M Jan 16 00:41 seq2seqdata000000000003.csv\r\n", |
57 | | - "-rw-r--r-- 1 root root 273M Jan 16 00:41 seq2seqdata000000000004.csv\r\n", |
58 | | - "-rw-r--r-- 1 root root 273M Jan 16 00:41 seq2seqdata000000000005.csv\r\n", |
59 | | - "-rw-r--r-- 1 root root 273M Jan 16 00:41 seq2seqdata000000000006.csv\r\n", |
60 | | - "-rw-r--r-- 1 root root 273M Jan 16 00:41 seq2seqdata000000000007.csv\r\n", |
61 | | - "-rw-r--r-- 1 root root 272M Jan 16 00:41 seq2seqdata000000000008.csv\r\n", |
62 | | - "-rw-r--r-- 1 root root 272M Jan 16 00:41 seq2seqdata000000000009.csv\r\n" |
| 53 | + "-rw-r--r-- 1 40294 40294 2.7G Jan 18 2018 github_issues.csv\r\n" |
63 | 54 | ] |
64 | 55 | } |
65 | 56 | ], |
66 | 57 | "source": [ |
67 | | - "!ls -lah | grep csv" |
| 58 | + "!ls -lah | grep github_issues.csv" |
68 | 59 | ] |
69 | 60 | }, |
70 | 61 | { |
|
76 | 67 | }, |
77 | 68 | { |
78 | 69 | "cell_type": "code", |
79 | | - "execution_count": 8, |
| 70 | + "execution_count": 11, |
80 | 71 | "metadata": {}, |
81 | 72 | "outputs": [ |
82 | 73 | { |
|
115 | 106 | " </thead>\n", |
116 | 107 | " <tbody>\n", |
117 | 108 | " <tr>\n", |
118 | | - " <th>344296</th>\n", |
119 | | - " <td>\"https://github.com/Tendrl/node-agent/issues/617\"</td>\n", |
120 | | - " <td>node_agent should handle sds native alerts also</td>\n", |
121 | | - " <td>some of the sds alerts do not have clearing alerts. so it always present in alerting directory. these kinds of alerts should be stored in etcd under /alerting/notify, it never goes to alerting/alerts directory and it is not displayed under alerts in ui also. these kinds of alerts are notified via notification channel and deleted via ttl. node_agent should have a logic to handle this in alerting framework.</td>\n", |
| 109 | + " <th>3165423</th>\n", |
| 110 | + " <td>\"https://github.com/1000hz/bootstrap-validator/issues/574\"</td>\n", |
| 111 | + " <td>uncaught typeerror: f b is not a function when using $ ... .validator 'update'</td>\n", |
| 112 | + " <td>the above error is being thrown when i try and run update via js to include some new fields that have been added dynamically. i'm using backbone.js rendering a script template element to add a new set up fields based on user interaction. the full error message is: uncaught typeerror: f b is not a function at htmlformelement.<anonymous> validator.min.js:9 at function.each jquery.min.js:2 at n.fn.init.each jquery.min.js:2 at n.fn.init.b as validator validator.min.js:9 at n.initskillgroup app.l...</td>\n", |
122 | 113 | " </tr>\n", |
123 | 114 | " <tr>\n", |
124 | | - " <th>177469</th>\n", |
125 | | - " <td>\"https://github.com/Eonasdan/bootstrap-datetimepicker/issues/2032\"</td>\n", |
126 | | - " <td>dst problems with some timezones</td>\n", |
127 | | - " <td>hello! i have created a datetimepicker with approximataly following config: $element.datetimepicker { locale: 'ru', timezone: 'europe/moscow', defaultdate: moment 614116800000 , format: 'dd.mm.yyyy' } but it shows the date 17.06.1989 instead of 18.06.1989. where can be the problem and what are the ways to resolve it? the plugin version is 4.17.47</td>\n", |
| 115 | + " <th>2763145</th>\n", |
| 116 | + " <td>\"https://github.com/quasar-analytics/quasar/issues/2821\"</td>\n", |
| 117 | + " <td>invoke endpoint regression</td>\n", |
| 118 | + " <td>problem accures in versions: 21.x.x , 23.x.x and 24.x.x didn't check 22.x.x first query is put to view mount sql select from /test-mount/testdb/flatviz the second one sql select row.seriesone as seriesone, row.seriestwo as seriestwo, min row.measureone as measureone from output_of_first_query as row group by row.seriesone, row.seriestwo order by row.seriesone asc, row.seriestwo asc the third one is sql select from output_of_second_query where seriesone = one-one in 20.14.13 this works as exp...</td>\n", |
128 | 119 | " </tr>\n", |
129 | 120 | " <tr>\n", |
130 | | - " <th>243616</th>\n", |
131 | | - " <td>\"https://github.com/Simperium/simperium-js/issues/22\"</td>\n", |
132 | | - " <td>two way sync not working as expected.</td>\n", |
133 | | - " <td>i'm having an issue syncing data. can someone tell me if i'm doing it wrong. i posted this in stackoverflow and got no responses. whats happening is in window one. if i update my teams array then do a bucket.update 'team-1',teams ; in console 2 i see the new updated teams object and its put into simperium properly. however in window 2 when i do the exact same thing after receiving the new teams object window 1 doesn't get the update nor does simperium. code is bellow. var bucket = simperium....</td>\n", |
| 121 | + " <th>3882729</th>\n", |
| 122 | + " <td>\"https://github.com/msharov/ustl/issues/79\"</td>\n", |
| 123 | + " <td>build ustl with clang on linux</td>\n", |
| 124 | + " <td>hi, on ubuntu 14.04 clang 3.4, gcc 4.8.4 and fedora 22 clang 3.5, gcc 5.3.1 : cc=clang cxx=clang++ ./configure --libdir=path/to/libsupc++.a without --libdir it searches for libcxxabi when cc=clang make works fine, make check however shows quite a few diffs. is such configuration supposed to work? thanks!</td>\n", |
134 | 125 | " </tr>\n", |
135 | 126 | " </tbody>\n", |
136 | 127 | "</table>\n", |
137 | 128 | "</div>" |
138 | 129 | ], |
139 | 130 | "text/plain": [ |
140 | | - " issue_url \\\n", |
141 | | - "344296 \"https://github.com/Tendrl/node-agent/issues/617\" \n", |
142 | | - "177469 \"https://github.com/Eonasdan/bootstrap-datetimepicker/issues/2032\" \n", |
143 | | - "243616 \"https://github.com/Simperium/simperium-js/issues/22\" \n", |
| 131 | + " issue_url \\\n", |
| 132 | + "3165423 \"https://github.com/1000hz/bootstrap-validator/issues/574\" \n", |
| 133 | + "2763145 \"https://github.com/quasar-analytics/quasar/issues/2821\" \n", |
| 134 | + "3882729 \"https://github.com/msharov/ustl/issues/79\" \n", |
144 | 135 | "\n", |
145 | | - " issue_title \\\n", |
146 | | - "344296 node_agent should handle sds native alerts also \n", |
147 | | - "177469 dst problems with some timezones \n", |
148 | | - "243616 two way sync not working as expected. \n", |
| 136 | + " issue_title \\\n", |
| 137 | + "3165423 uncaught typeerror: f b is not a function when using $ ... .validator 'update' \n", |
| 138 | + "2763145 invoke endpoint regression \n", |
| 139 | + "3882729 build ustl with clang on linux \n", |
149 | 140 | "\n", |
150 | | - " body \n", |
151 | | - "344296 some of the sds alerts do not have clearing alerts. so it always present in alerting directory. these kinds of alerts should be stored in etcd under /alerting/notify, it never goes to alerting/alerts directory and it is not displayed under alerts in ui also. these kinds of alerts are notified via notification channel and deleted via ttl. node_agent should have a logic to handle this in alerting framework. \n", |
152 | | - "177469 hello! i have created a datetimepicker with approximataly following config: $element.datetimepicker { locale: 'ru', timezone: 'europe/moscow', defaultdate: moment 614116800000 , format: 'dd.mm.yyyy' } but it shows the date 17.06.1989 instead of 18.06.1989. where can be the problem and what are the ways to resolve it? the plugin version is 4.17.47 \n", |
153 | | - "243616 i'm having an issue syncing data. can someone tell me if i'm doing it wrong. i posted this in stackoverflow and got no responses. whats happening is in window one. if i update my teams array then do a bucket.update 'team-1',teams ; in console 2 i see the new updated teams object and its put into simperium properly. however in window 2 when i do the exact same thing after receiving the new teams object window 1 doesn't get the update nor does simperium. code is bellow. var bucket = simperium.... " |
| 141 | + " body \n", |
| 142 | + "3165423 the above error is being thrown when i try and run update via js to include some new fields that have been added dynamically. i'm using backbone.js rendering a script template element to add a new set up fields based on user interaction. the full error message is: uncaught typeerror: f b is not a function at htmlformelement.<anonymous> validator.min.js:9 at function.each jquery.min.js:2 at n.fn.init.each jquery.min.js:2 at n.fn.init.b as validator validator.min.js:9 at n.initskillgroup app.l... \n", |
| 143 | + "2763145 problem accures in versions: 21.x.x , 23.x.x and 24.x.x didn't check 22.x.x first query is put to view mount sql select from /test-mount/testdb/flatviz the second one sql select row.seriesone as seriesone, row.seriestwo as seriestwo, min row.measureone as measureone from output_of_first_query as row group by row.seriesone, row.seriestwo order by row.seriesone asc, row.seriestwo asc the third one is sql select from output_of_second_query where seriesone = one-one in 20.14.13 this works as exp... \n", |
| 144 | + "3882729 hi, on ubuntu 14.04 clang 3.4, gcc 4.8.4 and fedora 22 clang 3.5, gcc 5.3.1 : cc=clang cxx=clang++ ./configure --libdir=path/to/libsupc++.a without --libdir it searches for libcxxabi when cc=clang make works fine, make check however shows quite a few diffs. is such configuration supposed to work? thanks! " |
154 | 145 | ] |
155 | 146 | }, |
156 | | - "execution_count": 8, |
| 147 | + "execution_count": 11, |
157 | 148 | "metadata": {}, |
158 | 149 | "output_type": "execute_result" |
159 | 150 | } |
160 | 151 | ], |
161 | 152 | "source": [ |
162 | 153 | "#read in data sample 2M rows (for speed of tutorial)\n", |
163 | | - "traindf, testdf = train_test_split(\n", |
164 | | - " pd.concat([\n", |
165 | | - " pd.read_csv(f) for f in glob.glob('*.csv')\n", |
166 | | - " ]).sample(n=2000000), \n", |
| 154 | + "traindf, testdf = train_test_split(pd.read_csv('github_issues.csv').sample(n=2000000), \n", |
167 | 155 | " test_size=.10)\n", |
168 | 156 | "\n", |
169 | 157 | "\n", |
|
1544 | 1532 | "outputs": [], |
1545 | 1533 | "source": [ |
1546 | 1534 | "# Read All 5M data points\n", |
1547 | | - "all_data_df = pd.concat([pd.read_csv(f) for f in glob.glob('*.csv')])\n", |
| 1535 | + "all_data_df = pd.read_csv('github_issues.csv')\n", |
1548 | 1536 | "# Extract the bodies from this dataframe\n", |
1549 | 1537 | "all_data_bodies = all_data_df['body'].tolist()" |
1550 | 1538 | ] |
|
0 commit comments