Skip to content

Commit 839b70e

Browse files
committed
Merge branch 'develop'
2 parents 9361d88 + 0c854a6 commit 839b70e

File tree

5 files changed

+1835
-90
lines changed

5 files changed

+1835
-90
lines changed

.coveragerc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22
omit =
33
venv/*
44
bin/*
5+
build/*
6+
dist/*
57
.pytest_cache/*
68
LICENSE
79
README.md

README.md

Lines changed: 87 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,115 +1,146 @@
11
![Test Coverage](coverage.svg)
22

33
# Protobuf Decoder
4-
Simple protobuf decoder for python
54

5+
Simple protobuf decoder for python
66

77
# Motivation
8+
89
The goal of this project is decode protobuf binary without proto files
910

1011
# Installation
12+
1113
Install using pip
1214

1315
`pip install protobuf-decoder`
1416

1517
# Simple Examples
16-
```
18+
19+
``` python
1720
"""
18-
# proto
19-
message Test1 {
20-
string a = 1;
21-
}
22-
23-
# message
24-
{
25-
"a": "테스트"
26-
}
27-
28-
# binary
29-
0A 09 ED 85 8C EC 8A A4 ED 8A B8
21+
# proto
22+
message Test1 {
23+
string a = 1;
24+
}
25+
26+
# message
27+
{
28+
"a": "테스트"
29+
}
30+
31+
# binary
32+
0A 09 ED 85 8C EC 8A A4 ED 8A B8
3033
"""
3134
from protobuf_decoder.protobuf_decoder import Parser
3235

3336
test_target = "0A 09 ED 85 8C EC 8A A4 ED 8A B8"
3437
parsed_data = Parser().parse(test_target)
35-
>> parsed_data
36-
>> [ParsedResult(field=1, wire_type="string", data='테스트')]
38+
assert parsed_data == ParsedResults([ParsedResult(field=1, wire_type="string", data='테스트')])
39+
assert parsed_data.to_dict() == {'results': [{'field': 1, 'wire_type': 'string', 'data': '테스트'}]}
3740
```
3841

39-
40-
```
42+
``` python
4143
"""
42-
# proto
43-
message Test1 {
44-
int32 a = 1;
45-
}
46-
47-
message Test2 {
48-
Test1 b = 3;
49-
}
50-
51-
# message
52-
{
53-
"a": {
54-
"b": 150
55-
}
56-
}
57-
58-
# binary
59-
1a 03 08 96 01
44+
# proto
45+
message Test1 {
46+
int32 a = 1;
47+
}
48+
49+
message Test2 {
50+
Test1 b = 3;
51+
}
52+
53+
# message
54+
{
55+
"a": {
56+
"b": 150
57+
}
58+
}
59+
60+
# binary
61+
1a 03 08 96 01
6062
"""
6163
from protobuf_decoder.protobuf_decoder import Parser
6264

6365
test_target = "1a 03 08 96 01"
6466
parsed_data = Parser().parse(test_target)
67+
assert parsed_data == ParsedResults(
68+
[
69+
ParsedResult(field=3, wire_type="length_delimited", data=ParsedResults([
70+
ParsedResult(field=1, wire_type="varint", data=150)
71+
]))
72+
])
73+
assert parsed_data.to_dict() == {'results': [{'field': 3, 'wire_type': 'length_delimited', 'data': {
74+
'results': [{'field': 1, 'wire_type': 'varint', 'data': 150}]}}]}
6575

66-
>> parsed_data
67-
>> [ParsedResult(field=3, wire_type="length_delimited", data=[ParsedResult(field=1, wire_type="varint", data=150)])]
6876
```
6977

70-
```
78+
``` python
7179
"""
72-
# proto
73-
message Test1 {
74-
required string a = 1;
75-
}
80+
# proto
81+
message Test1 {
82+
required string a = 1;
83+
}
7684
77-
# message
78-
{
79-
"a": "✊"
80-
}
85+
# message
86+
{
87+
"a": "✊"
88+
}
8189
82-
# binary
83-
0A 03 E2 9C 8A
90+
# binary
91+
0A 03 E2 9C 8A
8492
85-
"""
93+
"""
8694
from protobuf_decoder.protobuf_decoder import Parser
8795

8896
test_target = "0A 03 E2 9C 8A"
8997
parsed_data = Parser().parse(test_target)
90-
>> parsed_data
91-
>> [ParsedResult(field=1, wire_type="string", data='✊')]
98+
assert parsed_data == ParsedResults([ParsedResult(field=1, wire_type="string", data='')])
99+
assert parsed_data.to_dict() == {'results': [{'field': 1, 'wire_type': 'string', 'data': ''}]}
100+
92101
```
93102

94103
# Nested Protobuf Detection Logic
95-
Our project implements a distinct method to determine whether a given input is possibly a nested protobuf.
96-
The core of this logic is the `is_maybe_nested_protobuf` function.
104+
105+
Our project implements a distinct method to determine whether a given input is possibly a nested protobuf.
106+
The core of this logic is the `is_maybe_nested_protobuf` function.
97107
We recently enhanced this function to provide a more accurate distinction and handle nested protobufs effectively.
98108

99109
### Current Logic
110+
100111
The `is_maybe_nested_protobuf` function works by:
101112

102113
- Attempting to convert the given hex string to UTF-8.
103114
- Checking the ordinal values of the first four characters of the converted data.
104115
- Returning `True` if the data might be a nested protobuf based on certain conditions, otherwise returning False.
105116

106117
### Extensibility
107-
You can extend or modify the `is_maybe_nested_protobuf` function based on your specific requirements or use-cases.
108-
If you find a scenario where the current logic can be further improved,
118+
119+
You can extend or modify the `is_maybe_nested_protobuf` function based on your specific requirements or use-cases.
120+
If you find a scenario where the current logic can be further improved,
109121
feel free to adapt the function accordingly.
110122

111123
(A big shoutout to **@fuzzyrichie** for their significant contributions to this update!)
112124

125+
# Remain Bytes
126+
127+
If there are remaining bytes after parsing, the parser will return the remaining bytes as a string.
128+
129+
```python
130+
from protobuf_decoder.protobuf_decoder import Parser
131+
132+
test_target = "ed 85 8c ec 8a a4 ed 8a b8"
133+
parsed_data = Parser().parse(test_target)
134+
135+
assert parsed_data.has_results is False
136+
assert parsed_data.has_remain_data is True
137+
138+
assert parsed_data.remain_data == "ed 85 8c ec 8a a4 ed 8a b8"
139+
assert parsed_data.to_dict() == {'results': [], 'remain_data': 'ed 85 8c ec 8a a4 ed 8a b8', }
140+
141+
```
142+
113143

114144
# Reference
145+
115146
- [Google protocol-buffers encoding document](https://developers.google.com/protocol-buffers/docs/encoding)

coverage.svg

Lines changed: 2 additions & 2 deletions
Loading

0 commit comments

Comments
 (0)